python has a built-in sort method for lists:
>>> L = range(10)
>>> print L
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> from random import *
>>> shuffle(L)
>>> print L
[3, 1, 7, 4, 5, 9, 0, 6, 2, 8]
>>> L.sort()
>>> print L
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
How does L.sort()
work, and is it a good sort to use?
How would you sort numbers in a list? If you had to write a function to sort a given list, how would you do it? Try it out using some playing cards -- see if you can come up with a sorting algorithm. Once you have an idea, demo it for me or a ninja. Make sure your function:
Once you have shown us a working sort algorithm, see if you
can code it up in a file called sorts.py
. There are many
different ways to sort a list. We will add 3 or 4 different
sorting functions to this file. Then we can just add
from sorts import *
in our other programs that need a sorting
function.
Make sure you add a testing section to sorts.py
. Here is a
small example (assuming you called your sort function mysort()
:
if __name__ == "__main__":
from random import shuffle
N = 100
L = range(N)
shuffle(L)
mysort(L)
assert L == range(N)
In most languages, to swap two items in a list, you would do this
(assuming i
and j
are valid indecies):
tmp = L[i] # save ith value to temporary location
L[i] = L[j] # copy jth value to ith position
L[j] = tmp # copy from tmp to jth position
Since this is a common operation, python provides a shortcut:
L[j],L[i] = L[i],L[j]
This is the selection sort algorithm:
L
It is called selection sort because each time you are selecting the smallest number from the remaining unsorted elements.
I assume you understand the "find the smallest number in a list" part, but here it is:
ismall = 0 # index of first element in list
for i in range(len(L)):
if L[i] < L[ismall]:
ismall = i # found a smaller item, so save position
# after the for loop, ismall should be the index of the
# smallest item in the list
print "smallest is ", L[ismall]
The selection sort uses the above code, but not just for position 0.
A second outer for loop
is needed to run the above code for all
values from 0 to the last index of the list.
Here is a video of the selection sort algorithm (click to play):
See if you can add selectionSort()
to our sorts.py
file of sorts.
Include some test code at the bottom of sorts.py
to make sure your
selectionSort()
function is working properly.
This is the bubble sort algorithm:
L
Here's a video of the bubble sort algorithm (click to play):
See if you can add bubbleSort()
to our sorts.py
file.
Include some test code at the bottom of sorts.py
to make sure your
bubbleSort()
function is working properly.
This is the insertion sort algorithm:
Notice that, for each index, all items to the left are in order, and you are inserting the next item into the correct spot.
Here is a video of the insertion sort algorithm (click to play):
See if you can add insertionSort()
to our sorts.py
file.
Include some test code at the bottom of sorts.py
!
Each of the sorting algorithms we considered last time (selection,
bubble, and insertion sort) has nested loops, where each loop's
length depends on the size on the list being sorted (N). This means
the number of steps required for each sort, or the runtime, depends
on the size of the list squared. These are called quadratic or
N-squared sorts, because if the size of the list doubles, the number
of steps quadruples. For example, if the runtime of the bubble
sort was 10 seconds for a list of size N
, it would take approximately
40 seconds to sort a list of size 2N
.
Let's see if we can show this by actually timing a few sorts. Here is the code we will use to time a particular sort:
from time import time
t1 = time()
selectionSort(L)
t2 = time()
print "selection sort time: %8.4f" % (t2-t1)
If we do the above for lists of size N
, 2N
, and 4N
, we
should see the quadratic behavior.
$ python timesorts.py
N: 2000
selection sort time: 0.1416
bubble sort time: 0.4503
insertion sort time: 0.2378
python sort time: 0.0003
$ python timesorts.py
N: 4000
selection sort time: 0.5836
bubble sort time: 1.7433
insertion sort time: 0.9037
python sort time: 0.0007
$ python timesorts.py
N: 8000
selection sort time: 2.2732
bubble sort time: 7.1467
insertion sort time: 4.2049
python sort time: 0.0015
Notice three things:
How does the python L.sort()
method work? Are there other, better
sorting algorithms?
Here is the basic merge sort algorithm:
The merge step is actually not that hard, given two sorted lists. If you know each list is already sorted, you can just compare the first item in each list, and take the smaller of the two.
A more interesting question is, how do we sort each of the split lists? Do we use selection sort, bubble sort, or some other sort?
Since we are talking about merge sort, can we use merge sort to sort
each of the split lists? This is called recursion: we are writing
a function (mergeSort()
), and somewhere in that function we call
that function (or a copy of that function) to help solve the problem.
This may seem odd, but it is perfectly fine. If you think about stack
diagrams, calling any function just puts another function frame on the
stack. Assuming there is an end somewhere, it is perfectly fine for a
function to call itself and put another copy on the stack.
So here is a version of the mergeSort()
function, assuming we have
a valid merge()
function, which is not hard to write (we will write
it next week):
def mergeSort(L):
"""use merge sort to sort L from low to high"""
if len(L) > 1:
# split into two lists
half = len(L)/2
L1 = L[0:half]
L2 = L[half:]
# sort each list
mergeSort(L1)
mergeSort(L2)
# merge them back into one sorted list
merge(L1,L2,L)
We will write this and test it next week, after we learn all about recursion...