Our last sorting technique is similar to merge sort in that it is
recursive: we break our array into parts and sort each part. However,
what we do with this sorting technique, called quick sort, is
significantly different. When dividing the array, set things up so
that the largest element of one part is smaller than the smallest
element of the other part. We then recursively sort each of these
parts.
Suppose instead of sorting our baseball cards using merge sort, we had
used quick sort. We needed a way divide our big stack into two smaller
stacks, one for each of us to sort. There are something like 792 cards in
a set of baseball cards. So, my brother gave me all of the cards whose
numbers were less than 400, and he took all of the cards numbered 400 or
greater. We would then sort each of our halves. Note that there is no
need to perform merging when we were finished: all cards in my stack were
smaller than all cards in his stack. So, we would just set them next to
each other and be done with it. Assuming our division was good, this
technique would not be any worse than merge sort: the division process
would take O(n) time, and the combining would be O(1), whereas with merge
sort the division was O(1), and the combining was O(n).
We could then do something similar to before: have friends come over
and divide the work. I would divide my cards into blocks of 1 - 200
and 200 - 400. My friend would sort one, and I would sort the other.
We would then combine them, and then combine this single stack with
the result of what my brother and his friend did.
There were names for all of the important things that went on here:
breaking a stack into two smaller stacks was known as partitioning,
and the value we chose to use as the division point in partitioning is
called the "pivot". The algorithm for quick sort is fairly easy: if
we want to sort elements left through right of our array, chose some
pivot value, partition the elements into parts, put the pivot value
into the appropriate location, and then recursively sort each of our
two parts.
The difficulty comes in describing the partitioning process. We begin by
choosing an index into our array. The value at that index will be our
pivot. We then swap the element at this location with the very last
element of the portion of the array we want sorted. Then, we create a
left reference into our array, which is initially set to the left most
element of the part of the array we want sorted. We march this reference
along, comparing the value at this reference to our pivot value. We keep
incrementing this reference until the value we are examining is greater
than the pivot value (and therefore belongs in the right portion of the
array). We then stop marching this left reference along, and start moving
a right reference: a value which starts at the right portion of the array
and we decrement until we come across a value which is smaller than the
pivot, and thus belongs in the left portion of the array. We then begin
our marchings again, until the left and right references intersect. This
is where our pivot goes, so we exchange this index and the last index.
It isn't really as bad as the description above. The main reason quick
sort is complicated is because the partitioning may not create two
perfectly equal halves. What makes it worse, is that we do not know how
our array will divide: we don't know the size of the two portions. So, we
just can't create two separate arrays like we did with merge sort. That
is why it is necessary to use all of these "left references" and "right
references". The technique is formalized below.
quickSort(A, left, right)
Input: An unsorted array, A, left and right indices into A
Postcondition: Elements of A between left and right are sorted
if (left < right)
middle = partition(A, left, right)
quickSort(A, left, middle-1)
quickSort(A, middle + 1, right)
partition(A, left, right)
Input: An unsorted array, A, left and right indices into A
Postcondition: creates a middle element such that for all indices l in
[left, middle), A[l] <= A[middle] and for all indices r, (middle, right],
A[r] >= A[middle]
Returns: the middle index which satisfies above
if right - left <= 1
if A[left > A[right]
swap(array, left, right)
return left
pivot_position = choosePivot(array, left, right)
pivot = A[pivot_position]
l = left
r = right
swap(A, pivot_position, right)
while l < r
while A[l] <= pivot && l < right
l++
while A[r] > pivot && r > left
r--
if l < r
swap(A, l, r)
middle = l
swap(A, middle, right)
return middle
choosePivot(A, left, right)
Input: An unsorted array A, left and right indices into A
Returns: The pivot position that will be used in quickSort
Several possibilities:
return left
return random(left, right)
return median(A[left], A[(left+right)/2], A[right])
I have a series which shows partitioning.
It uses the left element as our pivot.
Time Analysis: It should be evident that we can partition an
n-element array in O(n) time. So now the question becomes how many times
do we have to partition? The answer is "it depends": it depends on how
lucky we are in choosing our pivot.
Suppose we choose our pivot such that each time, we split our array
directly in half. We will then partition each of these halves. We
begin the time analysis below, again with T(n) denoting the total
amount of time it takes to quicksort an n-element array:
Suppose we are unlucky though: suppose we break our array such that we put
n-1 elements in the left and 1 element in the right. Then, our time
becomes:
That is why we have to choose our pivot carefully: suppose we always
just choose the right element of our array as the pivot. Well, if our
array is already sorted, when we partition, we partition into two
blocks: one with n-1 elements, and one with 0 elements (the pivot is
already at the correct location). Ideally, what we want is to choose
the value that is exactly the median of the array. Unfortunately, it
is very hard to figure out what the median is when the array is
unsorted. Instead, the recommended technique is to choose a "median
of three": look at three values in our array, and choose the median of
those three as our pivot. In that case, we are guaranteed that none of
the parts of our array will be empty: we know at least one value is
smaller, and another is larger. This does not guarantee we have a
perfect split, but at least we avoid the worst case.
What is the running time of quick sort "on average"? The average time
is O(n log n).
So, which sorting technique is the best? Well, it depends on what you
are looking for. Some people only care about how long it takes to run the
algorithm, others care about how long it takes to implement, and other
care about completely different things, like:
Stable sorting: Stability has to do with sorting algorithms deal
with duplicate values in our array. A stable algorithm is one which keeps
these elements in the same relative order: if an element was further to
the left of a duplicate value, when we are done sorting these element
should still be to the left.
Of the four algorithms we examined, bubble, insertion, and merge are
stable, while quick sort is not.
In place: An algorithm is "in place" if it requires no more
memory than the initial array (not counting a temporary holder for
swapping values, and the extra memory needed in recursive calls). The
in place algorithms we examined are bubble, insertion, and quick sort,
while merge sort is not. An algorithm being in place was a big deal
long ago when memory was expensive. It is still fairly important
nowadays, but only if you are sorting very large data sets.
I summarize the advantages and disadvantages of each algorithm in the
table below:
Algorithm | Best case | Worst Case |
Stable | In place |
Bubble |
Sorted: O(n2) |
Reverse sorted:O(n2) |
Yes |
Yes |
Insertion |
Sorted: O(n) |
Reverse sorted: O(n2) |
Yes |
Yes |
Merge |
None: O(n logn) |
None: O(n logn) |
Yes |
No |
Quick |
Median partitioning: O(n logn) |
Sorted: O(n2) |
No |
Yes |
I will leave it to you to think of situations where any single sorting
algorithm is more desirable than any other.