Further Notes on Sorting

Quicksort: another divide and conquer algorithm

divide A into L and R:
- select a pivot element
- put all element less than pivot into L
- all element greater than pivot into R
conquer:
- quicksort L
- quicksort R
combine: A is now L + pivot + R

Unlike merge sort: divide is non-trivial; combine is easy.

top-down example:

A : [ 275, 185, 331,  45, 494, 242, 228, 450]
       ^
       |
     pivot


A : [ 228, 185,  45, 232, 275, 494, 450, 331]
      \________________/   ^   \___________/
              |            |         |
              L          pivot       R

                   .
                   .
		   .

A : [  45, 185, 228, 232, 275, 331, 450, 494]
      \________________/       \___________/

pseudocode for quickSort

quickSort array A of size n
  // divide: partition A around value at A[0]
  partition(A, n)

  // conquer:
  if (pivot > 1) 
    // sort `L' if it has more than 1 element
    quickSort(A, pivot)

  if (pivot + 1 < n) 
    // sort `R' if it has more than 1 element
    quickSort(A + (pivot + 1), n - (pivot+1))
  //           \---------/
  //         pointer arithmetic

  // combine: happens automatically in place

A more detailed example

[  62, 195, 375,  57, 313,   9, 388, 490]
    ^
  pivot

[  57,   9]  62 [313, 375, 388, 490, 195]
    ^             ^
  pivot         pivot


[   9]  57   62 [195] 313 [490, 388, 375]
                            ^
                          pivot

    9   57  62  195  313  [375, 388]  490
                            ^
			  pivot

[   9,  57,  62, 195, 313, 375, 388, 490]

Partitioning

most important (and difficult) step: choosing pivot
ideally, we want pivot to be median of array
unclear how we can find the median efficiently (it is possible, but complicated)
to simplify matter, we partition around value of first element in array

`partition` pseudocode

partition array A of size n around value at A[0]
  low=1, high=n
  while(low < high) {
    if (A[low] > A[0]) {
      high--
      swap(A[low],A[high])
    }
    else
      low++
  }
  low--
  if (low > 0)
    swap(A[0],A[low])
  pivot = low

A detailed example of `partition`

[  62, 195, 375,  57, 313,   9, 388, 490]
    ^   ^                                 ^
  pivot |                                high
       low


[  62, 490, 375,  57, 313,   9, 388, 195]
        ^                             ^
       low                           high


[  62, 388, 375,  57, 313,   9, 490, 195]
        ^                        ^
       low                      high


[  62,   9, 375,  57, 313, 388, 490, 195]
         ^                  ^
        low                high


[  62,   9, 375,  57, 313, 388, 490, 195]
             ^              ^
            low            high


[  62,   9, 313,  57, 375, 388, 490, 195]
             ^         ^
            low       high


[  62,   9,  57, 313, 375, 388, 490, 195]
             ^    ^
            low  high

[  62,   9,  57, 313, 375, 388, 490, 195]
                  ^
                 high
		 low


[  57,   9,  62, 313, 375, 388, 490, 195]
             ^
	   pivot

Quicksort Complexity

complexity of `partition`

low, high start at opposite ends of array
each iteration either increments low or decrements high
at most O(n) iterations, each doing constant work
partition is O(n)

complexity of `quickSort`

In worst case, pivot value is extreme (either largest or smallest element in subarray) --- so one of the subdivisions is of length 0 and the other of length n-1.

In that case we will need to do O(n) partition operations, so the worst-case running-time complexity of quicksort is O(n^2).

When does this happen? If the array is already sorted in increasing order.

What is the best case? Each partition splits the array evenly. Then our analysis becomes like merge sort: We can view quicksort as a binary tree of partition operations:

The root is the entire array
The left and right children are the result of the partition
Depth k of the tree represents 2^k partition operations on an array of size (n/2^k) so we can say there is O(n) work done at every depth
So worst-case, quicksort does O(hn) work where h is height of tree
If the partitions break the array roughly in half then height is O(lgn) so quicksort is O(n*lgn) in best case

It turns out, for random input, quicksort is O(n*lgn) in average case as well.

We can make quicksort randomized by randomly choosing pivot. This means that nearly sorted data causes quick sort to perform no worse than random data. (Average and worst-case complexity remains O(nlgn) and O(n^2) respectively)

(It's possible to adapt quicksort to be O(nlgn) worst-case, by using an O(n) algorithm to find the median of the array (the ideal pivot). This turns out to be slower in practice.)

The upshot: in practice, quicksort is really fast!

Radix Sorting

Consider sorting a deck of cards with an order like this:

all clubs < all diamonds < all hearts < all spades

               and

Ace < 2 < 3 < ... < 10 < Jack < Queen < King

club ace is the lowest
spade king is highest
club 10 < diamond 8 < diamond 10 < heart 3 < spade 2

(This defines a total order.)

We can sort these with any of our comparison-based sorts. Insertion sort mimics how we might do this drawing one card at a time into a hand.

But we might sort a different way:

Form thirteen piles, one for each card rank: A, 2, .., 10, J, Q, K
For each card, place into appropriate pile.
Pick up piles in order aces on top of twos on top of three ... on top of queens on top of kings.
Now make four piles, one for each suit.
For each card, place into appropriate pile.
Pick up piles in order clubs on top of diamonds on top hearts on top of spades.

The collection of cards is now sorted (and we didn't use any comparisons).

This procedure can be generalized to what is known as a radix sort.

The idea: considered the keys to be sorted to have k components, each of which can be thought of as its own key.
In cards, we think of the suit as one component (the most significant key) and the rank as being another (the least significant key).

For any integer, we might think of each of its digits as being a key:

32768
^   ^
|   |___________________ least significant digit
|
most significant digit

Alphabetical strings of the same length can also be thought of this way:

"foobar"
 ^    ^
 |    |___________________ least significant letter
 |
 most significant letter

Radix sort proceeds as follows:

Sort input based on least significant key (herein called first key) --- place each item in "pile" that corresponds to value at that key.
Make as many piles as the possible range of values that key might take on. (For digits: 10 piles; for letters: 26 piles, for card rank: 13 piles, for card suit: 4 piles)
Keep the piles in order of arrival. (If our first card is heart 3, and then later we see diamond 3, we put the diamond 3 after the heart 3 in the three pile.)
Put the piles together with the smallest keyed-pile first, followed by second-smallest key, etc.

A simple example with decimals digits

313, 446, 043, 412, 273, 981

pile 0: 
pile 1: 981
pile 2: 412
pile 3: 313, 043, 273
pile 4:
pile 5:
pile 6: 446
pile 7:
pile 8:
pile 9:

981, 412, 313, 043, 273, 446

pile 0: 
pile 1: 412, 313
pile 2: 
pile 3:
pile 4: 043, 446
pile 5:
pile 6: 
pile 7: 273
pile 8: 981
pile 9:

412, 313, 043, 446, 273, 981

pile 0: 043
pile 1: 
pile 2: 273
pile 3: 313
pile 4: 412, 446
pile 5:
pile 6: 
pile 7: 
pile 8: 
pile 9: 981

043, 273, 313, 412, 446, 981

radix is another name for base (for decimal digits, radix is 10, for letters, radix is 26)

consider decimal radix sort:

given a list of non-negative integers (or items that can be monotonically hashed to non-negative integers)

find the largest integer M. figure out how many keys there are - how many digits in M. (i.e. compute log base 10 of M). let this number be k.

create 10 "piles" (lists)

for(j=0; j < k; j++) {
  for(i=0; i < n; i++) {
    determine the jth digit of A[i] ((A[i]/(10^j))%10), call it d
    append A[i] onto pile d
  }
  for(d=0; d < 10; d++)
    place pile d in A (after pile d-1, before pile d+1)
}

binary radix sort

In practice, we can do this faster if we think of integers as binary numbers and we consider radix to be two.

For example:

7, 2, 5, 3

in binary: 111, 010, 101, 011

pile 0: 010
pile 1: 111, 101, 011

010, 111, 101, 011  (2, 7, 5, 3)

pile 0: 101
pile 1: 010, 111, 011 

101, 010, 111, 011  (5, 2, 7, 3)

pile 0: 010, 011
pile 1: 101, 111

010, 011, 101, 111  (2, 3, 5, 7)

pseudocode is very similar, but we have more efficient bit operations:

for(j=0; j < k; j++) {
  for(i=0; i < n; i++) {
    if (A[i] & (1 << j))  // check if jth bit is set
      append A[i] onto pile 1
    else 
      append A[i] onto pile 0
  }
  place pile 0 at front of A, pile 1 after it in A
}

Complexity of binary radix sort

O(lg(M) * n) where M is the maximum value
If M is on order of n, then this is an O(nlgn) sort. If M is significantly smaller than n --- quite possible if there are duplicates --- then this is an O(n) sort.
What if M is really big?

Mixing and matching

Can often mix and match sorts. E.g., start with a "faster sort" then use insertion sort on small cases.
Can sometimes use radix sort (on MSB) then use a simple sort like insertion --- this might work well on strings