Further Notes on Sorting

Quicksort: another divide and conquer algorithm

Unlike merge sort: divide is non-trivial; combine is easy.

top-down example:

A : [ 275, 185, 331,  45, 494, 242, 228, 450]
       ^
       |
     pivot


A : [ 228, 185,  45, 232, 275, 494, 450, 331]
      \________________/   ^   \___________/
              |            |         |
              L          pivot       R

                   .
                   .
		   .

A : [  45, 185, 228, 232, 275, 331, 450, 494]
      \________________/       \___________/

pseudocode for quickSort

quickSort array A of size n
  // divide: partition A around value at A[0]
  partition(A, n)

  // conquer:
  if (pivot > 1) 
    // sort `L' if it has more than 1 element
    quickSort(A, pivot)

  if (pivot + 1 < n) 
    // sort `R' if it has more than 1 element
    quickSort(A + (pivot + 1), n - (pivot+1))
  //           \---------/
  //         pointer arithmetic

  // combine: happens automatically in place

A more detailed example

[  62, 195, 375,  57, 313,   9, 388, 490]
    ^
  pivot

[  57,   9]  62 [313, 375, 388, 490, 195]
    ^             ^
  pivot         pivot


[   9]  57   62 [195] 313 [490, 388, 375]
                            ^
                          pivot

    9   57  62  195  313  [375, 388]  490
                            ^
			  pivot

[   9,  57,  62, 195, 313, 375, 388, 490]

Partitioning

partition pseudocode

partition array A of size n around value at A[0]
  low=1, high=n
  while(low < high) {
    if (A[low] > A[0]) {
      high--
      swap(A[low],A[high])
    }
    else
      low++
  }
  low--
  if (low > 0)
    swap(A[0],A[low])
  pivot = low

A detailed example of partition

[  62, 195, 375,  57, 313,   9, 388, 490]
    ^   ^                                 ^
  pivot |                                high
       low


[  62, 490, 375,  57, 313,   9, 388, 195]
        ^                             ^
       low                           high


[  62, 388, 375,  57, 313,   9, 490, 195]
        ^                        ^
       low                      high


[  62,   9, 375,  57, 313, 388, 490, 195]
         ^                  ^
        low                high


[  62,   9, 375,  57, 313, 388, 490, 195]
             ^              ^
            low            high


[  62,   9, 313,  57, 375, 388, 490, 195]
             ^         ^
            low       high


[  62,   9,  57, 313, 375, 388, 490, 195]
             ^    ^
            low  high

[  62,   9,  57, 313, 375, 388, 490, 195]
                  ^
                 high
		 low


[  57,   9,  62, 313, 375, 388, 490, 195]
             ^
	   pivot

Quicksort Complexity

complexity of partition

complexity of quickSort

In worst case, pivot value is extreme (either largest or smallest element in subarray) --- so one of the subdivisions is of length 0 and the other of length n-1.

In that case we will need to do O(n) partition operations, so the worst-case running-time complexity of quicksort is O(n^2).

When does this happen? If the array is already sorted in increasing order.

What is the best case? Each partition splits the array evenly. Then our analysis becomes like merge sort: We can view quicksort as a binary tree of partition operations:

It turns out, for random input, quicksort is O(n*lgn) in average case as well.

We can make quicksort randomized by randomly choosing pivot. This means that nearly sorted data causes quick sort to perform no worse than random data. (Average and worst-case complexity remains O(nlgn) and O(n^2) respectively)

(It's possible to adapt quicksort to be O(nlgn) worst-case, by using an O(n) algorithm to find the median of the array (the ideal pivot). This turns out to be slower in practice.)

The upshot: in practice, quicksort is really fast!

Radix Sorting

Consider sorting a deck of cards with an order like this:

all clubs < all diamonds < all hearts < all spades

               and

Ace < 2 < 3 < ... < 10 < Jack < Queen < King

(This defines a total order.)

We can sort these with any of our comparison-based sorts. Insertion sort mimics how we might do this drawing one card at a time into a hand.

But we might sort a different way:

The collection of cards is now sorted (and we didn't use any comparisons).

This procedure can be generalized to what is known as a radix sort.

Radix sort proceeds as follows:

A simple example with decimals digits

313, 446, 043, 412, 273, 981

pile 0: 
pile 1: 981
pile 2: 412
pile 3: 313, 043, 273
pile 4:
pile 5:
pile 6: 446
pile 7:
pile 8:
pile 9:

981, 412, 313, 043, 273, 446

pile 0: 
pile 1: 412, 313
pile 2: 
pile 3:
pile 4: 043, 446
pile 5:
pile 6: 
pile 7: 273
pile 8: 981
pile 9:

412, 313, 043, 446, 273, 981

pile 0: 043
pile 1: 
pile 2: 273
pile 3: 313
pile 4: 412, 446
pile 5:
pile 6: 
pile 7: 
pile 8: 
pile 9: 981

043, 273, 313, 412, 446, 981

radix is another name for base (for decimal digits, radix is 10, for letters, radix is 26)

consider decimal radix sort:

  • given a list of non-negative integers (or items that can be monotonically hashed to non-negative integers)
  • find the largest integer M. figure out how many keys there are - how many digits in M. (i.e. compute log base 10 of M). let this number be k.
  • create 10 "piles" (lists)
  • for(j=0; j < k; j++) {
      for(i=0; i < n; i++) {
        determine the jth digit of A[i] ((A[i]/(10^j))%10), call it d
        append A[i] onto pile d
      }
      for(d=0; d < 10; d++)
        place pile d in A (after pile d-1, before pile d+1)
    }
    
  • binary radix sort

    In practice, we can do this faster if we think of integers as binary numbers and we consider radix to be two.

    For example:
    7, 2, 5, 3
    
    in binary: 111, 010, 101, 011
    
    pile 0: 010
    pile 1: 111, 101, 011
    
    010, 111, 101, 011  (2, 7, 5, 3)
    
    pile 0: 101
    pile 1: 010, 111, 011 
    
    101, 010, 111, 011  (5, 2, 7, 3)
    
    pile 0: 010, 011
    pile 1: 101, 111
    
    010, 011, 101, 111  (2, 3, 5, 7)
    
    pseudocode is very similar, but we have more efficient bit operations:
    for(j=0; j < k; j++) {
      for(i=0; i < n; i++) {
        if (A[i] & (1 << j))  // check if jth bit is set
          append A[i] onto pile 1
        else 
          append A[i] onto pile 0
      }
      place pile 0 at front of A, pile 1 after it in A
    }
    

    Complexity of binary radix sort

    Mixing and matching