Lecture 29:  Sorting

Announcements:

·       practice exams for the final exam are posted – try them

·       homework #10 due midnight Thursday

·       p5 due midnight Thursday 

(small differences in similarity measures are due to choice of delimeters – that's ok)

·       final exam:  Monday May 9, 168 Noland, 10:05am

 

 

You try: 

·       each student or pair of students take 10 playing cards

·       place cards face down

·       invent a sorting algorithm that looks at two cards at a time

o      consider using the data structures we've covered

o      can use the fact that values are in [1,13]

·       after 10-15 min., report your sorting algorithm to class

 

After describing 3-4 algorithms: 

time the sorting algorithms on 13 random larger cards

Some possible results:

Algorithm #1:

1st step: 

hash each card to position in an aux. array of size 13

2nd step: 

for each position in the aux. array, put all values into a sorted array

        time:  O(N)

        space:  O(largest possible value)

Algorithm #2:

        1st step:  put each card into a BST

        2nd step: 

do an in-order traversal, putting values into a sorted array

        time:  first pass:  O(N log N),  2nd pass:  O(N)

Algorithm #3:   [called "bubble sort"]

        1st step:

compare adjacent values, if out of order, swap

        at the end, the largest value is at the end of the array

        repeat N times

        time:  O(N2)

Algorithm #4:   [called "quicksort"]

        look at middle value

        partition all other values to left, right if smaller, larger

 

Time the algorithms, discuss which ones are overall best

        consider arrays that are already sorted, or almost sorted

Sorting

problem:  given an array A of N values,

arrange the values in sorted order

·       most sorting algorithms involve comparing values

·       "obvious" algorithms are O(N2)

·       "clever" algorithms are O(N log N)

·       can prove that it's not possible for a comparison sort to have worst case time better than O(N log N)

·       if values are in some fixed range, then can avoid comparisons

Interesting questions for comparison sorts:

·       does the algorithm always take worst case time?

·       what happens when array A is already sorted?

9 well-known comparison algorithms:

1.      selection sort – always O(N2)

2.    insertion sort – worst case O(N2), but O(N) if already sorted

3.    merge sort – always O(N log N)

4.    quick sort – worst case O(N2) but on average O(N log N)

5.    heap sort – always O(N log N)

Selection Sort

idea: 

·       find the smallest value in A, put it in A[0]

·       find the 2nd smallest value, put it in A[1]

etc.

approach:

·       use a nested loop

o      outer loop, k from 0 to A.length-1, indicates which position to fill

o      inner loop, j from k+1 to A.length-1 finds which value to put in position k (need value & index)

o      after the inner loop, put the min value in position k

simulate:  6 students hold cards, initially facing in

        write value of k on the board

smallest-so-far is turned face out

        when the value is placed in A[k], hold it up high

You try:  write the code for

public static void SelectionSort (int[] A) {

public static void SelectionSort (int[] A) {

  int N = A.length, minIndex, min;

  for (int k=0; k<N; k++) {

    min = A[k];

    minIndex = k;

    for (int j=k+1; j < N; j++){

      if (A[j] < min){

        min = A[j];

        minIndex = j;

      }

    }

    A[minIndex] = A[k];

    A[k] = min;

  }

}

Time for selection sort:

·       inner loop is executed a different number of times than the outer loop, so either

(a)                    N-1 + N-2 + N-3 + … + 2 + 1 + 0 = O(N2)  or

(b)                   N ΄ avg length of inner loop = N ΄ N/2 = O(N2)

·       Selection sort is always O(N2)

(even if the array is already sorted)

Insertion sort

idea:

·       put the first two items in the correct order

·       find the right place for the 3rd item relative to the previous items, working right-to-left, shifting each larger value to the right to make room; insert the 3rd value when the next value to the right is smaller.

·       repeat for the 4th item, etc.

approach:

·       use a nested loop

·       outer loop index k from 1 to A.length-1 tells which item you're putting in it's place relative to the previous ones

·       inner loop index j from k-1 down to the position where the k'th value goes, indicates which previous value you're comparing with the kth value

simulate Insertion Sort with cards

·       afterwards, discuss:  what if we use binary search to find the place to insert?

o      Only faster if you don't have to move the item

o      Won’t be faster overall, due to needed shifting

Code for insertion sort is in the on-line notes

Time for insertion sort:

·       worst case:  inner loop executes 1 + 2 + … + N-1 times  O(N2)

·       best case:  array already sorted:  inner loop never executes – O(N)

·       what if array is in reverse sorted order?  - worst case

Merge sort

idea: 

2 sorted arrays (each of length N/2) can be merged into a sorted array of length N in time O(N)

example:

pair of students each sort 4 cards, then merge

to sort an array of length N:

·       divide the array into two halves, sort each half then merge

·       to sort each half, divide it in half, sort each half and merge

·       base case:  length = 1  (already sorted, so return)

simulate the recursive merge sort:

one student takes 8 cards,

gives first half of the cards to another student

that student gives first half to another student, etc.

when one half comes back sorted, give the other half to a student, etc.

when second half comes back sorted, merge the two halves and return the sorted cards

Time for Merge Sort:

·       base case (1 value to be sorted):  T(1) = 1

·       recursive case:  solve 2 problems each of size N/2

o      2 recursive calls

o      merge the 2 solutions

T(N) = 2 T(N/2) + N

You try:  fill in the table of

        N          T(N)         log2 N

        1             1               0

        2            4               1

        4           12              2

        8           32              3

guess:  T(N) = N log2 N + N

verify:  N log N + N = 2 ( N/2 log N/2 + N/2) + N

                                =  N [(log N) – 1] + N + N = N log N + N

        note:  log N/2 =  (log N) – (log 2) = (log N) - 1

Quick Sort

·       choose a "pivot" value v from A

·       partition the array

o      values £ v on the left

o      values > v on the right

·       recursively Quick sort the left and the right

How to choose the pivot value?  -  "median of 3"

        choose the median of

the leftmost, rightmost, & middle value

Simulate Quick Sort

use 7 cards

        choose the pivot, swap it with value in position 0,

partition in place:

        traverse from the left until find value n > pivot

        traverse from the right until find value m < pivot

        swap n and m; 

continue the traversals til  all values are partitioned

        swap the pivot with the last value < pivot

Runtime efficiency of Quick Sort

Time to partition proportional to the number of values that are partitioned

best case, pivot is middle value

        two recursive calls are problems of ½ size

        T(1) = 1

        T(N) = N + 2 T(N/2)

        T(N) = O(N log N)

Worst case:  pivot is smallest or largest value:

        Each recursive call has problem of size one smaller

        N calls, each must partition its part of the array

        So T(N) = N + N-1 + N-2 + … + 1 = O(N2)

Sorting using a BST:

·       insert the N values from the array, one at a time into a BST,  T(N)  = O(N log N)

·       do an in-order traversal of the BST  T(N) = O(N)

total time is proportional to N log N + N, which is O(N log N)

 

Heap Sort:

Step 1:  heapify the array;    T(N) = O(N)

Step 2:  remove the largest value

& place at the end, N times;     

T(N) = [log(N) + 1] + [log (N-1) + 1] +  … + [log 2 + 1] + 2

                = O(N log N)

total time is O(N log N)

 

simulate heap sort with 7 (or 15) values

Summary

 

Best case

Worst case

Avg case

Selection Sort

N2

N2

N2

Insertion Sort

N

N2

N2

Merge Sort

N log N

N log N

N log N

Quick Sort

N log N

N2

N log N

BST

N log N

N2

N log N

Heap Sort

N log N

N log N

N log N

 

You try: 

practice the Quick Sort using 7-10 playing cards

practice Heap sort, other sorts