Lecture
29: Sorting
Announcements:
·
practice
exams for the final exam are posted try them
·
homework
#10 due
·
p5
due
(small differences in
similarity measures are due to choice of delimeters
that's ok)
·
final
exam:
You try:
·
each
student or pair of students take 10 playing cards
·
place
cards face down
·
invent
a sorting algorithm that looks at two cards at a time
o
consider
using the data structures we've covered
o
can
use the fact that values are in [1,13]
·
after
10-15 min., report your sorting algorithm to class
After
describing 3-4 algorithms:
time the sorting algorithms on
13 random larger cards
Some possible results:
Algorithm
#1:
1st step:
hash each card to position in
an aux. array of size 13
2nd step:
for each position in the aux.
array, put all values into a sorted array
time: O(N)
space: O(largest possible value)
Algorithm #2:
1st step: put each card into a BST
2nd step:
do an in-order traversal,
putting values into a sorted array
time: first pass:
O(N log N), 2nd
pass: O(N)
Algorithm #3:
[called "bubble sort"]
1st step:
compare adjacent values, if out
of order, swap
at the end, the
largest value is at the end of the array
repeat N times
time: O(N2)
Algorithm #4:
[called "quicksort"]
look at middle
value
partition all
other values to left, right if smaller, larger
Time
the algorithms, discuss which ones are overall best
consider arrays
that are already sorted, or almost sorted
Sorting
problem: given an array A of N
values,
arrange the values in sorted
order
·
most
sorting algorithms involve comparing values
·
"obvious"
algorithms are O(N2)
·
"clever"
algorithms are O(N log N)
·
can
prove that it's not possible for a comparison sort to have worst case
time better than O(N log N)
·
if
values are in some fixed range, then can avoid comparisons
Interesting questions for comparison sorts:
·
does the algorithm always take worst case time?
·
what happens when array A is already sorted?
9 well-known comparison algorithms:
1.
selection
sort always O(N2)
2.
insertion
sort worst case O(N2), but O(N) if already sorted
3.
merge
sort always O(N log N)
4.
quick
sort worst case O(N2) but on average O(N log N)
5.
heap
sort always O(N log N)
Selection Sort
idea:
·
find
the smallest value in A, put it in A[0]
·
find
the 2nd smallest value, put it in A[1]
etc.
approach:
·
use
a nested loop
o
outer
loop, k from 0 to A.length-1, indicates which position to fill
o
inner
loop, j from k+1 to A.length-1 finds which value to put in position k (need
value & index)
o
after
the inner loop, put the min value in position k
simulate: 6 students hold cards, initially facing in
write value of k on the board
smallest-so-far is turned face out
when the value is placed in A[k], hold it up high
You try:
write the code for
public static void SelectionSort (int[] A) {
public static void SelectionSort (int[] A) {
int N = A.length,
minIndex, min;
for (int k=0; k<N; k++) {
min = A[k];
minIndex = k;
for (int j=k+1; j < N; j++){
if (A[j] < min){
min = A[j];
minIndex = j;
}
}
A[minIndex] = A[k];
A[k] =
min;
}
}
Time for
selection sort:
·
inner loop is executed a different number
of times than the outer loop, so either
(a)
N-1 + N-2 + N-3 +
+ 2 + 1 + 0 =
O(N2) or
(b)
N ΄ avg length of
inner loop = N ΄ N/2 =
O(N2)
·
Selection sort is always O(N2)
(even if the array is already sorted)
Insertion sort
idea:
·
put
the first two items in the correct order
·
find the right place for the 3rd item
relative to the previous items, working right-to-left, shifting each larger
value to the right to make room; insert the 3rd value when the next
value to the right is smaller.
·
repeat for the 4th item, etc.
approach:
·
use
a nested loop
·
outer
loop index k from 1 to A.length-1 tells which item you're putting in it's place
relative to the previous ones
·
inner
loop index j from k-1 down to the position where the k'th
value goes, indicates which previous value you're comparing with the kth value
simulate Insertion Sort with cards
·
afterwards, discuss:
what if we use binary search to find the place to insert?
o
Only
faster if you don't have to move the item
o
Wont
be faster overall, due to needed shifting
Code for insertion sort is in the on-line notes
Time for insertion sort:
·
worst
case: inner loop executes 1 + 2 +
+
N-1 times O(N2)
·
best
case: array already sorted: inner loop never executes O(N)
·
what if array is in reverse sorted order? - worst case
Merge sort
idea:
2 sorted arrays (each of length N/2) can be merged
into a sorted array of length N in time O(N)
example:
pair of students each sort 4 cards,
then merge
to sort an array of length
N:
·
divide
the array into two halves, sort each half then merge
·
to
sort each half, divide it in half, sort each half and merge
·
base
case: length = 1
(already sorted, so return)
simulate the recursive merge sort:
one student takes 8 cards,
gives first half of the cards
to another student
that student gives first half
to another student, etc.
when one half comes back
sorted, give the other half to a student, etc.
when second half comes back
sorted, merge the two halves and return the sorted cards
Time for Merge Sort:
·
base
case (1 value to be sorted): T(1) = 1
·
recursive
case: solve 2 problems each of size N/2
o
2
recursive calls
o
merge
the 2 solutions
T(N) = 2 T(N/2) + N
You try:
fill in the table of
N T(N) log2 N
1 1 0
2 4 1
4 12 2
8 32 3
guess: T(N) =
N log2 N + N
verify: N log N + N = 2 ( N/2 log N/2 + N/2) + N
= N [(log N) 1] + N + N = N log N + N
note: log N/2 = (log N) (log 2) = (log N) - 1
Quick Sort
·
choose
a "pivot" value v from A
·
partition
the array
o
values
£
v on the left
o
values
>
v on the right
·
recursively
Quick sort the left and the right
How to choose the pivot value? -
"median of 3"
choose the median of
the leftmost, rightmost,
& middle value
Simulate Quick Sort
use 7 cards
choose the pivot, swap it with value in position 0,
partition in place:
traverse from the left until find
value n >
pivot
traverse from the right until find
value m <
pivot
swap n and m;
continue the traversals til all values are
partitioned
swap the pivot with the last value < pivot
Runtime efficiency of Quick Sort
Time to partition proportional to the number of
values that are partitioned
best case, pivot is middle
value
two recursive calls are problems of ½ size
T(1) = 1
T(N) = N + 2 T(N/2)
T(N) = O(N log N)
Worst case:
pivot is smallest or largest value:
Each
recursive call has problem of size one smaller
N
calls, each must partition its part of the array
So T(N) = N + N-1 + N-2 +
+ 1 = O(N2)
Sorting using a
BST:
·
insert
the N values from the array, one at a time into a BST, T(N) = O(N log N)
·
do
an in-order traversal of the BST T(N) = O(N)
total time is proportional to N
log N + N, which is O(N log N)
Heap Sort:
Step 1: heapify the array;
T(N) =
O(N)
Step 2:
remove the largest value
& place at the end, N times;
T(N) = [log(N) + 1] + [log (N-1) + 1] +
+ [log 2 + 1] + 2
= O(N log N)
total time is O(N log N)
simulate heap sort with 7 (or 15) values
Summary
|
Best case |
Worst case |
Avg case |
Selection Sort |
N2 |
N2 |
N2 |
Insertion Sort |
N |
N2 |
N2 |
Merge Sort |
N log N |
N log N |
N log N |
Quick Sort |
N log N |
N2 |
N log N |
BST |
N log N |
N2 |
N log N |
Heap Sort |
N log N |
N log N |
N log N |
You try:
practice the Quick Sort using 7-10
playing cards
practice Heap sort, other sorts