Section#16: Asymptotically optimal mesh sorting algorithm
(CS838: Topics in parallel computing, CS1221, Thu, Mar 25, 1999, 8:00-9:15 a.m.)


Let N=n2. Consider M(n,n) of N nodes, each holding a number. Divide the mesh into n1/2 submeshes M(n3/4,n3/4), called shortly blocks. One column of n1/4 blocks is called a vertical slice and one row of n1/4 blocks a horizontal slice. Let m=n1/4 be the # of blocks in one slice (horizontal or vertical). Block (i,j) is the block in the i-th horizontal slice and j-th vertical slice.

The algorithm consists of 8 phases:

Lemma

The algorithm sorts N=n2 numbers on mesh M(n,n) into a snake-like order in
3n+o(n)
parallel steps.
Proof
(by 0-1 Sorting Lemma) Proof of the parallel time complexity: Each block is mesh M(n3/4,n3/4). If we use Shearsort within blocks and even-odd transposition in rows and columns, we have Hence the total parallel time complexity is \sumi=18 Ti(N)=3n+O(n3/4log n)=3n+o(n).
Back to the beginning of the page Back to the CS838 class schedule