Section#4: Homework 1
(CS838: Topics in parallel computing, CS1221, Thu, Jan 28, 1999, 8:00-9:15 a.m.)

Simple parallel PRAM sorting algorithm

An array of n numbers x1,...,xn is stored in shared memory of a p-processor PRAM machine, where p=p(n)<= n. For simplicity, assume the unit cost PRAM model, where any READ, WRITE, or local operation takes unit time. Your task is to sort the numbers using some parallel comparison-based sorting algorithm. Consider this simple divide-and-conquer approach:
  1. Processors split the numbers uniformly among themselves so that each takes care of a subarray of n/p numbers.
  2. Each processor sorts its subarray using an optimal sequential sort algorithm.
  3. Then they merge pairwise the sorted subarray as follows:
    1. p/2 processors merge p/2 pairs of subarrays, each of size n/p,
    2. p/4 processors merge p/4 pairs of subarrays, each of size 2n/p.
    3. and so on
    4. finally: 1 processor merges the last pair of subarrays, each of size n/2.
Solve the following problems:
  1. Find the best possible asymptotic formula for the parallel time T(n,p).
  2. Using this formula, calculate the cost C(n,p) and efficiency E(n,p).
  3. Given 0<E0<1, find the asymptotically maximal function f1 such that
    for all  pn=O(f1(n)):  E(n,pn)>= E0.
  4. Given 0<E0<1, find the asymptotically minimal function f2 such that
    for all  np=\Omega(f2(p)):  E(np,p)>= E0.
  5. Find the asymptotically minimal function f3 such that
    for all  p=\Omega(f3(n)):  T(n,p)=T_{opt}(n,p)
  6. What is the scalability of this algorithm?