Section#3: Parallel complexity theory - NC algorithms
(CS838: Topics in parallel computing, CS1221, Tue, Jan 26, 1999, 8:00-9:15 a.m.)


The contents

  1. Basics notions of sequential complexity theory
  2. Highly parallel feasibility
  3. NC algorithms
  4. Drawbacks of the class NC theory
A common sense on feasibility of some computing resource is that the growth rate for the resource is bounded by a polynomial in input size. In the world of sequential problem solving, the most valuable resource is time. Hence, a problem is feasible, or tractable, if it is solvable in polynomial time and intractable if the best known algorithm has higher than polynomial time complexity and all other intractable problems are known to be reducible to this problem. (Recent advances in complexity theory consider approximatively tractable problems which are intractable if we seek exact solutions, but are tractable if we can accept approximate solutions, but this is beyond the scope of this lecture).

Obviously, a sequentially intractable problem remains intractable in parallel if we use at most polynomial number of processors. And polynomial number of processors is the maximum we can hope for in practice. Hence, parallel complexity theory deals only with sequential polynomial problems. Most of the machinery of the parallel complexity theory is derived from the sequential one, which uses specific formal framework to make its results and conclusions independent on particular implementation details of algorithms and robust with respect to various models and architectures of computing devices. So before explain the basic ideas of the parallel complexity, we review the basic notions of the classical complexity theory, so that we can understand how the parallel complexity theory relates to the classical one.
Back to the beginning of the page Back to the CS838 class schedule

Basics notions of sequential complexity theory

Famous open problem P\not=NP?

Obviously, P\subset NP, but it is not known whether P=NP or P\not=NP. It is conjectured that P\not=NP and that NP-P contains another subclass of called NP-complete problems, typically of combinatorial flavor, such as search in exponentially large search spaces. To define NP-completeness, we need the notion of reducibility.

Lemma

P is closed under the relation of P-reduction: If L1<=p L2 and L2 in P, then L1 in P.

Lemma

The class of NP-complete problems is closed under P-reduction. If L2 in NP and there exists NP-complete L1 such that L1<=p L2, then L2 is NP-complete.

Corollary: If there exists an NP-complete problem such that is in P, then P=NP. If there exists an NP-complete problem which is proved not to be in P, then no NP-complete problem is in P. There is a strong evidence for the second case to be true. There are hundreds of NP-complete problems. To show that a language L is NP-complete, we have to show that L is in NP (which is usually easy) and then there are two choices:

Back to the beginning of the page Back to the CS838 class schedule

Highly parallel feasibility

In parallel computing world, we have another valuable resource, the number of processors. Except for this difference, all the notions of the classical complexity theory apply here. The parallel feasibility usually means decidability in polynomial time with polynomial number of processors. So a problem is parallel feasible iff it is sequentially feasible and parallel complexity theory works only with complexity class P.

Lemma

L is decidable in sequential time nO(1) iff it is decidable in parallel time nO(1) with nO(1) processors.

However, the goal of parallel computation is to achieve a highly parallel feasibility: to reduce the sequential polynomial complexity of problems to subpolynomial parallel time!!!! For that, we must use polynomial number of processors, since

p(n).T(n,p(n))>= SU(n).
There are two important practical considerations:
  1. ``Which polynomial number of processors is acceptable/affordable?'' In practice, the polynomial number of processors is meant to be linear or close.
  2. ``Which subpolynomial time is achievable''? It turned out that almost all highly parallel feasible problems can achieve polylogarithmic parallel time.
Therefore, the traditional class of highly parallel feasible algorithms is the following.
Back to the beginning of the page Back to the CS838 class schedule

NC algorithms

The class NC is the set of languages decidable in parallel time T(n,p(n))=O(logO(1)n) with p(n)=O(nO(1)) processors.

Again, we may assume PRAM model, and as we have seen in the previous lecture, if some algorithm is in NC, it remains in NC regardless of which PRAM submodel we assume. More generally, the class NC is robust with respect to any other accepted model of parallel computation (Bollean circuit model, LogP, Parallel memory hierarchy model, etc). The class NC includes:

  1. parallel prefix (scan) computation
  2. parallel sorting and selection
  3. matrix operations (matrix multiplication, solving special cases of systems of linear equations)
  4. parallel tree contraction and expression evaluation
  5. Euler tour technique and derived algorithms
  6. parallel algorithms for connected components of graphs
  7. parallel graph biconnectivity and triconnectivity
  8. parallel algorithms for many problems in computational geometry
  9. parallel string matching algorithms
Many NC algorithms are cost optimal. For example, they have T(n,p(n))=O(log n) with p(n)=n/log n if SU(n)=O(n) (for example parallel scan). Some cost optimal NC algorithms are superfast, they achieve even sublogarithmic time. For example, there are string matching algorithms with O(log log n) parallel time if p(n)=n/(log log n) processors, where n is the length of the string.

All parallel algorithms included in this course are NC, so that we will see many examples in detail during the course.

On the other hand, there are many problems for which no NC algorithms are known. Within this theory, they are considered badly parallelizable and usually called inherently sequential. This is going to be discussed in the next lecture. But before that, we take a look on problems with the NC class theory.
Back to the beginning of the page Back to the CS838 class schedule

Drawbacks of the class NC theory

  1. The class NC may include some algorithms which are not efficiently parallelizable. The most infamous example is parallel binary search. The trouble is that this problem has polylogarithmic time complexity even for p(n)=1. If we apply p>1 processors, we get T(n,p)=logp+1n and therefore the cost C(n,p)=(p/log p)log n and the speedup is S(n,p)=log(p+1) which is far from what could be considered as a highly parallel solution.
    Any sequential algorithm with logarithmic time is in NC regardless of its parallel feasibility!!!
  2. NC theory assumes situations where a huge machine (polynomial number of processors, e.g., millions) solves very quickly (polylogarithmic time, e.g., seconds) moderately sized problems (e.g., hundred of thousands input items). In practice, however, moderately sized machines (hundreds, at most thousands of processors) are used to solve large problems (e.g., millions of input items). So that the number of processors tends to be subpolynomial, even sublinear.
  3. Some problems in P have sublinear parallel time O(nx), 0<x<1, and are not therefore in NC. But sublinear functions may be less progressive for practical values of n and the asymptotic behavior becomes relevant only for impractically large n. For example, \sqrt{n}<log3n for n<= 0.5 x 109. So, sublinear parallel time algorithms may run faster than NC algorithms.
Back to the beginning of the page Back to the CS838 class schedule

Last modified: Mon Jan 25 by Pavel Tvrdik