CS 367-3 - Notes on Computational Complexity

COMPUTATIONAL COMPLEXITY

Computational complexity is the study of how much of a given resource a program uses. The resource in question is usually either space (how much memory) or time (how many basic operations).

Complexity can be looked at from many angles:

Worst-case complexity is the most commonly discussed (and what we will focus on in this course). It give a guaranteed upper bound for how much of a resource a program will use.
Average-case complexity is often useful, but more difficult to compute.
Best-case complexity is rarely useful, except as a contrast with the average and worst cases. It gives a guaranteed lower bound for how much of a resource a program will use.

Space vs. Time Complexity

In general, the amount of space used is less than the amount of time used because space can be reused and time cannot. For most applications, we focus on time complexity, since memory tends to be cheap compared to time (thee days). Sometimes, space is still an issue - particularly if the problem size is very large and the solution requires relatively low time complexity. (An example would be sorting very large (on the order of one-billion element) data sets.)

A simple space example - adding two arrays of integers. Assume our array size is N and A,B are input arrays.

The following code requires N+N+N=3N space:

int C[N];
for (int i=0; i < N; i++)
  C[i] = A[i] + B[i];

Suppose we no longer need B - we can reuse it. The following code requires only N+N=2N space:
```
for (int i=0; i < N; i++)
  B[i] = A[i] + B[i];
      
```

Running-time analysis

How much time does each function require?
Actual time (i.e. seconds) hard to measure - varies from machine to machine and system to system
Consider time as number of basic operations:
- one arithmetic op, e.g. + - *
- one assignment
- one read
- one write
- etc.

**Operations required by `Bag` members**
member function	approximate numbers of operations
size_t size()	1
void insert(int entry)	2
size_t occurrences(int target)	4n + 3
void grab() // slow version	3n + 5
void grab() // fast version	5

For some functions, the number of operations is the same on every call; e.g., the size and insert functons. We say that such functions require constant time.
For other functions. times vary according to some parameter's "value", e.g., time for occurrences is proportional to number of items already in the bag. We call the important factor the problem size.
We don't care about the exact number of operations, just how the time is related to the problem size.
To express time requirements we use "Big-0" notation.

Big-O Notation

Definition: function f(n) is O(g(n)) if there exist constants k and N such that for all n>=N: f(n) <= k * g(n).

(The notation is often confusing: f = O(g) is read "f is big-oh of g.")

As a simple example, let f(n)=2n and g(n)=n. Is f=O(g)? Yes, we can pick k=2 and N=1 and then the definition is easily satisfied since 2n <=n. Although simple, this example illustrates a key property of big-oh: it ignores constants.

Generally, when we see a statement of the form f(n)=O(g(n)):

f(n) is the formula that tells us exactly how many operations the function/algorithm in question will perform when the problem size is n.
g(n) is like an upper bound for f(n). Within a constant factor, the number of operations required by your function is no worse than g(n).
In practice, we try and choose the simplest g(n) possible - usually a single term with a coefficient of 1.

Below are some more big-oh examples:

f(n) = 4 is O(1)
f(n) = n is O(n² )
f(n) = n is O(n)
f(n) = 3n + 1 is not O(4)
f(n) = 3n + 1 is O(n)
f(n) = 3n is O(n³ )
f(n) = 3n is O(n)
f(n) = 36*sqrt(n) is O(sqrt(n))
f(n) = 36*sqrt(n) is O(4n² + 1)
f(n) = 2ⁿ is not O(17n⁵ + 4n² )
f(n) = 2ⁿ is O(2ⁿ )
f(n) = 4lg(n) is O(sqrt(n))
f(n) = 4lg(n) is O(lg(n))
f(n) = n! is not O(n² + 2n + 1)
f(n) = n! is O(n!)
f(n) = 1134 is O(lg(n))
f(n) = 1134 is O(1)

**Growth Rates of Functions**
f				n
	1	2	4	10	100	1,000
7	7	7	7	7	7	7
lg(n)	0	1	2	3.3	6.6	9.9
sqrt(n)	1	1.4	2	3.2	10	32
n	1	2	4	10	100	1,000
n²	1	4	16	100	10,000	1,000,000
n³	1	8	64	1,000	1,000,000	10⁹
2ⁿ	2	4	16	1,024	10³⁰	10³⁰⁰
n!	1	2	24	3,628,800	10¹⁵⁸	10²⁵⁶⁸

Why `big-oh'?

`O' for order as in order of magnitude.

f=O(g) means that asymptotically (as n gets really large), g(n) grows at least as fast as f(n).

Why is this useful? We want out algorithms to scalable. Often, we write program and test them on relatively small inputs. Yet, we expect a user to run our program with larger inputs. Running-time analysis helps us predict how efficient our program will be in the `real world'.