- Authors:
H. Buhrman, T. Lee, and D. van Melkebeek.

- Reference:
*Computational Complexity*, 14: 228-255, 2005.
Special issue for selected papers from the
19th Annual IEEE Conference on
Computational Complexity.

- Earlier version:

#### Abstract

The language compression problem asks for succinct descriptions of
the strings in a language *A* such that the strings can be efficiently
recovered from their description when given a membership oracle for
*A*. We study randomized and nondeterministic decompression schemes
and investigate how close we can get to the information theoretic lower
bound of log |*A*^{=n}| for the description length of strings
of length *n*.
Using nondeterminism alone, we can achieve the information theoretic
lower bound up to an additive term of
*O*(((log |*A*^{=n}|)^{1/2} + log *n*)
log *n*);
using both nondeterminism and randomness, we can make do with an
excess term of *O*(log^{3} *n*).
With randomness alone, we show a lower bound of
*n* - log |*A*^{=n}| - *O*(log *n*)
on the description length of strings in *A* of length *n*,
and a lower bound of 2 log |*A*^{=n}| - *O*(1) on the
length of any program that distinguishes a given string of length *n*
in *A* from any other string. The latter lower bound is
tight up to an additive term of *O*(log *n*).

The key ingredient for our upper bounds is the relativizable hardness
versus randomness tradeoffs based on the Nisan-Wigderson pseudorandom
generator construction.