Language Compression and Pseudorandom Generators Download as PS


Abstract

The language compression problem asks for succinct descriptions of the strings in a language A such that the strings can be efficiently recovered from their description when given a membership oracle for A. We study randomized and nondeterministic decompression schemes and investigate how close we can get to the information theoretic lower bound of log |A=n| for the description length of strings of length n.

Using nondeterminism alone, we can achieve the information theoretic lower bound up to an additive term of O(((log |A=n|)1/2 + log n) log n); using both nondeterminism and randomness, we can make do with an excess term of O(log3 n). With randomness alone, we show a lower bound of n - log |A=n| - O(log n) on the description length of strings in A of length n, and a lower bound of 2 log |A=n| - O(1) on the length of any program that distinguishes a given string of length n in A from any other string. The latter lower bound is tight up to an additive term of O(log n).

The key ingredient for our upper bounds is the relativizable hardness versus randomness tradeoffs based on the Nisan-Wigderson pseudorandom generator construction.


dieter@cs.wisc.edu