Language Compression and Pseudorandom Generators

Authors: H. Buhrman, T. Lee, and D. van Melkebeek.
Reference: Computational Complexity, 14: 228-255, 2005. Special issue for selected papers from the 19th Annual IEEE Conference on Computational Complexity.
Earlier version:
- H. Buhrman, T. Lee, and D. van Melkebeek. Language Compression and Pseudorandom Generators. In Proceedings of the 19th Annual IEEE Conference on Computational Complexity, pages 15-28, 2004.

Abstract

The language compression problem asks for succinct descriptions of the strings in a language A such that the strings can be efficiently recovered from their description when given a membership oracle for A. We study randomized and nondeterministic decompression schemes and investigate how close we can get to the information theoretic lower bound of log |A⁼ⁿ| for the description length of strings of length n.

Using nondeterminism alone, we can achieve the information theoretic lower bound up to an additive term of O(((log |A⁼ⁿ|)^1/2 + log n) log n); using both nondeterminism and randomness, we can make do with an excess term of O(log³ n). With randomness alone, we show a lower bound of n - log |A⁼ⁿ| - O(log n) on the description length of strings in A of length n, and a lower bound of 2 log |A⁼ⁿ| - O(1) on the length of any program that distinguishes a given string of length n in A from any other string. The latter lower bound is tight up to an additive term of O(log n).

The key ingredient for our upper bounds is the relativizable hardness versus randomness tradeoffs based on the Nisan-Wigderson pseudorandom generator construction.

dieter@cs.wisc.edu