0$ we want to find a list that with high probability includes all information words $x \in \{0,1\}^K$ s.t. $\Delta(E(x),r) \leq \frac{1}{2} - \epsilon $. Where $E(x)$ represents the encoding of $x$ and $\Delta$ the relative Hamming distance. We require $ 1/2 - \epsilon$ since, if we go up to $1/2$ and the received word is actually a valid codeword then our list would include all information words (Two distinct code words have relative distance $1/2$.) which we cannot allow since we would like to produce the list efficiently. We will present a randomized procedure that outputs the list in time polynomial in $/\epsilon$. This also implies that the list contains atmost polynomial in $k/\epsilon$ words. Recall that to find the i$^th$ bit of the indormation word $x$, the local decoding procedure for Hadamard codes randomly picked a point $a$ and queried two points in the received word $r(e_i + a)$ and $r(a)$ to obtain $x_i$ . Since the fraction of errors is less than $1/4 - \epsilon$, this procedure give us the correct value with probability $1/2 - 2\epsilon$ . However in the current situation, the fraction of errors is up to $1/2 -\epsilon$. In this case the standard decoding procedure gives the correct value with probability $\geq 2\epsilon$ which is of no use to us. %our error rate is up to $1/2$ - we required the error rate to be less than $1/4 - \epsilon$ for %the local decoding procedure to be useful. However, we still do something similar - The idea is to query the first point while assuming that we have the correct value for the second point. %We pick a bunch of first points in such a way that we need to make relatively few guesses to obtain the second points. We focus on obtaining particular bit of the information word $x$ - the $i^{th}$ bit $x_i$ (say). Pick $t$ points uniformly at random $a_1,a_2 \ldots , a_t \in \{0,1\}^K$ ($t$ will be determined later). Let $x$ be the information word. Recall that the encoding is obtained by taking inner product with all possible vectors $a \in \{0,1\}^K$. Consider $y_c = \sum_{i = 1}^{t} c_ia_i$ for every $c\in \{0,1\}^t, c \neq 0$. For every $c$, $y_c$ is uniformly distributed since the $a_i$'s are picked uniformly at random. Moreover different values of $c$ give pairwise independent random variables. Now, suppose $y_c + e_i$ (once $a_i$'s are fixed) is a position where $r$ is not corrupted. Then $(x,y_c) + r(y_c + e_i) = x_i$. Since atmost a $1/2 - \epsilon$ fraction is corrupted, $$\Pr_{a_1,a_2 \ldots , a_t \sim \{0,1\}^K}[(x,y_c) + r(y_c + e_i) = x_i] \geq \frac{1}{2} + \epsilon$$ Now, we would like to boost our confidence from $1/2 + \epsilon$ by making not $1$ but $2^t -1$ queries. (one for each value of $c$) and taking the majority vote. We will choose a $t$ s.t this can be done. We will see that pairwise independence is enough to boost probability using the Majority vote. %We will pick $t$ so that $ 2^t -1$ is not too many queries. Also, we will use the pair-wise independence %of $y_c$'s crucially in our analysis. Let $I_c$ be the indicator random variable for $(x,y_c) + r(y_c + e_i) = x_i$. $$\Pr[\text{MAJ is incorrect}] \leq \Pr[\sum I_c \leq \frac{1}{2}(2^t - 1)]$$ Since, atmost a $(\frac{1}{2} - \epsilon)$ fraction is corrupted, the expected value of $I = \sum I_c$ is $$E(I) = (\frac{1}{2} - \epsilon)(2^t - 1)$$. Therefore, $$\Pr[\text{MAJ is incorrect}] = \Pr[I \leq E(I) - \epsilon(2^t-1)]$$ by Chebyshev's inequality, $$ \begin{align} \Pr[\text{MAJ is incorrect}] & \leq \frac{\sigma^2(I)}{(\epsilon(2^t-1))^2} & = \frac{(2^t-1)}{ 4(\epsilon(2^t-1))^2} & = \frac{1}{4\epsilon^2(2^t-1)} \end{align} $$ We used the fact that the variance of an indicator random variable is atmost $1/4$ . Since the indicator random variables are pair-wise independent, their variances are additive. We obtain the following by a union bound $$ \Pr[\exists j \text{ We output }x_j\text{ incorrectly}] \leq \frac{K}{4\epsilon^2(2^t-1)}$$ We need $\Pr[\exists j \text{ We output }x_j\text{ incorrectly}] \leq \frac{1}{3}$. $t =$ \large O $(\log \frac{K}{\epsilon})$ works. Thus far, we have shown how one can recover each bit of the information word provided we know the value of $(x,y_c)$ for all $c$. This seems an unreasonable assumption since $x$ is what we want to find. This is where list decoding comes in. The algorithm needs a sequence of values $((x,y_c))_{c\neq 0^t}$. We run the algorithm on all possible values of this sequence and output the list of information words obtained. Then with probability $\geq 2/3$, $x$ is present in this list. We obtain $x$ on atleast a $2/3$ fraction of the runs on the correct sequence. However, one a potential problem is that there are $2^{2^t -1}$ possible sequences. But since the inner product is linear, $$(x,y_c)=(x,\sum_{j=1}^t c_ja_j)=\sum_{j=1}^t c_j(x,a_j).$$ Which means we only need to run through all possible sequences of values of $((x,a_j))_{j \in \{1 \ldots t\}}$. Which is just $2^t = \text{poly}(\frac{K}{\epsilon})$. \section*{Next Time} In the next lecture we will wrap up our discussion of worst-case to average-case reductions, Introduce Extractors and look at some of their applications. \section*{Acknowledgements} In writing the notes for this lecture, I perused the notes by Tom Watson and Matt Elder for lecture 16,17 from the Spring 2007 offering of CS~810, and the notes by Brian Rice and Theodora Hinkle for lectures 19,20 from the Spring 2010 offering of CS~710. \begin{thebibliography}{} \bibitem{stv} Madhu Sudan, Luca Trevisian, and Salil Vadhan. \newblock {\em Pseudorandom Generators without the XOR Lemma}. \newblock J. of Computer and System Sciences, 62(2):236-266, 2001. \newblock Preliminary version: Proc. of 31st ACM STOC, 1999. \newblock Accessed at http://www.cs.berkeley.edu/%7Eluca/pubs/} \end{thebibliography}{} \end{document}