\documentclass[11pt]{article}
\include{lecture}
\usepackage{subfigure}
\begin{document}
\lecture{6}{9/16/10}{Quantum Search}{Mark Wellons}
In the previous class, we had began to explore Grover's quantum search algorithm. Today, we will illustrate the algorithm and analyze its runtime complexity.
\section{Grover's Quantum Search Overview}
Grover's algorithm is an excellent example of the potential power of a quantum computer over a classical one, as it can search an unsorted array of elements in $\mathcal O(\sqrt N)$ operations with constant error. A classical computer requires $\Omega(N)$ operations, as it must traverse the entire array in the worst case.
Formally, Grover's algorithm solves the following problem: Given some function $f: \{0, 1\}^n \rightarrow \{0, 1\}$, and possibly the value $t = \left \vert f^{-1}(1) \right \vert$. Find any $x \in f^{-1}(1)$.
We begin by entangling all possible inputs so that the state vector looks like
\begin{equation}
\ket{ \psi} = \sum \alpha_x \ket{x}
\end{equation}
where
\begin{equation}
\alpha_x = \frac{1}{\sqrt{2^n}}.
\end{equation}
For brevity, we will define
\begin{equation}
N \equiv 2^n.
\end{equation}
If we were to plot the phase of $ \alpha_x $ for each $ \ket{x}$, it would look as shown in figure \ref{6:fig:base amp}.
\begin{figure}[h!]
\begin{center}
\includegraphics[height=4.5 cm]{base_amp}
\end{center}
\caption{The initial state of the system. Every state is equally likely to be observed if a measurement is taken. We sometimes refer to this state as the \textit{uniform distribution}. \label{6:fig:base amp} }
\end{figure}
At this point, we introduce a new operator, $U_1$, which performs a phase kick only on states where $f(x) = 1$ and leaves states where $f(x) = 0$ unchanged. The resulting amplitudes are shown in figure \ref{6:fig:kick amp}. .
\begin{figure}[h!]
\begin{center}
\includegraphics[height=5 cm]{kick_amp}
\end{center}
\caption{The state of the system after a phase kick on all states where $f(x) = 1$. In this example, there were three states affected, which were reflected across the $x$-axis. \label{6:fig:kick amp} }
\end{figure}
We also another operator, $U_2$, which reflects each amplitude across the average value of $\alpha_x$, as shown in figure \ref{6:fig:average amp}.
\begin{figure}[h!]
\begin{center}
\includegraphics[height=5 cm]{average_amp}
\end{center}
\caption{The state of the system after being reflected across the average, which is indicated by a dotted line. Note that the states where $f(x) = 1$ are now much more probable if a measurement is taken. \label{6:fig:average amp} }
\end{figure}
We now repeatably apply $U_1$ and $U_2$ until the amplitudes of the desired states vastly exceeds the amplitudes of the other states. We then take a measurement, and with high probability will get some state $x$ such that $ f(x) = 1$. Which particular $x$ we get will be uniformly random among the valid states.
%\section{Quantum Operators}
%\subsection{Periodic Property of $U_1$ and $U_2$ }
%Let us now analyze $U_1$ and $U_2$. We first define a new operator
%\begin{equation}
%U \equiv U_2 \circ U_1,
%\end{equation}
%which is simply application of $U_1$ followed by $U_2$. For now, let us assume $U$ is unitary. It follows that for any $l$ and $\epsilon$, there exist some $k > l$ such that
%\begin{equation}
%\left\| U^k - U^l \right\| \le \epsilon.
%\end{equation}
%That is, if we apply $U$ enough, we will return arbitrarily close to a previous state. We know this to be true, as these are finite dimensional matrices, thus the resulting space has finite volume. If we draw some $\epsilon$ ball around every matrix we realize, we must eventually fill the space or have overlap among the balls.
% To see this, consider applying $U$ to the state in figure \ref{6:fig:average amp}. Note that every time $U$ is applied, the average value decreases. Eventually, it will become negative, and then applying $U$ will \textit{decrease} the amplitudes of the desirable states. After enough operations, the state will become close to or equal to the initial state.
% More formally,
%\begin{equation}
%\left\| U^k - U^l \right\| = \left\| U^l \left(U^{k-l} - I \right) \right\| .
%\end{equation}
%However, $U^l$ is unitary, so it does not affect the value of the norm. Thus,
%\begin{equation}
% \left\| U^l \left(U^{k-l} - I \right) \right\| = \left\| U^{k-l} - I \right\| \le \epsilon.
%\end{equation}
%Thus after $k-l$ iterations of applying $U$, we are within $\epsilon$ of the initial state.
\section{Unitary Property of $U_1$ and $U_2$ }
We omit the proof that $U_1$ is unitary as it is simply a phase kick, which was shown to be unitary in a previous lecture.
To show $U_2$ is unitary, we first show it is linear. To understand how $U_2$ might be implemented, we note that reflecting around the average is equivalent to subtracting the average, reflecting across the $x$-axis, and then adding the average back. In formal notation, we can describe $U_2$ as
\begin{equation}
U_2 \equiv - \left( \ket{\psi} - \textit{AVG} \left(\alpha_x \right) \sum \ket{x} \right) + \textit{AVG}(\alpha_x) \sum \ket{x}
\end{equation}
where
\begin{equation}
\textit{AVG} (\alpha_x) \equiv \frac{1}{N} \sum \alpha_x.
\end{equation}
This is clearly linear in $a_x$, as $\textit{AVG} (\alpha_x) $ is simply a linear combination of the $a_x$'s and all of the operators are linear.
We finish this proof by showing that all the eigenvalues of $U_2$ are magnitude 1, a condition required of unitary matrices. First consider what happens when we apply $U_2$ to the initial state shown in figure \ref{6:fig:base amp}. Nothing will happen, as the reflection across the average transforms this state to itself. Thus, the uniform distribution is an eigenvector and the eigenvalue is 1.
Now consider the case shown in figure \ref{6:fig:eigenvector}. On the left, we have a system where the average is zero, and after applying $U_2$, we have the system mirrored across the $x$-axis. Thus this state is another eigenvector and the eigenvalue is -1. In fact, all the eigenvectors orthogonal to the uniform distribution will be states that $U_2$ simply reflects across the $x$-axis. Therefore, all eigenvalues are either 1 or -1, as we can consider $U_2$ to be a reflection across the $\sum_x \ket{x} $axis.
\begin{figure}[h!]
\begin{center}
\includegraphics[height=5 cm]{eigenvector}
\end{center}
\caption{The state of the system before applying $U_2$ is on the left, and the system afterwards is on the right. As the average is zero, the system is merely reflected across the $x$-axis. \label{6:fig:eigenvector} }
\end{figure}
\section{Quantum Circuit}
Now we would like to construct the quantum circuit that implements Grover's algorithm. We naturally start with the uniform superposition. Since $U_1$ is simply a phase kick, it can be implemented by adding an additional $\ket{-}$ qubit as described in previous lectures. To implement $U_2$, recall that $U_2$ is reflection along a axis. If this axis was a basis axis, this reflection would be easy to realize. Unfortunately, it is instead some basis determined by the uniform superposition. However, we can change basis via Hadamard gates, which will shift us to the basis state corresponding to the all-zeros vector. Now we simply reflect across this basis state, and then change back to the uniform superposition, and we have implemented $U_2$. We can repeat $U$ as many times as desired. The full circuit is shown below.
\begin{equation*}
\Qcircuit @C=1em @R=2.0em
{
& &\mbox{$U_1$} & & & &\mbox{$U_2$}&&&&&\mbox{repeat $k$ times}\\
\lstick{\ket{+}} & \qw & \multigate{4}{U_f} & \qw & \gate{H}& \qw & \multigate{3}{\mbox{Reflection Across $\ket{0^n}$ }}& \gate{H}& \qw &\qw &&\cdots&& \qw &\meter & \cw& \rstick{x_1} \\
\lstick{\ket{+}} & \qw & \ghost{U_f} & \qw &\gate{H} \qw & \qw &\ghost{reflection Across 00000} & \gate{H}& \qw& \qw &&\cdots&&\qw &\meter & \cw& \rstick{x_2,} \\
\lstick{...}& \qw & \ghost{U_f} & \qw &\gate{H} \qw & \qw &\ghost{reflection Across 00000} & \gate{H}& \qw&\qw & &\cdots&&\qw &\meter & \cw& \rstick{x_i,} \\
\lstick{\ket{+}} & \qw & \ghost{U_f} & \qw &\gate{H} \qw & \qw & \ghost{reflection Across 00000} & \gate{H}&\qw&\qw & &\cdots&&\qw &\meter & \cw& \rstick{x_n,} \\
\lstick{\ket{-}} & \qw & \ghost{U_f} & \qw &\qw & \qw & \qw& \qw& \qw&\qw &&\cdots&&\qw &\meter & \cw& \rstick{y}
}
\end{equation*}
There are alternatives ways to implement $U$, but this is adequate for our purposes.
\section{Algorithm Complexity}
\subsection{For a known $t$}
We now seek to determine the optimal value of $k$, where $k$ is the number of applications of $U$. Consider that the amplitude $\alpha_x$ of $\ket{x}$ at any point in time depends only whether $f(x) = 0$ or $f(x) = 1$. Since $\alpha_x^{(i)}$ only depends on $f(x)$, we can describe the system state after $i$ iterations of $U$ as
\begin{equation}
\ket{\psi^{(i)}} = \beta_i \frac{1}{\sqrt{N -t}} \sum_{x: f(x) = 0} \ket{x} +\gamma_i \frac{1}{\sqrt{t}} \sum_{x: f(x) = 1} \ket{x},
\end{equation}
where $\beta_i$ and $\gamma_i$ are constants and are constrained by
\begin{equation}
\beta_i^2 + \gamma_i^2 = 1.
\end{equation}
It follows that
\begin{eqnarray*}
\beta_0 &= &\sqrt{\frac{N- t}{N}}, \\
\gamma_0 & =& \sqrt{\frac{ t}{N}}.
\end{eqnarray*}
We can thus describe the system as a two-dimensional system with parameters $\beta$ and $\gamma$, where $(\beta, \gamma)$ lie on the unit circle, as shown in figure \ref{6:fig:unit circle}. Here we plot $\beta$ on the $B$ axis and $\gamma$ on the $C$ axis.
\begin{figure}[h!]
\begin{center}
\includegraphics[height=7 cm]{unit_circle}
\end{center}
\caption{$\beta$ and $\gamma$ can be mapped to the unit circle, with $\beta$ on the $B$ axis and $\gamma$ on the $C$ axis. \label{6:fig:unit circle} }
\end{figure}
This unit circle allows us to generate a new variable $\theta$, which is the angle between the $B$-axis and the point ($\beta$,$\gamma$) as measured from the origin. We can describe the initial value of $\theta$ as
\begin{equation}\label{6:equ:theta}
\sin (\theta_0) = \sqrt{\frac{t}{N}}.
\end{equation}
Given some point ($\beta$,$\gamma$) on this unit circle, what will the effect of the $U_1$ and $U_2$ operators be on this point? Since $U_1$ is a phase kick, it transforms ($\beta$,$\gamma$) by
\begin{equation}
(\beta,\gamma) \rightarrow (\beta,-\gamma)
\end{equation}
which is simply a reflection across the $B$-axis. $U_2$ reflects the point across the line defined by the origin and the point ($\beta_0$,$\gamma_0$). Taken together, these two reflections form a rotation of $2\theta_0$. That is, every application of $U$ rotates the point $2\theta_0$ counterclockwise. It follows that after $i$ iterations,
\begin{eqnarray*}
\theta_i &= &(2i + 1)\theta_0, \\
\beta_i & = &\cos( \theta_i ), \\
\gamma_i & =& \sin ( \theta_i).
\end{eqnarray*}
From looking at the unit circle, it should be clear that the best time to make a measurement is when ($\beta$,$\gamma$) is on or very close to the $C$-axis, as that is when the amplitudes of the valid states is highest. It follows that the ideal value of $k$ would satisfy
\begin{equation}
(2k+1)\theta_0 = \frac{\pi}{2}
\end{equation}
which leads to
\begin{equation}
k = \frac{1}{2} \left( \frac{\pi}{2 \theta_0} - 1 \right).
\end{equation}
This may not be an integer, so we simply choose the closest integer value. We now claim that if we choose a $k$ such that
\begin{equation}\label{6:eqn:k}
k = \left \lceil \frac{1}{2} \left( \frac{\pi}{2 \theta_0} - 1 \right) \right\rfloor
\end{equation}
then
\begin{equation}\label{6:eqn:measurement prob}
\textrm{Prob}\left [ \textrm{observe } x \in f^{-1}(1) \right] \ge \frac{1}{2}.
\end{equation}
We know this as $k$ must bring us with the top quarter of the unit circle, as shown in figure \ref{6:fig:unit circle arc}.
\begin{figure}[h!tb]
\begin{center}
\includegraphics[height=7 cm]{unit_circle_arc}
\end{center}
\caption{ \label{6:fig:unit circle arc} Here is an example where we have applied $U$ three times, which brings us into the shaded part the of unit circle. Each application of $U$ rotates us by $2 \theta_0$, and there is no value of $\theta_0 < \pi/2$ that will allow us to completely jump over the shaded area when applying $U$. Measurements taken in the shaded region have probability $\ge 1/2$ of observing a valid state. }
\end{figure}
The advantage of being in the shaded area is that, in terms of absolute value, the amplitudes of the valid states exceed the amplitudes of the invalid states, thus giving us an probability $\ge 1/2$ when taking a measurement.
We can now show that
\begin{equation}\label{6:eqn:k_complexity}
k =\mathcal O \left (\sqrt \frac {N}{t} \right)
\end{equation}
for small values of $t$. Using the small angle approximation we can rewrite equation \ref{6:equ:theta} as
\begin{equation}
\theta_0 \approx \sqrt{\frac{t}{N}}.
\end{equation}
Which can be substituted into equation (\ref{6:eqn:k}), giving us equation (\ref{6:eqn:k_complexity}).
\subsection{For an unknown $t$}
If $t$ is unknown, there are several things we can try. For now, let us assume that $t$ is positive.
\subsubsection{First Attempt}
We can try $k=1$, and then double $k$ with each step until the first success. This algorithm will have some iteration $i^*$ where $\textrm{Prob}[\textrm{success}] \ge 1/2$. This is clearly true, as if we double $k$ every step, then there is no way we can skip the top quarter of the unit circle.
How many times do we use $U$ in the algorithm? As each iteration doubles the number of times $U$ is applied, this is simply the sum of a geometric series.
So number of applications of $U$ until iteration $i^*$ is still $\mathcal O \left (\sqrt \frac {N}{t} \right)$.
However, this algorithm does not quite work, as $i^*$ is only guaranteed to have probability of success $\ge 1/2$. So it is very possible that we will reach $i^*$, fail the measurement, and then move past $i^*$. If we move past $i^*$, the amplitudes of the valid states begin \textit{decreasing}, thus lowering the probability of measuring a valid state. In other words, the problem with this algorithm is we do not know when to stop if we do not get a success.
\subsubsection{Second Attempt}
In our first attempt, the amplitudes of the valid states were improving until we reached $i^*$, at which point they declined. In our second attempt, we correct for that by trying to maintain our position in the desirable region. We do that by setting $l=1$ and doubling $l$ in each iteration, and each time, we pick a $k$ uniformly from random from the set $\{1, 2,3,...,l \}$. This has the advantage that if we overstep $i^*$, there is still a probability of at least $1/2$ that we will pick a point in the good region. It follows that the we expect to overstep $i^*$ by only one iteration.
In any case, the expected number of applications of $U$ is
\begin{equation}\label{6:eqn:U_complexity}
\left \langle \textrm{number of } U \right \rangle =\left \langle \textrm{number of $U$ up to }i^* \right \rangle + \left \langle \textrm{number of $U$ after }i^* \right \rangle
\end{equation}
We showed in the first attempt that
\begin{equation}\label{6:eqn:up to i* complexity}
\left \langle \textrm{number of $U$ up to }i^* \right \rangle = \mathcal O \left (\sqrt \frac {N}{t} \right),
\end{equation}
which just leaves the us to solve the right term in equation (\ref{6:eqn:U_complexity}). Since the number of applications of $U$ doubles every step, we can express this term as
\begin{equation}
\left \langle \textrm{number of $U$ after }i^* \right \rangle \le \sum_{i > i^*} 2^i \left(\frac{3}{4} \right)^{i-i^*}
\end{equation}
The $3/4$ arrises from that fact each $U$ has a probability of $1/2$ of being in the good region and points in the good region have $1/2$ probability of being a success.
However, this series diverges, as the ratio in our geometric series is greater than 1. This can easily fixed by not doubling between each iteration. Instead, we chose some other factor $\lambda < 4/3$, and now the series converges as shown.
\begin{eqnarray*}
\left \langle \textrm{number of $U$ after }i^* \right \rangle &\le &\sum_{i > i^*} \lambda^i \left( \frac{3}{4} \right)^{i-i^*}, \\
& \le & \lambda^{i^*} \sum_{i > i^*} \lambda^{i-i^*} \left( \frac{3}{4} \right)^{i-i^*}, \\
& \le & \lambda^{i^*} \sum_{i > 0} \lambda^{i} \left( \frac{3}{4} \right)^{i} .
\end{eqnarray*}
What's inside the summation converges, as it is a simple geometric series. We also know $ \lambda^{i^*}$ from equation (\ref{6:eqn:up to i* complexity}). As we now know both of the terms on the right side of equation (\ref{6:eqn:U_complexity}), it follows that Grover's algorithm runs in $\mathcal O \left (\sqrt \frac {N}{t} \right)$.
\end{document}