\documentclass[11pt]{article}
\include{lecture}
\usepackage{subfigure}
\begin{document}
\lecture{12}{9/30/2010}{Order Finding}{Kenneth Rudinger}
We continue our discussion of quantum algorithms based on eigenvalue estimation. We finish up the topic of solving sparse well-conditioned systems of linear equations over the reals. Additionally, we examine the topic of order finding and begin the development of an efficient order finding algorithm.
\section{``Solving" Linear Equations}
To briefly recap from last lecture, we wish to solve a system of linear equations $A\mathbf{x} = \mathbf{b}$, where $A$ is an $N \times N$ invertible hermitian matrix. We are given $A$ and $\ket{\mathbf{b}}$ (normalized); it is our goal to find $\ket{\mathbf{x}}$ (renormalized, as we don't require $A$ to be unitary).
Note that this is $not$ quite the same as solving for $\mathbf{x}$, because we are not actually solving for all components of $\mathbf{x}$; rather we hope to get as an output the normalized state $\ket{\mathbf{x}}$.
The quantum algorithm we will develop will run in $\widetilde{O}((\log N)s^2\kappa^2\frac{1}{\epsilon})$. $s$ is the sparseness of the matrix (the maximum number of non-zero elements in any row of $A$), $\epsilon$ is the allowed error ($||\ket{\widetilde{\mathbf{x}}}-\ket{\mathbf{x}}|| < \epsilon$, where $\ket{\widetilde{\mathbf{x}}}$ is the actual output state) and $||A||^{-1}\leq\kappa$. We also have the requirement that $\Omega(1)\leq||A||\leq1$.
Because $A$ is hermitian, we know we can decompose $\ket{\mathbf{b}}$ into a linear combination of the eigenvectors of $A$:
\begin{align}
\ket{\mathbf{b}} = \sum_j\beta_j\ket{\varphi_j}\label{12:eqn:1}
\end{align}
\noindent where $\ket{\varphi_j}$ are the eigenvectors of $A$, with eigenvalues $\lambda_j$. It is this state we wish to transform into $\ket{\mathbf{x}}$, as we know $\ket{\mathbf{x}}$ can be decomposed in the following manner (up
to normalization):
\begin{align}
\ket{\mathbf{x}}=\sum_j\beta_j\lambda_j^{-1}\ket{\varphi_j}\label{12:eqn:2}
\end{align}
\noindent The proof of (2) is straightforward: $A\ket{\mathbf{x}}=\ket{\mathbf{b}}=\sum_j\beta_j\ket{\varphi_j}$; applying $A^{-1}$ yields $A^{-1}A\ket{\mathbf{x}}=\ket{\mathbf{x}}=\sum_j\beta_jA^{-1}\ket{\varphi_j}=\sum_j\beta_j\lambda_j^{-1}\ket{\varphi_j}$.
Step 1 in creating $\ket{\mathbf{x}}$ is performing the transformation $\ket{\mathbf{b}}\ket{\mathbf{0}}\to\sum_j\beta_j\ket{\varphi_j}\ket{\widetilde{\lambda_j}}$ using eigenvalue estimation.
Step 2 will be transforming this new state into $\ket{\mathbf{x}}$.
\subsection{Step 1}
To perform the first step of this algorithm, we need a unitary operator $U$ with the same eigenvectors as $A$, and with eigenvalues in close relation to the eigenvalues of $A$. If we choose $U=e^{iA}$, these conditions are satisfied, as $U\ket{\varphi_j}=e^{i\lambda_j}\ket{\varphi_j}$. (The process of matrix exponentiation is discussed in Lecture 11.)
Because $||A||\leq1$, we know that each eigenvalue $\lambda_j$ satisfies
$|\lambda_j| \leq 1$.
Therefore, we know there is a one-to-one correspondence between $\lambda_j$
and $e^{i \lambda_j} = e^{2\pi i \omega_j}$ where $|\omega_j| \leq \frac{1}{2}$.
(In fact, this even holds if we relaxe the condition $||A|| \leq 1$ to
$||A|| < \pi$.)
We can, for a given eigenvector of $U$ $\ket{\varphi_j}$, obtain $\ket{\widetilde{\omega_j}}$ through eigenvalue estimation on $U$. We are then able to obtain the state $\ket{\widetilde{\lambda_j}}$, because of the one-to-one correspondence between $\lambda_j$ and $\omega_j$.
If $A$ is sparse, we can efficiently compute $U$ with fixed error in a time of approximately $\widetilde{O}(\log(N)s^2)$. (This computation will be discussed at greater length in the upcoming lecture on random walks.)
Once we have our hands on $U$, we can run eigenvalue approximation on $\sum_j\beta_j\ket{\varphi_j}\ket{\mathbf{0}}$ to get $\sum_j\beta_j\ket{\varphi_j}\ket{\widetilde{\lambda_j}}$ (which, we recall, we will not actually observe). $\ket{\widetilde{\lambda_j}}$ will be concentrated near $\ket{\lambda_j}$, but they will not be exactly equal. In order to realize our overall error of $\epsilon$, we must ensure that the relative error for each $\ket{\widetilde{\lambda_j}}$ is at most $\epsilon$. If we apply $U$ $k$ times in the eigenvalue estimation step, then absolute error is at most $k^{-1}$ with high probability.
To ensure that the relative error for all $j$ is at most $\epsilon$ with
high probability, we want $k^{-1}/|\lambda_j| \leq \epsilon$, i.e.,
$k \geq 1/(\epsilon|\lambda_j|)$ for all $j$. Because of our condition on $\kappa$, we see that this inequality is satisfied if $k\geq\frac{\kappa}{\epsilon}$. (Recall that $||A^{-1}||\leq\kappa$ means that $|\lambda_j^{-1}|\leq\kappa$.)
This is the first step in our algorithm. The complexity cost is the cost of eigenvalue estimation, so we see that we have a complexity of $O(\frac{\kappa}{\epsilon}(\log(N)s^2)$. The cost of running $U$ once is $O(\log(N)s^2)$, and we
do it $\frac{\kappa}{\epsilon}$ times. As we are now assured that each $\ket{\widetilde{\lambda_j}}$ is sufficiently concentrated near each respective $\ket{\lambda_j}$ for our total error to be less that $\epsilon$, we will simplify the rest of our analysis by assuming $\ket{\widetilde{\lambda_j}}=\ket{\lambda_j}$.
\subsection{Step 2}
Next we discuss the second step of our algorithm. We now have the state $\sum_j\beta_j\ket{\varphi_j}\ket{\lambda_j}$. We cannot simply extract $\ket{\mathbf{x}}$ by multiplying each term in the sum by $\lambda_j^{-1}$, as this would not be a unitary transformation. However, if we attach a single $\ket{0}$ qubit to our state, we can perform a pseudo-non-unitary action on the $\ket{\lambda_j}$ part of the state, absorbing the ``non-unitary" part of the transformation into the added $\ket{0}$ state, preserving unitarity. Such a transformation can be achieved with elementary operations, and will have the following action
for some $C$ to be determined:
\begin{align*}
\ket{\lambda_j}\ket{0}\to\ket{\lambda_j}(\sqrt{1-(\frac{C}{\lambda_j})^2}\ket{0}+\frac{C}{\lambda_j}\ket{1})
\end{align*}
For this operation to be unitary, we need that $\frac{C}{|\lambda_j|}\leq1$ for all $j$, so we choose the value of $C$ to be $C=\frac{1}{\kappa}$.
We now make a \emph{partial observation}, only observing the last qubit, measuring either $\ket{0}$ or $\ket{1}$, and thus collapsing the rest of the (now renormalized) state into the subspace consistent with our observation. If we observe $\ket{1}$, the new state of the system is (modulo normalization) $\sum_j\beta_j\lambda_j^{-1}\ket{\varphi_j}\ket{\lambda_j}$, which is the state $\ket{\mathbf{x}}$ we had wanted to extract from the beginning, with the addition of the $\ket{\lambda_j}$ qubits. To reset these $\ket{\lambda_j}$ states to their initialized state of $\ket{0}$ (and thus independent of $j$), we simply run eigenvalue estimation in reverse. As $\ket{1}$ is the ``good" state, we want to be guaranteed to have a sufficiently high probability of observing it. We see that the probability of this event is
\begin{align*}
\Pr[\mathrm{observing } \ \ket{1}]=\sum_j|\frac{\beta_jC}{\lambda_j}|^2\geq \min_j|\frac{C}{\lambda_j}|^2\geq C^2=\frac{1}{\kappa^2}
\end{align*}
Therefore, if we run this algorithm $\kappa^2$ times, we will have a good probability of observing $\ket{1}$. This would make our overall runtime $O(\kappa^3(\log(N)s^2\frac{1}{\epsilon})$. However, we can use amplitude amplification (as is done in Grover's algorithm) to decrease runtime, as it is desirable to increase the probability of observing a ``good" state (and decrease the probability of observing a ``bad" state). We recall that if, after one run, the probability of observing a good state is $p$, amplitude amplification reduces the number of iterations to have a good probability of success
from $O(\frac{1}{p})$ to $O(\frac{1}{\sqrt{p}})$. Therefore, as probability of success (observing $\ket{1}$) after one run is on the order of $\frac{1}{\kappa^2}$, we can save a factor of $\kappa$ in runtime, yielding a final runtime of $O(\kappa^2(\log(N)s^2\frac{1}{\epsilon})$.
\section{Order Finding}
We now turn our attention to the problem of order finding. We are given integers $M$ and $a$ such that $M>0$ and $0\leq a0$ such that $a^r=1$\ mod $M$.
\begin{exercise}
Prove that $r$ exists if and only if GCD(a,M)=1. (M and a are relatively prime.)
\end{exercise}
\noindent From now on, we shall assume that $GCD(a,M)=1$, and will not concern ourselves with the need to check this condition, which can be done in $O(\polylog(M))$ time.
Classically, the best known algorithms solve this problem in time exponential in the bit length of $M$. However, we will demonstrate the existence of a quantum algorithm that solves this problem in time polynomial in the bit length of $M$. Our algorithm will be an application of eigenvalue estimation. However, this problem of order finding can also be solved as a special case of period finding, as was done by Shor (which will be discussed in a future lecture, when we discuss Shor's algorithm for factoring).
In order to find $r$ using eigenvalue estimation, we need a unitary operator with its eigenvalues connected to $a$ in some manner. The following is such an operator:
\\\\
\indent \indent \indent \indent \indent \indent \indent \indent $U_a\ket{x}=\ket{ax\ \bmod\ M} \ ($if$\ 0\leq x\leq M)\\$
\indent \indent \indent \indent \indent \indent \indent \indent $U_a\ket{x}=\ket{x} \ $(otherwise)
\\\\
For future reference, we note that we can efficiently compute high powers
of $U$. If we simply applied $U$ as a black box, this would take time linear
in the exponent, and recall that we need an exponent of $N=2^n$ if we want
$n$ bits of accuracy in the eigenvalue estimation procedure for $U$.
However, we can compute the powers of $U_a$ more efficiently because
$U_a^N\ket{x}=\ket{a^Nx\ mod\ M}$. Since $a^N \bmod M$ is modular
exponentiation, we can compute it with a number of steps polynomial in
$\log(M+N)$. Therefore, computing high powers of $U_a$ is relatively easy.
Now, let us note that $U_a^r=I$, because we know that $a^r$ mod $M$=1. Therefore, for any eigenvalue $\lambda$ of $U$, $\lambda^r=1$ and thus $\lambda_j=e^{\frac{2\pi i j}{r}}$ for $0\leq j