\documentclass[11pt]{article}
\include{lecture}
\begin{document}
\lecture{18}{10/19/2010}{Simulating Hamiltonian Dynamics}{Tyson Williams}
Last lecture, we finished our discussion of discrete quantum walks. Today we discuss continuous-time quantum walks and how to simulate them using quantum gates. Their applications include solving sparse linear systems of equations and formula evaluation.
\section{Continuous-time Walks}
\subsection{Classical}
A classical continuous-time random walk on a graph $G$ is specified by a vector $P(t)$ where
\begin{align}
P_v(t) = \Pr[\text{walk is at vertex } v \text{ at time } t].
\end{align}
Each vertex sends probability to its neighbors proportional to its own probability. This process is described by
\begin{align}
\frac{dP(t)}{dt} = (A - D) P(t), \label{18:equ:ccrw}
\end{align}
where $A$ is the adjacency matrix $G$ (not normalized) and $D$ a diagonal matrix with $D_{ii} = deg(v_i)$. The matrix $L = A - D$ is known as the Laplacian of $G$.
\begin{exercise} \label{18:exe:ccrw}
Prove that (\ref{18:equ:ccrw}) describes a valid probabilistic process. That is, show for all $t$ that each component is never negative and all components sum to 1.
\end{exercise}
There is, in fact, a closed form solution for (\ref{18:equ:ccrw}), which is
\begin{align} \label{18:equ:exe_sol}
P(t) = e^{L t} P(0).
\end{align}
\subsection{Quantum}
The transition from classical to quantum is easier in the continuous-time setting than in the discrete-time setting. It is
\begin{align}
i \frac{d\ket{\psi(t)}}{dt} = (A - D) \ket{\psi(t)}. \label{18:equ:qcrw1}
\end{align}
While (\ref{18:equ:ccrw}) preserves probability, (\ref{18:equ:qcrw1}) preserves the 2-norm. We can state this using the bra-ket notation
\begin{align}
\ketbra{\psi(t)}{\psi(t)} = \bra{\psi(t)} \ket{\psi(t)} = \ket{\psi(t)}^\dag \ket{\psi(t)} = 1,
\end{align}
which is just the inner product of $\psi(t)$ with itself. Equation (\ref{18:equ:qcrw1}) then becomes
\begin{align}
i \frac{d\ket{\psi(t)}}{dt} = L \psi(t). \label{18:equ:qcrw2}
\end{align}
\begin{proof}[Proof that equation (\ref{18:equ:qcrw2}) is a valid quantum process]
Since
\begin{align*}
\frac{d}{dt} \ketbra{\psi(t)}{\psi(t)}
&= \left(\frac{d}{dt} \bra{\psi(t)}\right) \ket{\psi(t)} + \bra{\psi(t)}\left(\frac{d}{dt} \ket{\psi(t)}\right)\\
&= i L^\dag \ketbra{\psi(t)}{\psi(t)} - i L \ketbra{\psi(t)}{\psi(t)}\\
&= i (L^\dag - L) \ketbra{\psi(t)}{\psi(t)}\\
&= i (L - L) \ketbra{\psi(t)}{\psi(t)} \tag{Since $L$ is Hermitian}\\
&= 0,
\end{align*}
the 2-norm of this quantum process is a constant. Thus, if $\ketbra{\psi(0)}{\psi(0)} = 1$, then $\ketbra{\psi(t)}{\psi(t)} = 1$ for all $t$.
\end{proof}
Physicists will recognize (\ref{18:equ:qcrw2}) as Schr\"{o}dinger's equation, which holds even when $L$ is replaced with any Hermitian matrix $H$ that varies over time. The closed-form solution for constant $H$ in the quantum case is similar to the classic one, namely
\begin{align}
\ket{\psi(t)} = e^{-i H t} \ket{\psi(0)}.
\end{align}
When discussing how to solve well-conditioned systems of linear equations, $e^{-i H t}$ was the operator $U$ that we used. We used the fact that if $H$ is efficiently sparse, then we can compute $U$ efficiently. We now show how to do that.
\section{Simulating Sparse Hamiltonians}
To simulate a sparse Hamiltonian $H$ efficiently, we need a handle on $H$. It is not enough for $H$ to be sparse. It needs to sparse in a ``efficient'' way. When looking at a row, we need to be able to efficiently locate and approximately compute the nonzero entries. We say that $H$ is sparse when it has at most $s$ nonzero entries per row/column where $s = \poly\log(N)$. We can efficiently approximate $U = e^{-i H t}$ when $H$ is efficiently sparse.
Our algorithm will be slightly worse parameters than the one we used while discussing well-conditioned systems of linear equations.
\subsection{$H$ is Diagonal} \label{18:sec:diag}
If $H$ is diagonal, then $e^{-i H t}$ is just a combination of rotations.
\subsection{$H$ is Efficiently Diagonalizable}
Being a Hermitian matrix, $H$ has an orthonormal basis of eigenvectors. This implies that there exists a matrix $V$ such that $H V = V D$ where $V$, whose rows are the eigenvectors of $H$, is efficiently computable and $D$ is a diagonal matrix. Then
\begin{align}
e^{-i H t} = V e^{-i D t} V^{-1}
\end{align}
and we have reduced the problem to the case with a diagonal matrix.
\subsection{$H$ is a Matching} \label{18:sec:matching}
This is actually a special case of $H$ begin efficiently diagonalizable. We single it out and discuss it further because we will use it later.
If the graph underlying $H$ is a matching, then $H$ has at most one nonzero entry in each row/column. We can simultaneously permute the rows and columns to get a matrix of the form
\begin{align*}
\left[
\begin{array}{cc|cc|cc}
& * & & & & \\
* & & & & & \\\hline
& & & * & & \\
& & * & & & \\\hline
& & & & & * \\
& & & & * &
\end{array}
\right].
\end{align*}
Since $2 \times 2$ matrices are always efficiently diagonalizable when its entries are efficiently computable, we are done.
\subsection{Closure Under Addition} \label{18:sec:cua}
If we can efficiently compute $U_1 = e^{-i H_1 t}$ and $U_2 = e^{-i H_2 t}$, then we can efficiently compute $U = e^{-i (H_1 + H_2) t}$. This is easy when $H_1$ and $H_2$ commute because then
\begin{align}
U = e^{-i (H_1 + H_2)t} = e^{-i H_1 t} e^{-i H_2 t} = U_1 U_2.
\end{align}
When $H_1$ and $H_2$ do not commute, we take advantage of the fact that we only need to approximate $e^{-i (H_1 + H_2) t}$. The Taylor series expansion for $e^{-i H t}$ is
\begin{align}
e^{-i H t} = I - i H t + O\left(||H||^2 t^2\right). \label{18:equ:taylor1}
\end{align}
Since
\begin{align}
e^{-i H t} = \left(e^{-i H t/n}\right)^n,
\end{align}
we are also interested in the Taylor series expansion for $e^{-i H t/n}$, which is
\begin{align}
e^{-i H t/n} = I - i H \frac{t}{n} + O\left(||H||^2 \frac{t^2}{n^2}\right).
\end{align}
Then
\begin{align*}
e^{-i H_1 t/n} e^{-i H_2 t/n}
&= I - i (H_1 + H_2) \frac{t}{n} + O\left(\left(||H_1||^2 + ||H^2||^2\right) \frac{t^2}{n^2}\right)\\
&= e^{-i (H_1 + H_2) t/n} + O\left(\left(||H_1||^2 + ||H^2||^2\right) \frac{t^2}{n^2}\right),
\end{align*}
so
\begin{align*}
e^{-i (H_1 + H_2) t}
&= \left(e^{-i (H_1 + H_2) t/n}\right)^n\\
&= e^{-i H_1 t/n} e^{-i H_2 t/n} + O\left(\left(||H_1||^2 + ||H^2||^2\right) \frac{t^2}{n}\right).
\end{align*}
Closure under addition generalizes to
\begin{align}
e^{-i \sum_{j=1}^k H_j t} = \left(\prod_{j=1}^k e^{-i H_j t/n}\right)^n + O\left(\left(\sum_{j=1}^k ||H_j||^2\right) \frac{t^2 k}{n}\right). \label{18:equ:gen_cua}
\end{align}
In order to make the error term in (\ref{18:equ:gen_cua}) is no more than $\epsilon$, it suffices for $n$ to be at least
\begin{align*}
\poly\left(\max_{j} ||H_j||, k, t\right) \frac{1}{\epsilon}.
\end{align*}
\subsection{$H$ is Sparse}
When $H$ is sparse, the idea is to efficiently write $H$ as a sum of efficient matchings and apply cases \ref{18:sec:matching} and \ref{18:sec:cua}.
\paragraph{First Attempt}
Let $H_j$ be the $j$th nonzero entry in $H$. This will not work because the $H_j$'s will not always be Hermitian.
\paragraph{Second Attempt}
Decompose $H$ into matchings. By Vizing's Theorem, every graph $G$ can be edge colored with at most $\Delta(G) + 1$ colors, where $\Delta(G)$ is the maximum degree of the graph. Notice that the set of edges for each color is a matching. Unfortunately, we need to efficiently compute an edge coloring but all known constructive proofs of this result are not efficient. If we use $O(s^2 \log^2 N)$ colors, then efficient constructions are known.
For a graph $G = (V, E)$, label the vertices with 1, \ldots, $N = |V|$. If $G$ has any self loops, we can take care of them by adding a diagonal matrix, which is efficiently computable by cases \ref{18:sec:diag} and \ref{18:sec:cua}. Thus we can assume that $G$ has no self loops. To the edge $(v, w) \in E$, assign the color
\begin{align*}
c(v, w) =
\left\{
\begin{array}{lr}
(\text{index of } v \text{ as a neighbor of } w,\\\text{ index of } w \text{ as a neighbor of } v,\\\ m(v,w),\\\ w \bmod{m(v,w)}) & v < w\\
c(w,v) & v > w,
\end{array}
\right.
\end{align*}
where
\begin{align*}
m(v,w) = \min \{\mu \in \mathbb{Z}^+ \ |\ v \not\equiv w \pmod{\mu}\},
\end{align*}
which exists and is $O(\log N)$ since $0 < v < w \le N$. This can be seen by contradiction. Suppose that $v \not\equiv w \pmod{N}$ but $\mu = \omega(\log N)$. Then $v \equiv w \pmod{\mu}$ for all $\mu$ from 1 to $O(\log(N))$. In particular, $v$ and $w$ are equivalent modulo the primes in that range. However, a nontrivial fact from number theory is that the the product all primes less than a number $n$ is at least $2^n$. Thus by the Chinese remainder theorem, we get that $v \equiv w \pmod{N}$, a contradiction.
This coloring is consistent (since $c(v,w) = c(w,v)$) and is efficient to compute. It remains to show that it is a valid coloring.
\begin{proof}[Proof that this coloring is valid]
It suffices to show that $c(v,w) = c(v,w') \implies w = w'$.
\paragraph{Case 1: $v < w$ and $v < w'$}
The second component of the color implies that $w = w'$.
\paragraph{Case 2: $v > w$ and $v > w'$}
The first component of the color implies that $w = w'$ (and it is also symmetric to case 1).
\paragraph{Case 3: $v < w$ and $v > w'$}
The third component of the color is $\mu = m(v,w) = m(w',v)$, so $w \equiv v \bmod{\mu}$, which is a contradiction with the construction of $\mu = m(v,w)$.
\paragraph{Case 4: $v > w$ and $v < w'$}
This case is symmetric to case 3.
\end{proof}
Based on these techniques, our runtime for efficiently (and approximately) computing $e^{-i H t}$ when $H$ is sparse is
\begin{align*}
\poly\left(s, \log N, \max_j ||H_j||, t, \frac{1}{\epsilon}\right)
\end{align*}
for an accuracy of $\epsilon$. We refrain from specifying the exact polynomial since it is not as optimal as the one we mentioned while discussing well-conditioned systems of linear equations.
\section{Application: Formula Evaluation}
Consider the formula for determining which player in a two-player game (where each player always has two possible moves unless the game is over) has a winning strategy. Say that the formula evaluates to 1 if the first player to move has a winning strategy and 0 if the second player to move has a winning strategy. This type of formula is known as a game tree and has alternating levels of OR and AND gates with leaf nodes that indicate which player wins. The question is this, how many of the $N$ leaves do we need to query to determine who wins?
Deterministically, we need to query $\Theta(N)$. Using randomness to recursively determine which branch to evaluate first, the expected number of leaves we need to query is $\Theta(N^{d})$, where $d \approx 0.753$. This improvement comes from the fact that while evaluating an OR (equivalently AND) gate, we might find a branch that evaluates to 1 (equivalently 0) before evaluating the other branch (as long as such a branch exists).
The best known quantum algorithm uses $O(\sqrt{N} \log N)$ queries. The $\sqrt{N}$ term looks likes Grover search. The $\log N$ term would intuitively come from amplifying Grover's success probability. However, Grover cannot get this result. This result is actually an application of discrete random walks that were inspired by continuous random walks.
\section{Next Time}
In our next lecture, we will discuss adiabatic quantum computation. This is an alternate model of quantum computation that is similar to continue-time quantum walks. We will show that this model is universal and can be simulated by the quantum circuit model with a polynomial amount of overhead in time.
\end{document}