Categories: Algorithms, Cryptography, Quantum information.
Shor’s algorithm
Shor’s algorithm was the first truly useful quantum algorithm. It can solve important problems, most notably integer factorization, much more efficiently than any classical algorithm. It weakens widely-used cryptographic schemes, such as RSA and Diffie-Hellman.
In essence, Shor’s algorithm’s revolutionary achievement is that it can efficiently find the periods of a function on a discrete finite field, where:
This is a so-called hidden subgroup problem for a finite Abelian group. With minimal modifications, Shor’s algorithm can solve practically every such problem.
Integer factorization
Originally, Shor’s algorithm was designed to factorize an integer . For reasons explained later, this means our goal is to find the period of the modular exponentiation function :
For a given and . The period is the smallest integer satisfying . To do this, the following -qubit quantum circuit is used, with chosen so that :
Here, refers to the -qubit quantum Fourier transform, and the oracle calculates for predetermined values of and . It is an XOR oracle, working as follows:
Execution starts by applying the Hadamard gate to the first qubits, yielding:
Where , and is the computational basis state . Moving on to :
Then we measure , causing it collapse as follows for an unknown arbitrary value of :
Due to entanglement, the unmeasured (top ) qubits change state into a superposition:
Clearly, there is a periodic structure here, but we cannot measure it directly, because we do not know the value of , which, to make matters worse, changes every time we run the algorithm. This is where the QFT comes in, which outputs the following state:
Where is a th root of unity. Measuring this state yields a , with a probability :
The last step holds because . Surprisingly, this implies that we did not need to perform the measurement of earlier! This makes sense: the period does not depend on , so why would we need an implicit to determine ?
So, what does the above probability work out to? There are two cases:
Where the latter case was evaluated as a geometric series. The condition is equivalent to asking if is a multiple of , i.e. if , for an integer .
Recall that is the number of times that fits in , so . Assuming is an integer, then and , which tells us that . This implies that if is a multiple of (i.e. ), then , so , which is exactly what we got earlier!
In other words, the condition is equivalent to being an integer. In that case, we have that , which we substitute into from earlier:
And because is a multiple of , and fits times in , there must be exactly values of that satisfy . Therefore the probability of all other -values is zero! This becomes clearer when you look at the sum used to calculate : if , then it sums over , leading to perfect destructive interference for the “bad” -values, leaving only the “good” ones.
So, to summarize: if is an integer, then measuring only yields -values that are multiples of . Running Shor’s algorithm several times then gives several -values separated by . That tells us what is, and we already know , so we finally find the period .
That begs the question: what if is not an integer? We cannot check this, since is unknown! Instead, we rewrite the probability as follows:
This function peaks if is close to a multiple of , i.e. , which we rearrange:
We know the left-hand side, and, from the definition of , clearly . We chose , so is quite small, and consequently is too, since .
In other words, is a “simple” fraction, so our goal is to find a “simple” fraction that is close to the “complicated” fraction . For example, if , then probably .
This can be done rigorously using the continued fractions algorithm: write as a continued fraction, until the non-integer part of the denominator becomes small enough. This part is then neglected, and we calculate whatever is left, to get an estimate of .
Of course, is a probability distribution, so even though the odds are in our favour, we might occasionally measure a misleading -value. Running Shor’s algorithm several times “fixes” this.
So, to summarize: if is not an integer, the measured -values are generally close to for an integer . By approximating using the continued fraction algorithm, we estimate . Repeating this procedure gives several values of , such that is easy to deduce by taking the least common multiple of the denominators.
In any case, once we think we have , we can easily verify that . Whether is the smallest such integer depends on how lucky we are, but fortunately, for most applications of this algorithm, that does not actually matter, and usually we find the smallest anyway.
You typically need to repeat the algorithm times, and the QFT is . The bottleneck is modular exponentiation , which is and therefore worse than the QFT, yielding a total complexity of .
OK, but what does have to do with factorizing integers? Well, recall that is given by:
is the number to factorize, and is a random integer coprime to , meaning . The fact that is the period of for a certain -value, implies that:
Suppose that is even. In that case, we can rewrite the above equation as follows:
In other words, is a multiple of . We then use that :
Because is even by assumption, the two factors on the left are integers, and as just mentioned, their product is a multiple of . Then we only need to calculate:
And there we have the factors of ! The can be calculated efficiently in time.
But what if is odd? No problem, then we just choose a new coprime to , and keep repeating Shor’s algorithm until we do find an even . We do the same if is itself a multiple of .
References
- J.S. Neergaard-Nielsen, Quantum information: lectures notes, 2021, unpublished.
- S. Aaronson, Introduction to quantum information science: lecture notes, 2018, unpublished.