A new quantum computing benchmark has revealed the strengths and weaknesses of several quantum processing units (QPUs).

The benchmarking tests, led by a team at the Jülich Research Centre in Germany, compared 19 different QPUs from five suppliers – including IBM, Quantinuum, IonQ, Rigetti and IQM – to determine which chips were more stable and reliable for high-performance computing (HPC).

These quantum systems were tested both at different “widths” (the total number of qubits) as well as different “depths” for 2-qubit gates. The gates are operations that act on two entangled qubits simultaneously, and depth measures the length of a circuit – in other words, its complexity and execution time.

IBM’s QPUs showed the greatest strength in terms of depth, while Quantinuum performed best in the width category (where larger numbers of qubits were tested). The QPUs from IBM also showed significant improvement in performance across iterations, particularly between the earlier Eagle and more recent Heron chip generations.

These results, outlined in a study uploaded Feb. 10 to the preprint arXiv database, suggest that the performance improvements can be attributed not only to better and more efficient hardware, but also improvements in firmware and the integration of fractional gates — custom gates available on Heron can reduce the complexity of a circuit.

However, the latest version of the Heron chip, dubbed IBM Marrakesh, did not demonstrate expected performance improvements, despite having half the errors per layered gate (EPLG) compared to the computing giant’s previous QPU, IBM Fez.

Beyond classical computing

Smaller companies have made relatively big gains, too. Importantly, one Quantinuum chip passed the benchmark at a width of 56-qubits. This is significant because it represents the ability of a quantum computing system to surpass existing classical computers in specific contexts.

Related: China achieves quantum supremacy claim with new chip 1 quadrillion times faster than the most powerful supercomputers

“In the case of Quantinuum H2-1, the experiments of 50 and 56 qubits are already above the capabilities of exact simulation in HPC systems and the results are still meaningful,” the researchers wrote in their preprint study.

Specifically, the Quantinuum H2-1 chip produced results at 56 qubits, running three layers of the Linear Ramp Quantum Approximate Optimization Algorithm (LR-QAOA) — a benchmarking algorithm — involving 4,620 two-qubit gates.

“To the best of our knowledge, this is the largest implementation of QAOA to solve an FC combinatorial optimization problem on real quantum hardware that is certified to give a better result over random guessing,” the scientists said in the study.

IBM’s Fez managed problems at the highest depth of the systems tested. In a test that included a 100-qubit problem using up to 10,000 layers of LR-QAOA (nearly a million two-qubit gates) Fez retained some coherent information until nearly the 300-layer mark. The lowest performing QPU in testing was the Ankaa-2 from Rigetti.

The team developed the benchmark to measure a QPU’s potential to perform practical applications. With that in mind, they sought to devise a test with a clear, consistent set of rules. This test had to be easy to run, platform agnostic (so it could work the widest possible range of quantum systems) and provide meaningful metrics associated with performance.

Their benchmark is built around a test called the MaxCut problem. It presents a graph with several vertices (nodes) and edges (connections) then asks the system to divide the nodes into two sets so that the number of edges between the two subsets is maximal.

This is useful as a benchmark because it is computationally very difficult, and the difficulty can be scaled up by increasing the size of the graph, the scientists said in the paper.

A system was considered to have failed the test when the results reached a fully mixed state — when they were indistinguishable from those of a random sampler.

Because the benchmark relies on a testing protocol that’s relatively simple and scalable, and can produce meaningful results with a small sample set, it’s reasonably inexpensive to run, the computer scientists added.

The new benchmark is not without its flaws. Performance is dependent, for instance, on fixed schedule parameters, meaning that parameters are set beforehand and not dynamically adjusted during the computation, meaning they can’t be optimised. The scientists suggested that alongside their own test, “different candidate benchmarks to capture essential aspects of performance should be proposed, and the best of them with the most explicit set of rules and utility will remain.”

Share.

Leave A Reply

Exit mobile version