QuEST v4 has completely overhauled the API, software architecture, algorithms, implementations and testing. This page details the new features, divided into those relevant to users, developers who integrate QuEST into larger software stacks, and contributors who develop QuEST or otherwise peep at the source code!

‍TOC:

For users

For developers

For contributors

Acknowledgements

For users

auto-deployer
Functions like createQureg() and createFullStateDiagMatr() will automatiaclly decide whether to make use of the compiled and available hardware facilities, like multithreading, GPU-acceleration and distribution. The user no longer needs to consider which deployments are optimal for their simulation sizes, nor which devices have sufficient memory to fit their Qureg!
new deployments
QuEST can now make use of multiple GPUs, distributed across different machines on a network, or tightly-coupled by a high-bandwidth interconnect (like NVLink); or both! This is true independent of whether you're also using cuQuantum, NVIDIA or AMD GPUs. Further, used deployments are heterogeneous; simultaneous Quregs may use different facilities at runtime!
much faster
Practically all backend algorithms have been replaced with novel, optimised, bespoke routines - primarily documented in arXiv 2311.0512. Further, special care has been paid to enabling compile-time optimisations, giving structures (like matrices) persistent GPU memory, and lazily evaluating properties like matrix unitarity only once!
cleaner interface
The API has been polished; function names are more consistent, accept an arithmetic-overloaded complex scalar type (called qcomp), and more natural data structures. Matrices, Pauli tensors and their weighted sums are now easier to initialise and populate, accepting even matrix and string literals! Some functions now have type overloads - even the C ones! - and additional C++ overloads accepting native containers like std::vector.
reporters
The API also includes utilities for prettily printing all QuEST data structures (like states, operators and scalars) and reporting on the environment and Qureg hardware accelerations being used, the memory available, and the maximum possible simulation sizes. Input validation has also been massively broadened and error messages made precise and dynamic. Usability is through the roof!
new functions
The set of supported quantum operations has greatly expanded. All unitaries can be effected with any number of control qubits (in any state), diagonal matrices can be raised to powers, density matrices can undergo partial tracing and inhomogeneous Pauli channels (in addition to general Kraus maps and superoperatos), and multi-qubit projectors can now be performed, with and without renormalisation.
more control
Extensive new debugging facilities allow disabling or changing the validation precision and error response at runtime, and controlling how many amplitudes and significant figures of Qureg and matrices are printed.
better documentation
The documentation has been rewritten from the ground-up, and the API doc grouped into sub-categories and aesthetically overhauled with Doxygen Awesome. It is now more consistently structured, mathematically explicit, and is a treat on the eyes!

For developers

new build
The CMake build has been overhauled and modernised, with wider platform support and facilities to ease QuEST's integration into larger stacks. The build is more modular, limiting specialised compilers (like nvcc and mpicc) to compiling only their essential files. This minimises friction and widens QuEST's compiler support.
easier integration
QuEST's backend now uses the standard C++ complex primitive to represent quantum amplitudes and matrix elements, made precision agnostic via new qcomp type. Further, dense matrices now have both 1D row-major and 2D (aliasing the 1D) memory pointers. This permits Qureg and matrix data to be seamlessly accessed by third-party libraries, such as for linear algebra, without the need for adapters nor expensive copying.

For contributors

modular architecture
QuEST's new software architecture is highly modular, separating the responsibilities of interfacing, validating user input, core pre-processing, localising distributed data, choosing which accelerator to use (CPUs or GPUs), and modifying local data using an accelerator. The core pre-processing is further modularised into modules responsible for autodeploying, inlining, performing maths and bitwise routines, probing available memory, checking internal preconditions, parsing user text, printing output, managing data structures and generating random numbers. See architecture.md for more information.
C++ backend
While QuEST's frontend remains C and C++ agnostic, the backend has become consistently C++17, affording development luxuries like overloading, templating, type inference, namespaces, smart pointers, constant expressions, type-traiting, structured bindings, range-based looping and use of standard lists like vector. We have however endeavoured to keep the use of C++ facilities simple so that the code remains readable and editable by C programmers.
internal preconditions
QuEST's defensive design has massively improved by the extensive use of precondition checks, which cheaply validate that internal functions receive correct inputs, where there is room for insidious bugs or future changes. This greatly aids the development process and helps spot bugs earlier, as well as making the assumptions more explicit and ergo the code easier to read and understand.

Acknowledgements

QuEST v4 development was lead by Tyson Jones, with notable contributions from Oliver Thomson Brown, Richard Meister, Erich Essmann, Ali Rezaei and Simon C. Benjamin. Development was financially supported by the UK National Quantum Computing centre (NQCC200921), the UKRI SEEQA project, the University of Oxford, and the University of Edinburgh’s Chancellor’s Fellowship scheme. Developer time was contributed by AMD, the QTechTheory group at the University of Oxford, the EPCC of the University of Edinburgh, and Quantum Motion Technologies. Many helpful discussions were had with, and troubleshooting support given by, NVIDIA's cuQuantum team.

In addition, Tyson sincerely thanks Zoë Holmes of EPFL's QIC lab for her endless patience while juggling his development and postdoctoral duties! So too he thanks Simon Benjamin for his limitless support, and Oliver Brown for help accessing the tested supercomputers - in addition to his fantastic code contributions! Tyson further apologizes to Richard Meister, Sinan Shi, and Chris Whittle for collective hours (if not days) of rubber duck debugging.