# Quantra Quantra is a QuantLib-based pricing service built for parallel execution. It exposes pricing functionality over gRPC with FlatBuffers and through an HTTP/JSON gateway for easier integration and generated OpenAPI documentation. ## Why This Exists QuantLib is powerful, but it is not naturally suited to high-concurrency service workloads because important state such as `Settings::instance().evaluationDate()` is global to the process. Quantra works around that by running multiple isolated pricing workers and placing Envoy in front of them as a load balancer. ## What You Get - A C++ pricing server built on QuantLib - A gRPC API using FlatBuffers messages - A JSON/HTTP gateway in `jsonserver/` - A C++ client in `client/` - A Python client package in `quantra-python/` ## Supported Pricing Coverage Representative supported request types include: - Fixed-rate bonds - Floating-rate bonds - Vanilla swaps - OIS swaps - Basis swaps - Zero-coupon inflation swaps - Year-on-year inflation swaps - FRAs - Caps and floors - Swaptions - CDS - Equity options See `examples/data/` for sample payloads. ## Architecture The main runtime model is a multi-process gRPC service fronted by Envoy: ```text JSON client -> json_server (:8080) -> Envoy (:50051) -> sync_server workers (:50055+) gRPC client -----------------------> Envoy (:50051) -> sync_server workers (:50055+) ``` ## Performance Measured on an AMD Ryzen 9 3900X (12 cores / 24 threads), 62 GiB RAM, Debian 13, Linux 6.1. Both benchmarks are informational (not part of the test gate) and live in `tests/bench/`. ### Parallel throughput Pricing the same request across N worker processes behind Envoy, versus pricing it single-threaded with QuantLib. Workload: one EUR multicurve swap (2 curves, 24 bootstrap helpers). Generated by `tests/bench/run_throughput.sh`. | Workers | Throughput (req/s) | Speedup vs 1 worker | |--------:|-------------------:|--------------------:| | 1 | 8.5 | 1.0× | | 2 | 15.7 | 1.8× | | 4 | 31.0 | 3.6× | | 8 | 58.0 | 6.8× | | 12 | 75.4 | 8.9× | Single-threaded QuantLib reference: ~16 req/s. Scaling is near-linear up to the 12 physical cores. ### Curve cache Per-request latency with the curve cache off vs on (200 requests, mean). Generated by `tests/bench/run_bench.sh`. A cache hit reuses the bootstrapped curve and skips re-bootstrapping. | Workload | No cache | Cache | Speedup | |---|--:|--:|--:| | Bond (1 curve, 8 helpers) | 2.11 ms | 1.10 ms | 1.9× | | Swap (2 curves, 24 helpers) | 117.12 ms | 2.02 ms | 57.9× | The gain scales with how much of the request is curve bootstrapping: large for a heavy multicurve with few instruments, small for a light single-curve request. ## Quick Start ### Container Image The published GHCR image starts both the JSON API and the gRPC/Envoy endpoint: - HTTP/JSON API: `8080` - gRPC/Envoy endpoint: `50051` ```bash docker pull ghcr.io/joseprupi/quantra-server:0.1.1 docker run --rm \ -p 8080:8080 \ -p 50051:50051 \ ghcr.io/joseprupi/quantra-server:0.1.1 ``` Check the running service: ```bash curl http://localhost:8080/health curl http://localhost:8080/meta ``` Change the worker count with `QUANTRA_WORKERS`: ```bash docker run --rm \ -e QUANTRA_WORKERS=2 \ -p 8080:8080 \ -p 50051:50051 \ ghcr.io/joseprupi/quantra-server:0.1.1 ``` The public API reference is available at . ### Local Build See `docs/build.md` for environment setup details. Once dependencies are available: ```bash ./scripts/build.sh Release ./scripts/quantra start --workers 4 --foreground ./build/jsonserver/json_server localhost:50051 8080 ``` You can then call the HTTP API with sample requests from `examples/data/`: ```bash curl -X POST http://localhost:8080/price-fixed-rate-bond \ -H "Content-Type: application/json" \ -d @examples/data/fixed_rate_bond_request.json ``` The generated OpenAPI files live in `jsonserver/openapi/`. ## Development Workflow ### Build `./scripts/build.sh` regenerates schemas, recreates `build/`, and compiles the project. ```bash ./scripts/build.sh ./scripts/build.sh Release ``` ### Regenerate Schemas Only If you are editing FlatBuffers schemas and want to regenerate artifacts without a full build: ```bash ./scripts/generate_schemas.sh ``` ### Run Tests ```bash bash tests/run_all_tests.sh ``` The test suite exercises: - C++ pricing parity against QuantLib - C++ gRPC integration - JSON HTTP API scenarios - Python client scenarios ## Repository Map - `server/`: gRPC pricing server - `jsonserver/`: HTTP/JSON gateway and generated OpenAPI docs - `request/`: request entrypoints and endpoint orchestration - `parser/`: parsing, domain conversion, pricing helpers, and builders - `client/`: C++ client library - `quantra-python/`: Python client package - `flatbuffers/`: schema sources plus generated C++, Python, and JSON artifacts - `grpc/`: gRPC service definitions and generated service bindings - `examples/data/`: example JSON requests - `tests/`: parity, integration, and client tests - `scripts/`: build, code generation, and runtime helpers - `tools/quantra-manager/`: packaged process-manager implementation - `docs/`: project documentation and reference notes ## Documentation - `docs/README.md`: documentation index - `docs/build.md`: environment setup and build details - `docs/scripts.md`: build and schema tooling - `docs/testing.md`: test suite details - `docs/process-manager.md`: process-manager behavior and runtime model - `docs/client.md`: C++ client notes - `docs/parser.md`: parser/service/builder conventions - `docs/versioning.md`: versioning policy - `CONTRIBUTING.md`: contribution workflow ## Requirements The repository currently documents and builds around: - CMake `3.16+` - GCC `12+` or Clang `14+` - gRPC `v1.60.0` - FlatBuffers `v24.12.23` - QuantLib `1.41` in Docker builds - Envoy for worker load balancing ## License MIT / Apache 2.0