Deep dives into certified compilation and GPU optimization
We publish tokens/sec (TPS) from median CUDA timing, with JSONL certificates with sha256 checksum you can reproduce locally.