Benchmarking¶

Limen keeps repeatable benchmark commands and raw snapshots under benchmarks.

Local Benchmarks¶

Run all benchmark suites:

./scripts/run-benchmarks.sh

Run with PostgreSQL 18 through Docker Compose:

BENCH_COUNT=3 ./scripts/ci-benchmarks.sh

ci-benchmarks.sh starts PostgreSQL 18, sets LIMEN_POSTGRES_DSN, runs root, SQL, GORM, OAuth, session JWT, and credential-password benchmarks, and writes raw files to benchmarks/results.

GitHub Benchmarks¶

The Benchmarks workflow is manual and scheduled weekly. It uses the same Docker Compose PostgreSQL 18 harness and uploads the raw result files as the benchmark-results artifact.

Manual dispatch:

gh workflow run benchmarks.yml --repo ragokan/limen --ref main

Current Checked-In Snapshots¶

The checked-in snapshots are historical records, not release-wide guarantees.

micro-optimizations summary records the optimization branch comparison.
next-steps summary records the PostgreSQL 18 smoke run for the benchmark harness.
CI benchmark snapshot 26961254493 records the GitHub Actions benchmark workflow verification run.

The PostgreSQL smoke snapshot used BENCH_COUNT=1 and exists to prove the harness, not to make final performance claims. Use BENCH_COUNT=10 or higher for decisions that depend on small deltas.

Comparing Results¶

For release-facing comparisons:

Start from a clean worktree.
Run baseline and candidate benchmarks with the same BENCH_COUNT.
Compare matching raw files with benchstat.
Record the Go version, CPU, database image, sample count, and git commit.

go install golang.org/x/perf/cmd/benchstat@latest
benchstat benchmarks/results/baseline-root.txt benchmarks/results/candidate-root.txt

Dirty-worktree benchmark manifests are useful for smoke tests only.

SQL Snapshot¶

From the PostgreSQL 18 smoke run:

Operation	SQL PostgreSQL	GORM PostgreSQL
`FindOne`	`70.948 us/op`	`77.232 us/op`
`FindMany`	`107.716 us/op`	`145.104 us/op`
`Create`	`1.174 ms/op`	`1.440 ms/op`

In that run, the SQL adapter was about 8.1% faster for FindOne, 25.8% faster for FindMany, and 18.5% faster for Create.

Root Snapshot¶

The ten-sample root comparison in the micro-optimizations branch recorded a -24.47% geomean improvement across root benchmarks, with the largest win in email validation (2380.7 ns/op to 210.7 ns/op).