Guides3 min read

How to Read Your Load Curl Report Card (2026 Guide)

By James Okonkwo · Performance Engineer

Every Load Curl test produces more than raw charts. The report card distills latency, errors, throughput, and stability into a single letter grade — plus actionable recommendations per dimension.

Why a report card instead of dashboards alone?

Dashboards are essential for drill-down, but during a release crunch you need a fast answer: did we pass or not? The grade is a weighted summary aligned with common SLO thinking. Teams using CI/CD can fail builds automatically when the grade drops below a threshold (Pro plan).

The four grading dimensions

Latency (p50, p95, p99)

Latency measures how long requests take end-to-end. We weight p95 and p99 heavily because tail latency defines user experience for many APIs.

  • A — Tail latency well within typical SLO ranges
  • C — p95 acceptable but p99 showing strain
  • F — Sustained high latency or growing tails during the test

What to do when latency fails: Check database query plans, connection pool sizes, N+1 patterns, and upstream timeouts. Profile hot paths and compare against a baseline test from last week.

Error rate

Errors under load often differ from errors at low traffic. Connection resets, 503s from gateways, and rate limits appear only when concurrency rises.

  • Watch for error spikes correlated with ramp-up — often capacity, not code bugs
  • Compare 4xx vs 5xx — auth misconfiguration vs server overload

Throughput

Throughput confirms your system delivers expected requests per second at target concurrency. A fast but low-throughput system may be serializing work unnecessarily.

Stability

Stability captures whether metrics degrade over the test duration. Latency that creeps upward for 30 minutes suggests memory leaks, GC pressure, or queue buildup.

Reading the letter grade

| Grade | Typical meaning | |-------|-----------------| | A | Ship with confidence for this load profile | | B | Minor issues; monitor in production | | C | Fix before high-traffic events | | D | Significant risk under target load | | F | Do not deploy; critical failures observed |

Grades are relative to the load profile you configured — 50 concurrent users vs 5,000 produce different expectations.

Using recommendations effectively

Each failing dimension includes specific next steps, not generic advice. Work through them in order:

  1. Fix errors first — they invalidate latency readings
  2. Address stability — otherwise optimizations mask leaks
  3. Tune latency — caching, indexing, async where appropriate
  4. Validate throughput — scale horizontally if the app is healthy but capped

Comparing runs over time

Export PDF reports on Pro plans and store baselines per release tag. A B → A improvement after a cache change is proof your fix worked. A A → C regression after a dependency upgrade is a signal to roll back or patch.

Putting it into practice

Run the same test weekly against staging with identical concurrency and duration. Treat grade drift as technical debt. When the report card is green, your load test becomes a gate — not a guess — before production deploys.

Related articles