Rate limiting and resilience
Protect services under load with rate limiting, circuit breakers, timeouts, adaptive overload shedding, body limits, and request coalescing.
When a service is healthy, every request gets served. When it’s unhealthy — a
hot client hammering one endpoint, a slow upstream dragging latency up, a CPU
pegged at 100% — the difference between a brief blip and a full outage is how
gracefully you shed the excess. Celeris ships six in-tree middleware packages for
exactly this, each a standalone celeris.HandlerFunc you install with Use,
Pre, Group, or per-route:
| Package | What it does | Default reject status |
|---|---|---|
middleware/ratelimit | Per-key token bucket / sliding window | 429 Too Many Requests |
middleware/circuitbreaker | Stop calling a failing dependency | 503 Service Unavailable |
middleware/timeout | Bound per-request latency | 503 Service Unavailable |
middleware/overload | Adaptive CPU/depth/latency load shedding | 503 Service Unavailable |
middleware/bodylimit | Cap request body size | 413 / 411 |
middleware/singleflight & middleware/idempotency | Coalesce duplicate work; make retries safe | 409 on conflict |
Every package follows the same shape: a New(config ...Config) celeris.HandlerFunc
constructor, a Config struct with sensible zero-value defaults, and a sentinel
error you can match with errors.Is. You can install them with no config at all
and get a reasonable default, then tune from there.
import (
"github.com/goceleris/celeris"
"github.com/goceleris/celeris/middleware/ratelimit"
)
s := celeris.New(celeris.Config{Addr: ":8080"})
s.Use(ratelimit.New()) // 10 RPS, burst 20, per client IP
Ordering matters. Middleware runs outside-in. Put cheap, broad rejections first so an overloaded server spends as little as possible on requests it’s about to drop:
bodylimit→ratelimit→overload→circuitbreaker→timeout→ your handlers is a sane default. See Middleware for the full ordering model and wherePre(pre-routing) fits.
Rate limiting
ratelimit.New installs a per-key limiter. By default it’s a token bucket:
each key gets a bucket that refills at RPS tokens per second up to a ceiling of
Burst tokens; each request spends one token, and a request with no tokens left
is rejected with 429. This lets short bursts through (up to Burst) while
holding the long-run average at RPS.
Source: celeris/middleware/ratelimit/ratelimit.go:36.
import "github.com/goceleris/celeris/middleware/ratelimit"
// 100 requests/second sustained, bursts up to 200, per client IP.
s.Use(ratelimit.New(ratelimit.Config{
RPS: 100,
Burst: 200,
}))
Rate as a human string
Instead of RPS + Burst you can write a Rate string in <count>-<unit>
form, where the unit is S (second), M (minute), H (hour), or D (day).
Rate takes precedence over RPS, and sets Burst equal to the count unless
you set Burst explicitly. Source: celeris/middleware/ratelimit/config.go:208.
s.Use(ratelimit.New(ratelimit.Config{Rate: "100-M"})) // 100/min, burst 100
s.Use(ratelimit.New(ratelimit.Config{Rate: "1000-H"})) // 1000/hour
s.Use(ratelimit.New(ratelimit.Config{Rate: "5-S", Burst: 20})) // 5/s sustained, burst 20
Rate | Meaning | Resulting RPS | Default Burst |
|---|---|---|---|
"5-S" | 5 per second | 5 | 5 |
"100-M" | 100 per minute | 1.6667 | 100 |
"1000-H" | 1000 per hour | 0.2778 | 1000 |
"10000-D" | 10000 per day | 0.1157 | 10000 |
You can parse and validate a rate string yourself before constructing the middleware — handy when the value comes from config or an untrusted source:
rps, burst, err := ratelimit.ParseRate("100-M") // 1.6667, 100, nil
if err != nil { /* invalid format */ }
// Validate a whole Config without panicking (New panics on a bad Rate).
if err := ratelimit.ValidateConfig(cfg); err != nil {
log.Fatalf("bad ratelimit config: %v", err)
}
ValidateConfig is the safe pre-flight check; New itself panics on an
invalid Rate. Source: celeris/middleware/ratelimit/config.go:179.
The key: KeyFunc and the proxy caveat
A limiter is only as good as its key. By default the key is the client IP
(c.ClientIP(), falling back to c.RemoteAddr()).
Source: celeris/middleware/ratelimit/config.go:155.
Read this before you ship.
c.ClientIP()only returns the real client address when the proxy middleware is installed (s.Pre(proxy.New(...))) and configured with your load balancer’s address inTrustedProxies. Without it,ClientIP()returns the immediate peer — i.e. your load balancer — so every real client shares one bucket and a single noisy client triggers a global 429 for everyone behind that hop. With a misconfiguredTrustedProxiesrange, attackers can spoofX-Forwarded-Forand escape their bucket. Verify the chain before relying on the default. Source: theKeyFuncdoc comment,celeris/middleware/ratelimit/config.go:62.
For anything other than raw IP, supply your own KeyFunc:
// Rate-limit per API key (read from a header set by your auth middleware).
s.Use(ratelimit.New(ratelimit.Config{
Rate: "1000-M",
KeyFunc: func(c *celeris.Context) string {
if k := c.Header("x-api-key"); k != "" {
return "key:" + k
}
return "ip:" + c.ClientIP() // fall back for anonymous traffic
},
}))
X-RateLimit headers
Unless you set DisableHeaders: true, every response carries the standard
limit headers, and rejected requests also get Retry-After:
| Header | Meaning |
|---|---|
X-RateLimit-Limit | The bucket capacity (Burst) |
X-RateLimit-Remaining | Tokens left in this key’s bucket |
X-RateLimit-Reset | Seconds until the bucket refills |
Retry-After | (on 429 only) Seconds the client should wait |
Source: celeris/middleware/ratelimit/ratelimit.go:198-207.
Sliding window
Token bucket can let a burst land right at a window edge. Set
SlidingWindow: true to use a sliding-window counter instead — it weights the
previous and current window by the elapsed fraction of the current window,
smoothing the rate near boundaries. Source: celeris/middleware/ratelimit/config.go:91.
s.Use(ratelimit.New(ratelimit.Config{Rate: "100-M", SlidingWindow: true}))
Dynamic per-request limits with RateFunc
When different callers deserve different limits (free vs. paid tiers, say), set
RateFunc. It runs per request and returns a Rate string; an empty string
falls back to the static Rate/RPS/Burst. Each distinct rate string gets its
own limiter, cached up to MaxDynamicLimiters (default 1024); beyond that, new
rate strings are rejected. Source: celeris/middleware/ratelimit/config.go:56,
celeris/middleware/ratelimit/ratelimit.go:176-188.
s.Use(ratelimit.New(ratelimit.Config{
Rate: "100-M", // default tier
RateFunc: func(c *celeris.Context) (string, error) {
switch tier(c) {
case "enterprise":
return "10000-M", nil
case "pro":
return "1000-M", nil
default:
return "", nil // use the static default
}
},
}))
Distributed limiting with a Store
The built-in limiter is in-process: each instance counts independently, so N
replicas means N× the effective limit. To enforce a global limit across a
fleet, plug in a Store. When Store is set, it owns the rate/burst logic and
RPS, Burst, Shards, and CleanupInterval are ignored.
Source: celeris/middleware/ratelimit/config.go:38.
Celeris ships a Redis-backed store at
github.com/goceleris/celeris/middleware/ratelimit/redisstore:
import (
"github.com/goceleris/celeris/middleware/ratelimit"
"github.com/goceleris/celeris/middleware/ratelimit/redisstore"
"github.com/redis/go-redis/v9"
)
rdb := redis.NewClient(&redis.Options{Addr: "localhost:6379"})
store, err := redisstore.New(ctx, rdb, redisstore.Options{
RPS: 100, // tokens/second — required
Burst: 200, // bucket capacity — required
})
if err != nil {
log.Fatal(err)
}
s.Use(ratelimit.New(ratelimit.Config{Store: store}))
The Redis store uses an atomic Lua token-bucket script (loaded via SCRIPT LOAD
at New time, then run with EVALSHA on the hot path), so the limit holds even
under concurrent requests across replicas. Source:
celeris/middleware/ratelimit/redisstore/redisstore.go:61-70 (the script),
redisstore.go:114-116 (New). A memcached store ships alongside it at
github.com/goceleris/celeris/middleware/ratelimit/memcachedstore.
Refunding tokens
Sometimes you don’t want a request to cost a token if it failed (or
succeeded). SkipFailedRequests refunds the token when the handler returns a
status >= 400; SkipSuccessfulRequests refunds it for status < 400. For a
Store, refunds require the store to implement StoreUndo (the Redis store
does). Source: celeris/middleware/ratelimit/config.go:98-106.
// Only count requests that actually hit your backend (4xx don't burn quota).
s.Use(ratelimit.New(ratelimit.Config{Rate: "100-M", SkipFailedRequests: true}))
Other ratelimit.Config fields
| Field | Type | Default | Purpose |
|---|---|---|---|
RPS | float64 | 10 | Refill rate, requests/sec per key |
Burst | int | 20 | Bucket capacity (max tokens) |
Rate | string | — | Human rate string; overrides RPS |
RateFunc | func(*Context) (string, error) | — | Per-request rate string |
KeyFunc | func(*Context) string | client IP | Extract the limit key |
Store | Store | — | External backend; ignores RPS/Burst/Shards |
SlidingWindow | bool | false | Sliding window instead of token bucket |
Shards | int | NumCPU | Lock shards (rounded up to power of 2) |
CleanupInterval | time.Duration | 1m | How often expired buckets are reaped |
CleanupContext | context.Context | — | Cancel to stop the cleanup goroutine |
DisableHeaders | bool | false | Suppress X-RateLimit-* headers |
SkipFailedRequests | bool | false | Refund token on >= 400 |
SkipSuccessfulRequests | bool | false | Refund token on < 400 |
MaxDynamicLimiters | int | 1024 | Cap on cached RateFunc rate strings |
Skip / SkipPaths | func / []string | — | Bypass certain requests/paths |
ErrorHandler | func(*Context, error) error | 429 | Custom rejection response |
ErrorHandler receives the sentinel ratelimit.ErrTooManyRequests (a 429
*celeris.HTTPError) so you can distinguish the cause:
s.Use(ratelimit.New(ratelimit.Config{
Rate: "100-M",
ErrorHandler: func(c *celeris.Context, err error) error {
return c.JSON(429, map[string]string{"error": "slow down"})
},
}))
LimitReached func(c *celeris.Context) erroris the deprecated predecessor ofErrorHandler; if both are set,ErrorHandlerwins. PreferErrorHandlerfor consistency with the rest of the middleware family. Source:celeris/middleware/ratelimit/config.go:108-122.
Circuit breakers
A rate limiter protects you from your callers. A circuit breaker protects you from a failing dependency — when a downstream is throwing errors, the breaker “opens” and fails fast for a cooldown instead of piling more doomed requests onto a service that’s already down.
The breaker is a three-state machine:
- Closed — normal. All requests pass; the breaker watches the error rate.
- Open — tripped. Every request is rejected immediately with
503, no call to the handler, until the cooldown elapses. - Half-open — probing. After the cooldown, a few probe requests are let through. If they succeed the breaker closes; if any fails it re-opens.
Source: celeris/middleware/circuitbreaker/config.go:9-19.
import "github.com/goceleris/celeris/middleware/circuitbreaker"
// Wrap only the routes that call the flaky upstream.
api := s.Group("/api")
api.Use(circuitbreaker.New(circuitbreaker.Config{
Threshold: 0.5, // trip at 50% failure rate
MinRequests: 20, // …but only after 20 requests in the window
WindowSize: 10 * time.Second, // sliding observation window
CooldownPeriod: 30 * time.Second, // stay open this long before probing
HalfOpenMax: 3, // allow 3 probes in half-open
}))
When does it trip?
The breaker trips (Closed → Open) when, within the current WindowSize, the
total request count is at least MinRequests and failures/total >= Threshold. MinRequests stops a tiny sample (2 of 3 requests failing) from
tripping prematurely. Source: celeris/middleware/circuitbreaker/circuitbreaker.go:180.
| Field | Type | Default | Purpose |
|---|---|---|---|
Threshold | float64 | 0.5 | Failure ratio that trips the breaker; must be in (0, 1] |
MinRequests | int | 10 | Min requests in the window before it can trip |
WindowSize | time.Duration | 10s | Sliding observation window (must be >= 10ms) |
CooldownPeriod | time.Duration | 30s | Time Open before transitioning to Half-open |
HalfOpenMax | int | 1 | Probe requests allowed in Half-open |
IsError | func(err error, status int) bool | status >= 500 | What counts as a failure |
OnStateChange | func(from, to State) | — | Observe transitions (called under a mutex — keep it fast) |
ErrorHandler | func(*Context, error) error | 503 | Response when the breaker is open |
Skip / SkipPaths | func / []string | — | Bypass certain requests/paths |
Out-of-range values panic at construction:
Thresholdmust be in(0, 1],WindowSize >= 10ms,CooldownPeriod > 0,HalfOpenMax >= 1, `MinRequests= 1
. Source:celeris/middleware/circuitbreaker/config.go:120`.
What counts as a failure: IsError
By default any response with status >= 500 (or a non-HTTPError returned by
the handler) is a failure. A 429 or 404 is not — those are the client’s
fault, not the dependency’s. Override IsError to refine this:
api.Use(circuitbreaker.New(circuitbreaker.Config{
IsError: func(err error, status int) bool {
// Treat upstream timeouts (504) and any 5xx as failures.
return status >= 500
},
}))
Observing and controlling the breaker
New returns just the handler. Use NewWithBreaker when you want a handle on the
*Breaker for metrics, health checks, or a manual reset:
mw, brk := circuitbreaker.NewWithBreaker(circuitbreaker.Config{
OnStateChange: func(from, to circuitbreaker.State) {
log.Printf("breaker: %s -> %s", from, to) // e.g. "closed -> open"
},
})
api.Use(mw)
// Elsewhere — expose the state on a health endpoint:
s.GET("/internal/breaker", func(c *celeris.Context) error {
total, failures := brk.Counts()
return c.JSON(200, map[string]any{
"state": brk.State().String(), // "closed" | "open" | "half-open"
"total": total,
"failures": failures,
})
})
// brk.Reset() forces the breaker back to Closed and clears the window.
Breaker exposes State() State, Counts() (total, failures int64), and
Reset(). Source: celeris/middleware/circuitbreaker/circuitbreaker.go:54-74.
The sentinel for the open state is circuitbreaker.ErrServiceUnavailable, which
aliases celeris.ErrServiceUnavailable (503) — the same sentinel timeout
uses, so errors.Is(err, celeris.ErrServiceUnavailable) matches both. (The
overload middleware also returns 503, but it does so via
c.AbortWithStatus(503) rather than this sentinel, and ratelimit returns its
own 429 sentinel ratelimit.ErrTooManyRequests.)
Source: celeris/middleware/circuitbreaker/config.go:78-81,
celeris/errors.go:50.
Timeouts
A handler that hangs forever ties up a worker forever. timeout.New bounds how
long a request may take. There are two modes.
Non-preemptive (the default)
The middleware wraps the request context with a deadline and lets the handler
run. Your handler cooperatively observes the deadline by selecting on
c.Context().Done() (or passing c.Context() to an http.Client, DB driver,
etc.). If the handler overruns the deadline, the middleware returns the timeout
response. Source: celeris/middleware/timeout/timeout.go:36.
import "github.com/goceleris/celeris/middleware/timeout"
s.Use(timeout.New(timeout.Config{Timeout: 3 * time.Second}))
s.GET("/slow", func(c *celeris.Context) error {
// Honour cancellation — pass the request context to downstream calls.
row := db.QueryRowContext(c.Context(), "SELECT ...")
// …
return c.JSON(200, result)
}).Async() // blocking I/O → run off the event loop (see /docs/routing)
This mode is cheap: it uses a single lazy-deadline context and allocates no
goroutine. Its limitation is that a CPU-bound or blocking handler that never
checks Done() won’t actually be interrupted — the deadline is reported once it
returns, but it can still run long.
Preemptive
Set Preemptive: true to run the handler in a goroutine with a buffered
response. If it doesn’t finish within the timeout, the middleware discards the
buffered output and returns the error handler’s response — bounding the client’s
wait even when the handler ignores cancellation.
s.Use(timeout.New(timeout.Config{
Timeout: 2 * time.Second,
Preemptive: true,
}))
Preemptive mode has two hard constraints:
- Handlers must still honour
c.Context().Done(). Preemptive mode bounds the client wait, but a handler that never returns keeps leaking a goroutine. Always select on cancellation. Source:celeris/middleware/timeout/config.go:38.- It is incompatible with streaming. Buffered mode captures the whole response in memory, defeating
StreamWriterand risking OOM on large payloads. Use the non-preemptive default for SSE/streaming endpoints. See Streaming, SSE & WebSocket.Preemptive mode allocates a goroutine and a
context.WithTimeoutper request — a measurable ~1–3% throughput cost at very high RPS. Reach for it only when you genuinely need to interrupt uncooperative handlers.
Per-request timeouts and treating upstream errors as timeouts
TimeoutFunc computes a per-request deadline (falling back to Timeout when it
returns zero). TimeoutErrors lists errors that should be treated as a timeout
even before the deadline — e.g. a DB driver’s own query-timeout error — so they
flow through your ErrorHandler as a 503 instead of a raw 500.
s.Use(timeout.New(timeout.Config{
Timeout: 5 * time.Second,
TimeoutFunc: func(c *celeris.Context) time.Duration {
if c.Path() == "/report" {
return 30 * time.Second // reports get longer
}
return 0 // use the static Timeout
},
TimeoutErrors: []error{context.DeadlineExceeded, sql.ErrConnDone},
ErrorHandler: func(c *celeris.Context, err error) error {
return c.JSON(503, map[string]string{"error": "request timed out"})
},
}))
| Field | Type | Default | Purpose |
|---|---|---|---|
Timeout | time.Duration | 5s | Static deadline; fallback for TimeoutFunc |
TimeoutFunc | func(*Context) time.Duration | — | Per-request deadline |
Preemptive | bool | false | Run handler in a goroutine; bound client wait |
TimeoutErrors | []error | — | Errors treated as a timeout via errors.Is |
ErrorHandler | func(*Context, error) error | 503 | Timeout response |
Skip / SkipPaths | func / []string | — | Bypass certain requests/paths |
Newpanics ifTimeout <= 0andTimeoutFuncis nil — you must give it some deadline. Source:celeris/middleware/timeout/config.go:80. TheErrorHandlerreceivescontext.DeadlineExceededfor a deadline timeout, the matchedTimeoutErrorsentry for a semantic timeout, or a panic-wrapped error for a recovered panic. Source:celeris/middleware/timeout/config.go:27.
Adaptive overload shedding
Rate limits and circuit breakers are static rules. The overload middleware is
adaptive: a background goroutine samples real server pressure — CPU
utilization, in-flight request depth, and tail latency — and walks a five-stage
ladder, degrading more aggressively as pressure rises and recovering (with
hysteresis) as it falls. Source: celeris/middleware/overload/overload.go:1-23.
| Stage | CPU default | Behaviour |
|---|---|---|
| Normal | < 0.70 | Pass through unchanged |
| Expand | ≥ 0.70 | Signal best-effort worker widening; pass through |
| Reap | ≥ 0.80 | Opt-in runtime.GC(), then pass through |
| Reorder | ≥ 0.85 | Low-priority requests get 503; others pass |
| Backpressure | ≥ 0.90 | Low-priority requests get 503 (BackpressureStatus); others sleep BackpressureDelay (default 50ms) then pass; exempt pass |
| Reject | ≥ 0.95 | All non-exempt requests get 503 + Retry-After |
Source: celeris/middleware/overload/config.go:16-24 and config.go:177.
CollectorProvider is required
The middleware reads CPU from an observe.Collector, and the collector must have
a CPU monitor attached. New panics if CollectorProvider is nil. The
provider is a function (not a value) because the server only creates its
collector at Start, so you pass a closure that fetches it lazily.
Source: celeris/middleware/overload/overload.go:103-105.
import (
"github.com/goceleris/celeris/middleware/overload"
"github.com/goceleris/celeris/observe"
)
// Attach a CPU monitor to the server's collector before installing the mw.
if col := s.Collector(); col != nil {
if mon, err := observe.NewCPUMonitor(); err == nil {
col.SetCPUMonitor(mon)
}
}
s.Use(overload.New(overload.Config{
CollectorProvider: s.Collector, // method value — resolved per poll tick
}))
s.Collector() is created eagerly in New, so it is non-nil before Start
— the if col := s.Collector(); col != nil snippet above can attach the CPU
monitor at setup time. It returns nil only when Config.DisableMetrics is true.
Source: celeris/server.go:542.
observe.NewCPUMonitor() returns a platform-appropriate monitor (Linux reads
/proc/stat; others use runtime/metrics) and a non-nil error, so check it
before calling SetCPUMonitor. Source: celeris/observe/cpumon_linux.go:10,
cpumon_other.go:9, celeris/observe/collector.go:130 (SetCPUMonitor),
collector.go:77 (CPUMonitor). For the full metrics surface see
Configuration.
Depth and latency signals
CPU alone misses the most common production overload: handlers blocked on a slow
upstream. CPU stays low while requests queue up. Add DepthThresholds
(absolute in-flight counts) and/or LatencyThresholds (EMA tail-latency targets)
— whichever signal produces the highest stage wins. A depth/latency threshold
can escalate the stage beyond the CPU-driven one, never below it.
Source: celeris/middleware/overload/overload.go:159-177, overload.go:253-257.
workers := 12
s.Use(overload.New(overload.Config{
CollectorProvider: s.Collector,
DepthThresholds: overload.DepthThresholds{
Reorder: int32(2 * workers), // queue building → shed low priority
Backpressure: int32(4 * workers),
Reject: int32(8 * workers), // queue out of control → shed all
},
LatencyThresholds: overload.LatencyThresholds{
Backpressure: 500 * time.Millisecond, // SLO-aware: latency spiked
Reject: 2 * time.Second,
},
}))
Zero-valued threshold fields disable that signal for that stage, so you opt into
exactly the signals you want. DepthThresholds guidance from the source: try
Reorder = 2×workers, Backpressure = 4×workers, Reject = 8×workers.
Source: celeris/middleware/overload/config.go:62-66.
Priority: deciding who to shed
At Reorder and Backpressure, the middleware needs to know which requests are
expendable. PriorityFunc classifies each request (higher = more important);
requests scoring below PriorityThreshold (default 0) are the ones
rejected at Reorder and delayed/rejected at Backpressure. Without a PriorityFunc
all requests share priority 0, so Reorder passes everything and Backpressure
delays everything. Source: celeris/middleware/overload/config.go:128-137.
s.Use(overload.New(overload.Config{
CollectorProvider: s.Collector,
PriorityFunc: func(c *celeris.Context) int {
switch {
case c.Path() == "/checkout":
return 10 // protect revenue paths first
case strings.HasPrefix(c.Path(), "/api/"):
return 1
default:
return -1 // background/scrapers shed first
}
},
PriorityThreshold: 0, // priority < 0 is "low"
BackpressureDelay: 100 * time.Millisecond, // throttle, don't drop, mid-priority
RetryAfter: 10 * time.Second, // Retry-After on the 503s
}))
ExemptPaths and ExemptFunc mark requests that never get degraded — use them
for health checks and other endpoints that must always answer:
overload.Config{
CollectorProvider: s.Collector,
ExemptPaths: []string{"/healthz", "/readyz"},
}
Key overload.Config fields
| Field | Type | Default | Purpose |
|---|---|---|---|
CollectorProvider | func() *observe.Collector | — (required) | Source of CPU samples |
Thresholds | Thresholds | 0.70/0.80/0.85/0.90/0.95 | Per-stage CPU fractions; Hysteresis 0.05 |
DepthThresholds | DepthThresholds | all 0 (off) | Per-stage in-flight counts |
LatencyThresholds | LatencyThresholds | all 0 (off) | Per-stage EMA latency targets |
LatencyEMAAlpha | float64 | 0.1 | EMA smoothing factor |
PollInterval | time.Duration | 1s | How often CPU is sampled |
PriorityFunc | func(*Context) int | — | Classify request priority |
PriorityThreshold | int | 0 | Cutoff below which requests are “low” |
BackpressureDelay | time.Duration | 50ms | Added latency at Backpressure |
BackpressureStatus | int | 503 | Status for non-delayable at Backpressure |
RejectStatus | int | 503 | Status at Reject |
RetryAfter | time.Duration | 5s | Retry-After header on 503s |
ExemptPaths / ExemptFunc | []string / func | — | Never-degraded requests |
EnableReap | bool | false | Allow runtime.GC() at Reap |
StopContext | context.Context | Background | Cancel to stop the sampler |
Skip / SkipPaths | func / []string | — | Bypass certain requests/paths |
Inspecting the controller
NewWithController returns the handler and a *Controller you can poll for
the live stage, in-flight count, latency EMA, and last CPU sample — ideal for a
debug endpoint or your metrics exporter:
mw, ctrl := overload.NewWithController(overload.Config{CollectorProvider: s.Collector})
s.Use(mw)
s.GET("/internal/overload", func(c *celeris.Context) error {
return c.JSON(200, map[string]any{
"stage": ctrl.Stage().String(), // "normal" … "reject"
"inflight": ctrl.InFlight(),
"latency": ctrl.LatencyEMA().String(),
"cpu": ctrl.CPUSample(), // -1 until first sample
})
})
// ctrl.Stop() halts the sampler (the stage then freezes at its last value).
Source: celeris/middleware/overload/overload.go:48-85.
Body limits
bodylimit.New rejects oversized request bodies before your handler runs. Set a
size in bytes with MaxBytes, or with a human-readable Limit string (which
takes precedence). Source: celeris/middleware/bodylimit/config.go:11.
import "github.com/goceleris/celeris/middleware/bodylimit"
s.Use(bodylimit.New(bodylimit.Config{Limit: "10MB"})) // 413 if larger
Limit accepts both plain (KB/MB/GB/TB/PB/EB) and IEC binary
(KiB/MiB/…) suffixes, with optional fractions: "1.5GB", "512KiB". All
suffixes — including the plain KB/MB/GB forms — are interpreted as powers
of 1024, so "10MB" means 10 × 1024 × 1024 bytes. An invalid Limit string
panics at construction.
Source: celeris/middleware/bodylimit/config.go:83.
The middleware checks in two phases: first the Content-Length header (fast
reject), then the actual buffered body length (catches a lying or absent
Content-Length). Bodyless methods (GET, HEAD, DELETE, OPTIONS, TRACE,
CONNECT) are auto-skipped. Source: celeris/middleware/bodylimit/bodylimit.go:53-82.
ContentLengthRequired
Important architectural note. Celeris buffers the full request body before middleware runs, so
bodylimitcannot stop an oversized payload from entering memory whenContent-Lengthis absent or dishonest — the framework’s ownmaxRequestBodySize(100 MB) is the hard ceiling. To force clients to declare body size up front, setContentLengthRequired: true; requests without aContent-Lengthheader are then rejected with411 Length Requiredbefore the body check. Source:celeris/middleware/bodylimit/bodylimit.go:12-23,config.go:31.
s.Use(bodylimit.New(bodylimit.Config{
Limit: "5MB",
ContentLengthRequired: true, // 411 if no Content-Length
ErrorHandler: func(c *celeris.Context, err error) error {
if errors.Is(err, bodylimit.ErrLengthRequired) {
return c.JSON(411, map[string]string{"error": "Content-Length required"})
}
return c.JSON(413, map[string]string{"error": "body too large"})
},
}))
| Field | Type | Default | Purpose |
|---|---|---|---|
Limit | string | — | Human size string; overrides MaxBytes |
MaxBytes | int64 | 4 MiB | Max body bytes |
ContentLengthRequired | bool | false | Reject missing Content-Length with 411 |
ErrorHandler | func(*Context, error) error | 413/411 | Custom rejection |
Skip / SkipPaths | func / []string | — | Bypass certain requests/paths |
The sentinels are bodylimit.ErrBodyTooLarge (413) and
bodylimit.ErrLengthRequired (411).
Source: celeris/middleware/bodylimit/bodylimit.go:5-10.
Request coalescing & idempotency
These two packages both stop duplicate work, but solve different problems.
singleflight — coalesce identical in-flight requests
When many identical requests arrive at once (a cache stampede, a dashboard that
N clients all refresh on the same tick), singleflight lets the first
request run the handler and serves every concurrent duplicate a copy of that
one response. Coalesced responses carry an X-Singleflight: HIT header.
Source: celeris/middleware/singleflight/singleflight.go:59, singleflight.go:115.
import "github.com/goceleris/celeris/middleware/singleflight"
// Coalesce expensive, cacheable GETs.
s.GET("/report/:id", singleflight.New(), buildExpensiveReport)
The key must include user identity. The default
KeyFuncismethod + path + sorted-query + Authorization header + Cookie header— the auth and cookie components are what stop one user’s response from being served to another. If you supply your ownKeyFunc, you MUST incorporate user identity for any endpoint that returns user-specific data, or you will leak data across users. Source:celeris/middleware/singleflight/config.go:18-31.
s.GET("/me/feed", singleflight.New(singleflight.Config{
KeyFunc: func(c *celeris.Context) string {
return c.Method() + "\x00" + c.Path() + "\x00" + userID(c) // identity!
},
}), buildFeed)
singleflight only deduplicates requests that are in flight at the same time —
it is not a cache. Once the leader returns, the next request runs the handler
fresh. It’s purely a stampede guard.
idempotency — make retries safe
idempotency implements the HTTP Idempotency-Key pattern: a client sends a
unique key with a write request, and if it has to retry (network blip, timeout),
the server replays the original response instead of performing the operation
twice. By default it applies to POST, PUT, PATCH, DELETE; other methods
pass through. Source: celeris/middleware/idempotency/idempotency.go:1-18,
config.go:37-39.
import "github.com/goceleris/celeris/middleware/idempotency"
s.Use(idempotency.New()) // in-memory store, "Idempotency-Key" header, 24h TTL
How it behaves for a given key:
- First request — acquires an atomic lock, runs the handler, stores the response under the key, releases.
- Retry after completion — replays the stored response (same status, headers, body). No second execution.
- Concurrent duplicate while the first is still running — returns
409 Conflict(override viaOnConflict). - Handler crashed mid-flight — the lock expires after
LockTimeout(default 30s) so the next request can retry.
Source: celeris/middleware/idempotency/idempotency.go:70-200.
The store needs SetNX
The store must implement both store.KV and store.SetNXer (atomic
set-if-not-exists) — SetNX is how the lock is acquired race-free. These two
interfaces are combined in the idempotency.KVStore type, which is the static
type of Config.Store, so a backend that lacks SetNX simply won’t compile
when you assign it. The default in-memory store satisfies KVStore out of the
box; a Redis/Postgres store works too as long as it implements both interfaces.
Source: celeris/middleware/idempotency/config.go:15-26. When Store is left
nil, New installs the in-memory store. Source:
celeris/middleware/idempotency/idempotency.go:47-49.
s.Use(idempotency.New(idempotency.Config{
Store: myRedisKVStore, // must implement store.KV + store.SetNXer
TTL: 48 * time.Hour,
LockTimeout: 1 * time.Minute,
Methods: []string{"POST", "PATCH"},
}))
BodyHash — catch key reuse with a different payload
A client that reuses an idempotency key with a different body is almost always
a bug. Set BodyHash: true to store a SHA-256 of the request body alongside the
response; on replay, a body mismatch returns 422 Unprocessable Entity.
Source: celeris/middleware/idempotency/config.go:50-53,
idempotency.go:104-110.
s.Use(idempotency.New(idempotency.Config{BodyHash: true}))
With BodyHash enabled, a request whose body exceeds MaxBodyBytes (default
1 MiB) is rejected with 413 before hashing, since the hash can’t be computed
over a truncated body. Source: celeris/middleware/idempotency/idempotency.go:89-95.
On the leader path, a response larger than MaxBodyBytes is still served, but
it is not cached — the lock is released and later retries re-run the handler
rather than replaying. Source: celeris/middleware/idempotency/idempotency.go:160-170.
| Field | Type | Default | Purpose |
|---|---|---|---|
Store | KVStore | in-memory | Persists responses + locks; needs KV + SetNXer |
KeyHeader | string | Idempotency-Key | Header carrying the key |
TTL | time.Duration | 24h | Lifetime of a stored response |
LockTimeout | time.Duration | 30s | Lock lifetime (recovers crashed handlers) |
Methods | []string | POST,PUT,PATCH,DELETE | Methods the mw applies to |
OnConflict | func(*Context) error | 409 | Response for an in-flight duplicate |
BodyHash | bool | false | Reject key reuse with a different body (422) |
MaxBodyBytes | int | 1 MiB | Cap on hashed/stored body |
MaxKeyLength | int | 255 | Max key header length (over → 400) |
Skip / SkipPaths | func / []string | — | Bypass certain requests/paths |
A missing key header just passes the request through — idempotency is opt-in per request. An invalid key (non-printable, or longer than
MaxKeyLength) returns400. Source:celeris/middleware/idempotency/idempotency.go:77-83.
When to use which
| Situation | Use |
|---|---|
| Many concurrent reads of the same expensive resource | singleflight |
| A client may retry a write and must not double-charge | idempotency |
| Both — a write you also want to dedupe under concurrency | idempotency (it holds a lock; singleflight does not persist) |
They compose: singleflight in front collapses simultaneous duplicates;
idempotency behind it makes the surviving request’s retries safe.
Common pitfalls
- Rate limiting by the wrong key. Without the proxy middleware,
c.ClientIP()is your load balancer’s address, so all clients share one bucket. Installproxy.Newwith a correctTrustedProxiesrange, or supply aKeyFuncthat keys on something you control (API key, user ID). - Distributed limit that isn’t. The default limiter is per-instance; N
replicas means N× the limit. Use a
Store(Redis/memcached) for a global cap. Ratepanics,ValidateConfigdoesn’t. Load rate strings from config throughratelimit.ValidateConfig(orParseRate) before callingNew, which panics on a badRate.- Preemptive timeout on a streaming route. Buffered mode captures the whole
response — never combine it with
StreamWriter/SSE. Use the non-preemptive default there. - Timeout with handlers that ignore
Done(). Non-preemptive mode can only report a deadline overrun, not interrupt it. Always passc.Context()downstream and select onDone(). overloadwithout a CPU monitor.CollectorProvideris required and the collector needsSetCPUMonitor— otherwise CPU reads as-1and the CPU ladder never escalates (depth/latency signals still work). And remember CPU alone misses I/O-bound overload; addDepthThresholds.bodylimitis not a hard memory guard. The body is already buffered before it runs. Pair it withContentLengthRequired: trueto reject undeclared bodies early.singleflightcross-user leak. A customKeyFuncthat omits identity will serve one user’s data to another. Always include the user in the key.idempotencystore withoutSetNX. A store that doesn’t implementstore.SetNXercan’t be used —Config.Storeis typedidempotency.KVStore(store.KV+store.SetNXer), so an incompatible store is a compile error, not a runtime surprise. LeaveStorenil to get the in-memory default.
FAQ
Which status codes do these return?
ratelimit → 429; circuitbreaker, timeout, and overload → 503;
bodylimit → 413 (too large) or 411 (length required); idempotency → 409
on an in-flight conflict, 422 on a body-hash mismatch, 400 on a bad key, and
(with BodyHash) 413 on a request body over MaxBodyBytes. Most are
customizable via the relevant ErrorHandler/OnConflict.
Can I match all “service unavailable” causes with one errors.Is?
Yes. circuitbreaker.ErrServiceUnavailable and timeout.ErrServiceUnavailable
both alias celeris.ErrServiceUnavailable, so
errors.Is(err, celeris.ErrServiceUnavailable) matches both.
Do I install these globally or per route?
Either. s.Use(...) applies to every route; group.Use(...) scopes to a group;
passing the handler as a leading argument to s.GET(...) scopes it to one route.
Scope the circuit breaker to the routes that call the flaky dependency, not the
whole server. See Middleware and Routing.
How do these interact with async handlers?
They’re ordinary middleware and run in the chain regardless of dispatch mode. For
blocking I/O handlers wrapped by timeout/circuitbreaker, mark the route
.Async() so the blocking call runs off the event loop — see Routing.
Where do the overload CPU samples come from?
From the server’s observe.Collector with a CPUMonitor attached
(observe.NewCPUMonitor()). The collector is created automatically unless
Config.DisableMetrics is set; see Configuration.
See also
- Middleware — the full in-tree catalog and ordering model,
plus the
proxymiddleware that makesClientIP()trustworthy. - Configuration — server timeouts,
maxRequestBodySize, the metrics collector, andDisableMetrics. - Routing — per-route middleware and the
Async/Syncdispatch model. - Streaming, SSE & WebSocket — why preemptive timeouts and streaming don’t mix.