Architecture¶
RPC Plane sits between your application and your Solana RPC providers. It makes sub-millisecond routing decisions using pre-computed health scores and returns the response from the selected provider.
┌──────────────────────────────────────────┐
│ RPC Plane │
│ │
App request ─────▶│ Ingress (HTTP) │
│ │ │
│ ▼ │
│ Request Classifier │
│ (read vs write, method) │
│ │ │
│ ▼ │
│ Router ◄──── Health Scorer │
│ (select provider) ▲ │
│ │ │ │
│ │ Slot Tracker │
│ │ (background loop) │
│ ▼ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │Provider │ │Provider │ │Provider │ │
│ │ A │ │ B │ │ C │ │
│ └────┬────┘ └────┬────┘ └────┬────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ Response / Retry logic │
│ │ │
App response ◄────│ Egress │
│ │
│ Prometheus metrics ──▶ :9401/metrics │
└──────────────────────────────────────────┘
Request flow¶
- Ingress — the proxy accepts the JSON-RPC request (HTTP POST).
- Request classifier — parses the
methodfield and classifies it as a read or a write. - Router — selects a provider (or set of providers) based on current health scores and the configured strategy.
- Provider call — the proxy forwards the request. For broadcasts it fans out concurrently.
- Retry / failover — on retryable errors (429, 5xx, timeout) the proxy tries the next provider in the ordered list.
- Egress — the first successful response is returned to the caller.
All steps happen in the hot path. Health scores are pre-computed by background tasks; the router reads them from a shared atomic snapshot without any blocking I/O.
Background tasks¶
Two independent background loops run per provider:
Health probe loop (default: every 2s)¶
Sends getSlot and getHealth to each provider, measures round-trip latency, tracks success/failure. Updates the rolling error rate and consecutive-failure counter. Triggers circuit state transitions.
Slot tracker loop (default: every 1s)¶
Polls each provider with getSlot commitment=processed and records the result. Computes the network tip (max slot across all providers) and the per-provider slot drift. Slot freshness is fed into the health score.
Health score¶
See Health scoring for details.
Routing strategies¶
See Routing for details on each strategy.
Write-path broadcast¶
Optional. When routing.broadcast_writes = true, every method in routing.write_methods (default: sendTransaction) is sent to all healthy providers simultaneously, returning the fastest success. Off by default — writes use the same strategy as reads. simulateTransaction is read-only and routes like a read unless added to write_methods. Providers can be scoped to specific methods (e.g. a submission-only landing service) via a per-provider methods allowlist. See Routing for details.
Retry logic¶
On a retryable error the proxy tries the next provider in the ordered list (up to max_retries, default 2). See Routing — Retries for the full error-code table.
Hot reload¶
The proxy watches the config file for changes (checked every 2 seconds). When the file is modified:
- Added providers — new health monitor and slot tracker tasks start immediately.
- Removed providers — their background tasks are stopped.
- Client-input changes — a provider whose
urlorhttp3flag changed, or any provider whenserver.pool_max_idle_per_hostchanged, has its outbound client rebuilt (treated as remove + add, which resets health state). - Unchanged providers — accumulated health history is preserved. A
weight-only change applies live without a rebuild.
server.listen, server.metrics_listen, server.listen_backlog, and server.worker_threads require a restart; a warning is logged if one of them changes.
Key design constraints¶
Sub-millisecond routing. Health scores are pre-computed in background tasks and stored in a RwLock<Arc<...>> snapshot. The router reads the snapshot and makes an O(N) sort — no I/O in the hot path.
No unbounded state. Error rate uses a fixed sliding window. Slot history is a single atomic integer per provider. Memory usage is constant under sustained load.
No single point of failure. All panic paths are caught at the request handler level. One bad request or provider response cannot crash the proxy.
Zero mandatory infrastructure. Everything runs in-process. No external databases, caches, or service dependencies.