Docs Scaling Architecture

Scaling Architecture

Mockarty is designed as a distributed system from the ground up. Whether you are running a single instance for local development or deploying dozens of nodes across datacenters, the same architecture applies — you just add more pieces.

This guide explains how the components fit together, how to scale them, and what to monitor once they are running.


Architecture Overview

Mockarty consists of four component types that communicate over HTTP and gRPC:

                         ┌─────────────────────────────────────┐
                         │           ADMIN NODE                │
                         │          (port 5770)                │
                         │                                     │
                         │  ┌───────────┐  ┌───────────────┐  │
                         │  │  Web UI   │  │  Coordinator  │  │
                         │  │  REST API │  │  gRPC :5773   │  │
                         │  └───────────┘  └───┬───────┬───┘  │
                         │         │            │       │      │
                         │  ┌──────▼──────┐     │       │      │
                         │  │ Composite   │     │       │      │
                         │  │ Repository  │     │       │      │
                         │  └──┬──────┬───┘     │       │      │
                         └─────┼──────┼─────────┼───────┼──────┘
                               │      │         │       │
                    ┌──────────▼┐  ┌──▼────┐    │       │
                    │PostgreSQL │  │ Redis  │    │       │
                    │ (required)│  │(option)│    │       │
                    └───────────┘  └───────┘    │       │
                                                │       │
              ┌─────────────────────────────────┘       │
              │  gRPC registration + heartbeat          │
              │                                         │
    ┌─────────▼──────────┐              ┌───────────────▼────────────┐
    │  MOCK RESOLVER #1  │              │     RUNNER AGENT #1        │
    │    (port 5780)     │              │      (port 6770)           │
    │                    │              │                            │
    │  Lightweight HTTP  │   ...more    │  api_test, performance     │
    │  mock resolution   │   resolvers  │  capabilities              │
    └────────────────────┘              └────────────────────────────┘

    ┌────────────────────┐              ┌────────────────────────────┐
    │  MOCK RESOLVER #2  │              │     RUNNER AGENT #2        │
    │    (port 5781)     │              │      (port 6771)           │
    └────────────────────┘              └────────────────────────────┘

Component Roles

Component Default Port Role
Admin Node 5770 (HTTP), 5773 (gRPC) The brain. Manages mocks, serves the UI, coordinates runners, runs migrations. There is exactly one admin node per deployment.
Mock Resolver 5780+ Lightweight nodes that handle incoming mock requests. They read mock definitions from the database (with caching) but never write. You can run as many as you need.
Runner Agent 6770+ Distributed workers that execute API tests and performance tests. They register with the Coordinator over gRPC and pull tasks from the queue.
Coordinator 5773 (gRPC, hosted by Admin) A gRPC service embedded in the Admin Node. Runners and resolvers register here, receive tasks, and send heartbeats.

Key insight: The Admin Node is the only component that writes to the database. Resolvers only read. This separation means you can scale read-heavy mock resolution independently of the admin workload.


How Mock Resolution Works

When a client sends a request to a resolver node, here is what happens:

  Client
    │
    ▼
┌────────────────────┐
│  MOCK RESOLVER     │
│                    │
│  1. Match route    │
│  2. Check cache ───┼──► Cache hit? Return immediately.
│  3. Read from DB ──┼──► Cache miss? Query PostgreSQL.
│  4. Evaluate       │
│     conditions     │
│  5. Render         │
│     response with  │
│     Faker/JsonPath │
│  6. Return         │
│     response       │
└────────────────────┘

The Composite Repository Pattern

Every node (admin and resolver) uses a Composite Repository that layers three storage tiers:

  1. Ristretto (in-memory cache) — Microsecond lookups. Always available, no external dependencies. Holds recently-accessed mocks in a bounded LRU cache.
  2. Redis (optional) — Shared cache across all nodes. Sub-millisecond lookups. Useful when multiple resolvers need to see the same cached data.
  3. PostgreSQL (required) — Source of truth. All writes go here first. Reads fall through to PostgreSQL when caches miss.

The read path follows a read-through pattern:

Request → Ristretto → Redis → PostgreSQL
              ↓           ↓         ↓
           (hit?)      (hit?)    (always authoritative)
              │           │         │
              ▼           ▼         ▼
         Return +    Return +    Return +
         done       update       update both
                    Ristretto    Redis & Ristretto

Writes always go to PostgreSQL first, then update caches synchronously to prevent stale reads immediately after a write. The composite layer also handles serialization conflicts with automatic retries (up to 3 attempts) for high-concurrency scenarios.


Horizontal Scaling with Resolvers

Why Resolvers?

The Admin Node does a lot: it serves the UI, runs background jobs (cleanup, backups, scheduling), coordinates runners, and handles mock resolution. Under heavy load, mock resolution — which is the most frequent operation — can starve the admin functions.

Resolvers solve this by offloading mock resolution to dedicated, lightweight processes. Each resolver:

  • Handles only mock requests (HTTP, gRPC, GraphQL, SOAP, SSE, WebSocket)
  • Connects directly to PostgreSQL (read-only workload)
  • Optionally uses Redis for shared caching
  • Has its own Ristretto in-memory cache
  • Registers with the Coordinator for health tracking

Business value: Adding 3 resolver nodes lets you handle roughly 4x the mock traffic without touching the Admin Node. The admin stays responsive for UI operations, API management, and test orchestration.

When to Add Resolvers

Symptom Action
Mock response latency increasing under load Add resolver nodes behind a load balancer
Admin UI becomes sluggish during load tests Separate mock traffic to resolvers, keep admin for UI/API
Need geographic distribution Deploy resolvers closer to consuming services
Want zero-downtime mock updates Resolvers pick up changes from DB; roll them without touching admin

Example: 3 Resolvers Behind Nginx

# docker-compose.scaling.yml
version: "3.8"

services:
  postgres:
    image: postgres:16
    environment:
      POSTGRES_DB: mockarty
      POSTGRES_USER: mockarty
      POSTGRES_PASSWORD: secret
    ports:
      - "5432:5432"
    volumes:
      - pgdata:/var/lib/postgresql/data

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"

  admin:
    image: mockarty/admin:latest
    environment:
      DB_DSN: "postgres://mockarty:secret@postgres:5432/mockarty?sslmode=disable"
      CACHE_TYPE: redis
      REPO_REDIS_HOST: redis
      REPO_REDIS_PORT: "6379"
      HTTP_PORT: "5770"
      RUNNER_GRPC_PORT: "5773"
    ports:
      - "5770:5770"
      - "5773:5773"
    depends_on:
      - postgres
      - redis

  resolver-1:
    image: mockarty/resolver:latest
    environment:
      DB_DSN: "postgres://mockarty:secret@postgres:5432/mockarty?sslmode=disable"
      CACHE_TYPE: redis
      REPO_REDIS_HOST: redis
      REPO_REDIS_PORT: "6379"
      HTTP_PORT: "5780"
      COORDINATOR_ADDR: admin:5773
      API_TOKEN: "${RESOLVER_TOKEN}"
    depends_on:
      - admin

  resolver-2:
    image: mockarty/resolver:latest
    environment:
      DB_DSN: "postgres://mockarty:secret@postgres:5432/mockarty?sslmode=disable"
      CACHE_TYPE: redis
      REPO_REDIS_HOST: redis
      REPO_REDIS_PORT: "6379"
      HTTP_PORT: "5780"
      COORDINATOR_ADDR: admin:5773
      API_TOKEN: "${RESOLVER_TOKEN}"
    depends_on:
      - admin

  resolver-3:
    image: mockarty/resolver:latest
    environment:
      DB_DSN: "postgres://mockarty:secret@postgres:5432/mockarty?sslmode=disable"
      CACHE_TYPE: redis
      REPO_REDIS_HOST: redis
      REPO_REDIS_PORT: "6379"
      HTTP_PORT: "5780"
      COORDINATOR_ADDR: admin:5773
      API_TOKEN: "${RESOLVER_TOKEN}"
    depends_on:
      - admin

  nginx:
    image: nginx:alpine
    ports:
      - "8080:80"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
    depends_on:
      - resolver-1
      - resolver-2
      - resolver-3

volumes:
  pgdata:

nginx.conf for load balancing:

events {
    worker_connections 1024;
}

http {
    upstream resolvers {
        least_conn;
        server resolver-1:5780;
        server resolver-2:5780;
        server resolver-3:5780;
    }

    server {
        listen 80;

        # Mock resolution traffic → resolvers
        location / {
            proxy_pass http://resolvers;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_connect_timeout 5s;
            proxy_read_timeout 30s;
        }

        # Health checks
        location /health {
            proxy_pass http://resolvers;
        }
    }
}

Your consuming services point at nginx:8080 for mock resolution, while developers access admin:5770 for the UI and API management.


Runner Agent Architecture

Runner Agents are distributed workers that execute long-running tasks such as API test collections and performance tests.

Capabilities

Each runner declares its capabilities when it registers:

Capability What it runs
api_test API test collections, scheduled test suites
performance Performance/load test scripts

A runner can have multiple capabilities. Set them via the CAPABILITIES environment variable:

CAPABILITIES="api_test,performance"

Shared vs Namespace Runners

Runners can operate in two scopes:

  • Shared runners (scope: admin) — Accept tasks from any namespace. Created with admin-scoped integration tokens. Best for shared infrastructure.
  • Namespace runners — Accept tasks only from their assigned namespace. Created with namespace-scoped integration tokens. Best for team isolation.
  Admin Node (Coordinator)
       │
       ├── Shared Runner A ──► handles tasks from ALL namespaces
       ├── Shared Runner B ──► handles tasks from ALL namespaces
       │
       ├── Team-Alpha Runner ──► handles tasks from "alpha" namespace only
       └── Team-Beta Runner  ──► handles tasks from "beta" namespace only

Task Dispatching Flow

1. User triggers test run (UI or API)
       │
2. Admin Node creates task in DB
       │
3. Coordinator assigns task to a runner
   (matching capabilities + namespace scope)
       │
4. Runner pulls task via gRPC stream
       │
5. Runner executes, sends progress updates
   (real-time via gRPC → SSE to browser)
       │
6. Runner reports results back to Coordinator
       │
7. Results stored in DB, visible in UI

Runner Agent Configuration

# Required
COORDINATOR_ADDR=admin-node:5773    # gRPC address of the Coordinator
API_TOKEN=mki_xxxxx                 # Integration token (mki_* format)
RUNNER_NAME=runner-1                # Unique name for this runner

# Optional
CAPABILITIES=api_test,performance   # What this runner can do
SHARED=true                         # Accept tasks from all namespaces
NAMESPACE=team-alpha                # Only if SHARED=false
MAX_CONCURRENT=5                    # Max parallel tasks (default: varies)

Heartbeats and Fault Tolerance

Runners send heartbeats to the Coordinator every few seconds (configurable via RUNNER_HEARTBEAT_TIMEOUT, default 30s). If a runner stops responding:

  1. The Coordinator marks it as offline after the heartbeat timeout
  2. Any in-progress tasks are re-queued for other runners
  3. When the runner comes back, it re-registers automatically

Task timeout defaults to 30 minutes (RUNNER_TASK_TIMEOUT), preventing stuck tasks from blocking the queue.


Database and Cache Tiers

PostgreSQL — The Source of Truth

PostgreSQL is required for any production deployment. It stores:

  • All mock definitions and their conditions
  • Store data (Global, Chain, Mock stores)
  • API test collections, results, and schedules
  • User accounts, sessions, RBAC policies
  • Audit logs and webhook configurations
  • Runner task queue and results

Recommended version: PostgreSQL 14+ (for improved JSON performance and query optimization).

Redis — Shared Cache Layer

Redis is optional but recommended for multi-node deployments. When enabled (CACHE_TYPE=redis):

  • All resolver nodes share the same cache, reducing redundant DB queries
  • Cache invalidation propagates automatically across nodes
  • Mock resolution latency drops to sub-millisecond for cached mocks

Configuration:

CACHE_TYPE=redis
REPO_REDIS_HOST=redis
REPO_REDIS_PORT=6379
REPO_REDIS_PASSWORD=secret    # if auth is enabled

Ristretto — In-Memory Fallback

Every node always has a Ristretto in-memory cache, regardless of Redis configuration. This provides:

  • Zero-latency lookups for hot mocks (microseconds)
  • No external dependency — works even if Redis is down
  • Bounded memory usage with LRU eviction

When Redis is also configured, Ristretto acts as L1 cache and Redis as L2:

Request → Ristretto (L1, in-process) → Redis (L2, shared) → PostgreSQL (L3, persistent)

Choosing Your Cache Strategy

Deployment CACHE_TYPE Why
Single node, dev/test inmemory (default) No Redis needed. Ristretto handles everything.
Multiple resolvers redis Shared cache prevents each resolver from independently warming its cache.
Air-gapped / minimal deps inmemory Fewer moving parts. Acceptable if resolver count is low.

Deployment Patterns

Pattern 1: Single Node (Development / Small Teams)

┌─────────────────────┐
│     Admin Node      │
│    SQLite or PG     │
│   (port 5770)       │
│                     │
│  UI + API + Mocks   │
└─────────────────────┘
# Minimal start with SQLite
DB_USE=sqlite ./mockarty
  • When: Local development, demos, small teams (< 5 people), < 100 mocks
  • Pros: Zero infrastructure, single binary, instant startup
  • Cons: No horizontal scaling, SQLite limitations for concurrent writes

Pattern 2: Small Team (PostgreSQL, Admin + 1 Resolver)

┌──────────┐     ┌──────────┐
│  Admin   │     │ Resolver │
│  :5770   │     │  :5780   │
└────┬─────┘     └────┬─────┘
     │                │
     └───────┬────────┘
             │
      ┌──────▼──────┐
      │  PostgreSQL  │
      └─────────────┘
  • When: Team of 5-20, moderate mock traffic, want admin UI to stay fast
  • Pros: Admin offloaded from mock traffic, easy to set up
  • Cons: Resolver is a single point of failure for mock resolution

Pattern 3: Medium (PostgreSQL + Redis, 2 Resolvers, 1 Runner)

         ┌──────────┐
         │  Nginx   │
         │  :8080   │
         └────┬─────┘
              │
     ┌────────┼────────┐
     │        │        │
┌────▼───┐ ┌─▼──────┐ │
│Resolver│ │Resolver│ │
│  #1    │ │  #2    │ │
└───┬────┘ └───┬────┘ │
    │          │      │
    └────┬─────┘      │
         │            │
  ┌──────▼──────┐     │
  │    Redis    │     │
  └──────┬──────┘     │
         │            │
  ┌──────▼──────┐  ┌──▼────────┐
  │ PostgreSQL  │  │  Admin    │──── Runner Agent
  └─────────────┘  │  :5770    │    (api_test +
                   └───────────┘     performance)
  • When: 20-100 users, hundreds of mocks, automated test pipelines
  • Pros: Redundant resolvers, shared Redis cache, distributed test execution
  • Recommended hardware: 2 CPU / 4 GB RAM per resolver, 4 CPU / 8 GB RAM for admin

Pattern 4: Large / Enterprise

                    ┌───────────────┐
                    │ Load Balancer │
                    │   (L7/L4)     │
                    └───────┬───────┘
                            │
         ┌──────────────────┼──────────────────┐
         │                  │                  │
    ┌────▼───┐        ┌────▼───┐        ┌────▼───┐
    │Resolver│        │Resolver│        │Resolver│  ... N resolvers
    │  #1    │        │  #2    │        │  #N    │
    └───┬────┘        └───┬────┘        └───┬────┘
        │                 │                 │
        └────────┬────────┴────────┬────────┘
                 │                 │
          ┌──────▼──────┐  ┌──────▼──────┐
          │ Redis       │  │ Redis       │
          │ Primary     │  │ Replica     │
          └──────┬──────┘  └─────────────┘
                 │
          ┌──────▼──────┐
          │ PostgreSQL  │
          │ Primary     │──── Read Replicas (for resolvers)
          └─────────────┘

    ┌───────────┐    ┌──────────┐    ┌──────────┐
    │ Admin     │    │ Runner   │    │ Runner   │  ... M runners
    │ :5770     │    │ Agent #1 │    │ Agent #M │
    │ :5773     │    │ shared   │    │ ns:beta  │
    └───────────┘    └──────────┘    └──────────┘
  • When: 100+ users, thousands of mocks, multi-team namespaces, SLA requirements
  • Infra: PostgreSQL HA (primary + replicas), Redis Sentinel or Cluster, N resolvers behind L7 load balancer, M runner agents (shared + per-namespace)
  • Pros: Fault tolerant, independently scalable tiers, namespace isolation

Network Topology

Ports and Protocols

Component Port Protocol Direction Purpose
Admin Node 5770 HTTP/HTTPS Inbound Web UI, REST API, mock resolution
Coordinator 5773 gRPC Inbound Runner/resolver registration, task dispatch
Resolver 5780+ HTTP/HTTPS Inbound Mock resolution only
Runner Agent 6770+ HTTP Inbound (optional) Runner dashboard (monitoring)
PostgreSQL 5432 TCP Internal Database connections from admin + resolvers
Redis 6379 TCP Internal Cache connections from admin + resolvers

TLS Between Components

The Coordinator (gRPC) supports TLS for runner-to-admin communication:

# Admin Node (Coordinator TLS)
RUNNER_GRPC_TLS_ENABLED=true
RUNNER_GRPC_TLS_CERT_FILE=/path/to/server.crt
RUNNER_GRPC_TLS_KEY_FILE=/path/to/server.key
RUNNER_GRPC_TLS_DIR=.mockarty/tls

# Optional: mTLS (mutual TLS) — require client certificates
RUNNER_GRPC_TLS_CLIENT_CA_CERT=/path/to/ca.crt

If RUNNER_GRPC_TLS_ENABLED=true but no cert/key files are specified, Mockarty auto-generates a self-signed certificate in the TLS directory.

For production deployments, use proper certificates signed by your organization’s CA, especially if runners communicate over untrusted networks.

Firewall Rules (Minimum)

Admin  → PostgreSQL :5432  (required)
Admin  → Redis      :6379  (if CACHE_TYPE=redis)

Resolver → PostgreSQL :5432  (required, read-only workload)
Resolver → Redis      :6379  (if CACHE_TYPE=redis)
Resolver → Admin      :5773  (gRPC registration)

Runner → Admin :5773  (gRPC, task dispatch + heartbeats)

Clients → Resolver :5780  (mock requests, via load balancer)
Developers → Admin :5770  (UI and API management)

Capacity Planning

These are rough guidelines based on typical mock workloads. Actual numbers depend on mock complexity (number of conditions, Faker functions, store lookups, response size).

Mock Resolution Throughput

Setup Approx. Requests/sec Notes
1 Admin Node (no resolver) ~2,000 Adequate for development and small teams
1 Resolver (no Redis) ~5,000 Ristretto cache handles most reads
1 Resolver + Redis ~7,000 Redis prevents cold-cache penalties
3 Resolvers + Redis + Nginx ~20,000 Near-linear scaling with least_conn
10 Resolvers + Redis + LB ~60,000+ Production-grade for large organizations

Sizing Guidelines

Component CPU RAM Disk Notes
Admin Node 2-4 cores 4-8 GB 20 GB More CPU if many background jobs
Resolver 1-2 cores 2-4 GB Minimal Stateless; scale horizontally
Runner Agent 2-4 cores 4-8 GB 10 GB More for performance tests
PostgreSQL 2-8 cores 8-32 GB SSD Size based on mock count + history
Redis 1-2 cores 2-8 GB Size based on active mock count

When to Scale

Metric Threshold Action
Mock response p95 > 100ms Add resolvers or enable Redis
Admin UI response > 2s Move mock traffic to resolvers
DB connection pool exhaustion Add PgBouncer or increase max_connections
Runner task queue growing Add runner agents
Redis memory > 80% Increase Redis memory or review TTLs

Health Monitoring

The /health Endpoint

Every component exposes a /health endpoint that returns detailed status:

curl -s http://localhost:5770/health | jq .
{
  "status": "pass",
  "releaseId": "1.2.3",
  "uptime": "72h15m30s",
  "system": {
    "goVersion": "go1.24.1",
    "goroutines": 142,
    "cpus": 4,
    "memAllocMb": "85.3",
    "memSysMb": "210.7"
  },
  "components": {
    "database": {
      "status": "up",
      "latency": "1.2ms"
    },
    "redis": {
      "status": "up",
      "latency": "0.3ms"
    },
    "scheduler": {
      "status": "up"
    },
    "coordinator": {
      "status": "up"
    }
  }
}

The status field is "pass" when all critical components are healthy, or "fail" if any required component (like the database) is down. Non-critical components (like Redis) report "not_configured" when disabled.

Prometheus Metrics

Mockarty exposes Prometheus-compatible metrics at /metrics:

curl http://localhost:5770/metrics

Key metrics to monitor:

Metric What it tells you
http_request_duration_seconds Mock resolution latency distribution
http_requests_total Request count by route, method, status
db_query_duration_seconds Database query performance
cache_hit_ratio Effectiveness of Ristretto/Redis caching
runner_active_tasks Number of tasks currently executing
runner_queue_depth Number of tasks waiting for a runner

Alerting Recommendations

# Example Prometheus alerting rules
groups:
  - name: mockarty
    rules:
      - alert: MockartyDown
        expr: up{job="mockarty-admin"} == 0
        for: 1m
        labels:
          severity: critical

      - alert: HighMockLatency
        expr: histogram_quantile(0.95, http_request_duration_seconds_bucket) > 0.5
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "Mock response p95 exceeds 500ms — consider adding resolvers"

      - alert: DatabaseSlow
        expr: db_query_duration_seconds{quantile="0.99"} > 1
        for: 5m
        labels:
          severity: warning

      - alert: RunnerQueueBacklog
        expr: runner_queue_depth > 10
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "Task queue growing — consider adding runner agents"

Best Practices

1. Separate Mock Traffic from Admin Traffic

The single most impactful scaling decision: route your service-under-test traffic to dedicated resolver nodes, not the admin. The admin should only handle UI access and API management.

2. Start Simple, Scale When Needed

Dev       →  Single node with SQLite
Staging   →  Admin + PostgreSQL + 1 Resolver
Production →  Admin + PostgreSQL + Redis + 2+ Resolvers + Load Balancer

Do not over-engineer. A single admin node with PostgreSQL handles thousands of requests per second. Add resolvers only when you observe latency or throughput issues.

3. Use Redis for Multi-Resolver Deployments

Without Redis, each resolver warms its own Ristretto cache independently. With Redis, a mock cached by resolver #1 is immediately available to resolver #2. This matters most during cold starts and after mock updates.

4. Pin Resolver Versions to Admin Version

All components share a unified version. When upgrading, update the admin node first (it runs migrations), then roll resolvers and runners. Never run resolvers on a newer version than the admin.

5. Use Namespace Runners for Team Isolation

In multi-team environments, give each team a namespace-scoped runner. This prevents one team’s expensive performance test from blocking another team’s API test suite.

6. Monitor the Task Queue

A growing task queue means your runners cannot keep up. Either add more runner agents or review whether tests are hanging (check RUNNER_TASK_TIMEOUT).

7. Use Connection Pooling for PostgreSQL

In large deployments with many resolvers, each resolver opens its own connection pool. Use PgBouncer between resolvers and PostgreSQL to multiplex connections and avoid hitting max_connections.

8. Scale Resolvers Before Adding CPU to Admin

If mock latency is the problem, adding CPU to the admin node gives diminishing returns because the admin does many things. A dedicated resolver uses all its resources for mock resolution. Two small resolvers outperform one large admin node for mock traffic.