Architecture Overview¶

This document describes the high-level architecture of Routstr Core, helping contributors understand how the system works.

System Overview¶

Routstr Core is a FastAPI-based reverse proxy that adds Bitcoin micropayments to OpenAI-compatible APIs.

graph TB
    subgraph "External Services"
        Client[API Client]
        Mint[Cashu Mint]
        Provider[AI Provider]
        Nostr[Nostr Relays]
    end

    subgraph "Routstr Core"
        API[FastAPI Server]
        Auth[Auth Module]
        Payment[Payment Module]
        Proxy[Proxy Module]
        DB[(SQLite DB)]

        API --> Auth
        Auth --> Payment
        Auth --> DB
        Payment --> Proxy
        Proxy --> Provider
        Payment --> Mint
        API --> Nostr
    end

    Client --> API

Core Components¶

FastAPI Application¶

The main application is initialized in routstr/core/main.py:

Lifespan Management: Handles startup/shutdown tasks
Middleware: CORS, logging, error handling
Routers: Modular endpoint organization
Background Tasks: Price updates, automatic payouts

Authentication System¶

Located in routstr/auth.py, handles:

API Key Validation: Hashed key storage and lookup
Balance Checking: Ensures sufficient funds
Rate Limiting: Optional request throttling
Token Redemption: Converts eCash to balance

Payment Processing¶

The routstr/payment/ module manages:

Cost Calculation: Token-based or fixed pricing
Model Pricing: Dynamic pricing from models.json
Currency Conversion: BTC/USD rate management
Fee Application: Exchange and provider fees

Request Proxying¶

routstr/proxy.py handles:

Request Forwarding: Preserves headers and body
Response Streaming: Efficient memory usage
Usage Tracking: Counts tokens and costs
Error Handling: Graceful upstream failures

Database Layer¶

Using SQLModel in routstr/core/db.py:

# Core models
APIKey:
  - id: Primary key
  - key_hash: Hashed API key
  - balance: Current balance (msats)
  - created_at: Timestamp
  - metadata: JSON field

Transaction:
  - id: Primary key
  - api_key_id: Foreign key
  - amount: Transaction amount
  - type: deposit/usage/withdrawal
  - timestamp: When occurred

Request Flow¶

Standard API Request¶

sequenceDiagram
    participant C as Client
    participant R as Routstr
    participant D as Database
    participant P as AI Provider

    C->>R: API Request + Key
    R->>D: Validate Key
    D-->>R: Key Info + Balance
    R->>R: Check Balance
    R->>P: Forward Request
    P-->>R: AI Response
    R->>D: Deduct Cost
    R-->>C: Return Response

Payment Flow¶

sequenceDiagram
    participant C as Client
    participant R as Routstr
    participant W as Wallet Module
    participant M as Cashu Mint
    participant D as Database

    C->>R: Create Key Request + Token
    R->>W: Validate Token
    W->>M: Verify with Mint
    M-->>W: Token Valid
    W-->>R: Token Amount
    R->>D: Create Key + Balance
    R-->>C: Return API Key

Key Design Decisions¶

1. Async Architecture¶

Everything is async for maximum performance:

async def handle_request(request: Request) -> Response:
    # Non-blocking database queries
    api_key = await get_api_key(request.headers["Authorization"])

    # Concurrent operations
    balance_check, rate_limit = await asyncio.gather(
        check_balance(api_key),
        check_rate_limit(api_key)
    )

    # Stream response without blocking
    async for chunk in proxy_request(request):
        yield chunk

2. Modular Design¶

Components are loosely coupled:

Routers: Separate files for different endpoints
Dependencies: Injected via FastAPI's DI system
Models: Shared data structures
Services: Business logic separated from routes

3. Error Handling¶

Graceful degradation and clear error messages:

class RoustrError(Exception):
    """Base exception with structured error response"""
    status_code: int = 500
    error_type: str = "internal_error"

class InsufficientBalanceError(RoustrError):
    status_code = 402
    error_type = "insufficient_balance"

4. Database Migrations¶

Using Alembic for schema management:

Auto-migrations on startup
Version control for schema changes
Rollback capability
Zero-downtime updates

Security Architecture¶

API Key Security¶

Storage: SHA-256 hashed keys
Generation: Cryptographically secure random
Validation: Constant-time comparison
Rotation: Support for key expiry

Payment Security¶

Token Validation: Cryptographic verification
Double-Spend Prevention: Mint verification
Balance Protection: Atomic transactions
Audit Trail: All transactions logged

Network Security¶

HTTPS: Enforced in production
CORS: Configurable origins
Rate Limiting: Per-key limits
Input Validation: Pydantic models

Performance Considerations¶

Caching Strategy¶

# Model pricing cache
@lru_cache(maxsize=100)
def get_model_price(model_id: str) -> ModelPrice:
    return MODELS.get(model_id)

# Balance cache with TTL
balance_cache = TTLCache(maxsize=1000, ttl=60)

Database Optimization¶

Connection Pooling: Reuse connections
Indexed Queries: Key lookups are O(1)
Batch Operations: Group updates
Async I/O: Non-blocking queries

Streaming Responses¶

Efficient memory usage for large responses:

async def stream_response(upstream_response):
    async for chunk in upstream_response.aiter_bytes():
        # Process chunk without loading full response
        yield process_chunk(chunk)

Extension Points¶

Adding New Endpoints¶

Create router module
Define Pydantic models
Implement business logic
Register with main app
Add tests

Custom Pricing Models¶

Extend ModelPrice class
Implement calculation logic
Add to pricing registry
Update configuration

Payment Methods¶

Create payment handler
Implement validation
Add to payment router
Update balance logic

Testing Strategy¶

Unit Tests¶

Mock external dependencies
Test business logic in isolation
Fast execution (< 1 second per test)
High coverage target (> 80%)

Integration Tests¶

Test component interactions
Use test database
Mock external services
Verify end-to-end flows

Performance Tests¶

Load testing with locust
Memory profiling
Database query optimization
Response time benchmarks

Monitoring and Observability¶

Structured Logging¶

logger.info("api_request", extra={
    "request_id": request_id,
    "api_key": api_key_id,
    "endpoint": endpoint,
    "model": model,
    "tokens": token_count,
    "cost_sats": cost,
    "duration_ms": duration
})

Metrics Collection¶

Key metrics tracked:

Request rate by endpoint
Token usage by model
Balance changes
Error rates
Response times

Health Checks¶

@app.get("/health")
async def health_check():
    return {
        "status": "healthy",
        "version": __version__,
        "database": await check_db(),
        "upstream": await check_upstream(),
        "mints": await check_mints()
    }

Deployment Architecture¶

Container Structure¶

# Multi-stage build
FROM python:3.11-slim AS builder
# Install dependencies

FROM python:3.11-slim
# Copy only runtime needs
# Run as non-root user

Environment Configuration¶

Development: Local SQLite, debug logging
Testing: In-memory database, mock services
Production: Persistent storage, structured logs

Scaling Considerations¶

Horizontal: Multiple instances behind load balancer
Vertical: Async handles high concurrency
Database: Consider PostgreSQL for scale
Caching: Redis for distributed cache

Future Architecture¶

Planned Improvements¶

WebSocket Support: Real-time balance updates
Plugin System: Extensible pricing/auth
Multi-Region: Geographic distribution
Event Sourcing: Complete audit trail

Technical Debt¶

Areas for improvement:

Database query optimization
Response caching layer
Metric aggregation
API versioning strategy

Next Steps¶

Review Code Structure for detailed organization
See Testing Guide for test architecture