Skip to content

Models & Pricing

Understanding how Routstr calculates costs is essential for managing your API usage efficiently. This guide explains the pricing models and how to configure them.

Pricing Models

Routstr supports three pricing models:

1. Fixed Pricing

Simple per-request charging:

FIXED_PRICING=true
FIXED_COST_PER_REQUEST=10  # 10 sats per request

Best for:

  • Uniform API usage
  • Simple applications
  • Predictable costs

2. Token-Based Pricing

Charge based on actual token usage:

FIXED_PRICING=false             # use model pricing
FIXED_COST_PER_REQUEST=1        # optional base fee
FIXED_PER_1K_INPUT_TOKENS=5     # optional override
FIXED_PER_1K_OUTPUT_TOKENS=15   # optional override

Best for:

  • Varied request sizes
  • Fair usage billing
  • Cost optimization

3. Model-Based Pricing

Dynamic pricing based on model costs:

FIXED_PRICING=false
EXCHANGE_FEE=1.005      # 0.5% exchange fee
UPSTREAM_PROVIDER_FEE=1.05  # 5% provider fee

Best for:

  • Multiple models
  • Market-based pricing
  • Automatic updates

Model Configuration

Default Models

Routstr includes pricing for popular models:

Model Input ($/1K) Output ($/1K) Context Notes
gpt-3.5-turbo $0.0015 $0.002 16K Fast, economical
gpt-4 $0.03 $0.06 8K Advanced reasoning
gpt-4-turbo $0.01 $0.03 128K Large context
claude-3-opus $0.015 $0.075 200K Best quality
claude-3-sonnet $0.003 $0.015 200K Balanced
llama-2-70b $0.0007 $0.0009 4K Open source

Custom Models File

Create models.json to override defaults:

{
  "models": [
    {
      "id": "gpt-4-vision",
      "name": "GPT-4 Vision",
      "pricing": {
        "prompt": "0.00003",
        "completion": "0.00006",
        "request": "0",
        "image": "0.00255"
      },
      "context_length": 128000,
      "supports_vision": true
    },
    {
      "id": "custom-model",
      "name": "My Custom Model",
      "pricing": {
        "prompt": "0.001",
        "completion": "0.002",
        "request": "0.0001"
      },
      "context_length": 8192
    }
  ]
}

Auto-updating Models

Fetch latest models from OpenRouter:

# Update models from API
python scripts/models_meta.py

# Or manually
curl https://openrouter.ai/api/v1/models > models.json

Cost Calculation

Understanding the Formula

Base Cost = (Input Tokens × Input Rate) + (Output Tokens × Output Rate) + Request Fee

Bitcoin Price = Current BTC/USD rate (e.g., $50,000)
Sats Cost = (Base Cost / Bitcoin Price) × 100,000,000

Final Cost = Sats Cost × Exchange Fee × Provider Fee

Example Calculations

Example 1: Simple Chat (gpt-3.5-turbo)

Input: 50 tokens
Output: 150 tokens
Model rates: $0.0015/1K input, $0.002/1K output

USD Cost = (50/1000 × 0.0015) + (150/1000 × 0.002)
         = $0.000075 + $0.0003
         = $0.000375

At $50,000/BTC: 0.75 sats
With 5.5% total fees: 0.79 sats

Example 2: Large Context (gpt-4)

Input: 2,000 tokens
Output: 500 tokens
Model rates: $0.03/1K input, $0.06/1K output

USD Cost = (2000/1000 × 0.03) + (500/1000 × 0.06)
         = $0.06 + $0.03
         = $0.09

At $50,000/BTC: 180 sats
With 5.5% total fees: 190 sats

Example 3: Image Generation (dall-e-3)

Model: dall-e-3
Size: 1024x1024
Quality: standard
Cost: $0.04 per image

At $50,000/BTC: 80 sats
With 5.5% fees: 84 sats

Fee Structure

Exchange Fee

Covers Bitcoin/USD conversion costs:

EXCHANGE_FEE=1.005  # 0.5% default

Factors:

  • Exchange rate volatility
  • Conversion costs
  • Price update frequency

Provider Fee

Node operator's margin:

UPSTREAM_PROVIDER_FEE=1.05  # 5% default

Covers:

  • Infrastructure costs
  • Maintenance
  • Support
  • Profit margin

Calculating Total Fees

Total Multiplier = EXCHANGE_FEE × UPSTREAM_PROVIDER_FEE
Example: 1.005 × 1.05 = 1.05525 (5.525% total)

Special Pricing

Image Models

Image generation uses per-image pricing:

Model Size Quality Price
dall-e-2 256x256 - $0.016
dall-e-2 512x512 - $0.018
dall-e-2 1024x1024 - $0.02
dall-e-3 1024x1024 standard $0.04
dall-e-3 1024x1024 hd $0.08
dall-e-3 1024x1792 standard $0.08
dall-e-3 1024x1792 hd $0.12

Audio Models

Audio pricing by duration:

Model Type Price
whisper-1 Transcription $0.006/minute
whisper-1 Translation $0.006/minute
tts-1 Text-to-speech $0.015/1K chars
tts-1-hd HD speech $0.03/1K chars

Embedding Models

Lower costs for embeddings:

Model Price/1K tokens
text-embedding-3-small $0.00002
text-embedding-3-large $0.00013
text-embedding-ada-002 $0.0001

Monitoring Costs

Per-Request Tracking

Each API response includes usage data:

{
  "usage": {
    "prompt_tokens": 50,
    "completion_tokens": 150,
    "total_tokens": 200
  },
  "x-routstr-cost": {
    "sats": 79,
    "usd": 0.000375,
    "breakdown": {
      "prompt_cost": 15,
      "completion_cost": 60,
      "fees": 4
    }
  }
}

Daily Summaries

View in admin dashboard:

  • Total requests
  • Token usage by model
  • Cost distribution
  • Trending patterns

Cost Alerts

Set up notifications:

# Example monitoring script
def check_daily_spend(api_key):
    balance_start = get_balance(api_key, "00:00")
    balance_now = get_balance(api_key)
    spent = balance_start - balance_now

    if spent > DAILY_LIMIT:
        send_alert(f"Daily spend exceeded: {spent} sats")

Optimization Strategies

Model Selection

Choose the right model for each task:

Task Recommended Model Why
Simple Q&A gpt-3.5-turbo Fast, cheap, sufficient
Code generation gpt-4 Better reasoning
Summarization claude-3-haiku Good balance
Creative writing claude-3-opus Best quality
Embeddings text-embedding-3-small Optimized for vectors

Prompt Engineering

Reduce costs with efficient prompts:

# Expensive
prompt = """
You are an AI assistant. Your task is to help users.
Please provide detailed, comprehensive answers.
Now, answer this question: What is 2+2?
"""

# Economical
prompt = "Calculate: 2+2"

Caching Strategies

Implement smart caching:

# Cache embedding results
@lru_cache(maxsize=1000)
def get_embedding(text):
    return client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )

# Cache common responses
COMMON_RESPONSES = {
    "greeting": "Hello! How can I help you?",
    "goodbye": "Goodbye! Have a great day!"
}

Batch Processing

Process multiple items efficiently:

# Instead of multiple calls
for item in items:
    response = client.chat.completions.create(...)

# Use single call with formatted prompt
prompt = "\n".join([f"{i+1}. {item}" for i, item in enumerate(items)])
response = client.chat.completions.create(
    messages=[{"role": "user", "content": f"Process these items:\n{prompt}"}]
)

Custom Pricing Rules

Time-Based Pricing

Implement off-peak discounts:

def calculate_multiplier():
    hour = datetime.now().hour
    if 2 <= hour <= 6:  # 2 AM - 6 AM
        return 0.8  # 20% discount
    elif 18 <= hour <= 22:  # 6 PM - 10 PM
        return 1.2  # 20% premium
    return 1.0

Model-Specific Rules

Custom pricing logic:

def adjust_model_price(model, base_price):
    # Premium for latest models
    if "turbo" in model or "latest" in model:
        return base_price * 1.1

    # Discount for older models
    if "legacy" in model:
        return base_price * 0.8

    return base_price

Pricing Transparency

Public Pricing Page

Display current rates:

<!-- Available at /pricing -->
<table>
  <tr>
    <th>Model</th>
    <th>Input (sats/1K)</th>
    <th>Output (sats/1K)</th>
  </tr>
  <!-- Dynamically generated from models.json -->
</table>

Cost Estimation API

Provide cost estimates:

POST /v1/estimate
{
  "model": "gpt-4",
  "prompt_tokens": 500,
  "max_tokens": 200
}

Response:
{
  "estimated_cost_sats": 45,
  "breakdown": {
    "prompt": 30,
    "completion": 12,
    "fees": 3
  }
}

Troubleshooting

Pricing Mismatches

Issue: Costs don't match expectations

  • Check current BTC/USD rate
  • Verify fee settings
  • Review model configuration

Issue: Models not found

  • Update models.json
  • Check model ID spelling
  • Verify upstream support

Fee Calculations

Issue: Fees seem too high

  • Review EXCHANGE_FEE setting
  • Check UPSTREAM_PROVIDER_FEE
  • Calculate total multiplier

Next Steps