Quick Start
Copy.env.example to .env and configure required variables. All configuration can also be provided via the config/hoodcloud.yaml file with environment variable overrides.
Configuration Hierarchy
HoodCloud uses a two-tier configuration system:- Primary: YAML configuration file (
config/hoodcloud.yaml) - Override: Environment variables (take precedence over YAML)
HOODCLOUD_CONFIG environment variable can specify a custom config file path.
Variables by Category
Configuration Management
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| HOODCLOUD_CONFIG | No | - | public | Path to YAML configuration file (if not specified, uses env-only mode) |
| ENVIRONMENT | No | dev | public | Deployment environment (dev, staging, production). Used for logging context and production readiness validation. |
| LOG_LEVEL | No | info | public | Logging level (debug, info, warn, error) |
HTTP Server
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| SERVER_HOST | No | 0.0.0.0 | public | HTTP server bind address |
| SERVER_PORT | No | 8080 | public | HTTP server port |
| SERVER_PUBLIC_URL | Yes | - | sensitive | Public URL for API server (used by ops-agent) |
| SERVER_READ_TIMEOUT | No | 30s | public | HTTP server read timeout |
| SERVER_WRITE_TIMEOUT | No | 30s | public | HTTP server write timeout |
Database (PostgreSQL)
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| DB_HOST | Yes | localhost | sensitive | PostgreSQL host |
| DB_PORT | No | 5432 | public | PostgreSQL port |
| DB_USER | Yes | hoodcloud | sensitive | PostgreSQL username |
| DB_PASSWORD | Yes | - | secret | PostgreSQL password |
| DB_NAME | Yes | hoodcloud | sensitive | PostgreSQL database name |
| DB_SSL_MODE | No | require | public | PostgreSQL SSL mode (disable, require, verify-ca, verify-full). Default changed to require in docker-compose. |
Database Pool Configuration
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| DB_MAX_CONNS | No | 10 | public | Maximum number of connections in the pool |
| DB_MIN_CONNS | No | 2 | public | Minimum number of connections in the pool |
| DB_MAX_CONN_LIFETIME | No | 1h | public | Maximum lifetime of a connection before it is closed and replaced |
| DB_MAX_CONN_IDLE_TIME | No | 30m | public | Maximum time a connection can be idle before it is closed |
| DB_STATEMENT_TIMEOUT | No | 0 (disabled) | public | PostgreSQL statement_timeout per connection (e.g., 2s, 5s, 10s). Queries exceeding this duration are canceled. |
| Service | MaxConns | MinConns | StatementTimeout | Rationale |
|---|---|---|---|---|
| Health Evaluator | 5 | 2 | 5s | Single instance, periodic batch queries |
| Auth Server | 5 | 2 | 2s | Lightweight auth queries |
| API Server | 10 | 2 | 2s | User-facing, variable load |
| Agent Gateway | 10 | 2 | 2s | Agent heartbeats, moderate load |
| Orchestrator | 10 | 2 | 10s | Workflow activities, Terraform operations |
sum(MaxConns × instances) < PostgreSQL max_connections
Example: With 2 instances per service: (5×2) + (5×2) + (10×2) + (10×2) + (10×2) = 80 < 100 (default max_connections)
Redis
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| REDIS_HOST | Yes | localhost | sensitive | Redis host |
| REDIS_PORT | No | 6379 | public | Redis port |
| REDIS_PASSWORD | No | - | secret | Redis password (if authentication is enabled) |
| REDIS_DB | No | 0 | public | Redis database number |
Temporal
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| TEMPORAL_HOST | Yes | localhost | sensitive | Temporal server host |
| TEMPORAL_PORT | No | 7233 | public | Temporal server port |
| TEMPORAL_NAMESPACE | No | hoodcloud | public | Temporal namespace |
| TEMPORAL_TASK_QUEUE | No | hoodcloud-tasks | public | Temporal task queue name |
HashiCorp Vault
Vault is the secrets provider for all application secrets. AWS credentials may still be needed for S3 key backups (either via environment or Vault AWS secrets engine). Local Development: All Vault variables below are auto-configured bydocker-compose.dev.yml. The vault-dev-init.sh sidecar seeds secrets and writes AppRole credentials to a shared volume. No manual Vault configuration is needed for local dev.
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| VAULT_ADDR | Yes | - | sensitive | Vault server address (e.g., https://<vault-ip>:8200). TLS is enabled by default in vault.hcl. |
| VAULT_ROLE_ID | Yes | - | sensitive | AppRole role_id for authentication |
| VAULT_SECRET_ID_PATH | Yes | - | sensitive | Path to file containing AppRole secret_id (e.g., /secrets/vault-secret-id). File must be readable by the container user (chmod 644). |
| VAULT_NAMESPACE | No | - | public | Vault namespace (enterprise feature, leave empty for OSS) |
| VAULT_CACHE_TTL | No | 5m | public | Cache TTL for Vault secrets |
| VAULT_TLS_CA_FILE | Conditional | - | sensitive | Path to CA certificate for Vault TLS verification (e.g., /secrets/vault-ca.crt). Required when Vault uses self-signed TLS certificates. |
| VAULT_TLS_SKIP_VERIFY | No | false | public | Skip TLS certificate verification (dev only). Rejected in production. |
| VAULT_MASTER_KEY_PATH | Yes | - | sensitive | Vault KV path for master encryption key |
| VAULT_APP_CREDENTIALS_PATH | Yes | - | sensitive | Vault KV path for application credentials |
| VAULT_TRANSIT_KEY_NAME | No | - | public | Transit engine key name for DEK encryption (optional) |
| VAULT_PKI_ROLE | No | - | public | PKI role for gRPC TLS certificates (optional) |
| VAULT_PKI_CERT_TTL | No | 720h | public | PKI certificate TTL (30 days default) |
| VAULT_PKI_RENEW_BEFORE | No | 24h | public | Renew PKI certificates this long before expiry |
See also: Vault Secret Structure for the full Vault KV layout and all credential fields.Key notes:
sealed_box_public_keyandsealed_box_private_keyare mandatory — api-server and orchestrator fail fast at startup without them. See Vault - Generate X25519 Keypair.- Payment mTLS fields (
payment_client_cert,payment_client_key,payment_ca_cert) are optional — only needed when payment service integration is enabled. - Incident notification fields are optional. When present, health-evaluator enables the corresponding notification channels. If none are configured, incidents are tracked in DB only.
- Generate role_id:
vault read -field=role_id auth/approle/role/hoodcloud-control-plane/role-id - Generate secret_id:
vault write -f -field=secret_id auth/approle/role/hoodcloud-control-plane/secret-id - Deploy secret_id to
/opt/hoodcloud/secrets/vault-secret-idon the service host withchmod 644 - The host path is mounted into containers as
/secrets/vault-secret-idvia the/opt/hoodcloud/secrets:/secrets:rovolume mount - Service reads the secret_id file during AppRole authentication
AWS (S3 Key Backups)
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| AWS_REGION | No | eu-central-1 | public | AWS region for S3 |
| AWS_ENDPOINT_URL | No | - | sensitive | Custom AWS endpoint URL (for LocalStack in testing) |
| AWS_ACCESS_KEY_ID | Conditional | - | secret | AWS access key ID (required if using AWS S3 without Vault AWS engine) |
| AWS_SECRET_ACCESS_KEY | Conditional | - | secret | AWS secret access key (required if using AWS S3 without Vault AWS engine) |
| AWS_S3_BUCKET | Yes | - | sensitive | S3 bucket for encrypted key backups |
| AWS_S3_BACKUP_PREFIX | No | backups/ | public | S3 prefix for backup objects |
gRPC Server (Ops Agent Communication)
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| GRPC_HOST | No | 0.0.0.0 | public | gRPC server bind address |
| GRPC_PORT | No | 9090 | public | gRPC server port |
| GRPC_PUBLIC_URL | Yes | - | sensitive | Public gRPC address in host:port format (for ops-agent connection) |
| GRPC_USE_TLS | No | false | public | Enable TLS with system root CAs for ops-agent gRPC connection |
| GRPC_CERT_FILE | Conditional | - | sensitive | Path to TLS certificate file (required if using custom TLS) |
| GRPC_KEY_FILE | Conditional | - | secret | Path to TLS private key file (required if using custom TLS) |
| GRPC_CA_FILE | Conditional | - | sensitive | Path to TLS CA certificate file (for mTLS) |
| GRPC_CONFIG_SIGNING_KEY | No | - | secret | Path to ECDSA private key for signing configuration commands |
gRPC Rate Limiting
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| GRPC_RATE_LIMIT_RPS | No | 10 | public | Requests per second per node for gRPC rate limiting |
| GRPC_RATE_LIMIT_BURST | No | 20 | public | Maximum burst size for gRPC rate limiting |
| GRPC_RATE_LIMIT_CLEANUP_INTERVAL | No | 5m | public | Interval for cleaning up stale rate limiters |
Cloud Provider Defaults
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| DEFAULT_PROVIDER | No | hetzner | public | Default cloud provider (hetzner) |
| DEFAULT_REGION | No | fsn1 | public | Default region for node provisioning |
Terraform
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| TERRAFORM_BINARY | No | terraform | public | Path to Terraform binary |
| TERRAFORM_MODULE_PATH | No | - | public | Path to Terraform module (provider-specific) |
| TERRAFORM_STATE_DIR | No | terraform-state | public | Directory for Terraform state files (local backend) |
| TERRAFORM_PARALLELISM | No | 10 | public | Terraform parallelism setting |
| SKIP_TERRAFORM | No | false | public | Skip Terraform execution (for local dev) |
| DEV_MODE | No | false | public | Enable development mode features |
| TERRAFORM_STATE_BACKEND | No | local | public | State backend type: local or s3 |
| TERRAFORM_S3_BUCKET | Conditional | - | sensitive | S3 bucket for state storage (required when backend=s3) |
| TERRAFORM_S3_REGION | No | eu-central-1 | public | AWS region for S3 state bucket |
| TERRAFORM_DYNAMODB_TABLE | Conditional | - | sensitive | DynamoDB table for state locking (required when backend=s3) |
Terraform Operation Timeouts
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| TERRAFORM_INIT_TIMEOUT | No | 5m | public | Timeout for terraform init command |
| TERRAFORM_PLAN_TIMEOUT | No | 10m | public | Timeout for terraform plan command |
| TERRAFORM_APPLY_TIMEOUT | No | 30m | public | Timeout for terraform apply command |
| TERRAFORM_DESTROY_TIMEOUT | No | 20m | public | Timeout for terraform destroy command |
| TERRAFORM_OUTPUT_TIMEOUT | No | 1m | public | Timeout for terraform output command |
Command Execution
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| COMMAND_PROGRESS_TTL | No | 24h | public | TTL for command progress data in Redis. Progress data is automatically cleaned up after this duration. |
Chain Configurations
Dynamic Chain Config Provider
The control plane services can load chain configs from either local filesystem or S3 with hot-reload support.| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| CHAIN_PROFILES_DIR | No | hoodcloud-chain-configs | public | Directory containing chain profile definitions (local mode) |
| CHAIN_CONFIGS_SOURCE | No | auto | public | Config source: local, s3, or auto (auto uses S3 if bucket configured) |
| CHAINS_S3_BUCKET | Conditional | - | sensitive | S3 bucket for chain configs (required for S3 mode) |
| CHAINS_S3_REGION | No | AWS_REGION | public | AWS region for chain configs S3 bucket |
| CHAINS_S3_PREFIX | No | (empty) | public | S3 key prefix for chain config files |
| CHAIN_PROFILE_VERSION | No | latest | public | Config version: latest (auto-update) or pinned like v1.0.1 |
| CHAINS_CHECK_INTERVAL | No | 1m | public | Interval to check for version updates (auto-update mode only) |
| CHAIN_CONFIGS_WATCH | No | false | public | Enable fsnotify file watching for hot reload (local mode) |
| Mode | CHAIN_PROFILE_VERSION | Behavior |
|---|---|---|
| Auto-update | latest (default) | Downloads from latest/, polls checksums.txt for changes |
| Pinned | v1.0.1 | Downloads specific version once, no polling |
Config-as-Code (Ops Agent Runtime)
These variables control config bundle distribution to ops-agents (separate from control plane loading):| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| CHAIN_CONFIGS_ENABLED | No | false | public | Enable config-as-code mode for ops-agent |
| CONFIG_BUNDLE_VERSION | Conditional | - | public | Current config bundle version (e.g., v1.2.3) - required if config-as-code enabled |
| CHAIN_CONFIGS_S3_BUCKET | Conditional | - | sensitive | S3 bucket for config bundles - required if config-as-code enabled |
| CHAIN_CONFIGS_S3_REGION | No | AWS_REGION | public | S3 bucket region for config bundles |
| CHAIN_CONFIGS_URL_EXPIRY | No | 1h | public | Pre-signed URL expiry duration |
| CHAIN_CONFIGS_LOCAL_DIR | No | - | public | Local directory for config bundles (development/testing) |
| S3_PUBLIC_ENDPOINT_URL | No | - | sensitive | Public S3 endpoint for pre-signed URLs (E2E testing) |
Ops Agent
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| OPS_AGENT_VERSION | No | 0.1.0 | public | Ops agent version to deploy |
| OPS_AGENT_HEARTBEAT_INTERVAL | No | 15s | public | Heartbeat interval for keep-alive |
| OPS_AGENT_DOWNLOAD_URL | No | SERVER_PUBLIC_URL | sensitive | Base URL for ops agent binary downloads |
| OPS_AGENT_BINARIES_DIR | No | - | public | Directory containing ops agent binaries for download |
| REQUIRE_CONFIG_SIGNATURE | No | true | public | Require ECDSA signature verification for config commands |
Ops Agent Runtime (on VMs)
The ops-agent supports two configuration modes:- YAML + Environment Override: Set
OPS_AGENT_CONFIGto a YAML config file path. Environment variables override YAML values. - Environment-Only (Legacy): If
OPS_AGENT_CONFIGis not set, configuration loads purely from environment variables.
config/ops-agent.yaml.example for a YAML configuration template.
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| OPS_AGENT_CONFIG | No | - | public | Path to YAML configuration file. If set, loads from file with env overrides. |
| NODE_ID | Yes | - | sensitive | Unique node identifier |
| NODE_TOKEN | No | - | secret | Authentication token for control plane (nt_xxx format) |
| CONTROL_PLANE_URL | Yes | - | sensitive | Control plane gRPC URL (host:port) for backward compatibility |
| GRPC_URL | Yes | - | sensitive | Control plane gRPC address (host:port) - preferred over CONTROL_PLANE_URL |
| GRPC_ADDRESS | No | - | sensitive | Alias for GRPC_URL in YAML config |
| GRPC_USE_TLS | No | false | public | Use TLS with system root CAs for control plane connection |
| GRPC_PORT | No | 9091 | public | gRPC server port on ops-agent |
| TLS_CERT_FILE | Conditional | - | sensitive | Path to TLS certificate for mTLS |
| TLS_KEY_FILE | Conditional | - | secret | Path to TLS private key for mTLS |
| TLS_CA_FILE | Conditional | - | sensitive | Path to CA certificate for gRPC mTLS verification. Not needed for NATS (uses system root CAs with Let’s Encrypt). |
| CHAIN_PROFILE_ID | No | - | public | Chain profile identifier (e.g., celestia-mocha) |
| CHAIN_DATA_DIR | No | /opt/hoodcloud/data/chain | public | Chain data directory |
| CHAIN_RPC_PORT | No | 0 | public | Chain-specific RPC port (0 = auto-detect) |
| CHAIN_TYPE | No | - | public | Chain type (cosmos, ethereum, celestia) - loaded from runtime.yaml |
| HEARTBEAT_INTERVAL | No | 5m | public | Keep-alive heartbeat interval |
| GRACE_PERIOD | No | 30s | public | Graceful shutdown grace period |
| CONFIG_SIGNING_PUBLIC_KEY | No | /opt/hoodcloud/config/signing.pub | sensitive | Path to ECDSA public key for config signature verification |
| REQUIRE_CONFIG_SIGNATURE | No | true | public | Require signature verification for config commands |
| CONFIG_DIR | No | - | public | Directory containing chain config bundles |
| CONFIG_VERSION | No | - | public | Version of config bundle |
| METRICS_ENABLED | No | true | public | Enable Prometheus metrics endpoint on ops-agent |
| METRICS_PORT | No | 9101 | public | Port for Prometheus metrics endpoint |
| OBSERVATION_ENABLED | No | true | public | Enable observation collectors and NATS metrics transport (RFC-007) |
| NATS_URL | No | nats://localhost:4222 | sensitive | NATS server URL for metrics transport |
| HOST_ID | No | - | public | Host identifier for metrics labels |
| SYSTEM_METRICS_INTERVAL | No | 60s | public | How often to collect system metrics (disk, CPU, memory) |
| SYSTEM_METRICS_DATA_PATH | No | / | public | Path to monitor for disk usage (defaults to root filesystem) |
Health Evaluator
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| HEALTH_EVALUATION_INTERVAL | No | 30s | public | How often to evaluate node health |
| HEALTH_HEARTBEAT_TIMEOUT | No | 60s | public | Heartbeat timeout before marking node as DOWN |
| HEALTH_MIGRATION_COOLDOWN | No | 1h | public | Cooldown period between automatic migrations |
| SUBSCRIPTION_CLEANUP_INTERVAL | No | 1h | public | How often to check for expired subscriptions |
| MAINTENANCE_CLEANUP_INTERVAL | No | 5m | public | How often to check for nodes stuck in maintenance state |
| MAINTENANCE_CLEANUP_TIMEOUT | No | 30m | public | How long a node can stay in maintenance before considered stuck |
Uptime Worker (Hardcoded Defaults)
The UptimeWorker runs insidehealth-evaluator and uses hardcoded defaults (no environment variables). Values are defined in internal/defaults/defaults.go:
| Setting | Default | Description |
|---|---|---|
| Interval | 5m | How often the worker materializes hourly uptime buckets |
| BatchSize | 500 | Number of nodes processed per batch |
| GracePeriod | 10m | Provisioning grace period counted as uptime |
| RetentionDays | 90 | Hourly buckets older than this are purged automatically |
Observability
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| LOKI_URL | No | http://localhost:3100 | sensitive | Loki URL for log aggregation |
| PROMETHEUS_URL | No | http://localhost:9090 | sensitive | Prometheus URL for metrics collection |
| VICTORIA_METRICS_URL | No | - | sensitive | Victoria Metrics URL for observation metrics TSDB (RFC-007). Enables metrics ingester and policy evaluation when set. |
| METRICS_CONSUMER_NAME | No | metrics-ingester | public | NATS consumer name for metrics ingester (useful for multi-instance deployments) |
| EVALUATION_CONCURRENCY | No | 10 | public | Number of chains to evaluate in parallel for policy evaluation |
| CIRCUIT_BREAKER_FAILURE_THRESHOLD | No | 5 | public | Number of consecutive failures before opening the circuit breaker |
| CIRCUIT_BREAKER_SUCCESS_THRESHOLD | No | 3 | public | Number of successes in half-open state before closing the circuit |
| CIRCUIT_BREAKER_TIMEOUT | No | 30s | public | How long to stay open before transitioning to half-open state |
Telemetry (OpenTelemetry)
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| OTEL_ENABLED | No | false | public | Enable OpenTelemetry distributed tracing |
| OTEL_COLLECTOR_URL | No | otel-collector:4317 | sensitive | OTLP gRPC endpoint for the OTel Collector |
| OTEL_SAMPLE_RATIO | No | 1.0 | public | Trace sampling ratio (0.0 to 1.0). 1.0 = sample all traces |
| OTEL_INSECURE | No | true | public | Disable TLS for OTLP exporter connection (true for local dev) |
Authentication (Auth Server)
JWT RS256 (Asymmetric)
RS256 uses asymmetric cryptography: the auth-server signs tokens with a private key, and api-server verifies with the public key.| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| JWT_PRIVATE_KEY_FILE | Yes (auth-server) | - | secret | Path to RSA private key PEM file (auth-server only) |
| JWT_PUBLIC_KEY_FILE | Yes | - | sensitive | Path to RSA public key PEM file (all services) |
| JWT_ACCESS_TOKEN_TTL | No | 15m | public | JWT access token time-to-live |
| JWT_REFRESH_TOKEN_TTL | No | 168h | public | JWT refresh token time-to-live (7 days) |
| JWT_ISSUER | No | hoodcloud | public | JWT issuer claim |
| JWT_AUDIENCE | No | hoodcloud-api | public | JWT audience claim. Auth-server embeds this in issued tokens; all verifying services reject tokens with a mismatched audience (returns 401 Unauthorized). Must be identical across auth-server and api-server. |
| JWT_KEY_ID | No | - | public | Key ID for JWT header (kid), enables key rotation |
JWT_AUDIENCE is set, the auth-server includes it in every issued token and all verifying services (api-server, agent-gateway) validate that incoming tokens contain this exact audience value. A mismatch causes token validation to fail with 401 Unauthorized. If deploying multiple services, ensure JWT_AUDIENCE is consistent across all of them. Omitting JWT_AUDIENCE skips audience validation entirely (backward compatibility).
Key Generation:
- Vault (
jwt_private_key,jwt_public_keyfields) - File paths (
JWT_PRIVATE_KEY_FILE,JWT_PUBLIC_KEY_FILE) | SIWE_DOMAIN | No | localhost | public | (Deprecated) SIWE message domain. Only used whenAUTH_PROVIDER=siweorboth. | | SIWE_URI | No | http://localhost:3000 | public | (Deprecated) SIWE message URI. Only used whenAUTH_PROVIDER=siweorboth. | | AUTH_NONCE_EXPIRY | No | 5m | public | (Deprecated) Auth nonce expiry time. Only used whenAUTH_PROVIDER=siweorboth. | | AUTH_RATE_LIMIT_PER_MINUTE | No | 20 | public | Rate limit for auth endpoints per IP | | AUTH_MAX_REFRESH_TOKENS_PER_USER | No | 10 | public | Maximum concurrent active sessions per user | | AUTH_ALLOWED_ORIGINS | No | - | public | Comma-separated allowed CORS origins for auth server |
Clerk (Primary Auth Provider)
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| AUTH_PROVIDER | No | clerk | public | Auth provider: clerk (default), siwe (legacy), or both (migration) |
| AUTH_CLERK_SECRET_KEY | Conditional | - | secret | Clerk secret key for JWKS JWT verification (required when provider is clerk or both) |
| AUTH_CLERK_AUTHORIZED_PARTY | Conditional | - | public | Comma-separated list of allowed frontend origins for Clerk JWT azp claim, e.g. https://hoodcloud.io,https://app.hoodcloud.io (required when provider is clerk or both) |
| AUTH_CLERK_WEBHOOK_SIGNING_SECRET | Conditional | - | secret | Clerk webhook signing secret for svix signature verification (required for auth-server when provider is clerk or both) |
API Security
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| API_AUTH_ENABLED | No | false | public | Enable API key authentication |
| DANGEROUS_DISABLE_AUTH | No | false | public | Must be set to true to confirm disabling authentication when API_AUTH_ENABLED=false. Panics at startup if auth is disabled without this flag. |
| API_RATE_LIMIT_ENABLED | No | false | public | Enable rate limiting |
| API_RATE_LIMIT_REQUESTS | No | 100 | public | Requests per minute per API key |
| API_RATE_LIMIT_REDIS | No | false | public | Enable Redis-backed distributed rate limiting. When true, uses Redis INCR+EXPIRE sliding window instead of in-memory token bucket. Includes circuit breaker: after 5 consecutive Redis failures, falls back to in-memory limiter automatically. |
NATS Event Streaming
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| NATS_ENABLED | No | false | public | Enable NATS event streaming |
| NATS_URL | No | nats://localhost:4222 | sensitive | NATS server URL (internal, for control plane) |
| NATS_PUBLIC_URL | No | - | sensitive | Public NATS URL (for ops-agent on external VMs) |
| NATS_ACCOUNT | No | - | sensitive | NATS account name for multi-account auth |
| NATS_STREAM_NAME | No | HOODCLOUD_EVENTS | public | JetStream stream name |
| NATS_CONSUMER_NAME | No | control-plane | public | Durable consumer name |
| NATS_MAX_MESSAGE_SIZE | No | 1048576 | public | Maximum message size in bytes (default: 1MB) |
| NATS_STREAM_REPLICAS | No | 1 | public | JetStream stream replica count. Set to 3 for production with a 3-node NATS cluster. Applies to HOODCLOUD_EVENTS and HOODCLOUD_METRICS streams. |
Cross-repo dependency: The PAYMENTS stream is managed by the payment service repository. Its replica count must be configured separately in the payment service.
NATS Multi-Account Authentication
WhenNATS_ACCOUNTS_ENABLED=true, each service component connects with its own account and subject-level permissions. Tokens are delivered to ops-agents via gRPC after registration (no shared token in cloud-init).
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| NATS_ACCOUNTS_ENABLED | Yes (prod) | false | public | Enable per-service NATS account auth. Required true in production when NATS is enabled. |
| NATS_ACCOUNT_OPS_AGENT_TOKEN | Conditional | - | secret | NATS token for ops-agent account (delivered via gRPC) |
| NATS_ACCOUNT_PAYMENT_SVC_TOKEN | Conditional | - | secret | NATS token for payment-service account |
| NATS_ACCOUNT_CONTROL_PLANE_TOKEN | Conditional | - | secret | NATS token for control-plane account |
| NATS_ACCOUNT_HEALTH_EVAL_TOKEN | Conditional | - | secret | NATS token for health-evaluator account |
Payment Service Integration
The payment service runs as a separate microservice with its own database. These variables configure the main app’s connection to the payment service.Payment Service Client (gRPC)
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| PAYMENT_SERVICE_ENABLED | No | false | public | Enable payment service integration |
| PAYMENT_SERVICE_ADDRESS | Conditional | - | sensitive | Payment service gRPC address (e.g., payment-service.hoodcloud.internal:50051) |
| PAYMENT_SERVICE_CERT_FILE | Conditional | - | sensitive | Client certificate file for mTLS (see note below) |
| PAYMENT_SERVICE_KEY_FILE | Conditional | - | secret | Client key file for mTLS (see note below) |
| PAYMENT_SERVICE_CA_FILE | Conditional | - | sensitive | CA certificate for server verification (see note below) |
| PAYMENT_SERVICE_SERVER_NAME | No | - | public | Expected server name for TLS verification (SNI) |
| PAYMENT_SERVICE_TIMEOUT | No | 30s | public | Default timeout for gRPC calls |
| PAYMENT_SERVICE_INSECURE | No | false | public | Skip TLS (development only) |
Payment Event Consumer (NATS)
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| PAYMENT_CONSUMER_ENABLED | No | false | public | Enable payment event consumer |
| PAYMENT_CONSUMER_URL | No | NATS_URL | sensitive | NATS server URL for payment events |
| PAYMENT_CONSUMER_STREAM_NAME | No | PAYMENTS | public | JetStream stream name for payment events |
| PAYMENT_CONSUMER_NAME | No | main-app-payment | public | Durable consumer name |
| PAYMENT_CONSUMER_ACK_WAIT | No | 30s | public | Time before unacked message is redelivered |
| PAYMENT_CONSUMER_MAX_DELIVER | No | 5 | public | Maximum delivery attempts |
| PAYMENT_CONSUMER_IDEMPOTENCY_TTL | No | 24h | public | TTL for processed idempotency keys |
Payment Service (Separate Microservice)
The payment service runs as an isolated microservice with its own configuration. Seepayment-service/config/config.example.yaml for full configuration reference.
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| PAYMENT_CONFIG | No | - | public | Path to YAML configuration file |
| GRPC_ADDRESS | No | :50051 | public | gRPC server listen address |
| HTTP_ADDRESS | No | :8080 | public | HTTP server (health checks) |
| METRICS_ADDRESS | No | :9090 | public | Prometheus metrics server listen address |
| DB_HOST | Yes | localhost | sensitive | PostgreSQL host |
| DB_PORT | No | 5432 | public | PostgreSQL port |
| DB_USER | Yes | payment | sensitive | PostgreSQL user |
| DB_PASSWORD | Yes | - | secret | PostgreSQL password |
| DB_NAME | Yes | payment | sensitive | PostgreSQL database |
| DB_SSL_MODE | No | require | public | PostgreSQL SSL mode |
| NATS_URL | No | nats://localhost:4222 | sensitive | NATS server URL |
| NATS_STREAM_NAME | No | PAYMENTS | public | JetStream stream name |
| NATS_CTRL_SIGNING_SEED | Conditional | - | secret | CTRL account signing seed for JWT auth (from Vault in production) |
| NATS_CTRL_ACCOUNT_PUB | Conditional | - | sensitive | CTRL account public key for JWT auth |
| PRICING_CONFIG_PATH | No | config/pricing.yaml | public | Path to pricing configuration |
| TLS_CERT_FILE | Conditional | - | sensitive | Server TLS certificate (mTLS) |
| TLS_KEY_FILE | Conditional | - | secret | Server TLS private key (mTLS) |
| TLS_CA_FILE | Conditional | - | sensitive | CA certificate for client verification (mTLS) |
| TLS_INSECURE | No | false | public | Disable TLS (development only) |
| TLS_ALLOWED_CN | No | - | sensitive | Required Common Name for client certificate authentication. When set, gRPC clients must present a certificate with this CN. |
Payment Service Vault Integration
When Vault is enabled for the payment service, it provides AWS credentials for S3 operations instead of static environment variables.| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| VAULT_ADDRESS | Yes (if Vault) | - | sensitive | Vault server address (e.g., https://<vault-ip>:8200). Note: uses VAULT_ADDRESS not VAULT_ADDR. |
| VAULT_ROLE_ID | Yes (if Vault) | - | sensitive | AppRole role_id for authentication |
| VAULT_SECRET_ID_PATH | Yes (if Vault) | - | sensitive | Path to file containing AppRole secret_id |
| VAULT_NAMESPACE | No | - | public | Vault namespace (enterprise feature) |
| VAULT_TLS_SKIP_VERIFY | No | false | public | Skip TLS certificate verification. Rejected in production. |
| VAULT_TLS_CA_FILE | Yes (if Vault TLS) | - | sensitive | Path to CA certificate for Vault TLS verification |
| VAULT_APP_CREDENTIALS_PATH | No | hoodcloud/app-credentials | sensitive | Vault KV path for payment service credentials |
| VAULT_CACHE_TTL | No | 5m | public | Cache TTL for Vault secrets |
Tempo Crypto Provider
The payment service supports Tempo network TIP-20 stablecoins for crypto payments. Tempo usesTransferWithMemo events for exact payment matching via memo field (payment ID as bytes32).
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| TEMPO_ENABLED | No | false | public | Enable Tempo crypto payment provider |
| TEMPO_RPC_URL | No | https://rpc.moderato.tempo.xyz | public | Tempo network RPC endpoint |
| TEMPO_RECEIVER_ADDRESS | Yes (if Tempo) | - | sensitive | Wallet address for receiving payments |
| TEMPO_CHAIN_ID | No | 42431 | public | Tempo chain ID (42431 = Moderato testnet) |
| TEMPO_POLL_INTERVAL | No | 10s | public | Interval for polling transfer events |
| TEMPO_TOKEN_ALPHAUSD | No | 0x20c0…001 | public | AlphaUSD TIP-20 contract address |
| TEMPO_TOKEN_BETAUSD | No | - | public | BetaUSD TIP-20 contract address |
Stripe Payment Provider
The payment service supports Stripe Checkout for card payments. Stripe secrets are sourced from Vault atsecret/payment-service/credentials.
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| STRIPE_ENABLED | No | false | public | Enable Stripe payment provider |
| Vault Field | Vault Path | Description |
|---|---|---|
stripe_secret_key | secret/payment-service/credentials | Stripe API secret key (sk_live_... or sk_test_...) |
stripe_webhook_secret | secret/payment-service/credentials | Stripe webhook signing secret (whsec_...) |
Cloud Provider Credentials
Hetzner
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| HCLOUD_TOKEN | Yes (Hetzner) | - | secret | Hetzner Cloud API token for infrastructure provisioning |
OVH
OVH credentials are required only when using OVH providers (ovh-public-cloud, ovh-vps, ovh-dedicated).
Create API credentials at: https://api.ovh.com/createToken/
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| OVH_ENDPOINT | No | ovh-eu | public | OVH API endpoint (ovh-eu, ovh-ca, ovh-us, kimsufi-eu, soyoustart-eu) |
| OVH_APPLICATION_KEY | Yes (OVH) | - | secret | OVH Application Key from createToken page |
| OVH_APPLICATION_SECRET | Yes (OVH) | - | secret | OVH Application Secret from createToken page |
| OVH_CONSUMER_KEY | Yes (OVH) | - | secret | OVH Consumer Key from createToken page |
| OVH_CLOUD_PROJECT_SERVICE | Yes (Public Cloud) | - | sensitive | OVH Cloud Project Service ID (required for ovh-public-cloud provider) |
| OVH_SUBSIDIARY | Yes (VPS/Dedicated) | EU | public | OVH billing subsidiary (EU, FR, GB, DE, US, CA) |
| Provider | Description | Use Case |
|---|---|---|
ovh-public-cloud | Public Cloud instances (Beta API) | Most Hetzner-like, supports cloud-init |
ovh-vps | Traditional VPS (order-based) | Lower cost, no external volumes |
ovh-dedicated | Bare-metal dedicated servers | High performance, 2h delivery time |
E2E Testing
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| E2E_API_URL | No | http://localhost:8081 | sensitive | API base URL for E2E tests |
| E2E_GRPC_ADDRESS | No | localhost:9091 | sensitive | gRPC address for E2E tests |
| E2E_DB_HOST | No | localhost | sensitive | Database host for E2E tests |
| E2E_DB_PORT | No | 5433 | public | Database port for E2E tests |
| E2E_DB_USER | No | hoodcloud | sensitive | Database user for E2E tests |
| E2E_DB_PASSWORD | No | e2e_test_password | secret | Database password for E2E tests |
| E2E_DB_NAME | No | hoodcloud_e2e | sensitive | Database name for E2E tests |
| E2E_TEMPORAL_HOST | No | localhost | sensitive | Temporal host for E2E tests |
| E2E_TEMPORAL_PORT | No | 7234 | public | Temporal port for E2E tests |
| E2E_SSH_KEY_PATH | No | ~/.ssh/e2e_test_key | sensitive | Path to SSH key for E2E VM access |
| E2E_SSH_KEY_NAME | No | hoodcloud-e2e-test | public | SSH key name in Hetzner |
| E2E_TEST_REGION | No | fsn1 | public | Hetzner region for E2E tests |
| E2E_TEST_INSTANCE_TYPE | No | cpx22 | public | Hetzner instance type for E2E tests |
| E2E_CLEANUP_ON_FAILURE | No | true | public | Clean up resources on test failure |
Docker Compose Environment
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| POSTGRES_USER | No | hoodcloud | sensitive | PostgreSQL user for Docker Compose |
| POSTGRES_PASSWORD | Yes (Compose) | - | secret | PostgreSQL password for Docker Compose |
| POSTGRES_DB | No | hoodcloud | public | PostgreSQL database name for Docker Compose |
| GRAFANA_ADMIN_USER | No | admin | sensitive | Grafana admin username |
| GRAFANA_ADMIN_PASSWORD | Yes | - | secret | Grafana admin password (no default — must be set explicitly) |
| NGROK_AUTHTOKEN | Conditional | - | secret | ngrok auth token (required for E2E with external VMs) |
| OPS_AGENT_VERSION | No | latest | public | Ops agent version for orchestrator deployment |
| NATS_JWT_ENABLED | No | false | public | Enable NATS JWT operator mode authentication |
| NATS_JWT_CTRL_ACCOUNT_PUB | Conditional | - | sensitive | CTRL account public key (from nats-jwt-setup) |
| NATS_JWT_AGENT_ACCOUNT_PUB | Conditional | - | sensitive | AGENT account public key (from nats-jwt-setup) |
| NATS_JWT_EXTERNAL_URL | Conditional | - | sensitive | Public NATS URL for ops-agents (tls://…) |
| DOMAIN_API | No | api.localhost | public | Domain for API server reverse proxy (Caddy) |
| DOMAIN_GRAFANA | No | grafana.localhost | public | Domain for Grafana dashboard reverse proxy (Caddy) |
| DOMAIN_STATUS | No | status.localhost | public | Domain for Gatus status page reverse proxy (Caddy) |
| DOMAIN_PAY | No | pay.localhost | public | Domain for payment service reverse proxy (Caddy, docker-compose.payment.yml) |
Payment Service (docker-compose.payment.yml)
These variables are used when running the payment service viadocker compose -f docker-compose.yml -f docker-compose.payment.yml up:
| Variable | Required | Default | Security | Description |
|---|---|---|---|---|
| PAYMENT_DB_USER | No | payment | sensitive | Payment service PostgreSQL user |
| PAYMENT_DB_PASSWORD | Yes | - | secret | Payment service PostgreSQL password |
| PAYMENT_DB_NAME | No | payment | public | Payment service PostgreSQL database |
Variables by Service
auth-server
Required:- DB_HOST, DB_USER, DB_PASSWORD, DB_NAME
- REDIS_HOST (for Redis-backed rate limiting)
- JWT_PRIVATE_KEY_FILE + JWT_PUBLIC_KEY_FILE (or via Vault)
- AUTH_CLERK_SECRET_KEY, AUTH_CLERK_AUTHORIZED_PARTY, AUTH_CLERK_WEBHOOK_SIGNING_SECRET (when using Clerk, which is the default)
- SERVER_HOST, SERVER_PORT
- DB_PORT, DB_SSL_MODE, DB_MAX_CONNS, DB_MIN_CONNS, DB_MAX_CONN_LIFETIME, DB_MAX_CONN_IDLE_TIME, DB_STATEMENT_TIMEOUT
- REDIS_PORT, REDIS_PASSWORD, REDIS_DB
- JWT_ACCESS_TOKEN_TTL, JWT_REFRESH_TOKEN_TTL, JWT_ISSUER, JWT_AUDIENCE
- AUTH_PROVIDER (default:
clerk) - AUTH_RATE_LIMIT_PER_MINUTE, AUTH_ALLOWED_ORIGINS
- ENVIRONMENT, LOG_LEVEL
- SIWE_DOMAIN, SIWE_URI, AUTH_NONCE_EXPIRY (only needed when
AUTH_PROVIDER=siweorboth)
api-server
Required:- SERVER_PUBLIC_URL
- DB_HOST, DB_USER, DB_PASSWORD, DB_NAME
- REDIS_HOST
- TEMPORAL_HOST
- AWS_S3_BUCKET
- SERVER_HOST, SERVER_PORT, SERVER_READ_TIMEOUT, SERVER_WRITE_TIMEOUT
- DB_PORT, DB_SSL_MODE, DB_MAX_CONNS, DB_MIN_CONNS, DB_MAX_CONN_LIFETIME, DB_MAX_CONN_IDLE_TIME, DB_STATEMENT_TIMEOUT
- REDIS_PORT, REDIS_PASSWORD, REDIS_DB
- TEMPORAL_PORT, TEMPORAL_NAMESPACE, TEMPORAL_TASK_QUEUE
- AWS_REGION, AWS_ENDPOINT_URL, AWS_S3_BACKUP_PREFIX
- OPS_AGENT_BINARIES_DIR
- CHAIN_PROFILES_DIR, CHAIN_CONFIGS_SOURCE, CHAINS_S3_BUCKET, CHAINS_S3_REGION, CHAINS_S3_PREFIX, CHAINS_CHECK_INTERVAL, CHAIN_CONFIGS_WATCH
- JWT_PUBLIC_KEY_FILE, JWT_AUDIENCE for JWT validation (RS256)
- AUTH_PROVIDER, AUTH_CLERK_SECRET_KEY, AUTH_CLERK_AUTHORIZED_PARTY (when Clerk provider enabled; no webhook secret — api-server does not handle webhooks)
- API_AUTH_ENABLED, API_RATE_LIMIT_ENABLED, API_RATE_LIMIT_REQUESTS
- PAYMENT_CONSUMER_ENABLED, PAYMENT_CONSUMER_URL, PAYMENT_CONSUMER_STREAM_NAME (for processing payment events)
- ENVIRONMENT, LOG_LEVEL
agent-gateway
Required:- DB_HOST, DB_USER, DB_PASSWORD, DB_NAME
- REDIS_HOST
- GRPC_PUBLIC_URL
- DB_PORT, DB_SSL_MODE, DB_MAX_CONNS, DB_MIN_CONNS, DB_MAX_CONN_LIFETIME, DB_MAX_CONN_IDLE_TIME, DB_STATEMENT_TIMEOUT
- REDIS_PORT, REDIS_PASSWORD, REDIS_DB
- AWS_REGION, AWS_ENDPOINT_URL
- GRPC_HOST, GRPC_PORT, GRPC_USE_TLS, GRPC_CERT_FILE, GRPC_KEY_FILE, GRPC_CA_FILE
- GRPC_CONFIG_SIGNING_KEY
- GRPC_RATE_LIMIT_RPS, GRPC_RATE_LIMIT_BURST, GRPC_RATE_LIMIT_CLEANUP_INTERVAL
- COMMAND_PROGRESS_TTL
- NATS_ENABLED, NATS_URL, NATS_STREAM_NAME, NATS_MAX_MESSAGE_SIZE
- ENVIRONMENT, LOG_LEVEL
orchestrator
Required:- DB_HOST, DB_USER, DB_PASSWORD, DB_NAME
- REDIS_HOST
- TEMPORAL_HOST
- AWS_S3_BUCKET
- SERVER_PUBLIC_URL, GRPC_PUBLIC_URL
- HCLOUD_TOKEN (if using Hetzner provider)
- DB_PORT, DB_SSL_MODE, DB_MAX_CONNS, DB_MIN_CONNS, DB_MAX_CONN_LIFETIME, DB_MAX_CONN_IDLE_TIME, DB_STATEMENT_TIMEOUT
- REDIS_PORT, REDIS_PASSWORD, REDIS_DB
- TEMPORAL_PORT, TEMPORAL_NAMESPACE, TEMPORAL_TASK_QUEUE
- AWS_REGION, AWS_ENDPOINT_URL, AWS_S3_BACKUP_PREFIX
- GRPC_USE_TLS, GRPC_CONFIG_SIGNING_KEY
- TERRAFORM_BINARY, TERRAFORM_STATE_DIR, TERRAFORM_PARALLELISM
- TERRAFORM_STATE_BACKEND, TERRAFORM_S3_BUCKET, TERRAFORM_S3_REGION, TERRAFORM_DYNAMODB_TABLE
- TERRAFORM_INIT_TIMEOUT, TERRAFORM_PLAN_TIMEOUT, TERRAFORM_APPLY_TIMEOUT, TERRAFORM_DESTROY_TIMEOUT, TERRAFORM_OUTPUT_TIMEOUT
- COMMAND_PROGRESS_TTL
- SKIP_TERRAFORM, DEV_MODE
- CHAIN_PROFILES_DIR, CHAIN_CONFIGS_SOURCE, CHAINS_S3_BUCKET, CHAINS_S3_REGION, CHAINS_S3_PREFIX, CHAINS_CHECK_INTERVAL, CHAIN_CONFIGS_WATCH
- CHAIN_CONFIGS_ENABLED, CONFIG_BUNDLE_VERSION, CHAIN_CONFIGS_S3_BUCKET, CHAIN_CONFIGS_S3_REGION, CHAIN_CONFIGS_URL_EXPIRY, CHAIN_CONFIGS_LOCAL_DIR, S3_PUBLIC_ENDPOINT_URL (for ops-agent config bundles)
- OPS_AGENT_VERSION
- DEFAULT_PROVIDER, DEFAULT_REGION
- OVH_ENDPOINT, OVH_APPLICATION_KEY, OVH_APPLICATION_SECRET, OVH_CONSUMER_KEY, OVH_CLOUD_PROJECT_SERVICE (if using OVH providers)
- ENVIRONMENT, LOG_LEVEL
health-evaluator
Required:- DB_HOST, DB_USER, DB_PASSWORD, DB_NAME
- TEMPORAL_HOST
- DB_PORT, DB_SSL_MODE, DB_MAX_CONNS, DB_MIN_CONNS, DB_MAX_CONN_LIFETIME, DB_MAX_CONN_IDLE_TIME, DB_STATEMENT_TIMEOUT
- TEMPORAL_PORT, TEMPORAL_NAMESPACE, TEMPORAL_TASK_QUEUE
- AWS_REGION, AWS_S3_BUCKET, AWS_S3_BACKUP_PREFIX (for backup cleanup)
- HEALTH_EVALUATION_INTERVAL, HEALTH_HEARTBEAT_TIMEOUT, HEALTH_MIGRATION_COOLDOWN
- SUBSCRIPTION_CLEANUP_INTERVAL
- MAINTENANCE_CLEANUP_INTERVAL, MAINTENANCE_CLEANUP_TIMEOUT
- NATS_ENABLED, NATS_URL, NATS_STREAM_NAME, NATS_CONSUMER_NAME, NATS_MAX_MESSAGE_SIZE, NATS_STREAM_REPLICAS
- VICTORIA_METRICS_URL, METRICS_CONSUMER_NAME (for observation metrics ingestion and policy evaluation)
- EVALUATION_CONCURRENCY (for parallel chain policy evaluation)
- CIRCUIT_BREAKER_FAILURE_THRESHOLD, CIRCUIT_BREAKER_SUCCESS_THRESHOLD, CIRCUIT_BREAKER_TIMEOUT (for metrics ingester protection)
- CHAIN_PROFILES_DIR, CHAIN_CONFIGS_SOURCE, CHAINS_S3_BUCKET, CHAINS_S3_REGION, CHAINS_S3_PREFIX, CHAINS_CHECK_INTERVAL (for loading observation specs and policies)
- ENVIRONMENT, LOG_LEVEL
- UptimeWorker — materializes rolling uptime buckets (5m interval, 500 batch size, 90-day retention). See Deployment & Operations - Rolling Uptime.
incident_slack_webhook_url— Slack incoming webhook URL (optional, enables Slack notifications)incident_telegram_bot_token— Telegram bot token (optional, enables Telegram notifications)incident_telegram_chat_id— Telegram chat ID (required if bot token is set)incident_email_api_url— Email delivery service API URL (optional, enables Email notifications)incident_email_api_key— Email delivery service API key (required if API URL is set)incident_email_from— Sender email address for incident notificationsincident_email_to— Recipient email address for incident notificationsincident_webhook_url— Outbound webhook URL (optional, enables Webhook notifications)incident_webhook_secret— Shared secret sent inX-Webhook-Secretheader for request verification
secret/hoodcloud/app-credentials, not from environment variables. The health-evaluator initializes the incident notification pipeline at startup based on which credentials are present.
ops-agent (on VMs)
Required:- NODE_ID
- GRPC_URL (or CONTROL_PLANE_URL for backward compatibility)
- OPS_AGENT_CONFIG (path to YAML configuration file)
- NODE_TOKEN
- GRPC_USE_TLS, GRPC_PORT, GRPC_ADDRESS
- TLS_CERT_FILE, TLS_KEY_FILE, TLS_CA_FILE
- CHAIN_PROFILE_ID, CHAIN_DATA_DIR, CHAIN_RPC_PORT, CHAIN_TYPE
- HEARTBEAT_INTERVAL, GRACE_PERIOD
- CONFIG_SIGNING_PUBLIC_KEY, REQUIRE_CONFIG_SIGNATURE
- CONFIG_DIR, CONFIG_VERSION
- METRICS_ENABLED, METRICS_PORT
- OBSERVATION_ENABLED, NATS_URL, HOST_ID (for observation metrics transport; NATS auth via JWT operator mode)
- SYSTEM_METRICS_INTERVAL, SYSTEM_METRICS_DATA_PATH (for system metrics collection)
- ENVIRONMENT, LOG_LEVEL
payment-service (Isolated Microservice)
Required:- DB_HOST, DB_USER, DB_PASSWORD, DB_NAME
- TLS_CERT_FILE, TLS_KEY_FILE, TLS_CA_FILE (unless TLS_INSECURE=true)
- TLS_ALLOWED_CN (for client certificate CN verification)
- VAULT_ADDRESS, VAULT_ROLE_ID, VAULT_SECRET_ID_PATH
- VAULT_TLS_CA_FILE, VAULT_APP_CREDENTIALS_PATH
- PAYMENT_CONFIG
- GRPC_ADDRESS, HTTP_ADDRESS, METRICS_ADDRESS
- DB_PORT, DB_SSL_MODE, DB_MAX_CONNS, DB_MIN_CONNS
- NATS_URL, NATS_STREAM_NAME, NATS_CTRL_SIGNING_SEED, NATS_CTRL_ACCOUNT_PUB
- PRICING_CONFIG_PATH
- TLS_INSECURE (development only)
- VAULT_NAMESPACE, VAULT_TLS_SKIP_VERIFY (dev only — rejected in production), VAULT_CACHE_TTL
- OTEL_ENABLED, OTEL_COLLECTOR_URL, OTEL_SAMPLE_RATIO
- TEMPO_ENABLED, TEMPO_RPC_URL, TEMPO_RECEIVER_ADDRESS, TEMPO_CHAIN_ID, TEMPO_POLL_INTERVAL, TEMPO_TOKEN_ALPHAUSD, TEMPO_TOKEN_BETAUSD
- STRIPE_ENABLED (Stripe secrets sourced from Vault, not environment variables)
Security Classifications
Secret (Must Never Be Exposed)
These variables contain credentials, tokens, or cryptographic keys:- JWT_PRIVATE_KEY_FILE (file content is secret)
- DB_PASSWORD
- REDIS_PASSWORD
- AWS_ACCESS_KEY_ID
- AWS_SECRET_ACCESS_KEY
- VAULT_ROLE_ID
- VAULT_SECRET_ID_PATH (file content is secret)
- GRPC_KEY_FILE
- GRPC_CONFIG_SIGNING_KEY
- TLS_KEY_FILE
- HCLOUD_TOKEN
- OVH_APPLICATION_KEY
- OVH_APPLICATION_SECRET
- OVH_CONSUMER_KEY
- E2E_DB_PASSWORD
- POSTGRES_PASSWORD
- GRAFANA_ADMIN_PASSWORD
- NGROK_AUTHTOKEN
- NATS_JWT_CTRL_ACCOUNT_PUB (sensitive, not secret — public key only)
- NATS_CTRL_SIGNING_SEED (secret — loaded from Vault in production)
- AUTH_CLERK_SECRET_KEY
- AUTH_CLERK_WEBHOOK_SIGNING_SECRET
- Never commit to version control
- Store in secure secret management systems (HashiCorp Vault)
- Use environment-specific .env files (gitignored)
- Rotate regularly
- Use IAM roles/instance profiles instead of static credentials where possible
Sensitive (Should Not Be Public)
These variables contain internal URLs, usernames, or configuration that should not be publicly disclosed:- SERVER_PUBLIC_URL
- GRPC_PUBLIC_URL
- DB_HOST, DB_USER, DB_NAME
- REDIS_HOST
- TEMPORAL_HOST
- AWS_S3_BUCKET
- AWS_ENDPOINT_URL
- VAULT_ADDR
- VAULT_MASTER_KEY_PATH, VAULT_APP_CREDENTIALS_PATH
- VAULT_TLS_CA_FILE
- JWT_PUBLIC_KEY_FILE
- GRPC_CERT_FILE, GRPC_CA_FILE
- TLS_CERT_FILE, TLS_CA_FILE
- NODE_ID
- CONTROL_PLANE_URL, GRPC_URL
- CONFIG_SIGNING_PUBLIC_KEY
- LOKI_URL, PROMETHEUS_URL, VICTORIA_METRICS_URL
- NATS_URL, NATS_PUBLIC_URL
- OPS_AGENT_DOWNLOAD_URL
- S3_PUBLIC_ENDPOINT_URL
- E2E_API_URL, E2E_GRPC_ADDRESS, E2E_DB_HOST, E2E_DB_USER, E2E_DB_NAME, E2E_TEMPORAL_HOST, E2E_SSH_KEY_PATH
- POSTGRES_USER
- GRAFANA_ADMIN_USER
- CHAIN_CONFIGS_S3_BUCKET
- OVH_CLOUD_PROJECT_SERVICE
- TERRAFORM_S3_BUCKET
- TERRAFORM_DYNAMODB_TABLE
Public (Safe to Commit)
These variables contain non-sensitive configuration:- All timeouts, intervals, and thresholds
- Port numbers
- Feature flags (API_AUTH_ENABLED, NATS_ENABLED, etc.)
- Default values and limits
- ENVIRONMENT, LOG_LEVEL
- Public configuration like CHAIN_PROFILES_DIR
Production Checklist
See: Deployment Checklist for the full pre-deployment checklist covering configuration, authentication, security, and infrastructure validation.
Variable Reference by File
Source Code References
internal/config/config.go- Authoritative source for all environment variable mappings
- Implements koanf-based configuration with env var overrides
- TerraformConfig: timeout fields (init, plan, apply, destroy, output)
- CommandsConfig: progress_ttl field
- Environment variable to config path mappings
- TERRAFORM_*_TIMEOUT and COMMAND_PROGRESS_TTL mappings
- Ops agent configuration loading with YAML support
- LoadFromFile() for YAML configuration
- LoadConfig() with OPS_AGENT_CONFIG support
- All NODE_ID, GRPC_URL, CHAIN_* variables
- ENVIRONMENT variable usage for logging
- HTTP-only API server (gRPC moved to agent-gateway)
- gRPC server for ops-agent communication
- Uses COMMAND_PROGRESS_TTL for Redis progress storage
- GRPC_* variables for server configuration
- ENVIRONMENT variable usage
- HCLOUD_TOKEN direct usage (passed to Terraform)
- TERRAFORM_*_TIMEOUT variables for provisioning operations
- COMMAND_PROGRESS_TTL for progress reader
- ENVIRONMENT variable usage
- HOODCLOUD_CONFIG file path loading
- LOG_LEVEL parsing and configuration
- Timeout configuration from Config struct
- Falls back to defaults.Terraform.* when not configured
- ProgressStore with configurable TTL
- Uses functional options pattern (WithProgressTTL)
- Falls back to defaults.Commands.ProgressTTL
- All E2E_* test configuration variables
Configuration File References
docker-compose.dev.yml- Development environment defaults
- All service environment configurations
- E2E testing environment
- Isolated ports and test-specific configuration
- Production-like Docker Compose setup
- Template with all user-configurable variables
- YAML configuration structure
- Default values for all settings
Troubleshooting
Missing Required Variables
If services fail to start with missing variable errors:- Check
.envfile exists and contains required variables - Verify environment variables are exported in your shell
- For Docker Compose, ensure
.envis in the same directory asdocker-compose.yml - Check service-specific required variables in “Variables by Service” section
Configuration Not Applied
If environment variables seem to be ignored:- YAML file values take precedence unless overridden by environment variables
- Check
HOODCLOUD_CONFIGis pointing to the correct file - Verify variable names match exactly (case-sensitive)
- For Docker, check environment variables are passed through in docker-compose.yml
Connection Issues
If services can’t connect to each other:- Verify hostnames match service names in Docker Compose
- Check port numbers match between services
- Ensure URLs include protocol (http://, nats://, etc.)
- For external VMs, ensure *_PUBLIC_URL variables are set to externally-accessible addresses
Migration Guide
From Environment-Only to YAML + Environment
- Create
config/hoodcloud.yamlwith your base configuration - Set environment variables only for values that differ from YAML or are secrets
- Set
HOODCLOUD_CONFIG=config/hoodcloud.yaml - Restart services
From Development to Production
- Copy
.env.exampleto.env.production - Replace all placeholder values with production values
- Move all secrets to HashiCorp Vault
- Configure Vault AppRole authentication
- Enable security features (TLS, auth, rate limiting)
- Set appropriate timeouts and intervals for production scale