Skip to main content

Quick Start

Copy .env.example to .env and configure required variables. All configuration can also be provided via the config/hoodcloud.yaml file with environment variable overrides.

Configuration Hierarchy

HoodCloud uses a two-tier configuration system:
  1. Primary: YAML configuration file (config/hoodcloud.yaml)
  2. Override: Environment variables (take precedence over YAML)
The HOODCLOUD_CONFIG environment variable can specify a custom config file path.

Variables by Category

Configuration Management

VariableRequiredDefaultSecurityDescription
HOODCLOUD_CONFIGNo-publicPath to YAML configuration file (if not specified, uses env-only mode)
ENVIRONMENTNodevpublicDeployment environment (dev, staging, production). Used for logging context and production readiness validation.
LOG_LEVELNoinfopublicLogging level (debug, info, warn, error)

HTTP Server

VariableRequiredDefaultSecurityDescription
SERVER_HOSTNo0.0.0.0publicHTTP server bind address
SERVER_PORTNo8080publicHTTP server port
SERVER_PUBLIC_URLYes-sensitivePublic URL for API server (used by ops-agent)
SERVER_READ_TIMEOUTNo30spublicHTTP server read timeout
SERVER_WRITE_TIMEOUTNo30spublicHTTP server write timeout

Database (PostgreSQL)

VariableRequiredDefaultSecurityDescription
DB_HOSTYeslocalhostsensitivePostgreSQL host
DB_PORTNo5432publicPostgreSQL port
DB_USERYeshoodcloudsensitivePostgreSQL username
DB_PASSWORDYes-secretPostgreSQL password
DB_NAMEYeshoodcloudsensitivePostgreSQL database name
DB_SSL_MODENorequirepublicPostgreSQL SSL mode (disable, require, verify-ca, verify-full). Default changed to require in docker-compose.

Database Pool Configuration

VariableRequiredDefaultSecurityDescription
DB_MAX_CONNSNo10publicMaximum number of connections in the pool
DB_MIN_CONNSNo2publicMinimum number of connections in the pool
DB_MAX_CONN_LIFETIMENo1hpublicMaximum lifetime of a connection before it is closed and replaced
DB_MAX_CONN_IDLE_TIMENo30mpublicMaximum time a connection can be idle before it is closed
DB_STATEMENT_TIMEOUTNo0 (disabled)publicPostgreSQL statement_timeout per connection (e.g., 2s, 5s, 10s). Queries exceeding this duration are canceled.
Per-Service Connection Pool Sizing:
ServiceMaxConnsMinConnsStatementTimeoutRationale
Health Evaluator525sSingle instance, periodic batch queries
Auth Server522sLightweight auth queries
API Server1022sUser-facing, variable load
Agent Gateway1022sAgent heartbeats, moderate load
Orchestrator10210sWorkflow activities, Terraform operations
Connection Budget Formula: sum(MaxConns × instances) < PostgreSQL max_connections Example: With 2 instances per service: (5×2) + (5×2) + (10×2) + (10×2) + (10×2) = 80 < 100 (default max_connections)

Redis

VariableRequiredDefaultSecurityDescription
REDIS_HOSTYeslocalhostsensitiveRedis host
REDIS_PORTNo6379publicRedis port
REDIS_PASSWORDNo-secretRedis password (if authentication is enabled)
REDIS_DBNo0publicRedis database number

Temporal

VariableRequiredDefaultSecurityDescription
TEMPORAL_HOSTYeslocalhostsensitiveTemporal server host
TEMPORAL_PORTNo7233publicTemporal server port
TEMPORAL_NAMESPACENohoodcloudpublicTemporal namespace
TEMPORAL_TASK_QUEUENohoodcloud-taskspublicTemporal task queue name

HashiCorp Vault

Vault is the secrets provider for all application secrets. AWS credentials may still be needed for S3 key backups (either via environment or Vault AWS secrets engine). Local Development: All Vault variables below are auto-configured by docker-compose.dev.yml. The vault-dev-init.sh sidecar seeds secrets and writes AppRole credentials to a shared volume. No manual Vault configuration is needed for local dev.
VariableRequiredDefaultSecurityDescription
VAULT_ADDRYes-sensitiveVault server address (e.g., https://<vault-ip>:8200). TLS is enabled by default in vault.hcl.
VAULT_ROLE_IDYes-sensitiveAppRole role_id for authentication
VAULT_SECRET_ID_PATHYes-sensitivePath to file containing AppRole secret_id (e.g., /secrets/vault-secret-id). File must be readable by the container user (chmod 644).
VAULT_NAMESPACENo-publicVault namespace (enterprise feature, leave empty for OSS)
VAULT_CACHE_TTLNo5mpublicCache TTL for Vault secrets
VAULT_TLS_CA_FILEConditional-sensitivePath to CA certificate for Vault TLS verification (e.g., /secrets/vault-ca.crt). Required when Vault uses self-signed TLS certificates.
VAULT_TLS_SKIP_VERIFYNofalsepublicSkip TLS certificate verification (dev only). Rejected in production.
VAULT_MASTER_KEY_PATHYes-sensitiveVault KV path for master encryption key
VAULT_APP_CREDENTIALS_PATHYes-sensitiveVault KV path for application credentials
VAULT_TRANSIT_KEY_NAMENo-publicTransit engine key name for DEK encryption (optional)
VAULT_PKI_ROLENo-publicPKI role for gRPC TLS certificates (optional)
VAULT_PKI_CERT_TTLNo720hpublicPKI certificate TTL (30 days default)
VAULT_PKI_RENEW_BEFORENo24hpublicRenew PKI certificates this long before expiry
See also: Vault Secret Structure for the full Vault KV layout and all credential fields.
Key notes:
  • sealed_box_public_key and sealed_box_private_key are mandatory — api-server and orchestrator fail fast at startup without them. See Vault - Generate X25519 Keypair.
  • Payment mTLS fields (payment_client_cert, payment_client_key, payment_ca_cert) are optional — only needed when payment service integration is enabled.
  • Incident notification fields are optional. When present, health-evaluator enables the corresponding notification channels. If none are configured, incidents are tracked in DB only.
AppRole Authentication:
  1. Generate role_id: vault read -field=role_id auth/approle/role/hoodcloud-control-plane/role-id
  2. Generate secret_id: vault write -f -field=secret_id auth/approle/role/hoodcloud-control-plane/secret-id
  3. Deploy secret_id to /opt/hoodcloud/secrets/vault-secret-id on the service host with chmod 644
  4. The host path is mounted into containers as /secrets/vault-secret-id via the /opt/hoodcloud/secrets:/secrets:ro volume mount
  5. Service reads the secret_id file during AppRole authentication

AWS (S3 Key Backups)

VariableRequiredDefaultSecurityDescription
AWS_REGIONNoeu-central-1publicAWS region for S3
AWS_ENDPOINT_URLNo-sensitiveCustom AWS endpoint URL (for LocalStack in testing)
AWS_ACCESS_KEY_IDConditional-secretAWS access key ID (required if using AWS S3 without Vault AWS engine)
AWS_SECRET_ACCESS_KEYConditional-secretAWS secret access key (required if using AWS S3 without Vault AWS engine)
AWS_S3_BUCKETYes-sensitiveS3 bucket for encrypted key backups
AWS_S3_BACKUP_PREFIXNobackups/publicS3 prefix for backup objects

gRPC Server (Ops Agent Communication)

VariableRequiredDefaultSecurityDescription
GRPC_HOSTNo0.0.0.0publicgRPC server bind address
GRPC_PORTNo9090publicgRPC server port
GRPC_PUBLIC_URLYes-sensitivePublic gRPC address in host:port format (for ops-agent connection)
GRPC_USE_TLSNofalsepublicEnable TLS with system root CAs for ops-agent gRPC connection
GRPC_CERT_FILEConditional-sensitivePath to TLS certificate file (required if using custom TLS)
GRPC_KEY_FILEConditional-secretPath to TLS private key file (required if using custom TLS)
GRPC_CA_FILEConditional-sensitivePath to TLS CA certificate file (for mTLS)
GRPC_CONFIG_SIGNING_KEYNo-secretPath to ECDSA private key for signing configuration commands

gRPC Rate Limiting

VariableRequiredDefaultSecurityDescription
GRPC_RATE_LIMIT_RPSNo10publicRequests per second per node for gRPC rate limiting
GRPC_RATE_LIMIT_BURSTNo20publicMaximum burst size for gRPC rate limiting
GRPC_RATE_LIMIT_CLEANUP_INTERVALNo5mpublicInterval for cleaning up stale rate limiters

Cloud Provider Defaults

VariableRequiredDefaultSecurityDescription
DEFAULT_PROVIDERNohetznerpublicDefault cloud provider (hetzner)
DEFAULT_REGIONNofsn1publicDefault region for node provisioning

Terraform

VariableRequiredDefaultSecurityDescription
TERRAFORM_BINARYNoterraformpublicPath to Terraform binary
TERRAFORM_MODULE_PATHNo-publicPath to Terraform module (provider-specific)
TERRAFORM_STATE_DIRNoterraform-statepublicDirectory for Terraform state files (local backend)
TERRAFORM_PARALLELISMNo10publicTerraform parallelism setting
SKIP_TERRAFORMNofalsepublicSkip Terraform execution (for local dev)
DEV_MODENofalsepublicEnable development mode features
TERRAFORM_STATE_BACKENDNolocalpublicState backend type: local or s3
TERRAFORM_S3_BUCKETConditional-sensitiveS3 bucket for state storage (required when backend=s3)
TERRAFORM_S3_REGIONNoeu-central-1publicAWS region for S3 state bucket
TERRAFORM_DYNAMODB_TABLEConditional-sensitiveDynamoDB table for state locking (required when backend=s3)

Terraform Operation Timeouts

VariableRequiredDefaultSecurityDescription
TERRAFORM_INIT_TIMEOUTNo5mpublicTimeout for terraform init command
TERRAFORM_PLAN_TIMEOUTNo10mpublicTimeout for terraform plan command
TERRAFORM_APPLY_TIMEOUTNo30mpublicTimeout for terraform apply command
TERRAFORM_DESTROY_TIMEOUTNo20mpublicTimeout for terraform destroy command
TERRAFORM_OUTPUT_TIMEOUTNo1mpublicTimeout for terraform output command

Command Execution

VariableRequiredDefaultSecurityDescription
COMMAND_PROGRESS_TTLNo24hpublicTTL for command progress data in Redis. Progress data is automatically cleaned up after this duration.

Chain Configurations

Dynamic Chain Config Provider

The control plane services can load chain configs from either local filesystem or S3 with hot-reload support.
VariableRequiredDefaultSecurityDescription
CHAIN_PROFILES_DIRNohoodcloud-chain-configspublicDirectory containing chain profile definitions (local mode)
CHAIN_CONFIGS_SOURCENoautopublicConfig source: local, s3, or auto (auto uses S3 if bucket configured)
CHAINS_S3_BUCKETConditional-sensitiveS3 bucket for chain configs (required for S3 mode)
CHAINS_S3_REGIONNoAWS_REGIONpublicAWS region for chain configs S3 bucket
CHAINS_S3_PREFIXNo(empty)publicS3 key prefix for chain config files
CHAIN_PROFILE_VERSIONNolatestpublicConfig version: latest (auto-update) or pinned like v1.0.1
CHAINS_CHECK_INTERVALNo1mpublicInterval to check for version updates (auto-update mode only)
CHAIN_CONFIGS_WATCHNofalsepublicEnable fsnotify file watching for hot reload (local mode)
Version Modes:
ModeCHAIN_PROFILE_VERSIONBehavior
Auto-updatelatest (default)Downloads from latest/, polls checksums.txt for changes
Pinnedv1.0.1Downloads specific version once, no polling
S3 Bucket Structure:
s3://hoodcloud-chain-configs/
├── latest/
│   ├── checksums.txt      # SHA256 hash for change detection
│   └── configs.tar.gz     # Current production tarball
├── v1.0.0/
│   ├── checksums.txt
│   └── configs-v1.0.0.tar.gz
├── v1.0.1/
│   └── ...
└── v1.0.2/
    ├── checksums.txt
    └── configs-v1.0.2.tar.gz
Tarball Contents:
chains/
├── celestia-mocha/
│   ├── profile.yaml       # Chain metadata, provisioning config
│   ├── runtime.yaml       # Node lifecycle configuration
│   └── observation.yaml   # Health policies, metrics collectors
├── ethereum-holesky/
└── ...
recipes/
├── celestia/
├── ethereum/
└── ...

Config-as-Code (Ops Agent Runtime)

These variables control config bundle distribution to ops-agents (separate from control plane loading):
VariableRequiredDefaultSecurityDescription
CHAIN_CONFIGS_ENABLEDNofalsepublicEnable config-as-code mode for ops-agent
CONFIG_BUNDLE_VERSIONConditional-publicCurrent config bundle version (e.g., v1.2.3) - required if config-as-code enabled
CHAIN_CONFIGS_S3_BUCKETConditional-sensitiveS3 bucket for config bundles - required if config-as-code enabled
CHAIN_CONFIGS_S3_REGIONNoAWS_REGIONpublicS3 bucket region for config bundles
CHAIN_CONFIGS_URL_EXPIRYNo1hpublicPre-signed URL expiry duration
CHAIN_CONFIGS_LOCAL_DIRNo-publicLocal directory for config bundles (development/testing)
S3_PUBLIC_ENDPOINT_URLNo-sensitivePublic S3 endpoint for pre-signed URLs (E2E testing)

Ops Agent

VariableRequiredDefaultSecurityDescription
OPS_AGENT_VERSIONNo0.1.0publicOps agent version to deploy
OPS_AGENT_HEARTBEAT_INTERVALNo15spublicHeartbeat interval for keep-alive
OPS_AGENT_DOWNLOAD_URLNoSERVER_PUBLIC_URLsensitiveBase URL for ops agent binary downloads
OPS_AGENT_BINARIES_DIRNo-publicDirectory containing ops agent binaries for download
REQUIRE_CONFIG_SIGNATURENotruepublicRequire ECDSA signature verification for config commands

Ops Agent Runtime (on VMs)

The ops-agent supports two configuration modes:
  1. YAML + Environment Override: Set OPS_AGENT_CONFIG to a YAML config file path. Environment variables override YAML values.
  2. Environment-Only (Legacy): If OPS_AGENT_CONFIG is not set, configuration loads purely from environment variables.
See config/ops-agent.yaml.example for a YAML configuration template.
VariableRequiredDefaultSecurityDescription
OPS_AGENT_CONFIGNo-publicPath to YAML configuration file. If set, loads from file with env overrides.
NODE_IDYes-sensitiveUnique node identifier
NODE_TOKENNo-secretAuthentication token for control plane (nt_xxx format)
CONTROL_PLANE_URLYes-sensitiveControl plane gRPC URL (host:port) for backward compatibility
GRPC_URLYes-sensitiveControl plane gRPC address (host:port) - preferred over CONTROL_PLANE_URL
GRPC_ADDRESSNo-sensitiveAlias for GRPC_URL in YAML config
GRPC_USE_TLSNofalsepublicUse TLS with system root CAs for control plane connection
GRPC_PORTNo9091publicgRPC server port on ops-agent
TLS_CERT_FILEConditional-sensitivePath to TLS certificate for mTLS
TLS_KEY_FILEConditional-secretPath to TLS private key for mTLS
TLS_CA_FILEConditional-sensitivePath to CA certificate for gRPC mTLS verification. Not needed for NATS (uses system root CAs with Let’s Encrypt).
CHAIN_PROFILE_IDNo-publicChain profile identifier (e.g., celestia-mocha)
CHAIN_DATA_DIRNo/opt/hoodcloud/data/chainpublicChain data directory
CHAIN_RPC_PORTNo0publicChain-specific RPC port (0 = auto-detect)
CHAIN_TYPENo-publicChain type (cosmos, ethereum, celestia) - loaded from runtime.yaml
HEARTBEAT_INTERVALNo5mpublicKeep-alive heartbeat interval
GRACE_PERIODNo30spublicGraceful shutdown grace period
CONFIG_SIGNING_PUBLIC_KEYNo/opt/hoodcloud/config/signing.pubsensitivePath to ECDSA public key for config signature verification
REQUIRE_CONFIG_SIGNATURENotruepublicRequire signature verification for config commands
CONFIG_DIRNo-publicDirectory containing chain config bundles
CONFIG_VERSIONNo-publicVersion of config bundle
METRICS_ENABLEDNotruepublicEnable Prometheus metrics endpoint on ops-agent
METRICS_PORTNo9101publicPort for Prometheus metrics endpoint
OBSERVATION_ENABLEDNotruepublicEnable observation collectors and NATS metrics transport (RFC-007)
NATS_URLNonats://localhost:4222sensitiveNATS server URL for metrics transport
HOST_IDNo-publicHost identifier for metrics labels
SYSTEM_METRICS_INTERVALNo60spublicHow often to collect system metrics (disk, CPU, memory)
SYSTEM_METRICS_DATA_PATHNo/publicPath to monitor for disk usage (defaults to root filesystem)

Health Evaluator

VariableRequiredDefaultSecurityDescription
HEALTH_EVALUATION_INTERVALNo30spublicHow often to evaluate node health
HEALTH_HEARTBEAT_TIMEOUTNo60spublicHeartbeat timeout before marking node as DOWN
HEALTH_MIGRATION_COOLDOWNNo1hpublicCooldown period between automatic migrations
SUBSCRIPTION_CLEANUP_INTERVALNo1hpublicHow often to check for expired subscriptions
MAINTENANCE_CLEANUP_INTERVALNo5mpublicHow often to check for nodes stuck in maintenance state
MAINTENANCE_CLEANUP_TIMEOUTNo30mpublicHow long a node can stay in maintenance before considered stuck

Uptime Worker (Hardcoded Defaults)

The UptimeWorker runs inside health-evaluator and uses hardcoded defaults (no environment variables). Values are defined in internal/defaults/defaults.go:
SettingDefaultDescription
Interval5mHow often the worker materializes hourly uptime buckets
BatchSize500Number of nodes processed per batch
GracePeriod10mProvisioning grace period counted as uptime
RetentionDays90Hourly buckets older than this are purged automatically

Observability

VariableRequiredDefaultSecurityDescription
LOKI_URLNohttp://localhost:3100sensitiveLoki URL for log aggregation
PROMETHEUS_URLNohttp://localhost:9090sensitivePrometheus URL for metrics collection
VICTORIA_METRICS_URLNo-sensitiveVictoria Metrics URL for observation metrics TSDB (RFC-007). Enables metrics ingester and policy evaluation when set.
METRICS_CONSUMER_NAMENometrics-ingesterpublicNATS consumer name for metrics ingester (useful for multi-instance deployments)
EVALUATION_CONCURRENCYNo10publicNumber of chains to evaluate in parallel for policy evaluation
CIRCUIT_BREAKER_FAILURE_THRESHOLDNo5publicNumber of consecutive failures before opening the circuit breaker
CIRCUIT_BREAKER_SUCCESS_THRESHOLDNo3publicNumber of successes in half-open state before closing the circuit
CIRCUIT_BREAKER_TIMEOUTNo30spublicHow long to stay open before transitioning to half-open state

Telemetry (OpenTelemetry)

VariableRequiredDefaultSecurityDescription
OTEL_ENABLEDNofalsepublicEnable OpenTelemetry distributed tracing
OTEL_COLLECTOR_URLNootel-collector:4317sensitiveOTLP gRPC endpoint for the OTel Collector
OTEL_SAMPLE_RATIONo1.0publicTrace sampling ratio (0.0 to 1.0). 1.0 = sample all traces
OTEL_INSECURENotruepublicDisable TLS for OTLP exporter connection (true for local dev)

Authentication (Auth Server)

JWT RS256 (Asymmetric)

RS256 uses asymmetric cryptography: the auth-server signs tokens with a private key, and api-server verifies with the public key.
VariableRequiredDefaultSecurityDescription
JWT_PRIVATE_KEY_FILEYes (auth-server)-secretPath to RSA private key PEM file (auth-server only)
JWT_PUBLIC_KEY_FILEYes-sensitivePath to RSA public key PEM file (all services)
JWT_ACCESS_TOKEN_TTLNo15mpublicJWT access token time-to-live
JWT_REFRESH_TOKEN_TTLNo168hpublicJWT refresh token time-to-live (7 days)
JWT_ISSUERNohoodcloudpublicJWT issuer claim
JWT_AUDIENCENohoodcloud-apipublicJWT audience claim. Auth-server embeds this in issued tokens; all verifying services reject tokens with a mismatched audience (returns 401 Unauthorized). Must be identical across auth-server and api-server.
JWT_KEY_IDNo-publicKey ID for JWT header (kid), enables key rotation
Audience claim: When JWT_AUDIENCE is set, the auth-server includes it in every issued token and all verifying services (api-server, agent-gateway) validate that incoming tokens contain this exact audience value. A mismatch causes token validation to fail with 401 Unauthorized. If deploying multiple services, ensure JWT_AUDIENCE is consistent across all of them. Omitting JWT_AUDIENCE skips audience validation entirely (backward compatibility). Key Generation:
# Generate 2048-bit RSA key pair
openssl genrsa -out jwt_private.pem 2048
openssl rsa -in jwt_private.pem -pubout -out jwt_public.pem

# Store in Vault (production)
vault kv put secret/hoodcloud/app-credentials \
  jwt_private_key=@jwt_private.pem \
  jwt_public_key=@jwt_public.pem \
  # ... other existing fields
Key Resolution Order:
  1. Vault (jwt_private_key, jwt_public_key fields)
  2. File paths (JWT_PRIVATE_KEY_FILE, JWT_PUBLIC_KEY_FILE) | SIWE_DOMAIN | No | localhost | public | (Deprecated) SIWE message domain. Only used when AUTH_PROVIDER=siwe or both. | | SIWE_URI | No | http://localhost:3000 | public | (Deprecated) SIWE message URI. Only used when AUTH_PROVIDER=siwe or both. | | AUTH_NONCE_EXPIRY | No | 5m | public | (Deprecated) Auth nonce expiry time. Only used when AUTH_PROVIDER=siwe or both. | | AUTH_RATE_LIMIT_PER_MINUTE | No | 20 | public | Rate limit for auth endpoints per IP | | AUTH_MAX_REFRESH_TOKENS_PER_USER | No | 10 | public | Maximum concurrent active sessions per user | | AUTH_ALLOWED_ORIGINS | No | - | public | Comma-separated allowed CORS origins for auth server |

Clerk (Primary Auth Provider)

VariableRequiredDefaultSecurityDescription
AUTH_PROVIDERNoclerkpublicAuth provider: clerk (default), siwe (legacy), or both (migration)
AUTH_CLERK_SECRET_KEYConditional-secretClerk secret key for JWKS JWT verification (required when provider is clerk or both)
AUTH_CLERK_AUTHORIZED_PARTYConditional-publicComma-separated list of allowed frontend origins for Clerk JWT azp claim, e.g. https://hoodcloud.io,https://app.hoodcloud.io (required when provider is clerk or both)
AUTH_CLERK_WEBHOOK_SIGNING_SECRETConditional-secretClerk webhook signing secret for svix signature verification (required for auth-server when provider is clerk or both)

API Security

VariableRequiredDefaultSecurityDescription
API_AUTH_ENABLEDNofalsepublicEnable API key authentication
DANGEROUS_DISABLE_AUTHNofalsepublicMust be set to true to confirm disabling authentication when API_AUTH_ENABLED=false. Panics at startup if auth is disabled without this flag.
API_RATE_LIMIT_ENABLEDNofalsepublicEnable rate limiting
API_RATE_LIMIT_REQUESTSNo100publicRequests per minute per API key
API_RATE_LIMIT_REDISNofalsepublicEnable Redis-backed distributed rate limiting. When true, uses Redis INCR+EXPIRE sliding window instead of in-memory token bucket. Includes circuit breaker: after 5 consecutive Redis failures, falls back to in-memory limiter automatically.

NATS Event Streaming

VariableRequiredDefaultSecurityDescription
NATS_ENABLEDNofalsepublicEnable NATS event streaming
NATS_URLNonats://localhost:4222sensitiveNATS server URL (internal, for control plane)
NATS_PUBLIC_URLNo-sensitivePublic NATS URL (for ops-agent on external VMs)
NATS_ACCOUNTNo-sensitiveNATS account name for multi-account auth
NATS_STREAM_NAMENoHOODCLOUD_EVENTSpublicJetStream stream name
NATS_CONSUMER_NAMENocontrol-planepublicDurable consumer name
NATS_MAX_MESSAGE_SIZENo1048576publicMaximum message size in bytes (default: 1MB)
NATS_STREAM_REPLICASNo1publicJetStream stream replica count. Set to 3 for production with a 3-node NATS cluster. Applies to HOODCLOUD_EVENTS and HOODCLOUD_METRICS streams.
Cross-repo dependency: The PAYMENTS stream is managed by the payment service repository. Its replica count must be configured separately in the payment service.

NATS Multi-Account Authentication

When NATS_ACCOUNTS_ENABLED=true, each service component connects with its own account and subject-level permissions. Tokens are delivered to ops-agents via gRPC after registration (no shared token in cloud-init).
VariableRequiredDefaultSecurityDescription
NATS_ACCOUNTS_ENABLEDYes (prod)falsepublicEnable per-service NATS account auth. Required true in production when NATS is enabled.
NATS_ACCOUNT_OPS_AGENT_TOKENConditional-secretNATS token for ops-agent account (delivered via gRPC)
NATS_ACCOUNT_PAYMENT_SVC_TOKENConditional-secretNATS token for payment-service account
NATS_ACCOUNT_CONTROL_PLANE_TOKENConditional-secretNATS token for control-plane account
NATS_ACCOUNT_HEALTH_EVAL_TOKENConditional-secretNATS token for health-evaluator account

Payment Service Integration

The payment service runs as a separate microservice with its own database. These variables configure the main app’s connection to the payment service.

Payment Service Client (gRPC)

VariableRequiredDefaultSecurityDescription
PAYMENT_SERVICE_ENABLEDNofalsepublicEnable payment service integration
PAYMENT_SERVICE_ADDRESSConditional-sensitivePayment service gRPC address (e.g., payment-service.hoodcloud.internal:50051)
PAYMENT_SERVICE_CERT_FILEConditional-sensitiveClient certificate file for mTLS (see note below)
PAYMENT_SERVICE_KEY_FILEConditional-secretClient key file for mTLS (see note below)
PAYMENT_SERVICE_CA_FILEConditional-sensitiveCA certificate for server verification (see note below)
PAYMENT_SERVICE_SERVER_NAMENo-publicExpected server name for TLS verification (SNI)
PAYMENT_SERVICE_TIMEOUTNo30spublicDefault timeout for gRPC calls
PAYMENT_SERVICE_INSECURENofalsepublicSkip TLS (development only)
mTLS Credentials via Vault: When using Vault for secrets management, payment service mTLS credentials can be stored in the app credentials secret instead of using file paths. Add these fields to your app credentials JSON:
{
  "payment_client_cert": "-----BEGIN CERTIFICATE-----\n...",
  "payment_client_key": "-----BEGIN RSA PRIVATE KEY-----\n...",
  "payment_ca_cert": "-----BEGIN CERTIFICATE-----\n..."
}
When these fields are present in Vault, they take precedence over the file path environment variables. This is the recommended approach for production deployments.

Payment Event Consumer (NATS)

VariableRequiredDefaultSecurityDescription
PAYMENT_CONSUMER_ENABLEDNofalsepublicEnable payment event consumer
PAYMENT_CONSUMER_URLNoNATS_URLsensitiveNATS server URL for payment events
PAYMENT_CONSUMER_STREAM_NAMENoPAYMENTSpublicJetStream stream name for payment events
PAYMENT_CONSUMER_NAMENomain-app-paymentpublicDurable consumer name
PAYMENT_CONSUMER_ACK_WAITNo30spublicTime before unacked message is redelivered
PAYMENT_CONSUMER_MAX_DELIVERNo5publicMaximum delivery attempts
PAYMENT_CONSUMER_IDEMPOTENCY_TTLNo24hpublicTTL for processed idempotency keys

Payment Service (Separate Microservice)

The payment service runs as an isolated microservice with its own configuration. See payment-service/config/config.example.yaml for full configuration reference.
VariableRequiredDefaultSecurityDescription
PAYMENT_CONFIGNo-publicPath to YAML configuration file
GRPC_ADDRESSNo:50051publicgRPC server listen address
HTTP_ADDRESSNo:8080publicHTTP server (health checks)
METRICS_ADDRESSNo:9090publicPrometheus metrics server listen address
DB_HOSTYeslocalhostsensitivePostgreSQL host
DB_PORTNo5432publicPostgreSQL port
DB_USERYespaymentsensitivePostgreSQL user
DB_PASSWORDYes-secretPostgreSQL password
DB_NAMEYespaymentsensitivePostgreSQL database
DB_SSL_MODENorequirepublicPostgreSQL SSL mode
NATS_URLNonats://localhost:4222sensitiveNATS server URL
NATS_STREAM_NAMENoPAYMENTSpublicJetStream stream name
NATS_CTRL_SIGNING_SEEDConditional-secretCTRL account signing seed for JWT auth (from Vault in production)
NATS_CTRL_ACCOUNT_PUBConditional-sensitiveCTRL account public key for JWT auth
PRICING_CONFIG_PATHNoconfig/pricing.yamlpublicPath to pricing configuration
TLS_CERT_FILEConditional-sensitiveServer TLS certificate (mTLS)
TLS_KEY_FILEConditional-secretServer TLS private key (mTLS)
TLS_CA_FILEConditional-sensitiveCA certificate for client verification (mTLS)
TLS_INSECURENofalsepublicDisable TLS (development only)
TLS_ALLOWED_CNNo-sensitiveRequired Common Name for client certificate authentication. When set, gRPC clients must present a certificate with this CN.

Payment Service Vault Integration

When Vault is enabled for the payment service, it provides AWS credentials for S3 operations instead of static environment variables.
VariableRequiredDefaultSecurityDescription
VAULT_ADDRESSYes (if Vault)-sensitiveVault server address (e.g., https://<vault-ip>:8200). Note: uses VAULT_ADDRESS not VAULT_ADDR.
VAULT_ROLE_IDYes (if Vault)-sensitiveAppRole role_id for authentication
VAULT_SECRET_ID_PATHYes (if Vault)-sensitivePath to file containing AppRole secret_id
VAULT_NAMESPACENo-publicVault namespace (enterprise feature)
VAULT_TLS_SKIP_VERIFYNofalsepublicSkip TLS certificate verification. Rejected in production.
VAULT_TLS_CA_FILEYes (if Vault TLS)-sensitivePath to CA certificate for Vault TLS verification
VAULT_APP_CREDENTIALS_PATHNohoodcloud/app-credentialssensitiveVault KV path for payment service credentials
VAULT_CACHE_TTLNo5mpublicCache TTL for Vault secrets
Note: Payment service uses koanf for configuration (same pattern as main app). Environment variables override YAML configuration values.

Tempo Crypto Provider

The payment service supports Tempo network TIP-20 stablecoins for crypto payments. Tempo uses TransferWithMemo events for exact payment matching via memo field (payment ID as bytes32).
VariableRequiredDefaultSecurityDescription
TEMPO_ENABLEDNofalsepublicEnable Tempo crypto payment provider
TEMPO_RPC_URLNohttps://rpc.moderato.tempo.xyzpublicTempo network RPC endpoint
TEMPO_RECEIVER_ADDRESSYes (if Tempo)-sensitiveWallet address for receiving payments
TEMPO_CHAIN_IDNo42431publicTempo chain ID (42431 = Moderato testnet)
TEMPO_POLL_INTERVALNo10spublicInterval for polling transfer events
TEMPO_TOKEN_ALPHAUSDNo0x20c0…001publicAlphaUSD TIP-20 contract address
TEMPO_TOKEN_BETAUSDNo-publicBetaUSD TIP-20 contract address

Stripe Payment Provider

The payment service supports Stripe Checkout for card payments. Stripe secrets are sourced from Vault at secret/payment-service/credentials.
VariableRequiredDefaultSecurityDescription
STRIPE_ENABLEDNofalsepublicEnable Stripe payment provider
Vault-Sourced Secrets:
Vault FieldVault PathDescription
stripe_secret_keysecret/payment-service/credentialsStripe API secret key (sk_live_... or sk_test_...)
stripe_webhook_secretsecret/payment-service/credentialsStripe webhook signing secret (whsec_...)
These values are loaded by the payment service Vault client at startup and injected into the Stripe adapter configuration. They are not set via environment variables.

Cloud Provider Credentials

Hetzner

VariableRequiredDefaultSecurityDescription
HCLOUD_TOKENYes (Hetzner)-secretHetzner Cloud API token for infrastructure provisioning

OVH

OVH credentials are required only when using OVH providers (ovh-public-cloud, ovh-vps, ovh-dedicated). Create API credentials at: https://api.ovh.com/createToken/
VariableRequiredDefaultSecurityDescription
OVH_ENDPOINTNoovh-eupublicOVH API endpoint (ovh-eu, ovh-ca, ovh-us, kimsufi-eu, soyoustart-eu)
OVH_APPLICATION_KEYYes (OVH)-secretOVH Application Key from createToken page
OVH_APPLICATION_SECRETYes (OVH)-secretOVH Application Secret from createToken page
OVH_CONSUMER_KEYYes (OVH)-secretOVH Consumer Key from createToken page
OVH_CLOUD_PROJECT_SERVICEYes (Public Cloud)-sensitiveOVH Cloud Project Service ID (required for ovh-public-cloud provider)
OVH_SUBSIDIARYYes (VPS/Dedicated)EUpublicOVH billing subsidiary (EU, FR, GB, DE, US, CA)
OVH Provider Types:
ProviderDescriptionUse Case
ovh-public-cloudPublic Cloud instances (Beta API)Most Hetzner-like, supports cloud-init
ovh-vpsTraditional VPS (order-based)Lower cost, no external volumes
ovh-dedicatedBare-metal dedicated serversHigh performance, 2h delivery time

E2E Testing

VariableRequiredDefaultSecurityDescription
E2E_API_URLNohttp://localhost:8081sensitiveAPI base URL for E2E tests
E2E_GRPC_ADDRESSNolocalhost:9091sensitivegRPC address for E2E tests
E2E_DB_HOSTNolocalhostsensitiveDatabase host for E2E tests
E2E_DB_PORTNo5433publicDatabase port for E2E tests
E2E_DB_USERNohoodcloudsensitiveDatabase user for E2E tests
E2E_DB_PASSWORDNoe2e_test_passwordsecretDatabase password for E2E tests
E2E_DB_NAMENohoodcloud_e2esensitiveDatabase name for E2E tests
E2E_TEMPORAL_HOSTNolocalhostsensitiveTemporal host for E2E tests
E2E_TEMPORAL_PORTNo7234publicTemporal port for E2E tests
E2E_SSH_KEY_PATHNo~/.ssh/e2e_test_keysensitivePath to SSH key for E2E VM access
E2E_SSH_KEY_NAMENohoodcloud-e2e-testpublicSSH key name in Hetzner
E2E_TEST_REGIONNofsn1publicHetzner region for E2E tests
E2E_TEST_INSTANCE_TYPENocpx22publicHetzner instance type for E2E tests
E2E_CLEANUP_ON_FAILURENotruepublicClean up resources on test failure

Docker Compose Environment

VariableRequiredDefaultSecurityDescription
POSTGRES_USERNohoodcloudsensitivePostgreSQL user for Docker Compose
POSTGRES_PASSWORDYes (Compose)-secretPostgreSQL password for Docker Compose
POSTGRES_DBNohoodcloudpublicPostgreSQL database name for Docker Compose
GRAFANA_ADMIN_USERNoadminsensitiveGrafana admin username
GRAFANA_ADMIN_PASSWORDYes-secretGrafana admin password (no default — must be set explicitly)
NGROK_AUTHTOKENConditional-secretngrok auth token (required for E2E with external VMs)
OPS_AGENT_VERSIONNolatestpublicOps agent version for orchestrator deployment
NATS_JWT_ENABLEDNofalsepublicEnable NATS JWT operator mode authentication
NATS_JWT_CTRL_ACCOUNT_PUBConditional-sensitiveCTRL account public key (from nats-jwt-setup)
NATS_JWT_AGENT_ACCOUNT_PUBConditional-sensitiveAGENT account public key (from nats-jwt-setup)
NATS_JWT_EXTERNAL_URLConditional-sensitivePublic NATS URL for ops-agents (tls://…)
DOMAIN_APINoapi.localhostpublicDomain for API server reverse proxy (Caddy)
DOMAIN_GRAFANANografana.localhostpublicDomain for Grafana dashboard reverse proxy (Caddy)
DOMAIN_STATUSNostatus.localhostpublicDomain for Gatus status page reverse proxy (Caddy)
DOMAIN_PAYNopay.localhostpublicDomain for payment service reverse proxy (Caddy, docker-compose.payment.yml)

Payment Service (docker-compose.payment.yml)

These variables are used when running the payment service via docker compose -f docker-compose.yml -f docker-compose.payment.yml up:
VariableRequiredDefaultSecurityDescription
PAYMENT_DB_USERNopaymentsensitivePayment service PostgreSQL user
PAYMENT_DB_PASSWORDYes-secretPayment service PostgreSQL password
PAYMENT_DB_NAMENopaymentpublicPayment service PostgreSQL database

Variables by Service

auth-server

Required:
  • DB_HOST, DB_USER, DB_PASSWORD, DB_NAME
  • REDIS_HOST (for Redis-backed rate limiting)
  • JWT_PRIVATE_KEY_FILE + JWT_PUBLIC_KEY_FILE (or via Vault)
  • AUTH_CLERK_SECRET_KEY, AUTH_CLERK_AUTHORIZED_PARTY, AUTH_CLERK_WEBHOOK_SIGNING_SECRET (when using Clerk, which is the default)
Optional:
  • SERVER_HOST, SERVER_PORT
  • DB_PORT, DB_SSL_MODE, DB_MAX_CONNS, DB_MIN_CONNS, DB_MAX_CONN_LIFETIME, DB_MAX_CONN_IDLE_TIME, DB_STATEMENT_TIMEOUT
  • REDIS_PORT, REDIS_PASSWORD, REDIS_DB
  • JWT_ACCESS_TOKEN_TTL, JWT_REFRESH_TOKEN_TTL, JWT_ISSUER, JWT_AUDIENCE
  • AUTH_PROVIDER (default: clerk)
  • AUTH_RATE_LIMIT_PER_MINUTE, AUTH_ALLOWED_ORIGINS
  • ENVIRONMENT, LOG_LEVEL
Deprecated (SIWE legacy):
  • SIWE_DOMAIN, SIWE_URI, AUTH_NONCE_EXPIRY (only needed when AUTH_PROVIDER=siwe or both)

api-server

Required:
  • SERVER_PUBLIC_URL
  • DB_HOST, DB_USER, DB_PASSWORD, DB_NAME
  • REDIS_HOST
  • TEMPORAL_HOST
  • AWS_S3_BUCKET
Optional:
  • SERVER_HOST, SERVER_PORT, SERVER_READ_TIMEOUT, SERVER_WRITE_TIMEOUT
  • DB_PORT, DB_SSL_MODE, DB_MAX_CONNS, DB_MIN_CONNS, DB_MAX_CONN_LIFETIME, DB_MAX_CONN_IDLE_TIME, DB_STATEMENT_TIMEOUT
  • REDIS_PORT, REDIS_PASSWORD, REDIS_DB
  • TEMPORAL_PORT, TEMPORAL_NAMESPACE, TEMPORAL_TASK_QUEUE
  • AWS_REGION, AWS_ENDPOINT_URL, AWS_S3_BACKUP_PREFIX
  • OPS_AGENT_BINARIES_DIR
  • CHAIN_PROFILES_DIR, CHAIN_CONFIGS_SOURCE, CHAINS_S3_BUCKET, CHAINS_S3_REGION, CHAINS_S3_PREFIX, CHAINS_CHECK_INTERVAL, CHAIN_CONFIGS_WATCH
  • JWT_PUBLIC_KEY_FILE, JWT_AUDIENCE for JWT validation (RS256)
  • AUTH_PROVIDER, AUTH_CLERK_SECRET_KEY, AUTH_CLERK_AUTHORIZED_PARTY (when Clerk provider enabled; no webhook secret — api-server does not handle webhooks)
  • API_AUTH_ENABLED, API_RATE_LIMIT_ENABLED, API_RATE_LIMIT_REQUESTS
  • PAYMENT_CONSUMER_ENABLED, PAYMENT_CONSUMER_URL, PAYMENT_CONSUMER_STREAM_NAME (for processing payment events)
  • ENVIRONMENT, LOG_LEVEL

agent-gateway

Required:
  • DB_HOST, DB_USER, DB_PASSWORD, DB_NAME
  • REDIS_HOST
  • GRPC_PUBLIC_URL
Optional:
  • DB_PORT, DB_SSL_MODE, DB_MAX_CONNS, DB_MIN_CONNS, DB_MAX_CONN_LIFETIME, DB_MAX_CONN_IDLE_TIME, DB_STATEMENT_TIMEOUT
  • REDIS_PORT, REDIS_PASSWORD, REDIS_DB
  • AWS_REGION, AWS_ENDPOINT_URL
  • GRPC_HOST, GRPC_PORT, GRPC_USE_TLS, GRPC_CERT_FILE, GRPC_KEY_FILE, GRPC_CA_FILE
  • GRPC_CONFIG_SIGNING_KEY
  • GRPC_RATE_LIMIT_RPS, GRPC_RATE_LIMIT_BURST, GRPC_RATE_LIMIT_CLEANUP_INTERVAL
  • COMMAND_PROGRESS_TTL
  • NATS_ENABLED, NATS_URL, NATS_STREAM_NAME, NATS_MAX_MESSAGE_SIZE
  • ENVIRONMENT, LOG_LEVEL

orchestrator

Required:
  • DB_HOST, DB_USER, DB_PASSWORD, DB_NAME
  • REDIS_HOST
  • TEMPORAL_HOST
  • AWS_S3_BUCKET
  • SERVER_PUBLIC_URL, GRPC_PUBLIC_URL
  • HCLOUD_TOKEN (if using Hetzner provider)
Optional:
  • DB_PORT, DB_SSL_MODE, DB_MAX_CONNS, DB_MIN_CONNS, DB_MAX_CONN_LIFETIME, DB_MAX_CONN_IDLE_TIME, DB_STATEMENT_TIMEOUT
  • REDIS_PORT, REDIS_PASSWORD, REDIS_DB
  • TEMPORAL_PORT, TEMPORAL_NAMESPACE, TEMPORAL_TASK_QUEUE
  • AWS_REGION, AWS_ENDPOINT_URL, AWS_S3_BACKUP_PREFIX
  • GRPC_USE_TLS, GRPC_CONFIG_SIGNING_KEY
  • TERRAFORM_BINARY, TERRAFORM_STATE_DIR, TERRAFORM_PARALLELISM
  • TERRAFORM_STATE_BACKEND, TERRAFORM_S3_BUCKET, TERRAFORM_S3_REGION, TERRAFORM_DYNAMODB_TABLE
  • TERRAFORM_INIT_TIMEOUT, TERRAFORM_PLAN_TIMEOUT, TERRAFORM_APPLY_TIMEOUT, TERRAFORM_DESTROY_TIMEOUT, TERRAFORM_OUTPUT_TIMEOUT
  • COMMAND_PROGRESS_TTL
  • SKIP_TERRAFORM, DEV_MODE
  • CHAIN_PROFILES_DIR, CHAIN_CONFIGS_SOURCE, CHAINS_S3_BUCKET, CHAINS_S3_REGION, CHAINS_S3_PREFIX, CHAINS_CHECK_INTERVAL, CHAIN_CONFIGS_WATCH
  • CHAIN_CONFIGS_ENABLED, CONFIG_BUNDLE_VERSION, CHAIN_CONFIGS_S3_BUCKET, CHAIN_CONFIGS_S3_REGION, CHAIN_CONFIGS_URL_EXPIRY, CHAIN_CONFIGS_LOCAL_DIR, S3_PUBLIC_ENDPOINT_URL (for ops-agent config bundles)
  • OPS_AGENT_VERSION
  • DEFAULT_PROVIDER, DEFAULT_REGION
  • OVH_ENDPOINT, OVH_APPLICATION_KEY, OVH_APPLICATION_SECRET, OVH_CONSUMER_KEY, OVH_CLOUD_PROJECT_SERVICE (if using OVH providers)
  • ENVIRONMENT, LOG_LEVEL

health-evaluator

Required:
  • DB_HOST, DB_USER, DB_PASSWORD, DB_NAME
  • TEMPORAL_HOST
Optional:
  • DB_PORT, DB_SSL_MODE, DB_MAX_CONNS, DB_MIN_CONNS, DB_MAX_CONN_LIFETIME, DB_MAX_CONN_IDLE_TIME, DB_STATEMENT_TIMEOUT
  • TEMPORAL_PORT, TEMPORAL_NAMESPACE, TEMPORAL_TASK_QUEUE
  • AWS_REGION, AWS_S3_BUCKET, AWS_S3_BACKUP_PREFIX (for backup cleanup)
  • HEALTH_EVALUATION_INTERVAL, HEALTH_HEARTBEAT_TIMEOUT, HEALTH_MIGRATION_COOLDOWN
  • SUBSCRIPTION_CLEANUP_INTERVAL
  • MAINTENANCE_CLEANUP_INTERVAL, MAINTENANCE_CLEANUP_TIMEOUT
  • NATS_ENABLED, NATS_URL, NATS_STREAM_NAME, NATS_CONSUMER_NAME, NATS_MAX_MESSAGE_SIZE, NATS_STREAM_REPLICAS
  • VICTORIA_METRICS_URL, METRICS_CONSUMER_NAME (for observation metrics ingestion and policy evaluation)
  • EVALUATION_CONCURRENCY (for parallel chain policy evaluation)
  • CIRCUIT_BREAKER_FAILURE_THRESHOLD, CIRCUIT_BREAKER_SUCCESS_THRESHOLD, CIRCUIT_BREAKER_TIMEOUT (for metrics ingester protection)
  • CHAIN_PROFILES_DIR, CHAIN_CONFIGS_SOURCE, CHAINS_S3_BUCKET, CHAINS_S3_REGION, CHAINS_S3_PREFIX, CHAINS_CHECK_INTERVAL (for loading observation specs and policies)
  • ENVIRONMENT, LOG_LEVEL
Built-in workers (no env vars needed): Vault-sourced (incident notifications):
  • incident_slack_webhook_url — Slack incoming webhook URL (optional, enables Slack notifications)
  • incident_telegram_bot_token — Telegram bot token (optional, enables Telegram notifications)
  • incident_telegram_chat_id — Telegram chat ID (required if bot token is set)
  • incident_email_api_url — Email delivery service API URL (optional, enables Email notifications)
  • incident_email_api_key — Email delivery service API key (required if API URL is set)
  • incident_email_from — Sender email address for incident notifications
  • incident_email_to — Recipient email address for incident notifications
  • incident_webhook_url — Outbound webhook URL (optional, enables Webhook notifications)
  • incident_webhook_secret — Shared secret sent in X-Webhook-Secret header for request verification
These are read from Vault at secret/hoodcloud/app-credentials, not from environment variables. The health-evaluator initializes the incident notification pipeline at startup based on which credentials are present.

ops-agent (on VMs)

Required:
  • NODE_ID
  • GRPC_URL (or CONTROL_PLANE_URL for backward compatibility)
Optional:
  • OPS_AGENT_CONFIG (path to YAML configuration file)
  • NODE_TOKEN
  • GRPC_USE_TLS, GRPC_PORT, GRPC_ADDRESS
  • TLS_CERT_FILE, TLS_KEY_FILE, TLS_CA_FILE
  • CHAIN_PROFILE_ID, CHAIN_DATA_DIR, CHAIN_RPC_PORT, CHAIN_TYPE
  • HEARTBEAT_INTERVAL, GRACE_PERIOD
  • CONFIG_SIGNING_PUBLIC_KEY, REQUIRE_CONFIG_SIGNATURE
  • CONFIG_DIR, CONFIG_VERSION
  • METRICS_ENABLED, METRICS_PORT
  • OBSERVATION_ENABLED, NATS_URL, HOST_ID (for observation metrics transport; NATS auth via JWT operator mode)
  • SYSTEM_METRICS_INTERVAL, SYSTEM_METRICS_DATA_PATH (for system metrics collection)
  • ENVIRONMENT, LOG_LEVEL

payment-service (Isolated Microservice)

Required:
  • DB_HOST, DB_USER, DB_PASSWORD, DB_NAME
Conditional (for mTLS):
  • TLS_CERT_FILE, TLS_KEY_FILE, TLS_CA_FILE (unless TLS_INSECURE=true)
  • TLS_ALLOWED_CN (for client certificate CN verification)
Conditional (for Vault):
  • VAULT_ADDRESS, VAULT_ROLE_ID, VAULT_SECRET_ID_PATH
  • VAULT_TLS_CA_FILE, VAULT_APP_CREDENTIALS_PATH
Optional:
  • PAYMENT_CONFIG
  • GRPC_ADDRESS, HTTP_ADDRESS, METRICS_ADDRESS
  • DB_PORT, DB_SSL_MODE, DB_MAX_CONNS, DB_MIN_CONNS
  • NATS_URL, NATS_STREAM_NAME, NATS_CTRL_SIGNING_SEED, NATS_CTRL_ACCOUNT_PUB
  • PRICING_CONFIG_PATH
  • TLS_INSECURE (development only)
  • VAULT_NAMESPACE, VAULT_TLS_SKIP_VERIFY (dev only — rejected in production), VAULT_CACHE_TTL
  • OTEL_ENABLED, OTEL_COLLECTOR_URL, OTEL_SAMPLE_RATIO
  • TEMPO_ENABLED, TEMPO_RPC_URL, TEMPO_RECEIVER_ADDRESS, TEMPO_CHAIN_ID, TEMPO_POLL_INTERVAL, TEMPO_TOKEN_ALPHAUSD, TEMPO_TOKEN_BETAUSD
  • STRIPE_ENABLED (Stripe secrets sourced from Vault, not environment variables)

Security Classifications

Secret (Must Never Be Exposed)

These variables contain credentials, tokens, or cryptographic keys:
  • JWT_PRIVATE_KEY_FILE (file content is secret)
  • DB_PASSWORD
  • REDIS_PASSWORD
  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY
  • VAULT_ROLE_ID
  • VAULT_SECRET_ID_PATH (file content is secret)
  • GRPC_KEY_FILE
  • GRPC_CONFIG_SIGNING_KEY
  • TLS_KEY_FILE
  • HCLOUD_TOKEN
  • OVH_APPLICATION_KEY
  • OVH_APPLICATION_SECRET
  • OVH_CONSUMER_KEY
  • E2E_DB_PASSWORD
  • POSTGRES_PASSWORD
  • GRAFANA_ADMIN_PASSWORD
  • NGROK_AUTHTOKEN
  • NATS_JWT_CTRL_ACCOUNT_PUB (sensitive, not secret — public key only)
  • NATS_CTRL_SIGNING_SEED (secret — loaded from Vault in production)
  • AUTH_CLERK_SECRET_KEY
  • AUTH_CLERK_WEBHOOK_SIGNING_SECRET
Best Practices:
  • Never commit to version control
  • Store in secure secret management systems (HashiCorp Vault)
  • Use environment-specific .env files (gitignored)
  • Rotate regularly
  • Use IAM roles/instance profiles instead of static credentials where possible

Sensitive (Should Not Be Public)

These variables contain internal URLs, usernames, or configuration that should not be publicly disclosed:
  • SERVER_PUBLIC_URL
  • GRPC_PUBLIC_URL
  • DB_HOST, DB_USER, DB_NAME
  • REDIS_HOST
  • TEMPORAL_HOST
  • AWS_S3_BUCKET
  • AWS_ENDPOINT_URL
  • VAULT_ADDR
  • VAULT_MASTER_KEY_PATH, VAULT_APP_CREDENTIALS_PATH
  • VAULT_TLS_CA_FILE
  • JWT_PUBLIC_KEY_FILE
  • GRPC_CERT_FILE, GRPC_CA_FILE
  • TLS_CERT_FILE, TLS_CA_FILE
  • NODE_ID
  • CONTROL_PLANE_URL, GRPC_URL
  • CONFIG_SIGNING_PUBLIC_KEY
  • LOKI_URL, PROMETHEUS_URL, VICTORIA_METRICS_URL
  • NATS_URL, NATS_PUBLIC_URL
  • OPS_AGENT_DOWNLOAD_URL
  • S3_PUBLIC_ENDPOINT_URL
  • E2E_API_URL, E2E_GRPC_ADDRESS, E2E_DB_HOST, E2E_DB_USER, E2E_DB_NAME, E2E_TEMPORAL_HOST, E2E_SSH_KEY_PATH
  • POSTGRES_USER
  • GRAFANA_ADMIN_USER
  • CHAIN_CONFIGS_S3_BUCKET
  • OVH_CLOUD_PROJECT_SERVICE
  • TERRAFORM_S3_BUCKET
  • TERRAFORM_DYNAMODB_TABLE

Public (Safe to Commit)

These variables contain non-sensitive configuration:
  • All timeouts, intervals, and thresholds
  • Port numbers
  • Feature flags (API_AUTH_ENABLED, NATS_ENABLED, etc.)
  • Default values and limits
  • ENVIRONMENT, LOG_LEVEL
  • Public configuration like CHAIN_PROFILES_DIR

Production Checklist

See: Deployment Checklist for the full pre-deployment checklist covering configuration, authentication, security, and infrastructure validation.

Variable Reference by File

Source Code References

internal/config/config.go
  • Authoritative source for all environment variable mappings
  • Implements koanf-based configuration with env var overrides
  • TerraformConfig: timeout fields (init, plan, apply, destroy, output)
  • CommandsConfig: progress_ttl field
internal/config/loader.go
  • Environment variable to config path mappings
  • TERRAFORM_*_TIMEOUT and COMMAND_PROGRESS_TTL mappings
internal/opsagent/config.go
  • Ops agent configuration loading with YAML support
  • LoadFromFile() for YAML configuration
  • LoadConfig() with OPS_AGENT_CONFIG support
  • All NODE_ID, GRPC_URL, CHAIN_* variables
cmd/api-server/main.go
  • ENVIRONMENT variable usage for logging
  • HTTP-only API server (gRPC moved to agent-gateway)
cmd/agent-gateway/main.go
  • gRPC server for ops-agent communication
  • Uses COMMAND_PROGRESS_TTL for Redis progress storage
  • GRPC_* variables for server configuration
cmd/orchestrator/main.go
  • ENVIRONMENT variable usage
  • HCLOUD_TOKEN direct usage (passed to Terraform)
  • TERRAFORM_*_TIMEOUT variables for provisioning operations
  • COMMAND_PROGRESS_TTL for progress reader
cmd/health-evaluator/main.go
  • ENVIRONMENT variable usage
  • HOODCLOUD_CONFIG file path loading
internal/logging/logger.go
  • LOG_LEVEL parsing and configuration
internal/terraform/runner.go
  • Timeout configuration from Config struct
  • Falls back to defaults.Terraform.* when not configured
internal/commandqueue/progress.go
  • ProgressStore with configurable TTL
  • Uses functional options pattern (WithProgressTTL)
  • Falls back to defaults.Commands.ProgressTTL
tests/e2e/helpers/environment.go (Lines 58-79)
  • All E2E_* test configuration variables

Configuration File References

docker-compose.dev.yml
  • Development environment defaults
  • All service environment configurations
tests/e2e/docker-compose.e2e.yml
  • E2E testing environment
  • Isolated ports and test-specific configuration
infrastructure/docker/docker-compose.yml
  • Production-like Docker Compose setup
.env.example
  • Template with all user-configurable variables
config/hoodcloud.yaml
  • YAML configuration structure
  • Default values for all settings

Troubleshooting

Missing Required Variables

If services fail to start with missing variable errors:
  1. Check .env file exists and contains required variables
  2. Verify environment variables are exported in your shell
  3. For Docker Compose, ensure .env is in the same directory as docker-compose.yml
  4. Check service-specific required variables in “Variables by Service” section

Configuration Not Applied

If environment variables seem to be ignored:
  1. YAML file values take precedence unless overridden by environment variables
  2. Check HOODCLOUD_CONFIG is pointing to the correct file
  3. Verify variable names match exactly (case-sensitive)
  4. For Docker, check environment variables are passed through in docker-compose.yml

Connection Issues

If services can’t connect to each other:
  1. Verify hostnames match service names in Docker Compose
  2. Check port numbers match between services
  3. Ensure URLs include protocol (http://, nats://, etc.)
  4. For external VMs, ensure *_PUBLIC_URL variables are set to externally-accessible addresses

Migration Guide

From Environment-Only to YAML + Environment

  1. Create config/hoodcloud.yaml with your base configuration
  2. Set environment variables only for values that differ from YAML or are secrets
  3. Set HOODCLOUD_CONFIG=config/hoodcloud.yaml
  4. Restart services

From Development to Production

  1. Copy .env.example to .env.production
  2. Replace all placeholder values with production values
  3. Move all secrets to HashiCorp Vault
  4. Configure Vault AppRole authentication
  5. Enable security features (TLS, auth, rate limiting)
  6. Set appropriate timeouts and intervals for production scale