Skip to main content

Secrets

dagstack/config does not store secrets — it transports them correctly and masks them in diagnostics. Storage of the secrets themselves remains the responsibility of external tools (env variables, HashiCorp Vault, Kubernetes Secret, cloud secret managers).

The secrets surface has two independent layers:

  1. Auto-masking by field name (Phase 1, ADR-0001) — the loader recognises secret-shaped field names and replaces their values with [MASKED] in diagnostics. Works on every value, regardless of where it came from.
  2. ${secret:<scheme>:<path>} references with SecretSource adapters (Phase 2, ADR-0002) — pluggable backends (env, Vault, …) resolve secrets on demand. Resolution is lazy or eager per binding, cached for the Config lifetime, and refreshable.

Layer 1 is the foundation; layer 2 builds on top of it (resolved SecretRef values still pass through the same field-name mask in diagnostic output).

Layer 1: auto-masking by field name

When the configuration is rendered into logs / diagnostics / error messages, secret fields are automatically replaced with [MASKED]. The pattern list is fixed normatively in config-spec/_meta/secret_patterns.yaml:

Exact matchSuffix
api_key*_secret
secret_key*_token
access_token*_password
password*_key
client_secret

Examples of fields that get masked:

  • database.password (exact)
  • cache.auth_token (suffix _token)
  • auth.jwt_secret (suffix _secret)
  • payment.stripe_api_key (suffix _key)
  • webhook.signing_password (suffix _password)

Not masked:

  • database.host — does not match any pattern.
  • database.url — URLs may contain credentials (postgresql://user:pass@host), but the pattern is not designed for that; pass password as a separate field.
  • A custom field internal_key_id — contains _key, BUT: matching is by suffix _keyit matches. If this is not a secret (just an id), rename it: internal_id / key_name.

Diagnostics are masked

Masking is applied automatically inside the ConfigError.details message: when the loader or a typed getter raises an error on a value in a secret-named field, the exception text shows [MASKED] instead of the raw value. You do not have to opt in.

ConfigError(
reason=type_mismatch,
path=database.password,
details=expected string, got int with value [MASKED]
)

For your own diagnostic output the bindings expose three primitives:

from dagstack.config.secrets_mask import (
MASKED_PLACEHOLDER,
is_secret_field,
mask_value,
)

is_secret_field("password") # True
is_secret_field("host") # False
mask_value("api_key", "sk-live-...") # "[MASKED]"
mask_value("host", "localhost") # "localhost"
print(MASKED_PLACEHOLDER) # "[MASKED]"

Use them when you build a custom dump of the configuration tree, when you log specific fields, or when you want a stable check for whether a field name is a secret.

The 9 standard patterns already cover roughly 95% of cases; rename project-specific secret fields to match a standard suffix (*_secret, *_token, *_password, *_key) when you can — that keeps masking automatic.

Layer 2: ${secret:<scheme>:<path>} references

A ${secret:...} token in a YAML file is a typed secret reference. The loader does not resolve it during file reading — it emits a SecretRef placeholder at the corresponding leaf. Resolution happens on first read (lazy) or at load_from time (eager), via a SecretSource adapter registered for the token's scheme.

app-config.yaml
llm:
api_key: "${secret:env:OPENAI_API_KEY}"
database:
password: "${secret:vault:secret/dagstack/prod/db#password}"
external_api:
fallback: "${secret:env:EXTERNAL_TOKEN:-dev-placeholder}"

${OPENAI_API_KEY} from Phase 1 keeps working — it is semantically identical to ${secret:env:OPENAI_API_KEY}. The Phase 2 syntax is strictly additive; the migration is a mechanical sed.

Reference grammar

${secret:<scheme>:<path>[?<query>][#<field>][:-<default>]}
  • <scheme> — lowercase ASCII identifier matching [a-z][a-z0-9_]*. Maps to a registered SecretSource. Phase 2 ships env (mandatory) and vault (optional opt-in). Schemes are an operator-extensible space — register two Vault clusters as vault and vault-dr if you need them.
  • <path> — adapter-specific. For env it is an env-var name; for Vault it is the KV v2 path (secret/dagstack/prod/openai).
  • ?<query> — backend-specific options as key=value pairs joined by &. The only Phase 2 normative key is version (Vault only — ?version=3 pins a specific KV v2 version). Unknown query keys raise secret_unresolved; the loader does not silently drop them.
  • #<field> — sub-key projection for JSON-typed secrets (secret/.../db#password). The cache key strips the projection, so #password and #username of the same path share one backend round-trip.
  • :-<default> — literal fallback when the reference does not resolve. The default is a string and is not itself interpolated.

Escape rules

Inside <path>, the structural separators ?, # and the :-default marker are escaped by doubling:

Want literalWrite
? in a path segment??
# in a path segment##
:- in a path segment::-

Inside ?<query> values, the characters &, =, } and # MUST be percent-encoded per RFC 3986. This aligns with HTTP query-string conventions, so the standard library helpers (urllib.parse.quote / encodeURIComponent / url.QueryEscape) produce correct encodings.

# Path that legitimately contains '#' (after de-escaping → secret/v1#tag/value).
exotic: "${secret:vault:secret/v1##tag/value}"
# Query value with literal '&'.
multi: "${secret:vault:secret/keys?label=team%26ops}"

Scheme registry

The Phase 2 normative schemes are listed in _meta/secret_schemes.yaml:

SchemeAdapterPhaseKindNotes
envEnvSecretSource2in-processMandatory; auto-registered. Backwards-compat with Phase 1 ${VAR}.
vaultVaultSource2remoteOptional opt-in extra. HashiCorp Vault KV v2.
awssmAwsSecretsManagerSource3remoteReserved. AWS Secrets Manager — Phase 3 candidate.
gcpsmGcpSecretManagerSource3remoteReserved. GCP Secret Manager — Phase 3 candidate.
k8ssecretK8sSecretSource3in-clusterReserved. Kubernetes Secret reader — Phase 3 candidate.

Phase 3 entries are listed in the registry so the names are fixed across bindings, but the adapters do not ship yet. See ADR-0003 candidate for the Phase 3 design discussion.

Resolution timing — lazy vs eager

The SecretRef placeholder lives in the merged tree after the file sources have loaded. When the reference resolves to a real value depends on the binding:

BindingDefault modeOpt-in
PythonLazy — first get* on the pathload_from(..., eager_secrets=True) resolves all refs at load time
TypeScriptEagerConfig.loadFrom(...) resolves every ref before returningNo lazy mode in Phase 2
GoEagerLoadFrom(ctx, ...) resolves every ref before returningNo lazy mode in Phase 2

Lazy mode minimises startup time and avoids touching the backend if the application reads only a subset of the tree. The trade-off is that a misconfigured reference surfaces on the first request, not at startup. For long-lived servers, eager mode is recommended — the TS and Go bindings make this the default, and the Python binding exposes eager_secrets=True for the same effect.

A SecretRef survives Config.load_from(...) in lazy mode:

from dagstack.config import (
Config,
EnvSecretSource,
SecretRef,
YamlFileSource,
)

cfg = Config.load_from([
YamlFileSource("app-config.yaml"),
EnvSecretSource(),
])
# ${secret:env:OPENAI_API_KEY} in the YAML → SecretRef in the tree.
# snapshot() with the default (include_secrets=False) masks it.
masked = cfg.snapshot()
assert masked["llm"]["api_key"] == "[MASKED]"

# First read triggers EnvSecretSource.resolve() and caches it.
api_key = cfg.get_string("llm.api_key") # the resolved string

Snapshot behaviour and audit mode

Config.snapshot() is the diagnostic-dump entry point. Its default behaviour matches the resolution-timing trigger table in ADR-0002 §3:

  • Every SecretRef placeholder is replaced with [MASKED] — the reference itself is never resolved by snapshot(). No backend round-trip happens.
  • Field-name suffix masking (Phase 1) is applied on top, so plain string values whose key matches a secret pattern are also masked.

For audit-mode dumps where the operator needs the resolved values (for example, to verify a Vault read returned the expected payload), opt in with the per-binding flag:

audit = cfg.snapshot(include_secrets=True)
# SecretRef placeholders ARE resolved, then field-name suffix
# masking still runs over the result — so `api_key`-named fields
# remain "[MASKED]" but custom-named fields show real values.

Treat the returned object as sensitive — it contains real secret material under fields whose name does not match a Phase 1 pattern.

Caching and TTL

The loader caches a resolved SecretValue for the lifetime of the Config object, keyed by <scheme>:<full-ref-path> (including any ?query and #field). Repeat reads of the same reference do not re-hit the backend.

If the adapter returns a SecretValue with a populated expires_at (Python) / expiresAt (TypeScript) / ExpiresAt (Go) — for example Vault dynamic credentials or AWS-SM rotation hints — the cache treats the entry as a miss after that timestamp and re-resolves on the next read.

Forced refresh — drop the cache and re-resolve on next access:

# Rotate the OPENAI_API_KEY in your secret store, then:
cfg.refresh_secrets()
# Next cfg.get_string("llm.api_key") re-resolves through
# EnvSecretSource / VaultSource — picks up the new value.

Push-based rotation (Vault lease watcher, AWS-SM EventBridge, GCP Pub/Sub) is deferred to Phase 3. Phase 2 ships only the operator-driven refresh_secrets hook above. See ADR-0003 candidate.

Error taxonomy

ADR-0002 §5 adds three new ConfigErrorReason values, distinct so operators can dispatch on them programmatically:

ReasonWhenOperator action
secret_unresolvedReference cannot be resolved: no source registered for the scheme, key missing in the backend, ?version=N is destroyed, or #field points at an absent sub-key.Check the YAML reference and the backend key spelling.
secret_backend_unavailableBackend is unreachable: network failure, DNS, Vault sealed, auth-method credentials rejected at connect.Check connectivity, credentials and backend health.
secret_permission_deniedBackend rejected the read with an authorisation error (Vault 403, AWS-SM AccessDeniedException, K8s RBAC denial).Check backend policy (Vault policy / AWS IAM / K8s RBAC) for read permission on this key.

The error carries source_id of the SecretSource (for example vault:https://vault.example.com), distinct from a ConfigSource.id. The details string also references the YAML location the bad token came from, so the operator does not have to grep:

ConfigError(
path = "llm.api_key",
reason = secret_unresolved,
details = "vault:secret/dagstack/prod/openai → 404 Not Found "
"(referenced from yaml:app-config.yaml)",
source_id = "vault:https://vault.example.com",
)

Loader bootstrap — registering sources

Config.load_from / Config.loadFrom / LoadFrom accept a heterogeneous list of ConfigSource and SecretSource instances. The loader dispatches by interface; ConfigSource order defines merge priority, SecretSource order does not (each scheme has at most one registered source). An EnvSecretSource is auto-registered if you don't pass one.

import os

from dagstack.config import (
Config,
EnvSecretSource,
YamlFileSource,
)
from dagstack.config.vault import AppRoleAuth, VaultSource

cfg = Config.load_from(
[
YamlFileSource("app-config.yaml"),
YamlFileSource("app-config.production.yaml"),
VaultSource(
addr="https://vault.example.com",
auth=AppRoleAuth(
role_id=os.environ["VAULT_ROLE_ID"],
secret_id=os.environ["VAULT_SECRET_ID"],
),
namespace="dagstack/prod",
),
# EnvSecretSource auto-registered — pass one only when you
# want a custom getenv (tests inject a dict-backed lookup).
],
eager_secrets=True, # recommended for long-lived servers
)

Misconfiguration is caught fast:

  • Two SecretSource instances with the same scheme raise ConfigError(reason=validation_failed) at load_from construction time — the issue is in your bootstrap code, not in backend health.
  • A ${secret:<scheme>:...} token whose <scheme> has no registered source AND no :-default fallback raises ConfigError(reason=secret_unresolved) at load_from time, not at first read. The intent is to surface configuration errors at startup.

Migration from Phase 1

ADR-0002 is strictly additive — Phase 1 ${VAR} syntax keeps working. Three migration steps in order of operator effort:

Step 0 — no change. If you don't deploy Vault, do nothing. ${OPENAI_API_KEY} is identical to ${secret:env:OPENAI_API_KEY}EnvSecretSource resolves both.

Step 1 — opt into the secret namespace, still using env. A mechanical rename readies fields for a backend swap later:

# Before
llm:
api_key: "${OPENAI_API_KEY}"

# After
llm:
api_key: "${secret:env:OPENAI_API_KEY}"

Step 2 — point at Vault (or other backend). Stage the secret in Vault, register VaultSource in your loader bootstrap, remove the env-var from the process environment:

llm:
api_key: "${secret:vault:secret/dagstack/prod/openai#api_key}"

The YAML wire format is forward-compatible: future bindings may emit a deprecation warning under a strict-mode lint for the bare ${VAR} form, but the syntax remains supported.

Where to store secrets locally

For developer-only overrides that should never be committed, use app-config.local.yaml (always in .gitignore):

app-config.local.yaml
database:
password: "local-dev-pw" # ok — this file is gitignored

The standard .gitignore template for dagstack projects contains:

# dagstack config — local developer overrides, never committed.
app-config.local.yaml

An alternative is a .env file referenced through ${secret:env:VAR}:

.env (gitignored)
DB_PASSWORD=local-dev-pw
app-config.yaml
database:
password: "${secret:env:DB_PASSWORD}"

Production deployment patterns

In production, secrets typically come from a centralised secret manager. Three common shapes:

1. Process env from a Kubernetes Secret — works without any backend integration:

kubernetes/deployment.yaml
env:
- name: OPENAI_API_KEY
valueFrom:
secretKeyRef:
name: llm-credentials
key: api_key

The YAML stays ${secret:env:OPENAI_API_KEY}; the source of the env var (k8s Secret, .env, External Secrets Operator) is transparent.

2. Direct Vault read with VaultSource — no env-var intermediary:

app-config.yaml
llm:
api_key: "${secret:vault:secret/dagstack/prod/openai#api_key}"

The application bootstrap authenticates to Vault (token / AppRole / ServiceAccount) once, then the loader fetches secrets directly on demand. Rotation requires only a Vault update plus refresh_secrets — no container restart.

3. HashiCorp Vault through envconsul / agent (legacy pattern, no application-side integration needed):

# envconsul launches the process with env vars sourced from Vault.
envconsul -config=vault.hcl -- python main.py

dagstack/config sees ${secret:env:OPENAI_API_KEY} — it does not need to know envconsul is in the chain.

See Secret sources for the operator guide to configuring VaultSource in each binding (install instructions, auth methods, token-renewal boundaries).

What to do if a secret leaks

If you discover that a secret reached git (for example, app-config.yaml with a plaintext password or an API key):

  1. Revoke the secret immediately in the relevant service.
  2. Issue a new secret and update the env / Vault / cloud-secret-manager entry.
  3. Rewrite the YAML to use ${secret:env:VAR} or ${secret:vault:...} so the literal value no longer lives in any file in the repository.
  4. Rewrite the git history (git filter-branch / BFG Repo Cleaner) — if the repository is not yet public.
  5. If the repository was public, treat the secret as compromised regardless of history rewrites.

Adding your own secret patterns

The pattern list is fixed in v0.1 — _meta/secret_patterns.yaml is the normative source and the bindings do not yet expose a way to extend it at load time. If you need to mask a custom-named field, do it in your own diagnostic dump using mask_value / maskValue / MaskValue:

from dagstack.config.secrets_mask import (
MASKED_PLACEHOLDER,
is_secret_field,
)

def custom_masked_string(name: str, value: object) -> object:
if name == "connection_string" or is_secret_field(name):
return MASKED_PLACEHOLDER
return value

Rename project-specific secret fields to match a standard suffix when you can — that keeps masking automatic in every diagnostic path.

See also