ADR-0002 · Secret references and SecretSource adapters
Status: accepted v1.1 (2026-05-03; v1.0 — 2026-05-03) · Full normative text
Amends: ADR-0001 §6 (Secrets handling — Phase 2 placeholder), §8 (Source adapters).
Why a second ADR
ADR-0001 fixed Phase 1 secrets handling as ${ENV_VAR} interpolation
plus value masking driven by _meta/secret_patterns.yaml. The
env-only model is sufficient for single-process deployments where
the operator is willing to pre-stage every credential into the
process environment, but operator feedback flagged three real-world
gaps:
- Centralised secret lifecycle — three services sharing one
OPENAI_API_KEYneed three env-injection points; rotating the key means rolling all three. - Audit and access control — env vars are visible to sibling
processes under the same UID, leak into core dumps and
/proc/<pid>/environ, and carry no audit trail. - Per-environment scoping without per-environment YAML — today
the operator either commits
app-config.staging.yamlwith${OPENAI_API_KEY_STAGING}per environment, or maintains parallel secret stores with identical key names.
ADR-0002 closes those gaps with two parallel interfaces under a single
Config.loadFrom([...]) call: ConfigSource keeps returning whole
trees; SecretSource resolves single keys lazily. The two are
first-class peers; the loader dispatches by interface.
Six key decisions
1. Reference syntax — ${secret:<scheme>[:<path>]}
A new normative interpolation token, parallel to ADR-0001 §2 ${VAR}:
${secret:<scheme>:<path>} → resolved value of <scheme>+<path>
${secret:<scheme>:<path>:-<default>} → literal fallback if reference does not resolve
Examples:
llm:
api_key: "${secret:env:OPENAI_API_KEY}" # passthrough — default scheme = env
fallback: "${secret:env:OPENAI_API_KEY:-sk-dev-placeholder}"
database:
password: "${secret:vault:secret/dagstack/prod/db#password}"
external_api:
token: "${secret:awssm:arn:aws:secretsmanager:eu-west-1:...:secret/openai-key}"
regional: "${secret:gcpsm:projects/foo/secrets/openai/versions/latest}"
Grammar (normative). From
_meta/secret_ref_grammar.yaml:
secret_ref := "${" "secret" ":" scheme ":" path_with_query [field_proj] [":-" default] "}"
scheme := [a-z][a-z0-9_]*
path_with_query:= path ["?" query]
path := <any chars except "}", "#"; literal "?" is "??", literal "#" is "##", ":-" is "::-">
query := query_kv ("&" query_kv)*
query_kv := query_key "=" query_value
query_key := [a-z][a-z0-9_]*
query_value := <percent-encoded per RFC 3986; literal "&", "=", "}", "#" MUST be %-encoded>
field_proj := "#" field
field := <any chars except "}">
default := <any chars except "}">
The ?key=value query block is reserved for backend-specific
options. Phase 2 normative key: version (Vault only — ?version=3
selects a specific KV v2 version). Bindings MUST reject unknown
keys with secret_unresolved. Token order is fixed:
path → ?query → #field → :-default → }.
Escape rules. Literal #, ? and :- inside a path segment
are escaped by doubling (##, ??, ::-). Literal &, =, },
# inside a query value MUST be percent-encoded per RFC 3986.
Standard-library URL helpers (urllib.parse.quote /
encodeURIComponent / url.QueryEscape) produce correct
encodings.
1.1 The env scheme — backwards compatibility
${secret:env:OPENAI_API_KEY} is semantically identical to
${OPENAI_API_KEY}. The env scheme is a degenerate case of secret
resolution implemented by an EnvSecretSource that the loader
auto-registers if no explicit one is passed. Migration from Phase 1
is a mechanical sed; no behavioural change.
1.2 Sub-key projection via #field
Most secret managers store a "secret" as a JSON object with multiple
fields (Vault KV v2, AWS-SM JSON-typed secrets). The #field syntax
projects one sub-key from a multi-key envelope:
${secret:vault:secret/dagstack/prod/db#password}
${secret:vault:secret/dagstack/prod/db#username}
Both references hit the same Vault read; the loader caches by
<scheme>:<path-up-to-#> so one round-trip serves both. If #field
is omitted and the resolved value is an object, the binding raises
secret_unresolved rather than auto-stringifying.
2. The SecretSource contract — separate from ConfigSource
Pseudocode for the contract (each implementation realises it idiomatically):
SecretSource {
scheme: string # short scheme name (matches ${secret:<scheme>:...})
id: string # human-readable identifier (URI-style)
resolve(path: string, ctx: ResolveContext): SecretValue # binding picks sync/async idiom
close?(): void # release resources
watch?(path: string, callback: (SecretChangeEvent) => void): Subscription # Phase 3
}
SecretValue {
value: string # always string at the wire level
version?: string # opaque version id from the backend
expires_at?: ISO-8601 # if the backend returns a TTL
source_id: string # echoed from SecretSource.id for diagnostics
}
ResolveContext {
cancellation?: <binding-native cancellation handle>
deadline?: ISO-8601 / native deadline
attempt: int # 1-based attempt counter
}
Why two interfaces rather than a marker capability on
ConfigSource:
- Type safety —
ConfigSource.load() -> ConfigTreeis total;SecretSource.resolve(path) -> SecretValueis keyed and partial. - Watch semantics — config watch is a tree-level event; secret rotation is a key-level versioned event. Two signals, two shapes.
- Cache lifecycle — config sources cache for the process lifetime; secrets MAY cache with TTL or per-lease.
Sync vs async — per-binding choice (same rule as ADR-0001 §4):
- Go:
Resolve(ctx, path) (SecretValue, error). - TypeScript:
resolve(path, ctx): Promise<SecretValue>. - Python:
def resolve(self, path, ctx)— sync by default; a parallelAsyncSecretSourceprotocol withasync def resolve_asyncis provided for non-blocking event loops.
SecretValue.value is always a string. Type coercion happens at
the Config.get* call site, exactly like for env-interpolated
values (ADR-0001 §4.4). The binding MUST NOT JSON-parse the value
into a sub-tree.
3. SecretRef — opaque placeholder and resolution timing
A ${secret:...} token does not trigger a secret-manager
round-trip at Source.load() time. The file source emits a
SecretRef placeholder at the corresponding tree leaf:
SecretRef {
scheme: string
path: string # full path including any #field projection
default?: string # the literal after ":-", if any
origin_source: string # ConfigSource.id where this token was found
}
The merged tree may contain SecretRef instances mixed with regular
scalars. Resolution happens at one of three points:
| Trigger | Behaviour |
|---|---|
config.get(path) returns a SecretRef | The binding MUST resolve transparently and return the resolved string. |
config.get_string / get_int etc. | Resolve transparently; apply primitive coercion per ADR-0001 §4.3. |
config.get_section(path, schema) | Resolve every SecretRef inside the subsection, then run the schema validator. |
config.snapshot() | Replace every SecretRef with [MASKED] per _meta/secret_patterns.yaml. The reference itself is never resolved by snapshot(). An audit-mode opt-in (include_secrets=True / { includeSecrets: true } / WithIncludeSecrets()) MAY resolve and mask by suffix-pattern only. |
Lazy by default with eager opt-in. Per-binding choice:
- Python — lazy by default;
Config.load_from(..., eager_secrets=True)walks the merged tree at load time. - TypeScript — eager by default (
loadFromis async; pays the cost up-front). - Go — eager by default (same rationale as TypeScript).
Pilot consumer recommendation (long-lived servers): eager mode.
Surfacing secret_unresolved at startup is observably better than a
5xx on the first inbound request.
Caching. A binding MUST cache resolved secrets in-process for
the lifetime of the Config object, keyed by <scheme>:<full path>.
The cache MUST honour expires_at from SecretValue (a value with
expires_at in the past is treated as a cache miss).
Forced refresh. config.refresh_secrets() /
config.refreshSecrets() / config.RefreshSecrets() drops the
cache and triggers re-resolution on next access — the manual
rotation hook for Phase 2. Push-based rotation is deferred to
Phase 3.
4. Loader integration
Config.load_from / loadFrom accepts a heterogeneous list of
ConfigSource and SecretSource instances. The loader dispatches
by interface.
Normative loader rules:
- Source ordering.
ConfigSourceorder continues to define merge priority (ADR-0001 §3).SecretSourceorder does not define priority — each scheme has at most one registered source. TwoSecretSourceinstances with the sameschemeis a programming error (ConfigError(reason=validation_failed, details="duplicate SecretSource scheme")). - Implicit env source. The loader MUST register a default
EnvSecretSourceif none is passed explicitly. - Unknown scheme at load time. If a
${secret:<scheme>:...}token uses a scheme with no registered source AND no:-default, the binding raisesConfigError(reason=secret_unresolved)at load time, not at first read.
5. Error model — three new reasons
Three new entries in _meta/error_reasons.yaml:
name | value | When |
|---|---|---|
SECRET_UNRESOLVED | secret_unresolved | Reference cannot be resolved (no source, key missing, ?version= destroyed, #field absent). |
SECRET_BACKEND_UNAVAILABLE | secret_backend_unavailable | Backend unreachable (network, DNS, auth at connect time). |
SECRET_PERMISSION_DENIED | secret_permission_denied | Backend rejected the read with an authorisation error (Vault 403, AWS-SM AccessDeniedException). |
Three reasons (not one) because operators react differently:
secret_unresolved→ check the YAML and the backend key spelling.secret_backend_unavailable→ check network / DNS / credentials.secret_permission_denied→ check the Vault policy / AWS IAM.
source_id on these errors is the SecretSource.id (e.g.
vault:https://vault.example.com), distinct from a
ConfigSource.id. The details string also references the YAML
file the token came from:
ConfigError(
path = "llm.api_key",
reason = secret_unresolved,
details = "vault:secret/dagstack/prod/openai → 404 Not Found "
"(referenced from yaml:app-config.yaml)",
source_id = "vault:https://vault.example.com",
)
6. Pilot adapter — VaultSource (HashiCorp Vault KV v2)
The first adapter shipped in all three bindings.
- KV version: KV v2 only in Phase 2. KV v1 lacks versioning and soft-delete; ships in Phase 3 if requested.
- Auth methods (Phase 2 mandatory): Token, AppRole. Optional: Kubernetes ServiceAccount. Phase 3 adds AWS IAM, JWT/OIDC, TLS client cert.
- Namespace: passed at construction time
(
namespace="dagstack/prod"); the adapter prepends automatically. - Versioning:
${secret:vault:secret/.../db?version=3#password}. Cache key includes the version. #fieldprojection: pluck a sub-key from a JSON envelope.
SDK choice per binding:
| Binding | Library | Packaging |
|---|---|---|
| Python | hvac>=2.0,<3.0 | pip install 'dagstack-config[vault]' |
| TypeScript | node-vault>=0.10 | npm install @dagstack/config node-vault (optional peer-dep) |
| Go | github.com/hashicorp/vault/api (official) | go get go.dagstack.dev/config/vault (separate sub-module) |
Each binding records its SDK choice in a per-language ADR
(adr/0001-vault-source.md) with version constraints and
deprecation policy.
Migration story
# Phase 1
llm:
api_key: "${OPENAI_API_KEY}"
Three steps, in operator-effort order:
Step 0 — no change. ${OPENAI_API_KEY} is identical to
${secret:env:OPENAI_API_KEY} (§1.1). Operators with no Vault do
nothing.
Step 1 — opt into the secret namespace, still using env.
llm:
api_key: "${secret:env:OPENAI_API_KEY}"
Step 2 — point at Vault.
llm:
api_key: "${secret:vault:secret/dagstack/prod/openai#api_key}"
The YAML wire format keeps ${VAR} working indefinitely. Future
bindings may emit a deprecation warning under a strict-mode lint;
the syntax remains normative.
A configuration example with secrets
- Python
- TypeScript
- Go
llm:
api_key: "${secret:vault:secret/dagstack/prod/openai#api_key}"
database:
host: "${DB_HOST:-localhost}"
password: "${secret:vault:secret/dagstack/prod/db#password}"
fallback_key: "${secret:env:OPENAI_API_KEY:-sk-dev-placeholder}"
import os
from dagstack.config import Config, YamlFileSource
from dagstack.config.vault import AppRoleAuth, VaultSource
cfg = Config.load_from(
[
YamlFileSource("app-config.yaml"),
VaultSource(
addr="https://vault.example.com",
auth=AppRoleAuth(
role_id=os.environ["VAULT_ROLE_ID"],
secret_id=os.environ["VAULT_SECRET_ID"],
),
namespace="dagstack/prod",
),
# EnvSecretSource auto-registered for ${secret:env:...}
],
eager_secrets=True,
)
api_key = cfg.get_string("llm.api_key")
import {
Config,
YamlFileSource,
VaultSource,
} from "@dagstack/config";
const cfg = await Config.loadFrom([
new YamlFileSource("app-config.yaml"),
new VaultSource({
addr: "https://vault.example.com",
auth: {
kind: "approle",
roleId: process.env.VAULT_ROLE_ID!,
secretId: process.env.VAULT_SECRET_ID!,
},
namespace: "dagstack/prod",
}),
// EnvSecretSource auto-registered.
]);
const apiKey = cfg.getString("llm.api_key");
package main
import (
"context"
"os"
"go.dagstack.dev/config"
"go.dagstack.dev/config/vault"
)
func loadConfig(ctx context.Context) (*config.Config, error) {
vaultSrc, err := vault.NewSource(
"https://vault.example.com",
vault.AppRoleAuth{
RoleID: os.Getenv("VAULT_ROLE_ID"),
SecretID: os.Getenv("VAULT_SECRET_ID"),
},
vault.WithNamespace("dagstack/prod"),
)
if err != nil {
return nil, err
}
return config.LoadFrom(ctx, []config.Source{
config.YamlFileSource("app-config.yaml"),
vaultSrc,
// EnvSecretSource auto-registered.
})
}
Consequences
Positive:
- Operator-grade secrets — Vault (and later cloud secret managers) as first-class config sources.
- No breaking change —
${VAR}keeps working; the new syntax is strictly additive. - Type safety preserved —
get_int/get_string/get_sectioncontinue to work transparently; secret resolution is invisible to the consumer. - Pluggability — the
SecretSourceinterface is what every backend implements; new schemes ship without changing the loader. - Audit-ready — every resolution carries
source_idand the original YAML location.
Trade-offs:
- Operational complexity — Vault adds a process dependency.
Mitigated by SDK opt-in extras (Python
[vault], TS peer-dep, Go sub-module). - Spec surface area — three new
_meta/*.yamlfiles, three new error reasons, two new interfaces. The cost of solving the real problem. - Resolution-timing surprise — Python's lazy default means a
secret error surfaces at first request. Mitigated by recommending
eager_secrets=Truefor long-lived servers and making TS/Go eager-by-default.
Spec-distributed artefacts
New files in _meta/:
_meta/secret_schemes.yaml— normative scheme registry. Source of truth for adapter dispatch and operator-facing "did you mean X?" messages._meta/secret_ref_grammar.yaml— formal grammar for${secret:...}, per-binding regex fragments, escape rules, query-parameter table._meta/error_reasons.yaml— three new reasons appended._meta/conformance_tags.yaml—phase2_secrets(env-scheme fixtures) andphase2_secrets_vault(Vault-backed fixtures, gated onDAGSTACK_CONFORMANCE_VAULT_ADDR).
Existing _meta/types.yaml gains rows for SecretSource,
AsyncSecretSource, SecretRef, SecretValue, SecretChangeEvent,
ResolveContext, EnvSecretSource, VaultSource.
Related ADRs
- ADR-0001 — Phase 1 base spec.
- ADR-0003 candidate — Phase 3 design discussion (push-based rotation, cloud secret managers, token-renewal background tasks).
- Per-binding Vault SDK ADRs:
Python
adr/0001-vault-source.md· TypeScriptadr/0001-vault-source.md· Goadr/0001-vault-source.md.
Normative source
The full text with all six decisions in detail, the conformance
fixture catalogue, cross-binding CI extension, and the open
questions tracking sheet:
config-spec/adr/0002-secret-references-and-sources.md.