Engineering field note
We moved our Ansible deployment secrets into CertLocker
CertLocker now stores the secrets, PEM files, groups, tokens, and audit trail needed to deploy CertLocker itself.
This was good craic to build because it was not just a feature ticket. It was one of those infrastructure problems that only gives you a proper answer after you chase it through Ansible, Gateway tokens, PEM files, groups, deployment failures, and your own assumptions.
Ansible Vault was not the villain here. It did its job: encrypt sensitive values in a repo so we could keep deploying. The problem was the centre of gravity. As CertLocker matured, it became harder to justify keeping the sensitive deployment material in encrypted files beside the playbooks.
We are trying to be transparent with how CertLocker is being built. Some weeks are shiny UI work. Some weeks are HAProxy, Pebble, Gateway routes, Postgres keys, or deployment scripts. This was the second kind: useful infrastructure work that makes the deployment path cleaner.
CertLocker is meant to be an infrastructure trust control plane. If we still need a separate Vault-shaped dependency to deploy our own certificate platform, then the control plane is not finished. This migration is part of closing that loop.
What changed
The interesting part is the boundary it changes. We did not start with "replace Ansible Vault" as a slogan. We started with the pieces CertLocker already needed to support this kind of deployment:
- Gateway support for creating and reading secrets with automation tokens.
- Group-scoped access so records can belong to environments such as
dev02anddev03. - A Gateway lookup endpoint that resolves a secret by stable name.
- PEM secret support for certificate private keys.
- ACME, HAProxy, Cloudflare DNS, Pebble, and Let's Encrypt certificate delivery.
- Deployment hardening in Ansible: V3 RAM-only secrets, vault hooks, cert material handling, and safer compose rendering.
This is not saying HashiCorp Vault is bad. For a lot of organisations it is the right answer. For this CertLocker deployment path, we now have enough native capability that a separate Vault cluster is not the right dependency.
The first fix was not code
The first fix was deciding the boundary. The repo should describe the system, not carry the secrets.
all.yml non-sensitive operational configuration
vault.yml values uploaded to CertLocker or resolved from CertLocker
Ports, image tags, public FQDNs, usernames, feature flags, backup schedules, and paths belong in all.yml.
PATs, passwords, master keys, ACME tokens, Cloudflare tokens, SMTP secrets, webhooks, and private keys belong in CertLocker.
The uploader no longer tries to guess whether a variable looks sensitive. If a value is still in vault.yml, it gets uploaded. The inventory layout is the source of truth.
The uploaded records
After the upload, the inventory names appear in CertLocker exactly where the Ansible lookup expects to resolve them.
ansible.dev02.certlocker_stack...for variable-backed records.ansible.dev02.certs...for certificate material.CONFIGURATIONfor ordinary secret values and public certificate material.PEMfor private keys.ansibleplus the inventory name as groups.ACTIVEstatus for uploaded records.
The useful part is the shape: names, types, groups, status, and audit history all sit on the same record.
Stable names instead of ID babysitting
Our first version used CertLocker IDs in Ansible:
vault_grafana_admin_password_cl_id: "b5f9830b6"
vault_grafana_admin_password: >-
{{ lookup('certlocker.secrets.secret',
id=vault_grafana_admin_password_cl_id,
field='secret.value') }} It worked until records were wiped or recreated. Then the IDs changed and someone had to patch Git. That is not infrastructure trust. That is ID babysitting.
The better contract is a stable operational name:
ansible.<inventory>.<group>.<variable>
ansible.dev03.certlocker_stack.vault_grafana_admin_password Gateway resolves that name:
GET /api/v1/secret/lookup?name=ansible.dev03.certlocker_stack.vault_grafana_admin_password Then the Ansible collection fetches the current secret by the returned ID. The inventory never commits that ID.
The Ansible collection
We built and released a small Ansible collection, certlocker.secrets, so playbooks can resolve CertLocker records at runtime from the controller. The collection is published at github.com/certlocker-io/ansilbe-certlocker; the point is to make this reusable rather than a private deployment trick.
For another team, the runtime path is intentionally small: install the collection, set the Gateway URL and token on the controller, then replace raw vault values with lookups by name. You do not need the managed hosts to know anything about CertLocker.
ansible-galaxy collection install git+https://github.com/certlocker-io/ansilbe-certlocker.git export CERTLOCKER_API_PROFILE=gateway
export CERTLOCKER_API_URL="https://dev01.certlocker.io/rest"
export CERTLOCKER_TOKEN="${CERTLOCKER_GATEWAY_TOKEN}" Normal secret lookup:
vault_certlocker_postgres_password: >-
{{ lookup('certlocker.secrets.secret',
name='ansible.dev03.certlocker_stack.vault_certlocker_postgres_password',
field='secret.value') }} PEM private key lookup:
- name: Install Postgres primary private key
ansible.builtin.copy:
content: "{{ lookup('certlocker.secrets.pem_key',
name='ansible.dev03.certs.postgres-primary.server.key') }}"
dest: "{{ certlocker_postgres_primary_cert_dir }}/server.key"
owner: root
group: root
mode: "0600"
no_log: true The managed host does not need the CertLocker token. The lookup runs on the Ansible controller. Existing roles can keep using the same variable names; the value resolves from CertLocker at deploy time.
The uploader
The uploader is only there to help convert existing Ansible Vault variables and sensitive inventory files. It is not needed on every deploy. It reads what you already have, creates CertLocker records, and lets the playbook move to stable lookups. That tooling now sits alongside the released collection instead of living as a private deployment trick.
For an existing inventory, the conversion looks like this:
cd cl-ansible
CERTLOCKER_API_URL=https://dev01.certlocker.io/rest \
CERTLOCKER_TOKEN="${CERTLOCKER_GATEWAY_TOKEN}" \
./scripts/certlocker-upload-inventory-vaults.py --inventory dev03 --insecure It scans:
inventory/*/group_vars/*/vault.yml
inventory/*/certs/**
It decrypts Ansible Vault files with ansible-vault view, uploads every remaining variable in vault.yml, skips existing lookup templates and old *_cl_id helpers, validates URI-friendly names, and uploads certificate files as CONFIGURATION or PEM.
.key / .pem -> PEM
.crt / .srl -> CONFIGURATION
After that, vault.yml can become lookup-only. That is the easy part people should care about: the conversion is a one-time upload, while normal Ansible keeps reading variables in the same places.
New development environments
A new environment starts from our own components: Ansible inventory, generated Postgres TLS material, CertLocker Gateway, CertLocker secrets, HAProxy/ACME delivery, and the same group/audit model we expose to users.
cd cl-ansible
CERTLOCKER_TOKEN="${CERTLOCKER_GATEWAY_TOKEN}" \
./scripts/init-new-environment.sh dev09 trust.certlocker.io dev01.certlocker.io
The script creates the inventory, generates bootstrap secrets and Postgres certs, uploads the values and PEM files to CertLocker, then writes lookup-only references back into vault.yml.
vault_certlocker_postgres_password: >-
{{ lookup('certlocker.secrets.secret',
name='ansible.dev09.certlocker_stack.vault_certlocker_postgres_password',
field='secret.value') }} Why this matters
We are always growing the product, and we are trying to be transparent about the work as we go. I enjoyed this one because it made us run CertLocker through our own deployment process, not just a prepared demo path.
CertLocker already handles certificates, ACME delivery, private secrets, tokens, groups, probes, bastion access, and audit evidence. Moving our own Ansible deployment secrets into CertLocker is where those pieces start to click together.
Ansible owns configuration.
CertLocker owns sensitive material.
Stable names connect the two. The repo becomes easier to review. Rotations stop being Git edits. A database rebuild does not strand Ansible on dead IDs. PEM files follow the same access model as passwords and PATs. Groups give us an inventory boundary. Audit gives us the story afterwards.
The headline is not "we added an Ansible lookup". The point is that CertLocker is now being used as the trust plane for its own infrastructure.