Implementing DNSSEC: What We Learned the Hard Way
DNSSEC is one of those things where the RFC makes it sound reasonable, the blog posts make it sound easy, and production makes you question your career choices. We deployed DNSSEC across our PowerDNS infrastructure at Hopbox, and while it’s been stable for a while now, the path there was paved with broken resolutions and frantic debugging sessions.
Here’s what we learned. Hopefully it saves you some of the pain.
Why DNSSEC (And Why Most People Skip It)
Section titled “Why DNSSEC (And Why Most People Skip It)”DNS, by default, has zero authentication. When you query for example.com and get back 93.184.216.34, you’re trusting that nobody between you and the authoritative server tampered with the response. That trust is… optimistic.
DNSSEC adds cryptographic signatures to DNS responses. The resolver can verify that the answer actually came from the zone owner and hasn’t been modified in transit. It’s the difference between a postcard and a sealed letter.
So why do most people skip it? Because:
- It’s operationally complex (key management, rollovers, DS record coordination)
- A misconfiguration doesn’t just degrade service — it breaks resolution entirely for validating resolvers
- The failure modes are silent and confusing
- It doesn’t encrypt anything (that’s what DoH/DoT are for)
Despite all that, we think it’s worth doing. DNS cache poisoning is a real attack vector, and DNSSEC is the only defense that works at the protocol level.
Key Types: KSK vs ZSK
Section titled “Key Types: KSK vs ZSK”DNSSEC uses two types of keys:
Key Signing Key (KSK) — signs the DNSKEY record set. This is the “root of trust” for your zone. Its hash (the DS record) gets published in the parent zone. KSKs are typically 2048-bit RSA or Algorithm 13 (ECDSA P-256) and are rolled infrequently.
Zone Signing Key (ZSK) — signs everything else (A, AAAA, MX, TXT, etc.). These are rolled more frequently because they’re used more and thus have higher exposure.
In PowerDNS, you can see your zone’s keys with:
$ pdnsutil show-zone example-customer.comZone is actively securedZone has the following keys:ID = 1, flags = 257 (KSK), tag = 12345, algo = 13 (ECDSAP256SHA256), bits = 256 Active: 1, Published: 1 DNSKEY = example-customer.com. IN DNSKEY 257 3 13 aBcDeFgH... DS = example-customer.com. IN DS 12345 13 2 abcdef0123456789...
ID = 2, flags = 256 (ZSK), tag = 54321, algo = 13 (ECDSAP256SHA256), bits = 256 Active: 1, Published: 1 DNSKEY = example-customer.com. IN DNSKEY 256 3 13 xYzAbCdE...The flags = 257 indicates a KSK (256 + 1, where the 1 is the SEP — Secure Entry Point — bit). flags = 256 is a ZSK.
The Signing Workflow
Section titled “The Signing Workflow”With PowerDNS, signing is mostly automatic once you set it up. Here’s the basic flow:
# 1. Enable DNSSEC for a zone$ pdnsutil secure-zone example-customer.comSecuring zone with default key sizeAdding KSK with algorithm ecdsa256sha256Adding ZSK with algorithm ecdsa256sha256Zone example-customer.com secured
# 2. Verify the zone signs correctly$ pdnsutil check-zone example-customer.comzone example-customer.com is valid
# 3. Get the DS records to publish at the registrar$ pdnsutil show-zone example-customer.com | grep "^ DS" DS = example-customer.com. IN DS 12345 13 2 abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789Step 3 is the critical handoff. You take that DS record and publish it in the parent zone (usually via your domain registrar’s control panel, or via the registry’s API if you’re a registrar yourself). Until the DS record is published upstream, DNSSEC validation won’t work — resolvers have no way to build the chain of trust.
You can verify the chain with dig:
$ dig @8.8.8.8 example-customer.com A +dnssec +short203.0.113.50A 13 2 300 20260425120000 20260326120000 54321 example-customer.com. aBcDeFgHiJkLmNoPqRsTuVwXyZ...The second line is the RRSIG — the signature over the A record set. If you see that, signing is working.
Key Rollover: The Part Everyone Gets Wrong
Section titled “Key Rollover: The Part Everyone Gets Wrong”Keys don’t last forever. Algorithms get deprecated, keys get compromised, or you just want to follow best practices and rotate periodically. The rollover process is where most DNSSEC deployments break.
ZSK Rollover (the easier one)
Section titled “ZSK Rollover (the easier one)”ZSK rollovers are simpler because they don’t involve the parent zone. You can do a pre-publish rollover:
# 1. Add the new ZSK (published but not yet active)$ pdnsutil add-zone-key example-customer.com zsk inactive published 256
# 2. Wait for the new DNSKEY to propagate (at least 2x the DNSKEY TTL)# This ensures all caches have seen both the old and new ZSK
# 3. Activate the new ZSK and deactivate the old one$ pdnsutil activate-zone-key example-customer.com <new-key-id>$ pdnsutil deactivate-zone-key example-customer.com <old-key-id>
# 4. Wait again (2x the maximum zone TTL this time)# Cached RRSIGs signed with the old key need to expire
# 5. Remove the old ZSK$ pdnsutil remove-zone-key example-customer.com <old-key-id>The waiting periods are not optional. Skip them and you’ll have a window where resolvers have cached signatures from the old key but can’t verify them because the old DNSKEY is gone. That means SERVFAIL for validating resolvers.
KSK Rollover (the hard one)
Section titled “KSK Rollover (the hard one)”KSK rollovers require coordination with the parent zone because the DS record must be updated. The double-DS method works like this:
- Generate new KSK, publish it in the DNSKEY set
- Publish DS records for both old and new KSK in the parent zone
- Wait for parent zone DS propagation (this is the painful part — you’re at the mercy of the registrar/registry)
- Activate the new KSK, deactivate the old
- Remove the old DS from the parent zone
- Remove the old KSK from your zone
Step 3 is where things go wrong. Some registrars update DS records within minutes. Others take hours. A few take days. And there’s no reliable way to check propagation across all caching resolvers except to wait and test from multiple vantage points.
# Check DS records at the parent$ dig @a.gtld-servers.net example-customer.com DS +short12345 13 2 abcdef0123456789...67890 13 2 fedcba9876543210...When you see both DS records at the parent, you’re safe to proceed. When you see only one, wait.
Debugging with delv and dig +dnssec
Section titled “Debugging with delv and dig +dnssec”When DNSSEC breaks, delv is your best friend. It’s like dig but specifically designed for DNSSEC validation debugging:
$ delv @8.8.8.8 example-customer.com A +rtrace;; fetch: example-customer.com/A;; fetch: example-customer.com/DNSKEY;; fetch: example-customer.com/DS;; validating example-customer.com/A: starting;; validating example-customer.com/A: attempting positive response validation;; validating example-customer.com/DNSKEY: starting;; validating example-customer.com/DNSKEY: attempting positive response validation; fully validatedexample-customer.com. 300 IN A 203.0.113.50example-customer.com. 300 IN RRSIG A 13 2 300 20260425120000 20260326120000 54321 example-customer.com. aBcDeFgH...The +rtrace flag shows you the validation chain step by step. When something fails, you’ll see exactly where:
$ delv @8.8.8.8 broken-dnssec.example.com A +rtrace;; fetch: broken-dnssec.example.com/A;; fetch: broken-dnssec.example.com/DNSKEY;; fetch: broken-dnssec.example.com/DS;; validating broken-dnssec.example.com/DNSKEY: starting;; validating broken-dnssec.example.com/DNSKEY: no valid signature found;; broken trust chain;; resolution failed: SERVFAIL“No valid signature found” on the DNSKEY usually means the DS record in the parent doesn’t match any of your published DNSKEYs. This happens after a botched KSK rollover.
For quick checks, dig +dnssec +cd is also useful. The +cd flag (Checking Disabled) tells the resolver to skip validation, so you can see the actual response even when DNSSEC is broken:
$ dig @8.8.8.8 broken-dnssec.example.com A +dnssec +cd;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 28374;; flags: qr rd ra cd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; ANSWER SECTION:broken-dnssec.example.com. 300 IN A 203.0.113.50broken-dnssec.example.com. 300 IN RRSIG A 13 2 300 20260425120000 20260326120000 54321 broken-dnssec.example.com. ...The response comes back fine — the data is there, the signatures are there. The problem is upstream in the chain of trust. Without +cd, the same query would return SERVFAIL.
Common Pitfalls
Section titled “Common Pitfalls”Clock Skew
Section titled “Clock Skew”DNSSEC signatures have inception and expiration timestamps. If your server’s clock is off, it will generate signatures that appear to be from the future or already expired. NTP is not optional on DNSSEC-signing servers.
# Check signature validity window$ dig example-customer.com RRSIG +short | head -1A 13 2 300 20260425120000 20260326120000 54321 example-customer.com. ...# ^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^# expiration inceptionIf your clock is ahead of the inception time or behind the expiration time from a resolver’s perspective, validation fails.
Missing DS Records
Section titled “Missing DS Records”You’ve signed your zone, everything looks good locally, but DNSSEC validation fails for external resolvers. The most common cause: you forgot to publish the DS record at the parent. Or the registrar’s API accepted it but it hasn’t propagated yet.
# No DS record = no chain of trust$ dig @a.gtld-servers.net example.com DS +short(empty response)If this is empty but your zone is signed, you’re in a broken state — but only for resolvers that do DNSSEC validation. Non-validating resolvers will work fine, which makes the bug reports intermittent and confusing.
Algorithm Mismatches
Section titled “Algorithm Mismatches”If your DS record says algorithm 13 (ECDSA) but your DNSKEY is algorithm 8 (RSA-SHA256), validation fails. This typically happens during algorithm rollovers or when DS records are manually entered at the registrar with a typo.
# Check algorithm consistency$ pdnsutil show-zone example-customer.com | grep algo algo = 13 (ECDSAP256SHA256)
$ dig example-customer.com DS +short12345 13 2 abcdef...# ^^ algorithm must matchExpired Signatures
Section titled “Expired Signatures”PowerDNS re-signs records automatically, but if the signing process breaks (database issues, key problems), you can end up with expired signatures in your zone. The zone still serves responses, but they fail validation.
# Check if signatures are current$ dig example-customer.com A +dnssec +short203.0.113.50A 13 2 300 20260125120000 20251226120000 54321 example-customer.com. ...# ^^^^^^^^^^^^^^# This date is in the past = expired signature = SERVFAILMonitoring DNSSEC Health
Section titled “Monitoring DNSSEC Health”We monitor DNSSEC health from the outside. An internal check only tells you that signing is working — it doesn’t tell you that the chain of trust is intact from a resolver’s perspective.
Our monitoring does two things:
- Validates from external resolvers — queries our zones through Google (8.8.8.8) and Cloudflare (1.1.1.1) with
+dnssecand checks the AD (Authenticated Data) flag - Checks DS record presence — queries the parent zone’s nameservers for our DS records to ensure they haven’t been accidentally removed
# External validation check$ dig @8.8.8.8 example-customer.com A +dnssec | grep flags;; flags: qr rd ra ad; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1# ^^ AD flag = validated successfully
# DS record presence check$ dig @a.gtld-servers.net example-customer.com DS +short | wc -l1If the AD flag disappears or the DS record count drops to 0, we get paged.
DNSSEC isn’t fun to deploy. But once it’s running and you have monitoring in place, it mostly stays out of the way. The key (pun intended) is to never rush key rollovers, always test from external vantage points, and keep your clocks synchronized. And maybe keep delv in your muscle memory — you’ll need it.