Skip to content

Implementing DNSSEC: What We Learned the Hard Way

DNSSEC is one of those things where the RFC makes it sound reasonable, the blog posts make it sound easy, and production makes you question your career choices. We deployed DNSSEC across our PowerDNS infrastructure at Hopbox, and while it’s been stable for a while now, the path there was paved with broken resolutions and frantic debugging sessions.

Here’s what we learned. Hopefully it saves you some of the pain.

DNS, by default, has zero authentication. When you query for example.com and get back 93.184.216.34, you’re trusting that nobody between you and the authoritative server tampered with the response. That trust is… optimistic.

DNSSEC adds cryptographic signatures to DNS responses. The resolver can verify that the answer actually came from the zone owner and hasn’t been modified in transit. It’s the difference between a postcard and a sealed letter.

So why do most people skip it? Because:

  • It’s operationally complex (key management, rollovers, DS record coordination)
  • A misconfiguration doesn’t just degrade service — it breaks resolution entirely for validating resolvers
  • The failure modes are silent and confusing
  • It doesn’t encrypt anything (that’s what DoH/DoT are for)

Despite all that, we think it’s worth doing. DNS cache poisoning is a real attack vector, and DNSSEC is the only defense that works at the protocol level.

DNSSEC uses two types of keys:

Key Signing Key (KSK) — signs the DNSKEY record set. This is the “root of trust” for your zone. Its hash (the DS record) gets published in the parent zone. KSKs are typically 2048-bit RSA or Algorithm 13 (ECDSA P-256) and are rolled infrequently.

Zone Signing Key (ZSK) — signs everything else (A, AAAA, MX, TXT, etc.). These are rolled more frequently because they’re used more and thus have higher exposure.

In PowerDNS, you can see your zone’s keys with:

Terminal window
$ pdnsutil show-zone example-customer.com
Zone is actively secured
Zone has the following keys:
ID = 1, flags = 257 (KSK), tag = 12345, algo = 13 (ECDSAP256SHA256), bits = 256
Active: 1, Published: 1
DNSKEY = example-customer.com. IN DNSKEY 257 3 13 aBcDeFgH...
DS = example-customer.com. IN DS 12345 13 2 abcdef0123456789...
ID = 2, flags = 256 (ZSK), tag = 54321, algo = 13 (ECDSAP256SHA256), bits = 256
Active: 1, Published: 1
DNSKEY = example-customer.com. IN DNSKEY 256 3 13 xYzAbCdE...

The flags = 257 indicates a KSK (256 + 1, where the 1 is the SEP — Secure Entry Point — bit). flags = 256 is a ZSK.

With PowerDNS, signing is mostly automatic once you set it up. Here’s the basic flow:

Terminal window
# 1. Enable DNSSEC for a zone
$ pdnsutil secure-zone example-customer.com
Securing zone with default key size
Adding KSK with algorithm ecdsa256sha256
Adding ZSK with algorithm ecdsa256sha256
Zone example-customer.com secured
# 2. Verify the zone signs correctly
$ pdnsutil check-zone example-customer.com
zone example-customer.com is valid
# 3. Get the DS records to publish at the registrar
$ pdnsutil show-zone example-customer.com | grep "^ DS"
DS = example-customer.com. IN DS 12345 13 2 abcdef0123456789abcdef0123456789abcdef0123456789abcdef0123456789

Step 3 is the critical handoff. You take that DS record and publish it in the parent zone (usually via your domain registrar’s control panel, or via the registry’s API if you’re a registrar yourself). Until the DS record is published upstream, DNSSEC validation won’t work — resolvers have no way to build the chain of trust.

You can verify the chain with dig:

Terminal window
$ dig @8.8.8.8 example-customer.com A +dnssec +short
203.0.113.50
A 13 2 300 20260425120000 20260326120000 54321 example-customer.com. aBcDeFgHiJkLmNoPqRsTuVwXyZ...

The second line is the RRSIG — the signature over the A record set. If you see that, signing is working.

Key Rollover: The Part Everyone Gets Wrong

Section titled “Key Rollover: The Part Everyone Gets Wrong”

Keys don’t last forever. Algorithms get deprecated, keys get compromised, or you just want to follow best practices and rotate periodically. The rollover process is where most DNSSEC deployments break.

ZSK rollovers are simpler because they don’t involve the parent zone. You can do a pre-publish rollover:

Terminal window
# 1. Add the new ZSK (published but not yet active)
$ pdnsutil add-zone-key example-customer.com zsk inactive published 256
# 2. Wait for the new DNSKEY to propagate (at least 2x the DNSKEY TTL)
# This ensures all caches have seen both the old and new ZSK
# 3. Activate the new ZSK and deactivate the old one
$ pdnsutil activate-zone-key example-customer.com <new-key-id>
$ pdnsutil deactivate-zone-key example-customer.com <old-key-id>
# 4. Wait again (2x the maximum zone TTL this time)
# Cached RRSIGs signed with the old key need to expire
# 5. Remove the old ZSK
$ pdnsutil remove-zone-key example-customer.com <old-key-id>

The waiting periods are not optional. Skip them and you’ll have a window where resolvers have cached signatures from the old key but can’t verify them because the old DNSKEY is gone. That means SERVFAIL for validating resolvers.

KSK rollovers require coordination with the parent zone because the DS record must be updated. The double-DS method works like this:

  1. Generate new KSK, publish it in the DNSKEY set
  2. Publish DS records for both old and new KSK in the parent zone
  3. Wait for parent zone DS propagation (this is the painful part — you’re at the mercy of the registrar/registry)
  4. Activate the new KSK, deactivate the old
  5. Remove the old DS from the parent zone
  6. Remove the old KSK from your zone

Step 3 is where things go wrong. Some registrars update DS records within minutes. Others take hours. A few take days. And there’s no reliable way to check propagation across all caching resolvers except to wait and test from multiple vantage points.

Terminal window
# Check DS records at the parent
$ dig @a.gtld-servers.net example-customer.com DS +short
12345 13 2 abcdef0123456789...
67890 13 2 fedcba9876543210...

When you see both DS records at the parent, you’re safe to proceed. When you see only one, wait.

When DNSSEC breaks, delv is your best friend. It’s like dig but specifically designed for DNSSEC validation debugging:

Terminal window
$ delv @8.8.8.8 example-customer.com A +rtrace
;; fetch: example-customer.com/A
;; fetch: example-customer.com/DNSKEY
;; fetch: example-customer.com/DS
;; validating example-customer.com/A: starting
;; validating example-customer.com/A: attempting positive response validation
;; validating example-customer.com/DNSKEY: starting
;; validating example-customer.com/DNSKEY: attempting positive response validation
; fully validated
example-customer.com. 300 IN A 203.0.113.50
example-customer.com. 300 IN RRSIG A 13 2 300 20260425120000 20260326120000 54321 example-customer.com. aBcDeFgH...

The +rtrace flag shows you the validation chain step by step. When something fails, you’ll see exactly where:

Terminal window
$ delv @8.8.8.8 broken-dnssec.example.com A +rtrace
;; fetch: broken-dnssec.example.com/A
;; fetch: broken-dnssec.example.com/DNSKEY
;; fetch: broken-dnssec.example.com/DS
;; validating broken-dnssec.example.com/DNSKEY: starting
;; validating broken-dnssec.example.com/DNSKEY: no valid signature found
;; broken trust chain
;; resolution failed: SERVFAIL

“No valid signature found” on the DNSKEY usually means the DS record in the parent doesn’t match any of your published DNSKEYs. This happens after a botched KSK rollover.

For quick checks, dig +dnssec +cd is also useful. The +cd flag (Checking Disabled) tells the resolver to skip validation, so you can see the actual response even when DNSSEC is broken:

Terminal window
$ dig @8.8.8.8 broken-dnssec.example.com A +dnssec +cd
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 28374
;; flags: qr rd ra cd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; ANSWER SECTION:
broken-dnssec.example.com. 300 IN A 203.0.113.50
broken-dnssec.example.com. 300 IN RRSIG A 13 2 300 20260425120000 20260326120000 54321 broken-dnssec.example.com. ...

The response comes back fine — the data is there, the signatures are there. The problem is upstream in the chain of trust. Without +cd, the same query would return SERVFAIL.

DNSSEC signatures have inception and expiration timestamps. If your server’s clock is off, it will generate signatures that appear to be from the future or already expired. NTP is not optional on DNSSEC-signing servers.

Terminal window
# Check signature validity window
$ dig example-customer.com RRSIG +short | head -1
A 13 2 300 20260425120000 20260326120000 54321 example-customer.com. ...
# ^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^
# expiration inception

If your clock is ahead of the inception time or behind the expiration time from a resolver’s perspective, validation fails.

You’ve signed your zone, everything looks good locally, but DNSSEC validation fails for external resolvers. The most common cause: you forgot to publish the DS record at the parent. Or the registrar’s API accepted it but it hasn’t propagated yet.

Terminal window
# No DS record = no chain of trust
$ dig @a.gtld-servers.net example.com DS +short
(empty response)

If this is empty but your zone is signed, you’re in a broken state — but only for resolvers that do DNSSEC validation. Non-validating resolvers will work fine, which makes the bug reports intermittent and confusing.

If your DS record says algorithm 13 (ECDSA) but your DNSKEY is algorithm 8 (RSA-SHA256), validation fails. This typically happens during algorithm rollovers or when DS records are manually entered at the registrar with a typo.

Terminal window
# Check algorithm consistency
$ pdnsutil show-zone example-customer.com | grep algo
algo = 13 (ECDSAP256SHA256)
$ dig example-customer.com DS +short
12345 13 2 abcdef...
# ^^ algorithm must match

PowerDNS re-signs records automatically, but if the signing process breaks (database issues, key problems), you can end up with expired signatures in your zone. The zone still serves responses, but they fail validation.

Terminal window
# Check if signatures are current
$ dig example-customer.com A +dnssec +short
203.0.113.50
A 13 2 300 20260125120000 20251226120000 54321 example-customer.com. ...
# ^^^^^^^^^^^^^^
# This date is in the past = expired signature = SERVFAIL

We monitor DNSSEC health from the outside. An internal check only tells you that signing is working — it doesn’t tell you that the chain of trust is intact from a resolver’s perspective.

Our monitoring does two things:

  1. Validates from external resolvers — queries our zones through Google (8.8.8.8) and Cloudflare (1.1.1.1) with +dnssec and checks the AD (Authenticated Data) flag
  2. Checks DS record presence — queries the parent zone’s nameservers for our DS records to ensure they haven’t been accidentally removed
Terminal window
# External validation check
$ dig @8.8.8.8 example-customer.com A +dnssec | grep flags
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
# ^^ AD flag = validated successfully
# DS record presence check
$ dig @a.gtld-servers.net example-customer.com DS +short | wc -l
1

If the AD flag disappears or the DS record count drops to 0, we get paged.


DNSSEC isn’t fun to deploy. But once it’s running and you have monitoring in place, it mostly stays out of the way. The key (pun intended) is to never rush key rollovers, always test from external vantage points, and keep your clocks synchronized. And maybe keep delv in your muscle memory — you’ll need it.

v1.7.9