Major outage of AdGuard DNS
Incident Report for AdGuard
Postmortem

The issue has been caused by a security update applied by the Debian's "unattended-upgrades" package. Usually, security updates do not cause any issues, but not this time.

Here's what happened:

  1. At about 5 am UTC, the "unattended-upgrades" package installed a security patch for unbound.
  2. Alongside that patch, it installed AppArmor (a Linux kernel security module that allows the system administrator to restrict programs' capabilities with per-program profiles) with default rules and launched it.
  3. AppArmor's default ruleset is incompatible with our configuration so unbound simply stopped working. At the same time, the AdGuard DNS binary continued to work and was trying to process queries.
  4. This was done automatically and almost at the same time on all 26 AdGuard DNS servers.
  5. It took us about 30 minutes to realize what really happened. Once the issue was identified, it took us about half an hour more to roll out the updated configuration to all the servers and restart everything.
  6. At first, we brought up only DNS-over-HTTPS and plain DNS in order to warm up the DNS caches.

What we're going to do to avoid this in the future:

  1. We disabled the "unattended-upgrades" package on the servers. The fact that it installs not just the security updates, but also packages marked "recommended" is quite dangerous.
  2. All security updates will be reviewed manually and applied to the test server before rolling them out to everyone.
Posted May 29, 2020 - 12:35 UTC

Resolved
This incident has been resolved.
Posted May 28, 2020 - 07:02 UTC
Update
DOT is fully restored
Posted May 28, 2020 - 07:00 UTC
Update
DOT is fully restored in Amsterdam, we're rolling it to other locations
Posted May 28, 2020 - 06:53 UTC
Update
DNS and DOH are operational, we're working on restoring DOT
Posted May 28, 2020 - 06:12 UTC
Identified
The problem is identified, we're working on the fix
Posted May 28, 2020 - 05:54 UTC
Update
We are continuing to investigate this issue.
Posted May 28, 2020 - 05:34 UTC
Investigating
We are currently investigating this issue.
Posted May 28, 2020 - 05:05 UTC
This incident affected: AdGuard DNS.