The issue has been caused by a security update applied by the Debian's "unattended-upgrades" package. Usually, security updates do not cause any issues, but not this time.
Here's what happened:
- At about 5 am UTC, the "unattended-upgrades" package installed a security patch for unbound.
- Alongside that patch, it installed AppArmor (a Linux kernel security module that allows the system administrator to restrict programs' capabilities with per-program profiles) with default rules and launched it.
- AppArmor's default ruleset is incompatible with our configuration so unbound simply stopped working. At the same time, the AdGuard DNS binary continued to work and was trying to process queries.
- This was done automatically and almost at the same time on all 26 AdGuard DNS servers.
- It took us about 30 minutes to realize what really happened. Once the issue was identified, it took us about half an hour more to roll out the updated configuration to all the servers and restart everything.
- At first, we brought up only DNS-over-HTTPS and plain DNS in order to warm up the DNS caches.
What we're going to do to avoid this in the future:
- We disabled the "unattended-upgrades" package on the servers. The fact that it installs not just the security updates, but also packages marked "recommended" is quite dangerous.
- All security updates will be reviewed manually and applied to the test server before rolling them out to everyone.