cacert-sysadm AT lists.cacert.org
Subject: CAcert System Admins discussion list
List archive
- From: Ian G <iang AT cacert.org>
- To: cacert-sysadm AT lists.cacert.org
- Cc: Wytze van der Raay <wytze AT cacert.org>, Philipp Guehring <philipp AT cacert.org>, Mario AT cacert.org, Daniel Black <daniel AT cacert.org>, Mendel Mobach <mendel AT cacert.org>
- Subject: Re: DNS outage
- Date: Mon, 26 Apr 2010 21:16:42 +1000
- Authentication-results: lists.cacert.org; dkim=pass (1024-bit key) header.i= AT cacert.org; dkim-asp=none
Wytze and team,
would it be ok if I present this mail as a progress report to the board in next meeting? We voted on moving the DNS into the critical area a while back, and this little hiccup seems like a great time to see how it's going.
Please let us know if there are any rocks to move, buildings to knock down, or other use of blunt instruments...
iang
On 26/04/10 4:33 AM, Wytze van der Raay wrote:
Hi Philipp and others,
Op 25-4-2010 15:11, Philipp Guehring schreef:
We had an outage of the cacert.org zone in the DNS servers today.
I asked the provider go-now.at to restore a backuped zone, and he
switched the configuration, so that dns1.go-now.at is currently the
authorative master and dns2.go-now.at and dns4.go-now.at are sourcing
their zone from dns1. (So changes will not be propagated to
dns1.go-now.at at the moment!)
Please check whether the backuped zone contains all changes that were
done lately, and contact me directly for any necessary changes now.
The currently served zone is identical to the last version, as indicated
by the version number (2010022801) in the SOA record:
$ dig @dns1.go-now.at. +norecurs cacert.org soa
...
;; QUESTION SECTION:
;cacert.org. IN SOA
;; ANSWER SECTION:
cacert.org. 43200 IN SOA dns1.go-now.at.
hostmaster.go-now.at. 2010022801 7200 3600 604800 43200
...
so that is fine.
Please provide me with information about what happened to the cacert.org
zone in the past 72 hours, so that we can analyze the source of the problem.
I currently guess that the reason that somehow the update from Wytze's
DNS servers to the go-now.at servers failed, but I am not sure about
that yet.
I understood the same from Mendel -- the dns[124].go-now.at servers did not
have valid zone information anymore, presumably because it wasn't refreshed
(zone expire time is 1 week). As to why it wasn't refreshed, I have no idea,
there are no messages logged on the authorative ns1.cacert.org server about
failing/refused zone transfers by the *.go-now.at servers . But I do note
that the *.go-now.at servers have shown random failures in the past, in not
picking up zone file changes from the authorative master. Not quite the same
problem, but it could be related I guess.
Anyway, I would like to see the go-now.at configuration reverted to the
original setup, with the zone being sourced from the authorative server
ns1.cacert.org at 82.95.226.191. And while on the subject, could you also
please see to it that zone transfers by random hosts *from* the go-now.at
servers are turned off?
To relieve DNS problems caused by a mailfunctioning *.go-now.at setup,
I have added the ns[123].cacert.org servers to the list of name servers
for the cacert.org domain. ns[123].cacert.org have established a record
for reliable operation with exclusive serving of the cacert.net and
cacert.com zones, and are fully monitored by Nagios.
As soon as ns1.cacert.org has been migrated to its final destination
inside the CAcert critical infrastructure, we can do away with the
go-now.at setup as far as I am concerned (unless it can comply to the
required standard of doing TSIG-protected zone transfers, and DNSSEC
support).
Regards,
-- wytze
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature
- DNS outage, Philipp Guehring, 04/25/2010
- Re: DNS outage, Wytze van der Raay, 04/25/2010
- Re: DNS outage, Ian G, 04/26/2010
- Re: DNS outage, Wytze van der Raay, 04/27/2010
- Re: DNS outage, Ian G, 04/26/2010
- Re: DNS outage, Wytze van der Raay, 04/25/2010
Archive powered by MHonArc 2.6.16.