cacert-sysadm AT lists.cacert.org
Subject: CAcert System Admins discussion list
List archive
- From: Ian G <iang AT cacert.org>
- To: cacert-sysadm AT lists.cacert.org
- Subject: Re: system replication for availability
- Date: Wed, 24 Jun 2009 12:26:32 +0200
- Authentication-results: lists.cacert.org; dkim=pass (1024-bit key) header.i= AT cacert.org; dkim-asp=none
I have no implementation suggestions, but some management perspective. (I think Daniel knows all this, but others might not.)
The really big priority is to get the infrastructure machines out of the BIT rack, so they are not interfering with the work of Wytze and the security policy.
It's also been considered that they should be somewhere else, like another country. It's difficult to predict quite which country is best, but it is something where we have to look at the aspects when you guys suggest the country. For legal reasons, the current favourite is Switzerland. Vienna would be good because of Philipp, Philipp and also the sonance guys. Australia would be good because of Daniel. Etc etc, it all depends.
Having the stuff able to do hotstandby is something that would be nice, but it automatically requires two machines, which is something that doubles our trouble above. I suppose I would say, it is ok to experiment and work up to that, but I would be nervous of having the above project also saddled with a need for 2 machines.
(And, we have to consider backups. Which may imply we need 2 machines anyway, or it may imply that crit & infra teams do a swaps deal. And/or maybe that's the 2nd machine for your hot standby experiments?)
Just some thoughts!
iang
On 24/6/09 11:08, Daniel Black wrote:
I'm currently looking for a system to make CAcert's non-critical systems a
little more available in the case of hardware failure.
I've currently got a bunch of linux vservers on one machine and have another
in close network proximity.
My current thoughts are to use chironfs[1] or glusterfs[2] to replicate the
vserver filesystems to the backup node and use heartbeat to bring up the
backup
when the primary fails. At the moment I think I can handle manually restoring
the primary node.
chironfs is a userspace filesystem that can read from the localdisk when it
needs to and writes the the localdisk and the secondary node via NFS.
Reasons I chose it:
* no need to replace existing filesystem
* can install it without rebooting machine - just need to restart vservers
* realtime sync
* nagios monitoring
Poorer points that I can handle:
Its IO performance is really slow however I don't think any of the non-
critical servers are particularly IO intensive.
gluster - only just found - seems to include its own tcp transport. Could be
better. Has nice booster LD_PRELOAD library to bypass fuse.
Any other suggested implementations that are easy to implement or any thoughts
on these two?
[1] http://www.furquim.org/chironfs/index.en.html
[2] http://www.gluster.org
Daniel Black
--
Infrastructure System Administrator
CAcert
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature
- system replication for availability, Daniel Black, 06/24/2009
- Re: system replication for availability, Ian G, 06/24/2009
- Re: system replication for availability, Daniel Black, 06/24/2009
- Re: system replication for availability, Ian G, 06/24/2009
Archive powered by MHonArc 2.6.16.