Skip to Content.
Sympa Menu

cacert-sysadm - CAcert infrastructure ... past, present, future tasks

cacert-sysadm AT lists.cacert.org

Subject: CAcert System Admins discussion list

List archive

CAcert infrastructure ... past, present, future tasks


Chronological Thread 
  • From: Jan Dittberner <jandd AT cacert.org>
  • To: cacert-sysadm-volunteers AT lists.cacert.org
  • Cc: cacert-sysadm AT lists.cacert.org, cacert-board AT lists.cacert.org
  • Subject: CAcert infrastructure ... past, present, future tasks
  • Date: Sun, 27 Oct 2019 21:01:43 +0100

On Sun, Oct 27, 2019 at 12:07:16PM -0400, Don Harris wrote:
> Hi Jan,
>
> Thanks for your email.
>
> I think some of the confusion lies in the urgent call for SysAdmins, sent
> out by Brian McCullough, on September 6th.
> There were roughly a dozen replies to that message offering help.
> As one of those who replied having an interest in helping, we've since
> heard nothing.
>
> So, it appears that there are lots of people willing to help, but no one
> knows what's needed. If you're in a position to further outline what's
> needed, and any vetting process as you mentioned, that will likely help to
> get things started.

Hi Don,

the help needed mail from our board was very broadly addressing several
topics where CAcert needs help and I had no time to prepare a list of
open task in advance and was a bit overwhelmed and unprepared for the
mail and the many responses. I'm very grateful for all the help offers
though.

I try to give a bit of background about myself, and of where we came
from and where we are now with our infrastructure first:

About me
========

I am an IT architect from Dresden (Germany) and do volunteer system
administration work for CAcert since 2009, I started with maintaining
the SVN server at this time and took over administrator other systems
including our infrastructure server in 2010 before Daniel Black left
CAcert in 2012 (if I remember correctly). I took over the infrastructure
team lead role from Mario Lipinski in 2015. I do volunteer work as a
Debian Developer and help with Debian booths in Chemnitz (Germany) for
many years.

I have a full time job and family with three kids and my available time
is therefore quite limited.


CAcert's infrastructure
=======================

CAcert's infrastructure is split in two main parts: critical systems
(signer, user database, web frontend, firewalls) and non-critical
systems. The infrastructure team takes care of the non-critical systems.

When I came to CAcert in 2009 there was an ongoing effort to clearly
separate the two areas (critical/non-critical) and we finished this in
2011. Since then we had a separate hardware machine for the non-critical
infrastructure. In 2015 we got a new machine that was donated by Thomas
Krenn and is what we have today.

All non-critical systems are running on a single hardware machine [1] in
separate LXC containers. This hardware machine has been updated to
Debian Buster in July [2] and is primarily maintained by me. A second
active administrator with LXC, ferm/iptables/nftables and Debian skills
would be nice but requires a high level of trust.

Some of our systems have seen no update for many years because they have
no properly documented way to upgrade their application software and/or
configuration.

All our infrastructure is running on Debian GNU/Linux. I prefer to run
all software from official Debian packages but we have some older
systems that do not follow this rule unfortunately.

Documentation
-------------

[1] https://infradocs.cacert.org/systems/infra02.html
[2] https://lists.cacert.org/wws/arc/cacert-sysadm/2019-07/msg00001.html

The documentation for our systems is of varying quality. Some systems
have been setup and documented properly in the last few years. New
documentation is written with the Sphinx documentation system [3] and
maintained in a git repository. This is what is published on
https://infradocs.cacert.org/ automatically. We use some extensions for
Sphinx to automatically generate IP address, SSH host key and X.509
certificate lists automatically from the individual system's
documentation pages. Documentation was previously located in the CAcert
Wiki [4] but is almost unmaintained there.

[3] https://www.sphinx-doc.org/
[4] https://wiki.cacert.org/SystemAdministration/Systems

There are still some scarcely documented and unmaintained systems that
would need someone to do some archeology and especially find out which
modifications to upstream software have been made so that they can
either be upgraded or replaced with freshly setup systems.

Configuration management
------------------------

A few years ago I started to professionalise our infrastructure setup
and introduced Puppet as our configuration management. The level of
configuration that is managed by Puppet varies between systems. Older
systems are not managed at all or only have etckeeper to provide a
minimum of version controlled configuration. Some newer systems
(motion.cacert.org and email.cacert.org) have been setup using Puppet
and have all their configuration in Git. My hope is that this will be
the default but requires a lot of work.

All Puppet code can be found in Git [5].

[5] https://git.cacert.org/gitweb/?p=cacert-puppet.git

Monitoring
----------

We use Icinga 2 for our monitoring needs. We have an Icinga2 master
running on monitor.cacert.org. Newer machines that are managed by Puppet
are running Icinga 2 agents and older machines are monitored remotely
via NRPE. I run an external Icinga 2 agent in a different data center
for monitoring external access to our systems.

The Icinga 2 monitoring checks are defined in a Git repository [6] and
automatically applied to the actual systems via a Git post-receive hook.

[6] https://git.cacert.org/gitweb/?p=cacert-icinga2-conf_d.git


Tasks
=====

From my point of view most help is needed for documenting the currently
poorly documented and/or unmaintained systems to allow a decision how to
update or replace them.

I would also like to have support moving the existing configuration into
Puppet code and using Puppet on the remaining systems that are not
managed by Puppet yet.

It would be great if someone would take care or share the load of
maintaining the following systems:

- blog.cacert.org - is running an outdated Wordpress setup and would
need upgrades by someone knowing PHP and Wordpress well enough to
analyse and port the CAcert customizations

- board.cacert.org - an old OpenERP installation that would have to be
replaced by something that is maintained upstream to be able to update
the system to a modern Debian release

- cats.cacert.org - this is our assurer self training system that is
running a custom built PHP based software. An administrator with PHP
and Apache httpd skills that needs to communicate with the software
development team would be needed to allow an upgrade of that system

- irc.cacert.org has been upgraded to a current Debian release but has
no active administrator, I take care of this system on a best effort
base but do not know IRC administration very well

- lists.cacert.org is our mailing list system running the Sympa mailing
list software (chosen because of its ability to run S/MIME encrypted
mailing lists). The system needs an upgrade to a current Debian
release and someone who knows Sympa well enough to maintain it
properly

- test.cacert.org, test2.cacert.org, test3.cacert.org,
testmgr.cacert.org these are the test systems for our software
development team that are used to test the software that will end up
running on the critical systems (signer and www.cacert.org). These
systems should be as close to the critical systems as possible and
require coordination with the critical team and the software
development team

- wiki.cacert.org this is our Wiki system which is running an unpackaged
and most probably outdated version of the Moin wiki software that
could need some love, upgrades and documentation

I am still working on upgrading webmail/community.cacert.org and split
out the self service aspects of this system to a separate LXC container.
This is still work in progress and if someone is interested I would be
willing to share my ideas and would be grateful for
feedback/comments/reviews.

I would also like to improve backups of the whole infrastructure. At the
moment we can perform backups from the infrastructure host to an
externally attached backup disk but these do not consider application
consistence but just do dumb file system based backups with no
guarantee that databases used by the applications are consistently
snapshoted.


These are my thoughts/ideas for the moment. Any feedback and questions
are appreciated. I will try to answer all mails/IRC questions as soon as
possible.


Kind regards
Jan Dittberner

--
Jan Dittberner - CAcert Infrastructure Team Lead
Software Architect, Debian Developer
GPG-key: 4096R/0xA73E0055558FB8DD 2009-05-10
B2FF 1D95 CE8F 7A22 DF4C F09B A73E 0055 558F B8DD
https://jan.dittberner.info/

Attachment: smime.p7s
Description: S/MIME cryptographic signature




Archive powered by MHonArc 2.6.18.

Top of Page