Skip to Content.
Sympa Menu

cacert-devel - Re: Some thoughts about BirdShack

Subject: CAcert Code Development list.

List archive

Re: Some thoughts about BirdShack


Chronological Thread 
  • From: Ian G <iang AT iang.org>
  • To: cacert-devel AT lists.cacert.org
  • Cc: Christopher Etz <Christopher.Etz AT cetz.de>
  • Subject: Re: Some thoughts about BirdShack
  • Date: Sat, 19 Dec 2009 16:02:13 +0100

Hi Christopher,

On 19/12/2009 13:28, Christopher Etz wrote:
first, let me introduce myself: My name is Christopher Etz, I live in
Germany. I met Ulrich and Dirk at the Barcamp in Mainz, three weeks ago.
Last week, I met them again plus Ian, Mario, Andreas, Markus and others
in Essen.


Welcome!  For all, here is the write-up of our Essen talk:

https://dev.cacert.cl/wiki/birdshack/Minutes20091216EssenSoftwareMiniTOP

I would like to support you on the BirdShack project. My
professional background: I'm a freelancer consultant since 15 years. The
main topics, I work on, are: databases, data models, application
architectures, project procedures, data warehouses and BI.


Super!


I would like to share my impressions about the concepts for BirdShack
that I have read so far. Hopefully, I don't offend anybody with my
feedback. If this mail starts a real thread of replies, it might be
worth to seperate the topics into individual threads.

So far I see nothing offensive, and I'm pretty thick-skinned anyway.


   1. The API and RESTful services
      The plans about a HTTP(S) based communication between the
      (frontend) application and the business logic layer seem to be
      pretty settled.


yes, this was the major area where the Innsbruck went deep and detailed. The reason for this is that this is the key battleground between the website and the business logic. If this is established strongly, the two respective teams have at least one foot on the ground. It's also what audit wanted, a clean separation.

https://dev.cacert.cl/wiki/birdshack/Work_History


Obviously, a call via this mechanism is by several
      orders of magnitude more expensive than a direct call within
      whatever programming language. Thus, an activity within a business
      process (german: Geschäftsvorfall; is there a better translation?)
      should result in one or very few of such calls in order to achieve
      acceptable performance.


OK, just a point on performance. This is security code, and it has to be auditable security code. So, performance has to take a backseat in this design. Obviously there are some areas where we can get some performance, but the presence of a fully tiered API that is clearly accessed by a call to a port is a very important element in audit; we can very clearly draw a line at that point and confidently ignore what is on the other side of that line, while we audit our side.

https://dev.cacert.cl/wiki/birdshack/Security

Another way of putting it is that auditing performance is more important than CPU performance :) CPUs are cheaper than auditors.

(I realise you are not criticising that decision here, but I recall that Markus brought up the point that a tiered structure raises risks that might cause overall failure of the project, over a monolithic design. Your question reminded me of another reason why we went this way.)


As a consequence, the API should be
      oriented towards the activities of the business processes.
       From what I understand, the API is currently more oriented
      towards the entities (classes, instances) that the business logics
      knows. E.g. creating a new member would result in individual calls
      to create the member, a name, a DoB, an email address and perhaps
      other things. Another problem is, that it is hard to preserve the
      consistency of the database. E.g. every application deleting a
      member must know and implement calls to delete the name, the DoB,
      the email addresses, perhaps the assurances and so on.


I see your point. Yes, when I did this approach in my last big similar API (a thing called XML-X for payments transactions) it was a business language approach, not an entities approach.

My recollection here was that the 3 other guys at Innsbruck were quite well agreed that the entities approach was better. They were following the RESTful school, which isn't in my experience.

https://dev.cacert.cl/wiki/birdshack/Resources
https://dev.cacert.cl/wiki/birdshack/URL_Space

Having said that, I actually see this as not a big deal, because in my experience, having shaken out the core on this basis, it is often easier to migrate to new calls that provide better matches between the two teams ... as the experience develops. As long as the two teams can clearly express themselves to each other, this can happen. And if they can't, then no API design will help...


      One point that I miss in the current information within the wiki,
      is the kind of response that the services produce. I expect, you
      think of XMLs containing the requested information.

Yes, that was an expectation.  But it wasn't written.


Then, it is up
      to the different applications to decode (unmarshall, deserialize)
      the response. This would require to implement similar code in
      every programming lanuguage of the different applications. Or am I
      wrong here?


That's correct, every language would then need to implement a set of objects and a set of unmarshalling / deserializing code. I don't know a way around this. In my experience, trying to cut corners here never works out; better to write it all, fully, and then unit-test it very heavily ...


   2. Hand written code
      Between the lines (i.e. not being expressed explicitly) I've got
      the impression, that you expect all code to be written manually, like:
          * individually coded SQL statements
          * applications written in PHP, Python or whatever, building
            the HTML pages, communicating with the business layer and
            unmarshalling / deserializing the returned results
          * Coding some functionality for a modern ("nice and sexy")
            user interface in JavaScript

One point: I expect the entire business logic middle server to be hand-written, either completely or in the majority.

The reason for this is that it is the major defensive line for security. And it is the major attention for audit. Any code that is acquired from elsewhere has to support that audit. And that means either it comes with an audit, or there is some other equally-easy way to say "it's reliable".


      I don't want to go as far as to suggest a model-driven
      architecture
      (http://en.wikipedia.org/wiki/Model-driven_architecture). But
      there are means to automate some of these tasks (see my next point).


Second point: in contrast to that, the web frontends can be less cautious. So I would think that this is a plausible suggestion; I don't know how far we can go in downloading some shlock from the net, but we can certainly be more relaxed about leaning on some good tools.

Alejandro commented here on this:
https://dev.cacert.cl/wiki/birdshack/API


   3. Technologies that might be useful
      Within the architectural overview, a decision should be suggested
      about how to access the database. The technologies of interest are:
          * Pure SQL via JDBC (if the programming language is Java)


Java was the expectation; it depends on the team that does the work, but the feeling is that Java has by far the best readability and maintainability in the long run (it has other sins to compensate, it's by no means good for everything).


          * Use of an ORM (object relational mapper, such as Hibernate)


Possibly, depends on security.  I like the sounds of an ORM.

          * Use of JPA (possible with Tomcat) or even JTA (requiring an
            application server like JBoss)


Depending on our security/audit expectation of those tools. The bigger they are the less likely. (In my personal experience in secure Java servers, we didn't use them and we were very happy.)


      The user interface could also be coded with some standard
      technology such as
          * JSP (JavaServer Pages)
          * JSF (JavaServer Faces)
      I personally, would vote for JSF, because there are a lot of
      implementations, even open source, that offer the chance to have a
      modern user interface: Some of them implement AJAX or other
      JavaScript code automatically, without forcing the developer to
      code this manually. Candidates are:
          * Suns Reference Implementation (minimalistic and poor)
          * Apache Myfaces (widely spread, still simple)
          * ICEfaces (good interfaces with lots of AJAX)
          * JBoss Richfaces (even more AJAX, but tricky to use and
            neither very stable nor very robust [in my impression])
          * Primefaces (a turkish implementation with much AJAX; but I
            still haven't work with it).
      The discussions and decisions on the technologies to use will
      spread another light on the necessity of a HTTP(s) based


This depends on the team doing the frontend. It's much more open. If you are doing the frontend, you can pick the above.

https://dev.cacert.cl/wiki/birdshack/Architecture

In the Innsbruck meeting, we deliberately made this conscious choice to set a hard API over the business logic, so we could set free the teams doing the web front ends, and so we could anticipate exporting the capability to other more exotic applications.


      communication.
   4. Other candidates for the database system
       From what I read, I've understood that you want to move from
      MySQL to another database system such as PostgreSQL.


No, I don't think that choice was made. What choice was set was that we expected to have to use a SQL / relational database. We left the final choice to later, for the team to work out.

As it happens, there is a MySQL database in the current system. So depending on the migration / integration plan that falls out, the team may prefer to stick with the existing database (one foot on the ground) or build it afresh elsewhere (fresh start, fresh data).

Althought I
      don't really know MySQL, this seems to be a good idea in my opinion.
      In my opinion, PostgreSQL requires regular maintenance (the famous
      vacuumdb at least) and might not be the best option when
      recoverability is a high requirement.


:-) I must admit this is a mystery to em ... I do not know why people in the database world think recovery is not ever a high requirement? Is an unrecovered database still a database???

Certainly recoverability is a requirement. The way I've done it in the past is: ok, now throw away your live data set, recover the database from your last backup and show me it running with the users. That might be seen as obsessive, but my work was in payments, and no penny was every lost. I don't know that we care so much in the CA world about that, we can probably afford to lose a cert or a revocation or two as long as we have the other systems to deal with it.


      Other candidates that come to my mind are:
          * Oracle: Might be too big (resource intensive) and it is
            questionable, if they support a non-profit oragnization with
            a free license.
          * Ingres: Used to be an important player in that area (agreed,
            that's decades ago), now turned into open-source, has
            excellent features for performance and recovery.
   5. Thoughts on the approach
      I saw some code fragments in the SVN repository already. But I
      believe, it's too early to continue here.


I've had a look at the Java that Mario wrote in the SVN, yes, very early stuff, but we have to start somewhere, and I think he chose a good place to start. I refactored it on the train back, but not committed anything. I'm not sure about how to compile it up and test it as yet; without that I'm just fiddling about.


      Instead, I'd suggest an approch like this:
         1. Parallel activities on:
                * Analyzing / defining / collecting the activties of the
                  business processes
                  The wiki page "Actors" seems to be a good starting
                  point here.


yes.

                * Defining a set of "candidate" technologies (there are
                  probably more than I mentioned, both on the mentioned
                  layers and on another layers)


Yes, agreed. we did try to do that, and keep an open mind to the flame wars, etc.

The team for the front end has a lot more liberty in choosing their technologies, I think.

For the middleware consensus existed that Java and POJO was the way forward. For the backend ("signing server") it was expected to be C, but that's less relevant to this discussion. For the other backend ("SQL") ... that is left open for now.

From the flame wars section:

https://dev.cacert.cl/wiki/birdshack/Choosing_a_Language

                * Defining a set of criteria, against which the
                  technologies will be evaluated, once the business
                  processes are known.
         2. Putting the results of together, i.e. agreeing on the
            business processes, evulating the technologies and defining
            a set of technologies to go with.
         3. Adapting the architectural overview to the planned
            technologies (here, the decision about the API and its
            communication mechanism should be taken).
         4. Starting the implementation. Here, I presume that BirdShack
            will be a re-write of new code and not a patchwork on old code.

Having said all this, I want to make clear, that I never wanted to
diminish the value of the work that has been done so far. Hope you don't
regard me as a negative critic.


Not at all.  We did a lot in that week.  But much more needs to be done!



iang



Archive powered by MHonArc 2.6.16.

Top of Page