Re-architecting mod_parrot

WARNING: This post is long and technical. Approach with caution.

The design of mod_parrot has always closely followed mod_perl2. After all, mod_perl does a lot of what mod_parrot does, at least when it comes to integrating with Apache. And up to this point that design has held up, whether we were using just one language, or multiple languages (e.g. mod_perl6 and PHP on the same server).

That all changed two weeks ago when I was thinking about how to support multiple high-level languages (HLLs) with handlers in the same Apache phase. A good example of this would be a directory with two different acceptable authentication schemes, written in two different languages. Both would have register an authentication handler with mod_parrot. Seems simple enough, but there are two major problems with this.

The first problem is that mod_parrot does not support "stacked" handlers, which are multiple handlers for a single phase in a particular section. Adding support for stacked handlers would be a fairly simple task, but then that brings us to our other problem: handling the semantics of each phase. If a module handler fails or declines, does mod_parrot immediately pass that status back to Apache, or does it move onto the next stacked handler in the phase? The answer to this question depends on which phase you are in and what status was returned from the handler. I was about to write code to support all of this when I realized something -- I was rewriting Apache!

Apache does exactly what we are trying to do, except with individual Apache modules. With this in mind, the solution was obvious -- every HLL supported by mod_parrot must be represented by a first class Apache module. This is a monumental task, but fortunately I had already added support for adding Apache modules when I implemented custom Apache directives. The remaining work was adding HLL hooks for each phase. This is easy in C. But we can't use C -- the point of mod_parrot is to support new HLLs without requiring C. Unfortunately, Apache wants module hooks to be C functions.

I solved this problem by writing a common set of hooks in C that do nothing but figure out which HLL module is in effect*, and call the corresponding HLL metahandler for that hook (metahandlers implement the semantics of an HLL and call the real handler code). And as an added bonus, refactoring the hook code reduced the size of mod_parrot.c by about 8K. Sweet!

Now we just needed each HLL to manage its own configuration data. Previously, HLLs would register their handlers with mod_parrot, which worked, but wasn't optimal since mod_parrot had to manage all the configuration data. Now that the HLL layer is implemented in a regular Apache module, we can remove that responsibility from mod_parrot. The implementation was fairly straightforward. Using a model similar to the one found in mod_perl2, each HLL can have callbacks to create and merge both server and directory configurations. And of course, callbacks for custom directives can update configurations and, and the metahandlers can read from them so they know what handlers to call, etc.

Ok, that was a mouthful. So what have I accomplished with all of this? Well, mod_parrot is now a much thinner layer between Apache and the various HLL modules. The HLL layer is responsible for much more now, but that results in greater flexibility down the road.

That's all I have the energy to type for now. My next post will demonstrate how to use all this new functionality to create an HLL module of your own!

*Apache makes no effort to tell you what module a hook is currently running in because there is usually a one-to-many mapping of modules to hooks. This is not the case in mod_parrot (a many-to-many mapping), so we maintain an index into a list of HLL modules that always points to the current module.