Perl
How mod_perl6 works
Submitted by jeff on September 15, 2007 - 10:00am.NOTE: This article assumes familiarity with mod_perl, Parrot, Perl, and Apache. If any of the code or concepts presented here are unfamiliar to you, please see the references at the end of the article.
Writing the world's first mod_perl6 handler and actually seeing it work was quite a moment for me. It validated years of hard work from myself and the rest of the Parrot, Perl6, and Pugs developers. Here is an example of a mod_perl6 response handler:
sub polly_handler($r)
{
$r.puts("<h1>Polly, a mod_perl6 handler</h1>\n");
$r.puts("SQUAWK! Polly says "~$r.args());
0; # Apache OK
}
"Polly" takes the contents of the HTTP query string and returns a simple HTML page repeating the string back to the client. Pretty simple. But what's going on behind the scenes is much more complex and actually pretty exciting! So let's take a little trip, and follow the lifecycle of a mod_perl6 response handler through Apache, mod_parrot, mod_perl6, and back again.
Mapping the request to mod_parrot
We'll put our handler code in /myhandlers/polly.p6 and call it with the following simple URI: /polly. We map that location to the mod_parrot handler parrot-code with the following Apache location block:
<Location /polly>
SetHandler parrot-code
ParrotHandler /myhandlers/polly::polly_handler
ParrotLanguage perl6
</Location>
ParrotHandler tells mod_parrot where to find the code for this handler, and that the name of the handler subroutine is polly_handler. Right now it requires a path and a unique subroutine name, but will accept module names once the Perl6 implementation supports namespaces. Note that the .p6 extension will be appended automatically.
The ParrotLanguage directive tells mod_parrot that this is Perl6 code. mod_parrot can support any language that runs on the Parrot virtual machine, so we need to be explicit here.
Let's say our full request looked like this:
/polly?want_a_cracker
Apache will now take that request, map it to our location block and our mod_parrot handler, and place "want_a_cracker" in the args member of Apache's request_rec structure.
Initializing Parrot
Parrot is initialized not at Apache startup, but on demand for each httpd process. At the time of this writing, this happens at the first invocation of a Parrot handler. This behavior will change in a future release, as there are benefits to starting the interpreter earlier.
Once the interpreter has been initialized, it performs a few tasks. The first task is to load the initialization code stored in mod_parrot.pbc. This is Parrot bytecode that declares various mod_parrot namespaces, maps the Apache API to Parrot NCI functions, and loads other supporting libraries for accessing Apache's internal data structures, such as the request_rec structure and APR tables.
Read that last paragraph again. One more time for good measure. Now think about what you've read. mod_parrot is handling the interaction with Apache for us. Language modules like mod_perl6 don't need to touch the Apache internals directly, which makes the code for those modules much simpler. In fact, I expect that the majority of mod_perl6 will eventually be written in pure Perl6, relying on the Parrot backend for the nitty-gritty details of dealing with Apache.
Loading and parsing the Perl6 handler
Now that mod_parrot's interpreter has been initialized, it can load the code for our handler. But what do Parrot or mod_parrot know about the naming conventions, file layout, or module loading semantics of any particular language? Nothing! So mod_parrot handles this in a very graceful manner: it delegates the work to someone else!
That someone else is called the HLL layer (High Level Language layer). At the very minimum the HLL layer is responsible for loading, compiling, and running the language handlers. It can also serve as a proxy between the HLL and mod_parrot's Apache API, creating language-specific objects to pass to handlers, or calling handlers with different arguments than mod_parrot would by default. Anything you need to change about mod_parrot to suit your language's implementation belongs here. The HLL layer can be written in PIR (Parrot Intermediate Representation) or the high level language itself (e.g. Perl6). For now, the Perl6 HLL layer is written in PIR.
Each HLL layer defines various subroutines that will be called by mod_parrot (subject to change):
_load(handler_name) should load and compile the code for handler_name.
_config() is reserved for use during the Apache configuration phase and is not currently implemented.
_*handler(handler_name) should run the code for handler_name, calling _load() if necessary. There is one subroutine for each handler in the Apache lifecycle.
A practical example of HLL layer functionality is the ".p6" extension on our handler's name. mod_perl6's _load() subroutine takes care of appending it for us so we don't need to add it to every ParrotHandler directive.
Executing the mod_perl6 handler
When our handler for Polly is called, mod_parrot calls the Perl6 _handler() subroutine, passing it the name of the handler, /myhandlers/polly::polly_handler, as an argument. _handler() loads the code for this handler if it hasn't already by calling _load(), and then executes it.
The handler can interact with mod_parrot and Apache in various ways during execution. The most common way is through objects that are passed as arguments or reqeusted directly from mod_parrot. For example, unless the HLL layer says otherwise, response handlers are passed a Parrot ['Apache';'RequestRec'] object. This object lets us access various aspects of the request in Apache's request_rec structure, such as the query string that our handler will use.
Perl6 will place the ['Apache';'RequestRec'] object in $r, as specified in our handler's declaration:
sub polly_handler($r)
Perl6 objects are implemented as Parrot objects, therefore a method call on a Perl6 object will invoke the underlying Parrot method, whether the object was created in Perl6, PIR, or even another Parrot-based language. mod_perl6 exploits this behavior for its own benefit. In our handler we call two methods: args() to retrieve the query string, and puts() to output strings to the client:
$r.puts("SQUAWK! Polly says "~$r.args());
Internally, these methods are actually PIR that call C functions to interact with Apache. In the case of puts(), Parrot calls Apache's ap_puts() function:
dlfunc func, nul, "ap_rputs", "itp"
set_root_global [ 'Apache'; 'NCI' ], "ap_rputs", func
...
.sub puts :method
.param string data
.local pmc r
.local pmc ap_rputs
getattribute r, self, 'r'
ap_rputs = get_root_global [ 'Apache'; 'NCI' ], 'ap_rputs'
ap_rputs( data, r )
.end
args() is a bit more complex, as Apache doesn't provide an API function for getting or setting this value. So mod_parrot provides its own C function to maniupulate the request_rec structure:
char *mpnci_request_rec_args(Parrot_Interp interp, request_rec *r, char *args, int update_r)
{
if (update_r == 1) {
r->args = (char *)apr_pstrdup(r->pool, args);
}
return r>args;
}
It also provides a corresponding PIR method for calling that function:
.sub args :method
.param string data :optional
.param int update_r :opt_flag
.local pmc r
.local pmc request_rec_args
.local string args
getattribute r, self, 'r'
if update_r goto call_it
data = ""
call_it:
request_rec_args = get_root_global '_modparrot'; 'NCI' ], 'request_rec_args'
args = request_rec_args( r , data, update_r )
.return(args)
.end
Passing control back to Apache
The last thing a handler does is returns a status code to Apache to indicate whether the request was handled successfully or not, or whether it has declined to process it. A status of 0 (OK) tells Apache that the request was handled successfully. However, since return has not yet been implemented for Perl6, we declare the status alone on its own line, as Perl will by default use the last value seen as a subroutines's return value:
0; # Apache OK
Upon successful execution of a handler subroutine, mod_parrot will pass the return value and control of the request back to Apache. From there the data is sent to the client, and all is well with the world.
Persistence
Like Perl interpresters in mod_perl, Parrot interpreters in mod_parrot and their data are persistent. So suppose our handler is called again. Assuming the request hits the same Apache process (each process has its own interpreter), it will not bother initializing an interpreter or loading the code; it will just run the handler. Additionally, any global variables will retain their values, so it's easy to maintain persistent data structures like caches or counters.
Concluding remarks
I hope this has been an informative overview of how mod_perl6 works, as well as a peek into the internals of mod_parrot. Remember that most of the concepts and processes described here don't just apply to mod_perl6, but to any other language running on the Parrot VM. For example, assuming someone writes a compiler for Python, mod_python could be implemented in the same fashion as mod_perl6. And a language like PHP would be trivial to implement using mod_parrot, as it doesn't require the level of access to Apache internals as mod_perl6 does. Anyone feel like writing a compiler? ;-)
