Blogs

Writing mod_perl6 in Perl 6

One of the goals of the mod_parrot project is to provide the infrastructure for running the Perl 6 version of mod_perl, a.k.a. mod_perl6. I've already demonstrated that mod_perl6 works, so that goal is slowly being achieved. Many thanks to Patrick Michaud, Jerry Gay, and everyone else who has worked on the Parrot implementation of Perl 6.

Another lesser known goal of mod_parrot is to allow the high level language (HLL) layers to be written in the HLL itself. That is to say, write mod_perl6 in Perl 6. Up to this point, mod_parrot has five HLL layers (PIR, NQP, Perl6, PHP/Plumhead, Perl1/Punie), all written in Parrot's native PIR. However, yesterday, with some help from Patrick, I was able to port mod_perl6 from PIR to pure Perl 6!

As an example, here is a very bare-bones mod_perl6 (DISCLAIMER: string interpolation in namespaces doesn't actually work yet):

module ModParrot::HLL::perl6;

our %loaded_modules;

# load a Perl 6 handler module
sub load($module)
{
    unless (%loaded_modules{$module}) {
        use $handler;
        %loaded_modules{$module} = 1;
    }
}

# call a Perl 6 response handler
sub handler($name)
{
    my $r = Apache::RequestRec.new();
    load($name);
    my $status = ::($name)::handler($r);
    $status;
}

# call a Perl 6 authentication handler
sub authen_handler($name)
{
    my $r = Apache::RequestRec.new();
    load($name);
    my $status = ::($name)::handler($r);
    $status;
}

When calling a Perl 6 handler, mod_parrot loads this module and calls the individual handler routines according to the Apache configuration. It also provides the interface to Apache, including the Apache::RequestRec class needed by mod_perl6. Everything else it leaves to the Perl 6 compiler.

You might think this code doesn't actually do much, and that's the point. It's really just a simple thunking layer between mod_parrot and your handlers, enforcing the rules of the mod_perl6 implementation. For example, in mod_perl, an Apache::RequestRec object is passed to all response handlers. This layer is responsible for making sure that happens.

As the Perl 6 compiler matures and mod_parrot adds more functionality, this version of mod_perl6 will inevitably change. But what you see above will remain at its core -- loading Perl 6 modules, juggling arguments, and passing control to handler subroutines. And the fact that it's pure Perl 6 will enable scores of Perl programmers to hack on it without having to know anything about Parrot or C programming. Take that, XS.

PHP on mod_parrot

After only 30 minutes of hacking together the code, mod_parrot now supports an implementation of PHP! Plumhead is a simple PHP interpreter utilizing the Parrot Compiler Toolkit (PCT), and comes bundled with Parrot. Because it uses PCT, it plugs right into mod_parrot with minimal effort. The I/O subsystem is still a complete kludge, as I'm using Parrot's string I/O layer to capture output and feed it back to Apache, but that will be worked out eventually.

Here's the code I was able to run:

Hello
<?php
    echo "World!\n";
?>
I am here.

I HATE legacy systems!

Unless you're working at a very young company or only work on new projects, you will undoubtedly run into legacy systems some point. These are systems that you didn't design, that nobody else understands, are likely poorly documented, and are now your responsibility to maintain. I just migrated such a system from one data center to another, and tested it. It worked for my test scenarios. In the real world? BOOM! Utter disaster. I found the problem, but that doesn't make me feel any better -- I have a lot of pride in the quality of my work. Unfortunately, even with documentation, you can't possibly understand every little nuance of systems you didn't design. Unless your testing methodologies are akin to those of NASA (and I've been through test plans like that), something is very likely to go wrong.

So, two lessons to take away from this:

  • Always run your final tests in a production-like environment.
  • EXPECT PAIN!

World's first mod_perl6 handlers

From my post to parrot-porters:

It gives me great pleasure to introduce you to the world's first mod_perl6 handlers! They are run using Parrot's Perl6 compiler on top of mod_parrot, and are compiled on the fly the first time a handler is called. Each handler is passed an [Apache;RequestRec] object instantiated by mod_parrot, and the handlers can call methods on that object from Perl6 land.

First is Polly. Polly repeats everything from the query string of a URL. It uses the puts() method for output and args() to retrieve the query string from Apache.

sub polly_handler($r)
{
    $r.puts("<h1>Polly, a mod_perl6 handler</h1>\n");
    $r.puts("SQUAWK!  Polly says "~$r.args());
    0; # Apache OK
}

Second is the counter, which increments a counter each time it is called. It demonstrates the persistence of the interpreter and proper scoping of the counter variable using our. Since each Apache process has its own interpreter, the count might seem to jump between calls, espcially if your browser isn't using keepalives.

sub counter_handler($r)
{
    our $x;
    unless ($x) {
        $x = 1;
    }
    $r.puts("<h1>Hello, I'm a mod_perl6 response handler!</h1>\n");
    $r.puts("Page views for this interpreter: $x\n");
    $x++;
    0; # Apache OK
}

mod_parrot Registry Scripts

mod_parrot now supports registry scripts for the Perl6 and NQP languages. Behold:

sub handler($r)
{
    our $x;
    unless ($x) {
        $x = 1;
    }
    print "<h1>Hello, I'm a mod_perl6 registry script!</h1>\n";
    print "Page views for this interpreter: $x\n";
    $x++;
}

And the corresponding Apache configuration for the script directory:

Alias /perl6-bin/ /home/jeff/mod_parrot/perl6-bin/
<Directory /home/jeff/mod_parrot/perl6-bin>
    SetHandler parrot-code
    ParrotLanguage perl6
    ParrotHandler ModPerl6::Registry
    Order allow,deny
    Allow from all
</Directory>

This script returns a page with number of times it's been called for that instance of the parrot interpreter. It demonstrates the persistence of the interpreter by storing the count in the $x variable between requests. Pretty simple, but it shows how mod_parrot can load, compile, and run code from a file in any supported language.

OSCON wrapup

Just got back from Portland, and while the trip was short, it was fun and completely worth it. The Petfinder development team was there, and as we're all remote it's good to see each other face to face every so often. I also reconnected with some folks I haven't seen in a couple years, and, with some help from particle, resurrected mod_parrot from its feathery ashes. More on that soon.

As for the conference itself, there was a mix of good and not so good, as should be expected. Some of the highlights for me were Simon Peyton-Jones' keynote on software transactional memory, Tim Bunce's session on DBI::Gofer, which may prove to be useful at the day job, and Jonathan Worthington's .NET to Parrot bytecode talk. He may be crazier than I am. Chris Shiflett's Security 2.0 talk was good as well -- no surprise there. (You're welcome, Chris. ;-)

Of course the hallway track was the highlight for me, as always. I spent a lot of time hacking on mod_parrot and catching up with everyone, but I still missed a few people I knew were there. Oh well.

And now I'm off to get some rest. That redeye flight back to Philly was not pleasant at all.

GPL madness

Linus went on a rant yesterday about interpretations of the GPL. As he said, it's a legal license, and in my words, not some moral lesson to be learned at the end of an after-school special (RMS might think differently). Linus takes offense to the FSF defining what "freedom" is, and while I do think he's a bit jaded by the whole thing, I tend to agree with him. Personally, while I respect the GPL for what it is, and adhere to it where necessary, I never have, and never will never release any of my original code under the GPL. Quite simply, I don't like forcing users of my software to do something just because they've bundled or integrated my software into theirs.

I think it's great that we have so many licensing models to choose from. GPL is a perfectly valid option, and I respect it when I see it. I've just chosen not to use it for my software. Respect that.

Road Trip

And suddenly, I'm going to Portland for OSCON. The day job is paying for a bunch of us to go out this year, and I'm excited to be able to reconnect with people I haven't seen since YAPC 2005 in Toronto. Due to a prior commitment, I'll only be there Tuesday afternoon through Thursday evening, so track me down quick if I owe you a beer. :)

Commentary on "Real World Scalability"

I was reading Scott's (scrottie) journal, and came across his commentary on Ask's
Real World Scalability talk. I have some comments of my own, and in the end I think that Ask's presentation, when taken outside of the context of a Perlmongers group meeting, simply suffers from a bad title.

First, I applaud Ask for bringing to light a lot of the relatively unknown open source scalability and high availability solutions. It's important that people know that there are free and open source solutions out there. However, I also applaud Scott for exposing Ask's extreme prejudice toward FOSS solutions. I've been a system administrator for 12 years now, in academia, ISPs, and the corporate world, with a whole bunch of consulting on the side. If I know anything, it's that a good sysadmin will look at all of the available options, free or not, open source or not.

So let's nitpick. Ask's presentation seems to discount any solution whose cost is greater than zero, thereby eliminating the entire world of commercial solutions. And while there are plenty of good FOSS solutions out there, guess what? The commercial ones work too. Often better. And you get support. Not a mailing list, but someone to call to fix things when they break. Yes, all of this costs money. But in the real world, companies are willing to pay for this. And if they're not willing to pay, or the cost breaks the budget, or the open source solution is simply better than the commercial one, then that's where FOSS makes sense.

Let's move on to databases. Why the aversion to Oracle? I'm no fan of Oracle (I loathe their costly licensing scheme), but I've used it (along with MySQL and Postgres) since version 7, and it does the job and more. A properly tuned Oracle database will perform just as well as MySQL or Postgres, so don't tell me not to use something without showing me numbers. I let budget, in-house expertise, and feature/functionality dictate whether or not I use Oracle.

Also regarding databases:

Stored Procedures Dangerous...work in the database server bad

Wow. This couldn't be more wrong. Databases exist to work with data. If I can encapsulate complex data manipulation in an internal stored procedure, what is wrong with that? It divorces SQL from my application layer, where I don't want to see 30 lines of some complex multiple join. Now, Ask does have a point. There is a lot you can do in a stored procedure that does not belong in the database, such as HTML template processing (yes, I've seen this). But don't discount stored procedures as a whole. They are extremely valuable. And if your database's performance is suffering, get a DBA who knows what he's doing to diagnose the problem!

In the "High Availability Shared Storage" slide, Ask says this about shared storage, including Netapp filers, which provide both NAS and SAN interfaces:

all expensive and smells like "the one big server"

Expensive? Without question. "Smells like the one big server?" Yikes. Have you actually used a clustered Netapp filer like the FAS720? If one side goes down due to a hardware failure or maintenance, the other side takes over without missing a beat. And everything is redundant, down to dual loops through each disk controller to guard against FC-AL loop failure. How exactly is that bad? Where is the single point of failure that is "the one big server?"

Finally, we come to disaster recovery. This is a HUGE topic, and I hate seeing it covered so succinctly in any presentation. But at the same time you would be doing your audience a disservice by not mentioning it. So I won't really harp on this, but do I see one glaring thing missing in Ask's presentation -- testing. He never mentions testing your disaster recovery plan. This is the single most difficult and overlooked part of any DR plan, and it is absolutely critical to making sure your recovery procedures actually work in the real world, and for keeping your plan up to date when things change. And things always change.

Now, back to my original assertion that Ask's presentation suffers from a bad title. I think if it were titled, "Open source scalability solutions" or "Inexpensive scalability solutions", there wouldn't be a problem, since he most certainly focuses on FOSS and the price of commercial hardware and software. But the "real world" is a big place, and gives us a much wider range of appropriate solutions than you'd be led to believe from this presentation.

Commentary on "Virtualization's downsides"

A recent Comupterworld article, which discusses the downsides of virtualization, looked interesting at first. As a cynic, I enjoy reading articles that deflate overhyped technologies and expose weaknesses, though I am a big fan of virutalization. After reading the article though, I can say that the consultant referenced therein, and possibly the author, see virtualization in a vacuum. The problems presented are perfectly valid, but they all have solutions in the real world. Let's look at each of them:

...enterprises sometimes [sic] have difficulty finding or applying adequate monitoring and management tools that work across both virtual and physical landscapes. Other issues can include support, integration and compatibility of different operating systems on the multivendor hardware being virtualized.

Absolutely true. The solution? Verify that the implementation you choose meets your requirements -- something that a system administrator should do before deploying any software.

Increased uptime requirements arise when enterprises stack multiple workloads onto a single server, making it even more essential to keep the server running.

Okay, so spread your workload around onto multiple hardware nodes, and where applicable, provision redundant services on multiple hardware nodes to minimize single points of failure if a hardware node goes down. You'll never eliminate the hardware node as a point of failure, but you can minimize the effects of any downtime with some careful planning.

Bandwidth problems are also a challenge, Mann says, and are caused by co-locating multiple workloads onto a single system with one network path...in a virtual environment, multiple workloads share a single NIC, and possibly one router or switch as well.

Sure. So make sure you have a hardware node with the capacity for multiple NICs. You can then put certain virtual machines on certain NICs, or trunk the interfaces. If using multiple NICs though, make sure your virutalization software supports that configuration. As for the router or switch, this is solved by using multiple hardware nodes connected to different routers or switches.

So what have we learned?

First, analyze your requirements. If you have 50 high-traffic web servers you'd like to consolidate, it doesn't make sense to have just one hardware node with a single NIC. At the other end of the application stack, a high-volume transactional database server probably doesn't belong on a virtual server at all.

Second, know the resources you have to work with. If all you have is a single server with a single NIC, then plan around that. You might not be able to virtualize everything that you want to, as performance might suffer. But that's a limitation of your hardware and your budget -- not virtualization.

These are all issues a good system administrator would take into account. Get yourself a good sysadmin, and take these kinds of articles with a grain of salt.

Syndicate content