Chapter 12: Custom Mason Subclasses
Something that we have tried very hard to do beginning with the 1.10 release of
Mason is to make it easier to customize Mason's behavior. Jon Swartz was
already on this track even back with the release of 0.80, which saw the first
appearance of the HTML::Mason::Resolver
classes, but 1.10 tries to bring this to new levels.
Starting with 1.10 it has become possible to subclass almost every core class that comes with Mason. Some obvious candidates for subclassing include the Lexer, Compiler, and Resolver. This chapter will demonstrate how you might go about implementing subclasses of various Mason objects.
Class::Container as a Superclass
A number of modules in Mason are subclasses of
Class::Container
. This is a class that was created to encapsulate some common behaviors for
Mason objects. Originally, it was called HTML::Mason::Container
, but Ken Williams decided to package this class separately and release it to
CPAN, as it solves some fundamental problems of a large object-oriented system.
Any Mason object that takes parameters to its constructor must inherit from this module. Of course, since all of the classes that you might
consider subclassing inherit from Class::Container
already, you shouldn't need to inherit from it directly. However, you may
need to use some of its methods. We will briefly cover a few of them here, but
see the Class::Container
documentation for more details.
The modules in the Mason core distribution that are Class::Container
subclasses are HTML::Mason::ApacheHandler
, HTML::Mason::CGIHandler
, HTML::Mason::Interp
, HTML::Mason::Compiler
, HTML::Mason::Lexer
, HTML::Mason::Resolver
, and HTML::Mason::Request
.
The most important methods that Class::Container
provides are valid_params()
and contained_objects()
, both of which are class methods.
The first,
valid_params()
, is called in order to register the valid parameters for a class's new()
constructor. The second method, contained_objects()
, is used to register the objects, if any, that a given class contains.
The contained_objects()
method is not something you will have to use for all of your subclasses, since
most of the time you won't be altering the structure of Mason's
framework, you'll just be plugging your own classes into it. This method is
called with a hash that contains as its keys parameter names that the
class's constructor accepts and as its values the default name of the
contained class.
For example, HTML::Mason::Compiler
contains the following code:
__PACKAGE__->contained_objects( lexer => 'HTML::Mason::Lexer' );
This says that the HTML::Mason::Compiler->new()
method will accept a lexer
parameter and that, if no such parameter is given, then an object of the HTML::Mason::Lexer
class will be constructed.
Class::Container
also implements a bit of magic here, so that if HTML::Mason::Compiler->new()
is called with a lexer_class
parameter, it will load the class, instantiate a new object of that class, and
use that for the lexer. In fact, it's even smart enough to notice if
parameters given to HTML::Mason::Compiler->new()
are really intended for this subclass, and it will make sure that they get
passed along.
The valid_params()
method is a bit more complex. It also takes a list of key/value pairs as
arguments. The keys are the names of parameters accepted by the new()
method, while the values are hash references defining a validation
specification for the parameter. This specification is largely the same as that
used by the
Params::Validate
module, with a few additions (but no subtractions).
One addition is that each parameter, excluding those that represent contained
objects, may also define a value for parse
. This tells Mason how to parse this parameter if it is defined as part of an
Apache configuration file. If no parse
parameter is provided, a sensible default will be guessed from the value of
the Params::Validate
type
argument.
The upshot of this is that your subclasses can define their own constructor parameters and Mason will then check for these parameters in an Apache configuration file.
As an example, HTML::Mason::Compiler
contains the following:
__PACKAGE__->valid_params ( allow_globals => { parse => 'list', type => ARRAYREF, default => [ ], descr => "An array of names of Perl variables that are" . " allowed globally within components" }, default_escape_flags => { parse => 'string', type => SCALAR, default => '', descr => "Escape flags that will apply by default to" . " all Mason tag output" }, lexer => { isa => 'HTML::Mason::Lexer', descr => "A Lexer object that will scan component" . " text during compilation" }, preprocess => { parse => 'code', type => CODEREF, optional => 1, descr => "A subroutine through which all component text" . " will be sent during compilation" }, postprocess_perl => { parse => 'code', type => CODEREF, optional => 1, descr => "A subroutine through which all Perl code" . " will be sent during compilation" }, postprocess_text => { parse => 'code', type => CODEREF, optional => 1, descr => "A subroutine through which all plain text will" . " be sent during compilation" }, ); __PACKAGE__->contained_objects( lexer => 'HTML::Mason::Lexer' );
The type
, default
, and optional
parameters are part of the validation specification used by
Params::Validate
. The various constants used, ARRAYREF
, SCALAR
, and so on, are all exported by Params::Validate
. The parameters passed to valid_params()
correspond to the MasonAllowGlobals
, MasonDefaultEscapeFlags
, MasonLexerClass
, MasonPreprocess
, MasonPostprocessPerl
, and MasonPostprocessText
httpd.conf
configuration variables. Yes, Class
is added automatically to the lexer
param because lexer
was also given to the contained_objects()
method.
The descr
parameter is used when we generate the HTML::Mason::Params
documentation and is probably not something you'd need to use.
For more details, see both the Class::Container
and Params::Validate
documentation.
Syntax: Your Very Own Lexer
A request heard every so often on the Mason users list is for some way to create an XML-based markup language that can be used with Mason and that can be compiled to a Mason component object.
Despite the panic the thought of such a thing inspires in us, in the interests of good documentation, we will show the beginnings of such a lexer.
This lexer object will make use of several modules from CPAN, including XML::SAX::ParserFactory
and XML::SAX::Base
. The former is what it sounds like, a factory for
SAX parsers (SAX2 parsers, actually). The latter is what any SAX2 handler
should use as a base class. It implements a default no-op method for all the
possible SAX2 methods, allowing you to simply implement those that you need.
Our lexer will be a SAX2 handler, so we will inherit from XML::SAX::Base
.
A quick side note on
SAX (Simple API for XML): SAX is an event-based API for parsing XML. As the
parser finds
XML constructs, such as tags or character data, it calls appropriate methods
in a SAX handler, such as start_element()
or characters()
. The parser is an event producer and the handler, like our Lexer, is an event
consumer. In our case, the Lexer will also be generating events for the
Compiler, though these will not be SAX events.
For more information on Perl's implementation of SAX2, see the perl-xml project on Sourceforge at http://perl-xml.sourceforge.net/.
For the purposes of our example, let's assume that any element that is not
in the mason
XML namespace will be output verbatim, as will any text. For tags, we'll
just implement <mason:args>
, <mason:init>
, <mason:perl>
, and <mason:output>
in this example.1 The <mason:init>
tag will contain XML-escaped Perl code, while the <mason:args>
tag will contain zero or more <mason:arg>
tags. Each <mason:arg>
tag will have the attributes name
and default
, with name
being required.
We will also implement a <mason:component>
tag in order to provide a single top-level containing tag for the component,
which is an XML requirement.
This is only a subset of the Mason syntax set, but it's enough to show you how to customize a fairly important part of the system.
Using these tags, we might have some XML like this:
<?xml version="1.0"?> <mason:component xmlns:mason="http://www.masonbook.com/"> This is plain text. <b>This is text in an HTML tag</b> <mason:perl> my $x; if ($y > 10) { $x = 10; } else { $x = 100; } </mason:perl> $x is <mason:output>$x</mason:output> $y is <mason:output>$y</mason:output> <mason:args> <mason:arg name="$y" /> <mason:arg name="@z" default="(2,3)" /> </mason:args> <mason:init> $y *= $_ foreach @z; </mason:init> </mason:component>
OK, that looks just beautiful!
Let's start with the preliminaries.
package HTML::Mason::Lexer::XML; $VERSION = '0.01'; use strict; use HTML::Mason::Exceptions( abbr => [ qw( param_error syntax_error error ) ] ); use HTML::Mason::Lexer; use Params::Validate qw(:all); use XML::SAX::Base; use XML::SAX::ParserFactory; use base qw(HTML::Mason::Lexer XML::SAX::Base); # Lexer comes first
As mentioned before, XML::SAX::Base
provides default no-op methods for all of the possible SAX2 events, of which
there are many. Since we're not interested in most of them, it's nice
to have them safely ignored. We inherit from HTML::Mason::Lexer
because it provides a few methods that the compiler class needs, such as object_id()
.
Because we're staunch generalists, we won't insist that the XML
namespace of our tags needs to be ' mason'
. We'll let the user override this with a parameter if desired:
__PACKAGE__->valid_params ( xml_namespace => { parse => 'string', type => SCALAR, default => 'mason', descr => "Prefix of XML tags indicating Mason sections" }, );
We don't need to make a separate new()
method in our module, since we can just inherit the one provided by our base
Lexer class. The main action will happen in the lex()
method:
sub lex { my ($self, %p) = @_; local $self->{name} = $p{name}; local $self->{compiler} = $p{compiler};
We need a convenient place to keep these, so we stick them into $self
for the duration of lexing. Perl's local()
function makes sure these entries expire at the end of the lex()
method:
$self->{state} = [ ];
We'll need to keep a stack of what tags we've seen so we can check that
tags aren't improperly nested and in order to handle characters()
events correctly:
my $parser = XML::SAX::ParserFactory->parser( Handler => $self );
We could have created the parser object in our new()
method, but to store it we would have had to save it in the lexer object's
structure, which would have created a circular reference. Doing it this way
guarantees that the reference to the parser will go out of scope when we're
finished using it.
$parser->parse_string( $p{comp_source} ); }
The last bit tells the parser to parse the component text we were given. That will cause the parser to in turn call methods for each SAX event that occurs while parsing the string.
Now we'll take a look at our event-handling methods. The first is start_element()
, which will be called whenever an XML tag is first encountered:
sub start_element { my $self = shift; my $elt = shift; if ( ! defined $elt->{Prefix} || $elt->{Prefix} ne $self->{xml_namespace} ) { $self->_verbatim_start_element($elt); return; }
If we got something that isn't in our designated namespace we'll just pass it through to the compiler as text to be output:
if ( $elt->{LocalName} eq 'component' ) { $self->{compiler}->start_component; }
When the component starts, we notify the compiler so it can do any initialization that it needs to do:
foreach my $block ( qw( init perl args ) ) { if ( $elt->{LocalName} eq $block ) { $self->_start_block($block); last; } } if ( $elt->{LocalName} eq 'output' ) { $self->_start_output; } if ( $elt->{LocalName} eq 'arg' ) { $self->_handle_argument($elt); } }
The rest of this method is basically a switch statement. Depending on what type of element we receive, we call the appropriate internal method to handle that element.
Let's look at some of the individual methods that are called:
sub _verbatim_start_element { my $self = shift; my $elt = shift; my $xml = '<' . $elt->{Name}; my @att; foreach my $att ( values %{ $elt->{Attributes} } ) { push @att, qq|$att->{Name}="$att->{Value}"|; } if (@att) { $xml .= ' '; $xml .= join ' ', @att; } $xml .= '>'; $self->{compiler}->text( text => $xml ); }
Basically, this method goes through some contortions to regenerate the original
XML element and then passes it on to the compiler as plain text. It should be
noted that this implementation will end up converting tags like <foo/>
into tag pairs like <foo></foo>
. This is certainly valid XML but it may be a bit confusing to users.
Unfortunately, there is no easy way to retrieve the exact text of the source
document to determine how a tag was originally written, and with XML you're
not supposed to care anyway.
Back to our subclass. The next method to implement is our _start_block()
method. This will handle the beginning of a number of blocks in a simple
generic fashion:
sub _start_block { my $self = shift; my $block = shift; if ( $self->{state}[-1] && $self->{state}[-1] ne 'def' && $self->{state}[-1] ne 'method' ) { syntax_error "Cannot nest a $block tag inside a $self->{state}[-1] tag"; }
What we are doing here is making it impossible to do something like nest a <mason:init>
tag inside a <mason:perl>
block. In fact, the only tags that can contain other tags are method and
subcomponent definition tags, which are unimplemented in this example.
We notify the compiler that a new block has started and then push the block name onto our internal stack so we have access to it later:
$self->{compiler}->start_block( block_type => $block ); push @{ $self->{state} }, $block; }
Again, we check for basic logical errors:
sub _start_output { my $self = shift; if ( $self->{state}[-1] && $self->{state}[-1] ne 'def' && $self->{state}[-1] ne 'method' ) { syntax_error "Cannot nest an output tag inside a $self->{state}[-1] tag"; }
Again, we push this onto the stack so we know that this was the last tag we saw:
push @{ $self->{state} }, 'output'; }
The variable name and default are expressed as attributes of the element. The
weird '{}name
' syntax is intentional. Read the Perl SAX2 spec mentioned earlier for more
details on what this means.
sub _handle_argument { my $self = shift; my $elt = shift; my $var = $elt->{Attributes}{'{}name'}{Value}; my $default = $elt->{Attributes}{'{}default'}{Value};
We want to check that the variable name is a valid Perl variable name:
unless ( $var =~ /^[\$\@%][^\W\d]\w*/ ) { syntax_error "Invalid variable name: $var"; }
Then we tell the compiler that we just found a variable declaration.
$self->{compiler}->variable_declaration( block_type => 'args', type => substr( $var, 0, 1 ), name => substr( $var, 1 ), default => $default ); }
That wraps up all the methods that start_element()
calls. Now let's move on to handling a characters()
SAX event. This happens whenever the SAX parser encounters data outside of an
XML tag.
sub characters { my $self = shift; my $chars = shift; if ( ! $self->{state}[-1] || $self->{state}[-1] eq 'def' || $self->{state}[-1] eq 'method' ) { $self->{compiler}->text( text => $chars->{Data} ); return; }
If we're in the main body of a component, subcomponent, or method, we simply pass the character data on as text:
if ( $self->{state}[-1] eq 'init' || $self->{state}[-1] eq 'perl' ) { $self->{compiler}->raw_block( block_type => $self->{state}[-1], block => $chars->{Data} ); return; }
Character data in a <mason:init>
or <mason:perl>
section is passed to the compiler as the contents of that block. The compiler
knows what type of tag is currently being processed and handles it
appropriately.
if ( $self->{state}[-1] eq 'output' ) { $self->{compiler}->substitution( substitution => $chars->{Data} ); } }
If we are in a substitution tag, we call a different compiler method instead. Otherwise, we'll simply end up discarding the contents.
Since we may be dealing with text where whitespace is significant (as opposed to HTML), we'll want to pass on whitespace as if it were character data:
sub ignorable_whitespace { $_[0]->characters($_[1]->{Data}) }
This method may be called if the XML parser finds a chunk of "ignorable
whitespace." Frankly, we can never ignore whitespace, because it is just so cool, and without it our code would be unreadable. But apparently XML parsers
can ignore it.2 The last thing we need to handle is an
end_element()
event:
sub end_element { my $self = shift; my $elt = shift; if ( ! defined $elt->{Prefix} || $elt->{Prefix} ne $self->{xml_namespace} ) { $self->_verbatim_end_element($elt); return; }
Again, XML elements not in our designated namespace are passed on verbatim to the compiler:
if ( $elt->{LocalName} eq 'component' ) { $self->{compiler}->end_component; return; }
If we have reached the end tag of the component, we inform the compiler that the component is complete and we return:
return if $elt->{LocalName} eq 'arg';
We don't need to do anything to end argument declarations. The work needed
to handle this element happened when we called _handle_argument()
from our start_element()
method.
if ( $self->{state}[-1] ne $elt->{LocalName} ) { syntax_error "Something very weird happened. " . "We encountered an ending tag for a $elt->{LocalName} tag " . "before ending our current tag ($self->{state}[-1])."; }
Actually, this should just never happen: XML does not allow tag overlap and, if the parser finds overlapping tags, it should die rather than passing them to us. But we believe in being paranoid. If there is an error in the logic of this lexer code, this might help us in catching it.
if ( $elt->{LocalName} eq 'output' ) { pop @{ $self->{state} }; return; }
Any output that needed to be sent has already been dealt with via the characters()
method so we simply need to change our state if the end tag was </mason:output>
:
$self->{compiler}->end_block( block_type => $elt->{LocalName} ); pop @{ $self->{state} }; }
The only remaining possibilities at this point are either <mason:perl>
, <mason:init>
, or <mason:args>
. For these we simply tell the compiler that the block is over, change our
state, and finish.
The last method we need to write is _verbatim_end_element()
to pass through tag endings for non-Mason tags:
sub _verbatim_end_element { my $self = shift; my $elt = shift; $self->{compiler}->text( text => "</$elt->{Name}>" ); }
This concludes our sample lexer subclass. Note that there are a couple of things missing here. First of all, there is no handling of subcomponents or methods. This wouldn't be too terribly hard as it's mostly an issue of calling the right methods on the compiler.
We also would want to handle line numbers. The default Mason lexer keeps track
of line numbers in the source file so that the compiler can output appropriate #line
directives in the object file, meaning that errors are reported relative to
the source file. This feature isn't required but can be very nice to have.
Some of the unhandled potential tags like <mason:text>
would be extremely trivial to implement. The <mason:flags>
and <mason:attr>
tags could be modeled on the code for handling <mason:args>
. And of course, we need to handle component calls too. This is the point in
this example where we say, "finishing this is left as an exercise to the
reader."
To use this new lexer class, we would either place the following in the httpd.conf file:
PerlSetVar MasonLexerClass HTML::Mason::Lexer::XML
or, when creating the ApacheHandler object, we would simply pass in 'HTML::Mason::Lexer::XML
' as the value of the lexer_class
parameter.
Output: Compiling to a Different Output
So you've decided that you really hate Mason and you want to use Embperl instead. But you have a number of Mason components you've already written that you'd like to save. Well, you can create your own compiler to generate Embperl code from Mason. In this case, we'll use the lexer as is and rewrite the compiler from scratch. There isn't really a one-to-one match between Mason and Embperl's features so this example will, like the lexer example, be limited in scope. Finding an intelligent way to convert Mason's methods and subcomponents to Embperl is beyond the scope of this book.
In case you are unfamiliar with Embperl, it uses the following syntax: [+ +]
tags contain code whose results should be sent to the browser, like
Mason's substitution tag (<% %>
). The [* *]
tags contain Perl code that is not intended to generate output. This is
equivalent to Mason's %
-lines and <%perl>
blocks. Finally, Embperl also has a
[! !]
tag similar to Mason's <%once>
block.
There are other Embperl tags but, once again, this is a simplified example.
Embperl does have a feature similar to Mason's inheritance system called EmbperlObject, but translating between the two is nontrivial.
So let's make our new compiler:
package HTML::Mason::Compiler::ToEmbperl; $VERSION = '0.01'; use strict; use HTML::Mason::Lexer; use HTML::Mason::Exceptions ( abbr => [qw(syntax_error)] ); use HTML::Mason::Compiler; use base qw(HTML::Mason::Compiler);
This pulls in the basic packages we'll need. Even though we really
aren't inheriting much from HTML::Mason::Compiler
, we still subclass it as anything expecting a compiler will check that what
it is given is a subclass of HTML::Mason::Compiler
.
Of course, in our case, we won't be using this compiler with the HTML::Mason::Interp
class, so the point is moot but important to mention.
sub compile { my ($self, %p) = @_; $self->lexer->lex( comp_source => $p{comp_source}, name => 'Embperl', compiler => $self ); return $self->component_as_embperl; }
The only parameter we expect is
comp_source
. We tell the lexer the name of the component is 'Embperl'
since we don't really care what the name is in this context. Presumably we
are being called by some sort of script that is simply going to take the
Embperl-ized component and write it to disk somewhere. The name is used for
reporting syntax errors when a component is run, but that won't be an issue
in this case.
sub start_component { my $self = shift; $self->{once_header} = ''; $self->{header} = ''; $self->{body} = ''; $self->{footer} = ''; $self->{current_block} = ''; }
This method is called to give the compiler a chance to reset its state, so that's what we do.
We will be storing blocks of code in each of the first four attributes. When we
encounter a <%once>
block, it will go in the once_header
attribute. For <%init>
blocks, we can put then in the
header
attribute. %
-lines, <%perl>
blocks, <%text>
blocks, substitution tags, and text will be placed immediately into the body
attribute. Finally, any <%cleanup>
blocks will go into the footer
attribute.
The current_block()
attribute will be used to keep track of what type of block we are in after a
call to our start_block()
method.
This example will ignore other Mason syntax such as component calls,
subcomponents, methods, and <%shared>.
Again, this will be left as an exercise for the reader.
sub start_block { my ($self, %p) = @_; syntax_error "Cannot nest a $p{block_type} inside a $self->{in_block} block" if $self->{in_block};
This is to make sure that the component is following the syntax rules we expect.
$self->{in_block} = $p{block_type}; }
Then we record what kind of block we are starting, which will be something like init
or perl
.
The next method, raw_block()
, is called for all of the blocks that we handle except the <%text>
block:
sub raw_block { my ($self, %p) = @_; for ($self->{in_block}) { /^once$/ and $self->{once_header} .= $p{block}; /^init$/ and $self->{header} .= $p{block}; /^perl$/ and $self->{body} .= "[* $p{block} *]"; /^cleanup$/ and $self->{footer} .= $p{block}; } }
This switchlike statement stores the code given to us in the appropriate
attribute. If it is a <%perl%>
block, we wrap it in the relevant Embperl tag; otherwise, we simply store it
as is in the appropriate slot.
sub text_block { my ($self, %p) = @_; $self->{body} .= $p{block}; } sub text { my ($self, %p) = @_; $self->{body} .= $p{text}; }
The first method is called when the lexer finds a<%text>
block. The second is called for regular text. Both of these get placed into
the body
attribute for later use.
sub substitution { my ($self, %p) = @_; $self->{body} .= "[+ $p{substitution} +]"; }
This method handles
substitution tags (<% %>
) though it ignores the fact that this method can also be given an escape
parameter. This could be handled via Embperl's $escmode
variable (again, left as an exercise for the reader).
sub perl_line { my ($self, %p) = @_; $self->{body} .= "[* $p{line} *]"; }
This method is called for %
-lines.
Then we need to implement the end_block()
method:
sub end_block { my ($self, %p) = @_; syntax_error "end of $p{block_type} encountered while in $self->{in_block} block" unless $self->{in_block} eq $p{block_type};
Another sanity check is in the start_block()
method. It's always a good thing to make sure that the lexer is giving us
the kind of input that we would expect.
$self->{in_block} = undef; }
And we reset our in_block
attribute so that the next call to start_block()
succeeds.
The last method to implement is the
component_as_embperl()
method, which simply will return a big block of text, our new Embperl page:
sub component_as_embperl { my $self = shift; my $page = ''; if ( length $self->{once_header} ) { $page .= "[! $self->{once_header} !]\n"; } if ( length $self->{header} ) { $page .= "[* $self->{header} *]\n"; } if ( length $self->{body} ) { $page .= "$self->{body}\n"; } if ( length $self->{footer} ) { $page .= "[* $self->{footer} *]\n"; } return $page; }
And there you have it -- a perfectly good Mason component brutally butchered and turned into an Embperl page. I hope you're happy with yourself!
Storage: Replacing the Resolver
Occasionally, people on the Mason users list wonder if they can store their
component source in an RDBMS. The way to achieve this is to create your own
HTML::Mason::Resolver
subclass.
The resolver's job is take a component path and figure out where the corresponding component is.
We will show an example that connects to a MySQL server containing the following table:
MasonComponent ---------------------------------------- path VARCHAR(255) PRIMARY KEY component TEXT NOT NULL last_modified DATETIME NOT NULL
Our code starts as follows:
package HTML::Mason::Resolver::MySQL; $VERSION = '0.01'; use strict; use DBI; use Params::Validate qw(:all); use HTML::Mason::ComponentSource; use HTML::Mason::Resolver; use base qw(HTML::Mason::Resolver); __PACKAGE__->valid_params ( db_name => { parse => 'string', type => SCALAR }, user => { parse => 'string', type => SCALAR, optional => 1 }, password => { parse => 'string', type => SCALAR, optional => 1 }, );
These parameters will be used to connect to the MySQL server containing our components. Readers familiar with the Perl DBI will realize that there are a number of other parameters that we could take.
Our constructor method, new()
, needs to do a bit of initialization to set up the database connection, so we
override our base class's method:
sub new { my $class = shift; my $self = $class->SUPER::new(@_);
We invoke the new()
method provided by our superclass, which validates the parameters in @_
and makes sure they get sent to the right contained objects. The latter
concern doesn't seem so important in this case since we don't have any
contained objects, but the point is that if somebody subclasses our HTML::Mason::Resolver::MySQL
class and adds contained objects, our new()
method will still do the right thing with its parameters.
Now we connect to the database in preparation for retrieving components later:
$self->{dbh} = DBI->connect ( "dbi:mysql:$self->{db_name}", $self->{user}, $self->{password}, { RaiseError => 1 } ); return $self; }
A resolver needs to implement two methods left unimplemented in the parent HTML::Mason::Resolver
class. These are get_info()
and glob_path()
. The first is used to retrieve information about the component matching a
particular component path. The second takes a glob pattern like /path/*
or /path/*/foo/*
and returns the component paths of all the components that match that wildcard
path.
Additionally, if we want this resolver to be usable with the ApacheHandler
module, we need to implement a method called apache_request_to_comp_path()
, which takes an Apache object and translates it into a component path.
Given a path, we want to get the time when this component was last modified, in the form of a Unix timestamp, which is what Mason expects:
sub get_info { my ($self, $path) = @_; my ($last_mod) = $self->{dbh}->selectrow_array ( 'SELECT UNIX_TIMESTAMP(last_modified) FROM MasonComponent WHERE path = ?', {}, $path ); return unless $last_mod;
If there was no entry in the database for the given path, we simply return, which lets Mason know that no matching component was found:
return HTML::Mason::ComponentSource->new ( comp_path => $path, friendly_name => $path, last_modified => $last_mod, comp_id => $path, source_callback => sub { $self->_get_source($path) }, ); }
The get_info()
method returns its information in the form of a HTML::Mason::ComponentSource
object. This is a very simple class that holds information about a component.
Its constructor accepts the following parameters:
-
comp_path
-
friendly_name
The string given for this parameter will be used to identify the component in error messages. For our resolver, the component path works for this parameter as well because it is the primary key for the MasonComponent table in the database, allowing us to uniquely identify a component.
For other resolvers, this might differ from the component path. For example, the filesystem resolver that comes with Mason uses the component's absolute path on the filesystem.
-
last_modified
This is the last modification time for the component, as seconds since the epoch.
-
comp_id
This should be a completely unique identifier for the component. Again, since the component path is our primary key in the database, it works well here.
-
source_callback
This is a subroutine reference that, when called, returns the source text of the component.
Mason could have had you simply create an
HTML::Mason::ComponentSource
subclass that implemented asource()
method for your resolver, but we thought that rather than requiring you to write such a do-nothing subclass, it would be easier to simply use a callback instead.Our
_get_source()
method is trivially simple:sub _get_source { my $self = shift; my $path = shift; return $self->{dbh}->selectrow_array ( 'SELECT component FROM MasonComponent WHERE path = ?', {}, $path ); }
-
comp_class
This is the component class into which this particular component should be blessed when it is created. This must be a subclass of
HTML::Mason::Component
. The default isHTML::Mason::Component
. -
extra
This optional parameter should be a hash reference. It is used to pass information from the resolver to the component class.
This is needed since an
HTML::Mason::Resolver
subclass and anHTML::Mason::Component
subclass can be rather tightly coupled, but they must communicate with each other through the interpreter (this may change in the future).
Next is our glob_path()
method:
sub glob_path { my $self = shift; my $pattern = shift; $pattern =~~ s/*/%/g;
The pattern given will be something that could be passed to Perl's glob()
function. We simply replace this with the SQL equivalent for a LIKE
search:
return $self->{dbh}->selectcol_array ( 'SELECT path FROM MasonComponent WHERE path LIKE ?', {}, $pattern ); }
Then we return all the matching paths in the database.
Since we may want to use this resolver with ApacheHandler, we will also
implement the
apache_request_to_comp_path()
method:
sub apache_request_to_comp_path { my $self = shift; my $r = shift; my $path = $r->uri; return $path if $self->{dbh}->selectrow_array ( 'SELECT 1 FROM MasonComponent WHERE path = ?', {}, $path ); return undef unless $r->path_info; $path .= $r->path_info; return $path if $self->{dbh}->selectrow_array ( 'SELECT 1 FROM MasonComponent WHERE path = ?', {}, $path ); return undef; }
We generate a component path by taking the requested URI and looking for that
in the database. If it doesn't exist, we will try appending the path info
if possible or just give up. Finally, we try the altered path and, if that
doesn't exist either, we just give up and return undef
, which will cause the ApacheHandler module to return a NOT FOUND
status for this request.
That's it, all done. And nothing left as an exercise for the reader this time.
As with the lexer, this can be used either via a httpd.conf directive:
PerlSetVar MasonResolverClass HTML::Mason::Resolver::MySQL
or by passing the resolver_class
parameter to the new()
method for HTML::Mason::Interp
.
Request: A Request Object with a Built-in Session
Wouldn't it be cool to have a request object with a built-in session? "Yes, it would," you answer. "Child's play," we say.
When a request is made using this
object, it should either find an old session or create a new one. Then in our
components we will simply call $m->session()
to get back a hash reference that will persist between requests.
For simplicity's sake, we won't make this class configurable as to what type of session to use, though it could be done.3
package HTML::Mason::Request::WithSession; $VERSION = '0.01'; use strict; # Import a subroutine error( ) which throws an HTML::Mason::Exception # object use HTML::Mason::Exceptions ( abbr => [ 'error' ] ); use HTML::Mason::ApacheHandler; use base qw(HTML::Mason::Request);
One problem unique to subclassing to the Request object is that Mason already
comes with two of its own Request subclasses. These are HTML::Mason::Request::ApacheHandler
and HTML::Mason::Request::CGIHandler
, which are used by the ApacheHandler and CGIHandler, respectively.
In order to cooperate with the ApacheHandler and CGIHandler modules, we want to
subclass the appropriate class. However, we can't know which one to
subclass when we are loaded, because it is possible that we will be loaded before the ApacheHandler or CGIHandler module. We'll take care of this in our new()
method, which will be discussed momentarily.
Our session will be implemented using cookies and
Cache::FileCache
for storage, just as we saw in Chapter 11:
use Apache::Cookie; use Cache::FileCache; use Digest::SHA1;
We solve our subclassing problem with the following code. There is nothing
wrong with changing a class's inheritance dynamically in Perl, so
that's what we do. The alter_superclass()
method is provided by the HTML::Mason::Request
base class, and does the right thing even given multiple inheritance. It also
cooperates with Class:Container
to make sure that it sees any changes made to the inheritance hierarchy:
sub new { my $class = shift; $class->alter_superclass( $HTML::Mason::ApacheHandler::VERSION ? 'HTML::Mason::Request::ApacheHandler' : $HTML::Mason::CGIHandler::VERSION ? 'HTML::Mason::Request::CGI' : 'HTML::Mason::Request' ); return $class->SUPER::new(@_); }
We make a session, call exec()
in our parent class, taking care to preserve the caller's scalar/list
context, and then save the session. If an exception is thrown, we simply
rethrow it:
sub exec { my $self = shift; $self->_make_session; my @result; if (wantarray) { @result = eval { $self->SUPER::exec(@_) }; } elsif (defined wantarray) { $result[0] = eval { $self->SUPER::exec(@_) }; } else { eval { $self->SUPER::exec(@_) }; } # copy this in case _save_session overwrites $@ my $e = $@; $self->_save_session; die $e if $e; return wantarray ? @result : defined wantarray ? $result[0] : undef; }
Making a new session for subrequests is probably incorrect behavior, so we
simply reuse our parent's session object if a subrequest is exec()
'd:
sub _make_session { my $self = shift; if ( $self->is_subrequest ) { $self->{session} = $self->parent_request->session; return; }
This code is pulled almost verbatim from Chapter 11:
my %c = Apache::Cookie->fetch; my $session_id = exists $c{masonbook_session} ? $c{masonbook_session}->value : undef; $self->{session_cache} = Cache::FileCache->new( { cache_root => '/tmp', namespace => 'Mason-Book-Session', default_expires_in => 60 * 60 * 24, # 1 day auto_purge_interval => 60 * 60 * 24, # 1 day auto_purge_on_set => 1 } ); my $session; if ($session_id) { $session = $self->{session_cache}->get($session_id); } unless ($session) { $session = { _session_id => Digest::SHA1::sha1_hex( time, rand, $$ ) }; } Apache::Cookie->new( $self->apache_req, name => 'masonbook_session', value => $session->{_session_id}, path => '/', expires => '+1d', )->bake; $self->{session} = $session; }
Also just like Chapter 11:
sub _save_session { my $self = shift; $self->{session_cache}->set ( $self->{session}{_session_id} => $self->{session} ); }
And to finish it off, a simple accessor method:
sub session { $_[0]->{session} }
Wow, nice and simple. Of course, this would need to be customized for your
environment, or you can use the previously mentioned HTML::Mason::Request::WithApacheSession
module available from CPAN.
Once again, you have two options to use this new subclass. If you are configuring Mason via your httpd.conf file, do this:
PerlSetVar MasonRequestClass HTML::Mason::Request::WithSession
or in your handler.pl you can load the module and then pass a request_class
parameter to the HTML::Mason::ApacheHandler
class's constructor.
Argument Munging: ApacheHandler
One of the main reasons that you might consider creating your own ApacheHandler class is to change the way arguments are processed. For example, we might want to create objects based on certain objects.
Our subclass starts like many others:
package HTML::Mason::ApacheHandler::AddObjects; $VERSION = '0.01'; use strict; use HTML::Mason::ApacheHandler; use base qw(HTML::Mason::ApacheHandler);
This should look pretty familiar. Now we'll load a few more classes, which we'll be using to create objects:
use Date::ICal; # date object use MyApp::User; # user object
And now we override the argument-processing subroutine, request_args()
:
sub request_args { my $self = shift; my ($args, $r, $cgi_object) = $self->SUPER::request_args(@_);
ApacheHandler's request_args()
method returns three items. The first is a hash reference containing the
arguments that will be passed to the component. The second is the Apache
or Apache::Request
object for the current request, and the third is a CGI.pm
object. The CGI.pm
object is created only when the ApacheHandler's args_method
attribute is set to CGI
.
if ( exists $args->{epoch} ) { $args->{date} = Date::ICal->new( epoch => $args->{epoch} ); } if ( exists $args->{user_id} ) { $args->{user} = MyApp::User->new( user_id => $args->{user_id} ); }
This bit of code simply creates some useful objects if certain incoming
arguments exist. Finally we return the munged $args
hash reference, along with the other return values from our superclass's request_args()
method:
return ($args, $r, $cgi_object); }
Now, whenever the client submits an argument called epoch
, there will be an additional argument, date
, a Date::ICal
object, available in components. Similarly, a request with a user_id
parameter means that there will be an argument user
containing a MyApp::User
object passed to components.
Using this class is a little different from what we've seen previously. In our httpd.conf file, we'd have something like this:
<Location /mason> SetHandler perl-script PerlHandler HTML::Mason::ApacheHandler::AddObjects </Location>
Similarly, in our handler.pl file, we'd simply create this object instead of an HTML::Mason::ApacheHandler
object.
More Reader Exercises
Consider the following possibilities for Mason subclasses:
- A resolver subclass that allows you to have two files per component. One file could be primarily HTML and the other would be code. The resolver subclass would simply concatenate the two together.
- A lexer subclass enabling ASP-style syntax.
-
An ApacheHandler subclass that munges incoming strings into Unicode, using the
Encode
module from CPAN. - A CGIHandler subclass that performs the same argument-to-object transformation seen in our example ApacheHandler subclass.
Footnotes
1. The equivalent of <% %>
in the sane world where people don't use XML for everything! -- Return.
2. See Section 2.10 of the W3C XML 1.0 Recommendation for the definition of "ignorable whitespace." -- Return.
3. This is left as an exercise... Actually, this was left to the one of the
authors. Dave Rolsky recently created MasonX::Request::WithApacheSession
, which is a highly configurable module that expands on the example shown in
this section. This module is available from a CPAN mirror near you. -- Return.
These HTML pages were created by running this script against the pseudo-POD source.