Chapter 11: Recipes
No, we are not going teach you how to make a delicious tofu and soybean stew. But this is almost as good. This chapter shows how to do some common Mason tasks, some of them with more than one implementation.
Sessions
For many of our
session examples, we will be using the Apache::Session
module. Despite its name, this module doesn't actually require mod_perl
or Apache, though that is the context in which it was born and in which
it's most often used. It implements a simple tied hash interface to a
persistent object.1 It has one major gotcha: you must make sure that the session object gets
cleaned up properly (usually by letting it go out of scope), so that it will be
written to disk after each access.
Nowadays, you might want to take a look at MasonX::Request::WithApacheSession on CPAN, if you really want to integrate sessions into your Mason code. Again, I'd prefer Catalyst for session management.
Dave, 2007
Without Touching httpd.conf
Here is an example that doesn't involve changing any of your Apache configuration settings. The following code should be placed in a top-level autohandler. Any component that needs to use the session will have to inherit from this component, either directly or via a longer inheritance chain.
It uses cookies to store the session.
<%once> use Apache::Cookie; use Apache::Session::File; </%once> <%init> my %c = Apache::Cookie->fetch; my $session_id = exists $c{masonbook_session} ? $c{masonbook_session}->value : undef;
First, it loads the necessary
modules. Normally we recommend that you do this at server startup via a
PerlModule directive in your httpd.conf file or in your handler.pl file to save memory, but we load them here just to show you which ones we are
using. The component uses the Apache::Cookie
module to fetch any cookies that might have been sent by the browser. Then we
check for the existence of a cookie called masonbook_session
, which if it exists should contain a valid session ID.
local *MasonBook::Session; eval { tie %MasonBook::Session, 'Apache::Session::File', $session_id, { Directory => '/tmp/sessions', LockDirectory => '/tmp/sessions', }; }; if ($@) { die $@ unless $@ =~ /Object does not exist/; # Re-throw $m->redirect('/bad_session.html'); }
The first line ensures that when this component ends, the session variable will
go out of scope, which triggers Apache::Session
's cleanup mechanisms. This is quite important, as otherwise the data will
never be written to disk. Even worse, Apache::Session
may still be maintaining various locks internally, leading to deadlock. We use local()
to localize the symbol table entry *MasonBook::Session
; it's not enough to localize just the hash %MasonBook::Session
, because the tie()
magic is attached to the symbol table entry. It's also worth mentioning that we use a global variable rather than a lexical one, because we want this variable to be available to all components.
If the value in the $session_id
variable is undef
, that is not a problem. The Apache::Session
module simply creates a new session ID. However, if $session_id
is defined but does not represent a valid session, an exception will be
thrown. This means either that the user's session has expired or that
she's trying to feed us a bogus ID. Either way, we want to tell her
what's happened, so we redirect to another page that will explain things.
To trap the exception, we wrap the tie()
in an eval {}
block.
If an exception is thrown, we check $@
to see whether the message indicates that the session isn't valid. Any
other error is fatal. If the session isn't valid, we use the redirect()
method provided by the request object.
Finally, we send the user a cookie:
Apache::Cookie->new( $r, name => 'masonbook_session', value => $MasonBook::Session{_session_id}, path => '/', expires => '+1d', )->bake;
This simply uses the Apache::Cookie
module to ensure that a cookie will be sent to the client with the response
headers. This cookie is called 'masonbook_session'
and is the one we checked for earlier. It doesn't hurt to send the cookie
every time a page is viewed, though this will reset the expiration time of the
cookie each time it is set. If you want the cookie to persist for only a
certain fixed length of time after the session is created, don't resend the cookie.
$m->call_next; </%init>
This line simply calls the next component in the inheritance chain. Presumably,
other components down the line may change the contents of %MasonBook::Session
, and those modifications will be written to disk at the end of the request.
Example 11-1 shows the entire component.
<%once> use Apache::Cookie; use Apache::Session::File; </%once> <%init> my %c = Apache::Cookie->fetch; my $session_id = exists $c{masonbook_session} ? $c{masonbook_session}->value : undef; local *MasonBook::Session; eval { tie %MasonBook::Session, 'Apache::Session::File', $session_id, { Directory => '/tmp/sessions', LockDirectory => '/tmp/sessions', }; }; if ($@) { die $@ unless $@ =~ /Object does not exist/; # Re-throw $m->redirect('/bad_session.html'); } Apache::Cookie->new( $r, name => 'masonbook_session', value => $MasonBook::Session{_session_id}, path => '/', expires => '+1d', )->bake; $m->call_next; </%init>
Predeclaring the Global via an httpd.conf File
It'd be nice to be able to simply use the global session variable without
having to type the fully qualified name, %MasonBook::Session
in every component. That can be done by adding this line to your httpd.conf file:
PerlSetVar MasonAllowGlobals %session
Of course, if you're running more than one Mason-based site that uses sessions, you may need to come up with a unique variable name.
Adding this to your httpd.conf means you can simply reference the %session
variable in all of your components, without a qualifying package name. The %session
variable would actually end up in the HTML::Mason::Commands
package, rather than MasonBook
.
Predeclaring the Global via a handler.pl Script
If you have a handler.pl
script, you could also use the session-making code we just saw. If you wanted
to declare a %session
global for all your components, you'd simply pass the allow_globals
parameter to your interpreter when you make it, like this:
my $ah = HTML::Mason::ApacheHandler->new( comp_root => ..., data_dir => ..., allow_globals => [ '%session' ] );
You might also choose to incorporate the session-making code into your handler subroutine rather than placing it in a component. This would eliminate the need to make sure that all components inherit from the session-making component.
Using Cache::Cache for Sessions
Just to show you that you don't have to use Apache::Session
, here is a simple alternate using Cache::Cache
, which is integrated into Mason via the request object's cache()
method.
This version also sets up the session in a top-level autohandler just like our first session example. It looks remarkably similar.
<%once> use Apache::Cookie; use Cache::FileCache; use Digest::SHA1; </%once>
Again, for memory savings, you should load these modules at server startup.
<%init> my $cache = Cache::FileCache->new( { namespace => 'Mason-Book-Session', cache_root => '/tmp/sessions', default_expires_in => 60 * 60 * 24, # 1 day auto_purge_interval => 60 * 60 * 24, # 1 day auto_purge_on_set => 1 } );
This creates a new cache object that will be used to store sessions. Without going into too much detail, this creates a new caching object that will store data on the filesystem under /tmp/sessions.2 The namespace is basically equivalent to a subdirectory in this case, and the remaining options tell the cache that, by default, stored data should be purged after one day and that it should check for purgeable items once per day.
my %c = Apache::Cookie->fetch; if (exists $c{masonbook_session}) { my $session_id = $c{masonbook_session}->value; $MasonBook::Session = $cache->get($session_id); } $MasonBook::Session ||= { _session_id => Digest::SHA1::sha1_hex( time, rand, $$ ) };
These lines simply retrieve an existing session based on the session ID from
the cookie, if such a cookie exists. If this fails or if there was no session
ID in the cookie, we make a new one with a randomly generated session ID. The
algorithm used earlier for generating the session ID is more or less the same
as the one provided by Apache::Session
's Apache::Session::Generate::MD5
module, except that it uses the
SHA1 digest module. This algorithm should provide more than enough randomness
to ensure that there will never be two identical session IDs generated. It may not be enough to keep people from guessing possible session IDs, though, so if you
want make sure that a session cannot be hijacked, you should incorporate a
secret into the digest algorithm input.
Apache::Cookie->new( $r, name => 'masonbook_session', value => $MasonBook::Session->{_session_id}, path => '/', expires => '+1d', )->bake;
We then set a cookie in the browser that contains the session ID. This cookie
will expire in one day. Again, this piece is identical to what we saw when
using Apache::Session
.
eval { $m->call_next }; $cache->set( $MasonBook::Session->{_session_id} => $MasonBook::Session );
Unlike with Apache::Session
, we need to explicitly tell our cache object to save the data. This means we
need to wrap the call to $m->call_next()
in an eval {}
block in order to catch any exceptions thrown in other components. Otherwise,
this part looks almost exactly like our example using Apache::Session
.
die $@ if $@; </%init>
After saving the session, we rethrow any exception we may have gotten.
The entire component is shown in Example 11-2.
<%once> use Apache::Cookie; use Digest::SHA1; </%once> <%init> my $cache = Cache::FileCache->new( namespace => 'Mason-Book-Session', cache_root => '/tmp/sessions', default_expires_in => 60 * 60 * 24, # 1 day auto_purge_interval => 60 * 60 * 24, # 1 day auto_purge_on_set => 1 } ); my %c = Apache::Cookie->fetch; if (exists $c{masonbook_session}) { my $session_id = $c{masonbook_session}->value; $MasonBook::Session = $cache->get($session_id); } $MasonBook::Session ||= { _session_id => Digest::SHA1::sha1_hex( time, rand, $$ ) }; Apache::Cookie->new( $r, name => 'masonbook_session', value => $MasonBook::Session->{_session_id}, path => '/', expires => '+1d', )->bake; eval { $m->call_next }; $cache->set( $MasonBook::Session->{_session_id} => $MasonBook::Session ); die $@ if $@; </%init>
Sessions with Cache::Cache
have these major differences from those with Apache::Session
:
- The session itself is not a tied hash. Objects are faster than tied hashes but not as transparent.
-
No attempt is made to track whether or not the session has changed. It is always written to the disk at the end of a session. This trades the performance boost
of
Apache::Session
's behavior for the assurance that the data is always written to disk.When using
Apache::Session
, many programmers are often surprised that changes to a nested data structure in the session hash, like:$session{user}{name} = 'Bob';
are not seen as changes to the top-level
%session
hash. If no changes to this hash are seen,Apache::Session
will not write the hash out to storage.As a workaround, some programmers may end up doing something like:
$session{force_a_write}++;
or:
$session{last_accessed} = time( );
after the session is created. Using
Cache::Cache
and explicitly saving the session every time incurs the same penalty as always changing a member of anApache::Session
hash.
Putting the Session ID in the URL
If you don't want to, or cannot, use cookies, you can store the session ID in the URL. This can be somewhat of a hassle because it means that you have to somehow process all the URLs you generate. Using Mason, this isn't as bad as it could be. There are two ways to do this:
One would be to put a filter in your top-level autohandler that looks something like this:
<%filter> s/href="([^"])+"/add_session_id($1)/eg; s/action="([^"])+"/add_session_id($1)/eg; </%filter>
The add_session_id()
subroutine, which should be defined in a module, would look something like
this:
sub add_session_id { my $url = shift; return $url if $url =~ m{^\w+://}; # Don't alter external URLs if ($url =~ /\?/) { $url =~ s/\?/?session_id=$MasonBook::Session{_session_id}&/; } else { $url .= "?session_id=$MasonBook::Session{_session_id}"; } return $url; }
This routine accounts for external links as well as links with or without an existing query string. However, it doesn't handle links with fragments properly.
The drawback to putting this in the <%filter>
is that it filters URLs only in the content body, not in headers. Therefore
you'll need to handle those cases separately.
The other solution would be to create all URLs (including those intended for redirects) via a dedicated component or subroutine that would add the session ID. This latter solution is probably a better idea, as it handles redirects properly. The drawback with this strategy is that you'll have a Mason component call for every link, instead of just regular HTML.
We'll add a single line (bolded in Example 11-3) to the /lib/url.mas component we saw in Chapter 8. Now this component expects there to be a variable named %UserSession
.
<%args> $scheme => 'http' $username => undef $password => '' $host => undef $port => undef $path %query => ( ) $fragment => undef </%args> <%init> my $uri = URI->new; if ($host) { $uri->scheme($scheme); if (defined $username) { $uri->authority( "$username:$password" ); } $uri->host($host); $uri->port($port) if $port; } # Sometimes we may want to path in a query string as part of the # path but the URI module will escape the question mark. my $q; if ( $path =~ s/\?(.*)$// ) { $q = $1; } $uri->path($path); # If there was a query string, we integrate it into the query # parameter. if ($q) { %query = ( %query, split /[&=]/, $q ); } $query{session_id} = $UserSession{session_id}; # $uri->query_form doesn't handle hash ref values properly while ( my ( $key, $value ) = each %query ) { $query{$key} = ref $value eq 'HASH' ? [ %$value ] : $value; } $uri->query_form(%query) if %query; $uri->fragment($fragment) if $fragment; </%init> <% $uri->canonical | n %>\
Making Use of Autoflush
Every once in a while you may have to output a very large component or file to the client. Simply letting this accumulate in the buffer could use up a lot of memory. Furthermore, the slow response time may make the user think that the site has stalled.
Example 11-4 sends out the contents of a potentially large file without sucking up lots of memory.
<%args> $filename </%args> <%init> local *FILE; open FILE, "< $filename" or die "Cannot open $filename: $!"; $m->autoflush(1); while (<FILE>) { $m->print($_); } $m->autoflush(0); </%init>
If each line wasn't too huge, you might just flush the buffer every once in a while, as in Example 11-5.
<%args> $filename </%args> <%init> local *FILE; open FILE, "< $filename" or die "Cannot open $filename: $!"; while (<FILE>) { $m->print($_); $m->flush_buffer unless $. % 10; } $m->flush_buffer; </%init>
The unless $. % 10
bit makes use of the special Perl variable $
., which is the current line number of the file being read. If this number
modulo 10 is equal to zero, we flush the buffer. This means that we flush the buffer every 10 lines. Replace the number 10 with any desired value.
User Authentication and Authorization
One problem that web sites have to solve over and over is user authentication and authorization. These two topics are related but not the same, as some might think. Authentication is the process of figuring out if someone is who he says he is, and usually involves checking passwords or keys of some sort. Authorization comes after this, when we want to determine whether or not a particular person is allowed to perform a certain action.
There are a number of modules on CPAN intended to help do these things under mod_perl
. In fact, Apache has separate request-handling phases for both authentication
and authorization that mod_perl
can handle. It is certainly possible to use these modules with Mason.
You can also do authentication and authorization using Mason components (as seen in Chapter 8). Authentication will usually involve some sort of request for a login and password, after which you give the user some sort of token (either in a cookie or a session) that indicates that he has been authenticated. You can then check the validity of this token for each request.
If you have such a token, authorization simply consists of checking that the user to whom the token belongs is allowed to perform a given action.
Using Apache::AuthCookie
The Apache::AuthCookie
module, available from CPAN, handles both authentication and authorization via mod_perl
and can be easily hooked into Mason. Let's just skip all the details of
configuring Apache::AuthCookie
, which requires various settings in your server config file, and show how to
make the interface to Mason.
Apache::AuthCookie
requires that you create a "login script" that will be executed the
first time a browser tries to access a protected area. Calling this a script is
actually somewhat misleading since it is really a page rather than a script
(though it could be a script that generates a page). Regardless, using a Mason
component for your login script merely requires that you specify the path to
your Mason component for the login script parameter.
We'll call this script AuthCookieLoginForm-login.comp,as shown in Example 11-6.
<html> <head> <title>Mason Book AuthCookie Login Form</title> </head> <body> <p> Your attempt to access this document was denied (<% $r->prev->subprocess_env("AuthCookieReason") %>). Please enter your username and password. </p> <form action="/AuthCookieLoginSubmit"> <input type="hidden" name="destination" value="<% $r->prev->uri %>"> <table align="left"> <tr> <td align="right"><b>Username:</b></td> <td><input type="text" name="credential_0" size="10" maxlength="10"></td> </tr> <tr> <td align="right"><b>Password:</b></td> <td><input type="password" name="credential_1" size="8" maxlength="8"></td> </tr> <tr> <td colspan="2" align="center"><input type="submit" value="Continue"></td> </tr> </table> </form> </body> </html>
This component is a modified version of the example login script included with
the Apache::AuthCookie
distribution.
The action used for this form, /AuthCookieLoginSubmit, is configured as part of your AuthCookie configuration in your httpd.conf file.
That is about it for interfacing this module with Mason. The rest of
authentication and authorization is handled by configuring mod_perl
to use Apache::AuthCookie
to protect anything on your site that needs authorization. A very simple
configuration might include the following directives:
PerlSetVar MasonBookLoginScript /AuthCookieLoginForm.comp <Location /AuthCookieLoginSubmit> AuthType MasonBook::AuthCookieHandler AuthName MasonBook SetHandler perl-script PerlHandler MasonBook::AuthCookieHandler->login </Location> <Location /protected> AuthType MasonBook::AuthCookieHandler AuthName MasonBook PerlAuthenHandler MasonBook::AuthCookieHandler->authenticate PerlAuthzHandler MasonBook::AuthCookieHandler->authorize require valid-user </Location>
The MasonBook::AuthCookieHandler
module would look like this:
package MasonBook::AuthCookieHandler; use strict; use base qw(Apache::AuthCookie); use Digest::SHA1; my $secret = "You think I'd tell you? Hah!"; sub authen_cred { my $self = shift; my $r = shift; my ($username, $password) = @_; # implementing _is_valid_user() is out of the scope of this chapter if ( _is_valid_user($username, $password) ) { my $session_key = $username . '::' . Digest::SHA1::sha1_hex( $username, $secret ); return $session_key; } } sub authen_ses_key { my $self = shift; my $r = shift; my $session_key = shift; my ($username, $mac) = split /::/, $session_key; if ( Digest::SHA1::sha1_hex( $username, $secret ) eq $mac ) { return $session_key; } }
This provides the minimal interface an Apache::AuthCookie
subclass needs to provide to get authentication working.
Authentication Without Cookies
But what if you don't want to use Apache::AuthCookie
? Your site may need to work without using cookies.
First, we will show an example authentication system that uses only Mason and passes the authentication token around via the URL (actually, via a session).
This example assumes that we already have some sort of session system that passes the session ID around as part of the URL, as discussed previously.
We start with a quick login form. We will call this component login_form.html, as shown in Example 11-7.
<%args> $username => '' $password => '' $redirect_to => '' @errors => ( ) </%args> <html> <head> <title>Mason Book Login</title> </head> <body> % if (@errors) { <h2>Errors</h2> % foreach (@errors) { <b><% $_ | h %></b><br> % } % } <form action="login_submit.html"> <input type="hidden" name="redirect_to" value="<% $redirect_to %>"> <table align="left"> <tr> <td align="right"><b>Login:</b></td> <td><input type="text" name="username" value="<% $username %>"></td> </tr> <tr> <td align="right"><b>Password:</b></td> <td><input type="password" name="password" value="<% $password %>"></td> </tr> <tr> <td colspan="2" align="center"><input type="submit" value="Login"></td> </tr> </table> </form> </body> </html>
This form uses some of the same techniques we saw in Chapter 8 to prepopulate the form and handle errors.
Now let's make the component that handles the form submission. This component, called login_submit.html and shown in Example 11-8, will check the username and password and, if they are valid, place an authentication token into the user's session.
<%args> $username $password $redirect_to </%args> <%init> if (my @errors = check_login($username, $password) { $m->comp( 'redirect.mas', path => 'login_form.html', query => { errors => \@errors, username => $username, password => $password, redirect_to => $redirect_to } ); } $MasonBook::Session{username} = $username; $MasonBook::Session{token} = Digest::SHA1::sha1_hex( 'My secret phrase', $username ); $m->comp( 'redirect.mas', path => $redirect_to ); </%init>
This component simply checks (via magic hand waving) whether the username and password are valid and, if so, generates an authentication token that is added to the user's session. To generate this token, we take the username, which is also in the session, and combine it with a secret phrase. We then generate a MAC from those two things.
The authentication and authorization check looks like this:
if ( $MasonBook::Session{token} ) { if ( $MasonBook::Session{token} eq Digest::SHA1::sha1_hex( 'My secret phrase', $MasonBook::Session{username} ) { # ... valid login, do something here } else { # ... someone is trying to be sneaky! } } else { # no token my $wanted_page = $r->uri; # Append query string if we have one. $wanted_page .= '?' . $r->args if $r->args; $m->comp( 'redirect.mas', path => '/login/login_form.html', query => { redirect_to => $wanted_page } ); }
We could put all the pages that require authorization in a single directory tree and have a top-level autohandler in that tree do the check. If there is no token to check, we redirect the browser to the login page, and after a successful login the user will return, assuming she submitted valid login credentials.
Access Controls with Attributes
The components we saw previously assumed that there are only two access levels, unauthenticated and authenticated. A more complicated version of this code might involve checking that the user has a certain access level or role.
In that case, we'd first check that we had a valid authentication token and then go on to check that the user actually had the appropriate access rights. This is simply an extra step in the authorization process.
Using attributes, we can easily define access controls for different portions of our site. Let's assume that we have four access levels, Guest, User, Editor, and Admin. Most of the site is public and viewable by anyone. Some parts of the site require a valid login, while some require a higher level of privilege.
We implement our access check in our top-level autohandler, /autohandler, from which all other components must inherit in order for the access control code to be effective.
<%init> my $user = get_user( ); # again, hand waving my $required_access = $m->base_comp->attr('required_access'); unless ( $user->has_access_level($required_access) ) { # ... do something like send them to another page } $m->call_next; </%init> <%attr> required_access => 'Guest' </%attr>
It is crucial that we set a default access level in this autohandler. By doing this, we are saying that, by default, all components are accessible by all people, since every visitor will have at least Guest access.
We can override this default elsewhere. For example, in a component called /admin/autohandler, we might have:
<%attr> required_access => 'Admin' </%attr>
As long as all the components in the
/admin/ directory inherit from the /admin/autohandler
component and don't override the required_access
attribute, we have effectively limited that directory (and its subdirectories)
to admin users only. If we for some reason had an individual component in the /admin/ directory that we wanted editors to be able to see, we could simply set the required_access
attribute for that component to 'Editor'
.
Co-Branding Color Schemes
One common business practice these days is to take a useful site and offer "cobranded" versions of it to other businesses. A co-branded site might display different graphics and text for each client while retaining the same basic layout and functionality across all clients.
Mason is extremely well-suited to this task. Let's look at how we might apply a new color scheme to each co-brand.
For the purpose of these examples, we're going to assume that the name of
the co-brand has already been determined and is being passed to our components
as a variable called $cobrand
. This variable could be set up by including the co-brand in the query string,
in a session, or as part of a hostname.
With Stylesheets
One way to do this is to use stylesheets for all of your pages. Each cobrand will then have a different stylesheet. However, since most of the stylesheets will be the same for each client, you'll probably want to have a parent stylesheet that all the others inherit from.
Of course, while it is supposed to be possible to inherit stylesheets, some older browsers like Netscape 4.x don't support that at all, so we will generate the stylesheet on the fly using Mason instead. This gives you all the flexibility of inheritance without the compatibility headaches.
The stylesheet will be called via:
<link rel="stylesheet" href="/styles.css?cobrand=<% $cobrand %>">
Presumably, this snippet would go in a top-level autohandler.3 The styles.css component might look something like Example 11-9.
% while (my ($name, $def) = each %styles) { <% $name %> <% $def %> % } <%args> $cobrand </%args> <%init> my %styles; die "Security violation, style=$style" unless $cobrand =~ /^\w+$/; foreach my $file ('default.css', "$cobrand.css") { local *FILE; open FILE, "< /var/styles/$file" or die "Cannot read /var/styles/$file: $!"; while (<FILE>) { next unless /(\S+) \s+ (\S.*)/x; $styles{$1} = $2; } close FILE; } $r->content_type('text/css'); </%init>
Of course, this assumes that each line of the stylesheet represents a single style definition, something like:
.foo_class { color: blue }
This isn't that hard to enforce for a project, but it limits you to just a
subset of
CSS functionality. If this is not desirable, check out the CSS
and CSS::SAC
modules on CPAN.
This component first grabs all the default styles from the default.css file and then overwrites any styles that are defined in the co-brand-specific file.
One nice aspect of this method is that if the site designers are not programmers, they can just work with plain old stylesheets, which should make them more comfortable.
With Code
Another way to do this is to store the color preferences for each co-brand in a component or perhaps in the database. At the beginning of each request, you could fetch these colors and pass them to each component.
For example, in your top-level autohandler you might have:
<%init> my $cobrand = determine_cobrand( ); # magic hand waving again my %colors = cobrand_colors($cobrand); $m->call_next(%ARGS, colors => \%colors); </%init>
The cobrand_colors()
subroutine could be made to use defaults whenever they were not overridden for
a given co-brand.
Then the components might do something like this:
<%args> %colors </%args> <html> <head> <title>Title</title> </head> <body bgcolor="<% $colors{body_bgcolor} %>"> ...
This technique is a bit more awkward, as it requires that you have a color set
for every possibility ($colors{left_menu_table_cell}
, $colors{footer_text}
, ad nauseam). It also works only for colors, whereas stylesheets allow you to customize fonts and layouts. But if
you're targeting browsers that don't support stylesheets or you
don't know CSS, this is a possible alternative.
Developer Environments
Having a development environment is a good thing for many reasons. Testing potential changes on your production server is likely to get you fired, for one thing.
Ideally, you want each developer to have his own playground where changes he makes don't affect others. Then, when something is working, it can be checked into source control and everyone else can use the updated version.
Multiple Component Roots
A fairly simple way to achieve this goal is by giving each developer his own component root, which will be checked before the main root.
Developers can work on components in their own private roots without fear of breaking anything for anyone else. Once changes are made, the altered component can be checked into source control and moved into the shared root, where everyone will see it.
This means that one HTML::Mason::ApacheHandler
object needs to be created for each developer. This can be done solely by
changing your server configuration file, but it is easiest to do this using an
external handler.
The determination of which object to use can be made either by looking at the URL path or by using a different hostname for each developer.
By path
This example checks the URL to determine which developer's private root to use:
use Apache::Constants qw(DECLINED); my %ah; sub handler { my $r = shift; my $uri = $r->uri; $uri =~ s,^/(\w+),,; # remove the developer name from the path my $developer = $1 or return DECLINED; $r->uri($uri); # set the uri to the new path $ah{$developer} ||= HTML::Mason::ApacheHandler->new ( comp_root => [ [ dev => "/home/$developer/mason" ], [ main => '/var/www' ] ], data_dir => "/home/$developer/data" ); return $ah{$developer}->handle_request($r); }
We first examine the URL of the request to find the developer name, which we assume will always be the first part of the path, like /faye/index.html. We use a regex to remove this from the URL, which we then change to be the altered path.
If there is no developer name we simply decline the request.
The main problem with this approach is that it would then require that all URLs on the site be relative in order to preserve the developer's name in the path. In addition, some Apache features like index files and aliases won't work properly either. Fortunately, there is an even better way.
By hostname
This example lets you give each developer their own hostname:
my %ah; sub handler { my $r = shift; my ($developer) = $r->hostname =~ /^(\w+)\./; $ah{$developer} ||= HTML::Mason::ApacheHandler->new ( comp_root => [ [ dev => "/home/$developer/mason" ], [ main => '/var/www' ] ], data_dir => "/home/$developer/data" ); return $ah{$developer}->handle_request($r); }
This example assumes that for each developer there is a DNS entry like dave.dev.masonbook.com
. You could also insert a CNAME wildcard entry in your DNS. The important part
is that the first piece is the developer name.
Of course, with either method, developers will have to actively manage their development directories. Any component in their directories will block their access to a component of the same name in the main directory.
Multiple Server Configurations
The multiple component root method has several downsides:
-
Modules are shared by all the developers. If a change is made to a module,
everybody will see it. This means that API changes are forced out to everyone
at once, and a runtime error will affect all the developers. Additionally, you
may need to stop and start the server every time a module is changed,
interrupting everyone (although you could use
Apache::Reload
from CPAN to avoid this). - You can't test different server configurations without all the developers being affected.
- Truly catastrophic errors that bring down the web server affect everyone.
- The logs are shared, so if you like to send messages to the error log for debugging you'd better hope that no one else is doing the same thing or you'll have a mess.
The alternative is to run a separate daemon for each developer, each on its own
port. This means maintaining either one fairly complicated configuration file,
with a lot of <IfDefine>
directives or separate configuration files for each developer.
The latter is probably preferable as it gives each developer total freedom to experiment. The configuration files can be generated from a template (possibly using Mason) or a script. Then each developer's server can listen on a different hostname or port for requests.
You can have each server's component root be the developer's working directory, which should mirror the layout of the real site. This means that there is no need to tweak any paths in the components.
This method's downside is that it will inevitably use up more memory than having a single server. It also requires a greater initial time investment in order to generate the configuration file templates. But the freedom it gives to individual developers is very nice, and the time investment is fixed.
Of course, since each developer has a computer, there is nothing to stop a
developer from simply setting up Apache and mod_perl
locally. And the automation would be even easier since there's no need to
worry about dealing with unique port numbers or shared system resources. Even
better (or worse, depending on your point of view), a developer can check out
the entire system onto a laptop and work on the code without needing to be on
the office network.
Managing DBI Connections
Not infrequently, we see people on the Mason users list asking questions about how to handle caching DBI connections.
Our recipe for this is really simple:
use Apache::DBI
Rather than reinventing the wheel, use Apache::DBI
, which provides the following features:
-
It is completely transparent to use. Once you've used it, you simply call
DBI->connect()
as always andApache::DBI
gives you an existing handle if one is available. - It makes sure that the handle is live, so that if your RDBMS goes down and then back up, your connections still work just fine.
- It does not cache handles made before Apache forks, as many DBI drivers do not support using a handle after a fork.
Using Mason Outside of Dynamic Web Sites
So far we've spent a lot of time telling you how to use Mason to generate spiffy web stuff on the fly, whether that be HTML, WML, or even dynamic SVG files.
But Mason can be used in lots of other contexts. For example, you could write a Mason app that recursively descends a directory tree and calls each component in turn to generate a set of static pages.
How about using Mason to generate configuration files from templates? This could be quite useful if you had to configure a lot of machines similarly but with each one slightly different (for example, a web server farm).
Generating a Static Site from Components
Many sites might be best implemented as a set of static files instead of as a set of dynamically created responses to requests. For example, if a site's content changes only once or twice a week, generating each page dynamically upon request is probably overkill. In addition, you can often find much cheaper web hosting if you don't need a mechanism for generating pages dynamically.
But we'd still like some of the advantages a Mason site can give us. We'd like to build the site based on a database of content. We'd also like to have a nice consistent set of headers and footers, as well as automatically generate some bits for each page from the database. And maybe, just maybe, we also want to be able to make look-and-feel changes to the site without resorting to a multi-file find-and-replace. These requirements suggest that Mason is a good choice for site implementation.
For our example in this section, we'll consider a site of film reviews. It is similar to a site that one of the authors actually created for Hong Kong film reviews. Our example site will essentially be a set of pages that show information about films, including the film's title, year of release, director, cast, and of course a review. We'll generate the site from the Mason components on our home GNU/Linux box and then upload the site to the host.
First, we need a directory layout. Assuming that we're starting in the directory /home/dave/review-site, here's the layout:
/home/dave/review-site (top level) /htdocs - index.html /reviews - autohandler - Anna_Magdalena.html - Lost_and_Found.html - ... (one file for each review) /lib - header.mas - footer.mas - film_header_table.mas
The index page will be quite simple. It will look like Example 11-10.
<& /lib/header.mas, title => 'review list' &> <h1>Pick a review</h1> <ul> % foreach my $title (sort keys %pages) { <li><a href="<% $pages{$title} | h %>"><% $title | h %></a> % } </li> <%init> my %pages; local *DIR; my $dir = File::Spec->catfile( File::Spec->curdir, 'reviews' ); opendir DIR, $dir or die "Cannot open $dir dir: $!"; foreach my $file ( grep { /\.html$/ } readdir DIR ) { next if $file =~ /index\.html$/; my $comp = $m->fetch_comp("reviews/$file") or die "Cannot find reviews/$file component"; my $title = $comp->attr('film_title'); $pages{$title} = "reviews/$file"; } closedir DIR or die "Cannot close $dir dir: $!"; </%init>
This component simply makes a list of the available reviews, based on the files
ending in .html in the /home/dave/review-site/reviews subdirectory. We assume that the actual film title is kept as an attribute
(via an <%attr>
section) of the component, so we load the component and ask it for the film_title
attribute. If it doesn't have one Mason will throw an exception, which we
think is better than having an empty link. If this were a dynamic web site, we
might want to instead simply skip that review and go on to the next one, but
here we're assuming that this script is being executed by a human being
capable of fixing the error.
We make sure to HTML-escape the filename and the film title in the <a>
tag's href
attribute. It's not unlikely that the film could contain an ampersand
character (&
), and we want to generate proper HTML.
Next, let's make our autohandler for the reviews subdirectory (Example 11-11), which will take care of all the repeated work that goes into displaying a review.
<& /lib/header.mas, title => $film_title &> <& /lib/film_header_table.mas, comp => $m->base_comp &> % $m->call_next; <& /lib/footer.mas &> <%init> my $film_title = $m->base_comp->attr('film_title'); </%init>
Again, a very simple page. We grab the film title so we can pass it to the header component. Then we call the film_header_table.mas component, which will use attributes from the component it is passed to generate a table containing the film's title, year of release, cast, and director.
Then we call the review component itself via call_next()
and finish up with the footer.
Our header (Example 11-12) is quite straightforward.
<html> <head> <title><% $real_title | h %></title> </head> <body> <%args> $title </%args> <%init> my $real_title = "Dave's Reviews - $title"; </%init>
This is a nice, simple header that generates the basic HTML pieces every page
needs. Its only special feature is that it will make sure to incorporate a
unique title, based on what is passed in the $title
argument.
The footer (Example 11-13) is the simplest of all.
<p> <em>Copyright © David Rolsky, 1996-2002</em>. </p> <p> <em>All rights reserved. No part of the review may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without permission in writing from the copyright owner.</em> </p> </body> </html>
There's one last building block piece left before we get to the reviews, the /lib/film_header_table.mas component (Example 11-14).
<table width="100%"> <tr> <td colspan="2" align="center"><h1><% $film_title | h %></h1></td> </tr> % foreach my $field ( grep { exists $data{$_} } @optional ) { <tr> <td><strong><% ucfirst $field %></strong>:</td> <td><% $data{$field} | h %></td> </tr> % } </table> <%args> $comp </%args> <%init> my %data; my $film_title = $comp->attr('film_title'); my @optional = qw( year director cast ); foreach my $field (@optional) { my $data = $comp->attr_if_exists($field); next unless defined $data; $data{$field} = ref $data ? join ', ', @$data : $data; } </%init>
This component just builds a table based on the attributes of the component passed to it. The required attribute is the film's title, but we can accommodate the year, director(s), and cast.
There are only two slightly complex lines.
The first is:
% foreach my $field ( grep { exists $data{$_} } @optional ) {
Here we are iterating through the fields in @optional
that have matching keys in %data
. We could have simply called keys %data
, but we want to display things in a specific order while still skipping
nonexistent keys.
The other line that bears some explaining is:
$data{$field} = ref $data ? join ', ', @$data : $data;
We check whether the value is a reference so that the attribute can contain an array reference, which is useful for something like the cast, which is probably going to have more than one person in it. If it is an array, we join all its elements together into a comma-separated list. Otherwise, we simply use it as-is.
Let's take a look at what one of the review components might look like:
<%attr> film_title => 'Lost and Found' year => 1996 director => 'Lee Chi-Ngai' cast => [ 'Kelly Chan Wai-Lan', 'Takeshi Kaneshiro', 'Michael Wong Man-Tak' ] </%attr> <p> Takeshi Kaneshiro plays a man who runs a business called Lost and Found, which specializes in searching for lost things and people. In the subtitles, his name is shown as That Worm, though that seems like a fairly odd name, cultural barriers notwithstanding. Near the beginning of the film, he runs into Kelly Chan. During their first conversation, she says that she has lost something. What she says she has lost is hope. We soon find out that she has leukemia and that the hope she seeks seems to be Michael Wong, a sailor who works for her father's shipping company. </p> <p> blah blah blah... </p>
This makes writing new reviews really easy. All we do is type in the review and a small number of attributes, and the rest of the framework is built automatically.
A more complex version of this site might store some or all of the data, including the reviews, in a database, which would make it easier to reuse the information in another context. But this is certainly good enough for a first pass.
All that's left is the script that will generate the static HTML files. See Example 11-15.
#!/usr/bin/perl -w use strict; # Always use strict! use Cwd; use File::Basename; use File::Find; use File::Path; use File::Spec; use HTML::Mason; # These are directories. The canonpath method removes any cruft # like doubled slashes. my ($source, $target) = map { File::Spec->canonpath($_) } @ARGV; die "Need a source and target\n" unless defined $source && defined $target; # Make target absolute because File::Find changes the current working # directory as it runs. $target = File::Spec->rel2abs($target); my $interp = HTML::Mason::Interp->new( comp_root => File::Spec->rel2abs(cwd) ); find( \&convert, $source ); sub convert { # We don't want to try to convert our autohandler or .mas # components. $_ contains the filename return unless /\.html$/; my $buffer; # This will save the component's output in $buffer $interp->out_method(\$buffer); # We want to split the path to the file into its components and # join them back together with a forward slash in order to make # a component path for Mason # # $File::Find::name has the path to the file we are looking at, # relative to the starting directory my $comp_path = join '/', File::Spec->splitdir($File::Find::name); $interp->exec("/$comp_path"); # Strip off leading part of path that matches source directory my $name = $File::Find::name; $name =~ s/^$source//; # Generate absolute path to output file my $out_file = File::Spec->catfile( $target, $name ); # In case the directory doesn't exist, we make it mkpath(dirname($out_file)); local *RESULT; open RESULT, "> $out_file" or die "Cannot write to $out_file: $!"; print RESULT $buffer or die "Cannot write to $out_file: $!"; close RESULT or die "Cannot close $out_file: $!"; }
We take advantage of the File::Find
module included with Perl, which can recursively descend a directory structure
and invoke a callback for each file found. We simply have our callback (the convert()
subroutine) call the HTML::Mason::Interp
object's exec()
method for each file ending in .html. We then write the results of the component call out to disk in the target
directory.
We also use a number of other modules, including Cwd
, File::Basename
, File::Path
, and File::Spec
. These modules are distributed as part of the Perl core and provide useful
functions for dealing with the filesystem in a cross-platform-compatible
manner.
You may have noticed in Example 9-1 that when we invoked the Interpreter's exec()
method directly, it didn't attempt to handle any of the web-specific
elements of the request.
The same method is employed again here in our HTML generation script, and this same methodology could be applied in other situations that have little or nothing to do with the web.
Generating Config Files
Config files are a good candidate for
Mason. For example, your production and staging web server config files might
differ in only a few areas. Changes to one usually will need to be propagated
to another. This is especially true with mod_perl
, where web server configuration can basically be part of a web-based
application.
And if you adopt the per-developer server solution discussed earlier, a template-driven config file generator becomes even more appealing.
Example 11-16 is a simple script to drive this generation.
#!/usr/bin/perl -w use strict; use Cwd; use File::Spec; use HTML::Mason; use User::pwent; my $comp_root = File::Spec->rel2abs( File::Spec->catfile( cwd( ), 'config' ) ); my $output; my $interp = HTML::Mason::Interp->new( comp_root => $comp_root, out_method => \$output, ); my $user = getpwuid($<); $interp->exec( '/httpd.conf.mas', user => $user ); my $file = File::Spec->catfile( $user->dir, 'etc', 'httpd.conf' ); open FILE, ">$file" or die "Cannot open $file: $!"; print FILE $output; close FILE;
An httpd.conf.mas from the component might look like Example 11-17.
ServerRoot <% $user->dir %> PidFile <% File::Spec->catfile( $user->dir, 'logs', 'httpd.pid' ) %> LockFile <% File::Spec->catfile( $user->dir, 'logs', 'httpd.lock' ) %> Port <% $user->uid + 5000 %> # loads Apache modules, defines content type handling, etc. <& standard_apache_config.mas &> <Perl> use lib <% File::Spec->catfile( $user->dir, 'project', 'lib' ) %>; </Perl> DocumentRoot <% File::Spec->catfile( $user->dir, 'project', 'htdocs' ) %> PerlSetVar MasonCompRoot <% File::Spec->catfile( $user->dir, 'project', 'htdocs' ) %> PerlSetVar MasonDataDir <% File::Spec->catfile( $user->dir, 'mason' ) %> PerlModule HTML::Mason::ApacheHandler <LocationMatch "\.html$"> SetHandler perl-script PerlHandler HTML::Mason::ApacheHandler </LocationMatch> <%args> $user </%args>
This points the server's document root to the developer's working
directory. Similarly, it adds the project/lib directory the Perl's @INC
via use lib
so that the user's working copy of the project's modules are seen
first. The server will listen on a port equal to the user's user id plus
5,000.
Obviously, this is an incomplete example. It doesn't specify where logs, or other necessary config items, will go. It also doesn't handle generating the config file for a server intended to be run by the root user on a standard port.
Footnotes
1. If you are not familiar with Perl's tied variable feature, we suggest
reading the perltie manpages (perldoc perltie
). -- Return.
2. See the documentation accompanying the Cache::Cache
modules for more detail. -- Return.
3. The overachieving reader may want to imagine a dhandler-based solution with URLs like /styles/<cobrand>.css. -- Return.
These HTML pages were created by running this script against the pseudo-POD source.