home :: computers :: programming :: perl :: mod perl

Apache::Util::escape_html() Doesn't Like Perl UTF-8 Strings

I got bit by a bug with Apache::Util’s escape_html() function in mod_perl 1. It seems that it doesn’t like Perl’s Unicode encoded strings! This patch demonstrates the issue (be sure that your editor understands utf-8):

—- modperl/t/net/perl/util.pl.~1.18.~	Sun May 25 03:54:08 2003
+++ modperl/t/net/perl/util.pl	Thu Sep  9 19:38:40 2004
@@ -74,6 +74,25 @@
 
 #print $esc_2;
 test ++$i, $esc eq $esc_2;
+
+# Make sure that escape_html() understands multibyte characters.
+my $utf8 = '<專輯>';
+my $esc_utf8 = '<專輯>';
+my $test_esc_utf8 = Apache::Util::escape_html($utf8);
+test ++$i, $test_esc_utf8 eq $esc_utf8;
+#print STDERR "Compare '$test_esc_utf8'\n     to '$esc_utf8'\n";
+
+eval { require Encode };
+unless ($@) {
+    # Make sure escape_html() properly handles strings with Perl's
+    # Unicode encoding.
+    $utf8 = Encode::decode_utf8($utf8);
+    $esc_utf8 = Encode::decode_utf8($esc_utf8);
+    $test_esc_utf8 = Apache::Util::escape_html($utf8);
+    test ++$i, $test_esc_utf8 eq $esc_utf8;
+    #print STDERR "Compare '$test_esc_utf8'\n     to '$esc_utf8'\n";
+}
+
 use Benchmark;
 
 =pod

If I enable the print statements and look at the log, I see this:

Compare ‘<專輯>’
     to ‘<專輯>’
Compare ‘<å°è¼¯>’
     to ‘<專輯>’

The first escape appears to work correctly, but when I decode the string to Perl’s Unicode representation, you can see how badly escape_html() munges the text!

Curiously, both tests fail, although the first conversion appears to be correct. This could be due to the behavior of eq, though I’m not sure why. But it’s the second test that’s the more interesting, since it really screws things up.

Apache::TestMB Released!

As I mentioned last week, I’ve been working on a subclass of Module::Build that supports testing with Apache::Test. Today, Geoff announced the release of Apache::Test 1.12. This release includes the new Module::Build subclass, Apache::TestMB. Now anyone using Apache::Test to test their module can convert the build system to Module::Build.

To set an example, I’ve just released MasonX::Interp::WithCallbacks using the new build module. The conversion was simple; in fact, I think that Apache::TestMBis easier to use than Apache::TestMM (which integrates Apache::Test with ExtUtils::MakeMaker). My Makefile.PL had looked like this:

#!perl -w

use strict;
use ExtUtils::MakeMaker;
use File::Spec::Functions qw(catfile catdir);
use constant HAS_APACHE_TEST => eval {require Apache::Test};

# Set up the test suite.
if (HAS_APACHE_TEST) {
    require Apache::TestMM;
    require Apache::TestRunPerl;
    Apache::TestMM->import(qw(test clean));
    Apache::TestMM::filter_args();
    Apache::TestRunPerl->generate_script();
} else {
    print "Skipping Apache test setup.\n";
}

my $clean = join ‘ ‘, map { catfile(’t’, $_) }
  qw(mason TEST logs);

WriteMakefile(
    NAME		=> ‘MasonX::Interp::WithCallbacks’,
    VERSION_FROM	=> ‘lib/MasonX/Interp/WithCallbacks.pm’,
    PREREQ_PM		=> { ‘HTML::Mason’             => ‘1.23’,
                                ‘Test::Simple’            => ‘0.17’,
                                ‘Class::Container’        => ‘0.09’,
                                ‘Params::CallbackRequest’ => ‘1.11’,
                              },
    clean               => { FILES => $clean },
    ($] >= 5.005 ?    ## Add these new keywords supported since 5.005
      (ABSTRACT_FROM    => ‘lib/MasonX/Interp/WithCallbacks.pm’,
       AUTHOR           => ‘David Wheeler <david@kineticode.com>’) : ()),
);

The new Build.PL simplifies things quite a bit. It looks like this:

use Module::Build;

my $build_pkg = eval { require Apache::TestMB }
  ? ‘Apache::TestMB’ : ‘Module::Build’;

$build_pkg->new(
    module_name        => ‘MasonX::Interp::WithCallbacks’,
    license            => ‘perl’,
    requires           => { ‘HTML::Mason’             => ‘1.23’,
                               ‘Test::Simple’            => ‘0.17’,
                               ‘Class::Container’        => ‘0.09’,
                               ‘Params::CallbackRequest’ => ‘1.11’
                             },
    build_requires     => { Test::Simple => ‘0.17’ },
    create_makefile_pl => ‘passthrough’,
    add_to_cleanup     => [’t/mason’],
)->create_build_script;

Much nicer, eh?

Module::Build + Apache::Test is Nearly Here

Over the last couple of days, I whipped up a new class to be added to the Apache HTTP Test Project. The new class, Apache::TestMB, is actually a subclass of Module::Build, and finally provides support for using Apache::Test with Module::Build. You use it just like Module::Build; however, since a lot of modules choose to install themselves even if Apache isn’t installed (because they can be used both inside and outside of mod_perl, e.g., HTML::Mason), I’m suggesting that Build.PL files look like this:

use Module::Build;

my $build_pkg = eval { require Apache::TestMB }
  ? "Apache::TestMB" : "Module::Build";

my $build = $build_pkg->new(
  module_name => "My::Module",
)->create_build_script;

Pretty simple, huh? To judge by the discussion, it will soon be committed to the Apache::Test repository and released to CPAN. My MasonX::Interp::WithCallbacks module will debut with a new Apache::TestMB-powered Build.PL soon afterward.

Powered by KinoSearch