Just a Theory

By David E. Wheeler

Posts about Perl

Wanted: New SVN::Notify Maintainer

I’ve used Subversion very occasionally since 2009, and SVN::Notify at all. Over the years, I’ve fixed minor issues with it now and then, and made the a couple of releases to address issues fixed by others. But it’s past the point where I feel qualified to maintain it. Hell, the repository for SVN::Notify has been hosted on GitHub ever since 2011. I don’t have an instance of Subversion against which to test it; nor do I have any SMTP servers to throw test messages at.

In short, it’s past time I relinquished maintenance of this module to someone with a vested interest in its continued use. Is that you? Do you need to keep SVN::Notify running for your projects, and have a few TUITs to fix the occasional bug or security issue? If so, drop me a line (david @ this domain). I’d be happy to transfer the repository.

Please Test Pod::Simple 3.29_3

Pod Book

I’ve just pushed Pod-Simple 3.29_v3 to CPAN. Karl Williamson did a lot of hacking on this release, finally adding support for EBCDIC. But as part of that work, and in coordination with Pod::Simple’s original author, Sean Burke, as well as pod-people, we have switched the default encoding from Latin-1 to CP-1252.

On the surface, that might sound like a big change, but in truth, it’s pretty straight-forward. CP-1252 is effectively a superset of Latin-1, repurposing 30 or so unused control characters from Latin-1. Those characters are pretty common on Windows (the home of the CP family of encodings), especially in pastes from Word. It’s nice to be able to pick those up essentially for free.

Still, Karl’s done more than that. He also updated the encoding detection to do a better job at detecting UTF-8. This is the real default. Pod::Simple only falls back on CP1252 if there are no obvious UTF-8 byte sequences in your Pod.

Overall these changes should be a great improvement. Better encoding support is always a good idea. But it is a pretty significant change, including a change to the Pod spec. Hence the test release. Please make sure it works well with your code by installing it today:

cpan D/DW/DWHEELER/Pod-Simple-3.29_3.tar.gz
cpanm DWHEELER/Pod-Simple-3.29_3.tar.gz

Oh, and one last thing: If Pod::Simple fails to properly recognize the encoding in your Pod file, you can always use the =encoding command early in your Pod file to make it explicit:

=encoding CP1254

Build Modern Perl RPMs with rpmcpan

iovation + Perl = Love

We’ve been using the CentOS Perl RPMs at iovation to run all of our Perl applications. This has been somewhat painful, because the version of Perl, 5.10.1, is quite old — it shipped in August 2009. In fact, it consists mostly of bug fixes against Perl 5.10.0, which shipped in December 2007! Many of the modules provided by CentOS core and EPEL are quite old, as well, and we had built up quite the collection of customized module RPMs managed by a massive spaghetti-coded Jenkins job. When we recently ran into a Unicode issue that would best have been addressed by running a more modern Perl — rather than a hinky workaround — I finally sat down and knocked out a way to get a solid set of Modern Perl and related CPAN RPMs.

I gave it the rather boring name rpmcpan, and now you can use it, too. Turns out, DevOps doesn’t myopically insist on using core RPMs in the name of some abstract idea about stability. Rather, we just need a way to easily deploy our stuff as RPMs. If the same applies to your organization, you can get Modern Perl RPMs, too.

Here’s how we do it. We have a new Jenkins job that runs both nightly and whenever the rpmcpan Git repository updates. It uses the MetaCPAN API to build the latest versions of everything we need. Here’s how to get it to build the latest version of Perl, 5.20.1:

./bin/rpmcpan --version 5.20.1

That will get you a nice, modern Perl RPM, named perl520, completely encapsulated in /usr/local/perl520. Want 5.18 instead: Just change the version:

./bin/rpmcpan --version 5.18.2

That will give you perl518. But that’s not all. You want to build CPAN distributions against that version. Easy. Just edit the dists.json file. Its contents are a JSON object where the keys name CPAN distributions (not modules), and the values are objects that customize our RPMs get built. Most of the time, the objects can be empty:

    "Try-Tiny": {}

This results in an RPM named perl520-Try-Tiny (or perl518-Try-Tiny, etc.). Sometimes you might need additional information to customize the CPAN spec file generated to build the distribution. For example, since this is Linux, we need to exclude a Win32 dependency in the Encode-Locale distribution:

    "Encode-Locale": { "exclude_requires": ["Win32::Console"] }

Other distributions might require additional RPMs or environment variables, like DBD-Pg, which requires the PostgreSQL RPMs:

    "DBD-Pg": {
        "build_requires": ["postgresql93-devel", "postgresql93"],
        "environment": { "POSTGRES_HOME": "/usr/pgsql-9.3" }

See the README for a complete list of customization options. Or just get started with our dists.json file, which so far builds the bare minimum we need for one of our Perl apps. Add new distributions? Send a pull request! We’ll be doing so as we integrate more of our Perl apps with a Modern Perl and leave the sad RPM past behind.

More about…

Localize Your Perl Apps with this One Weird Trick

Nota Bene: This is a republication of a post that originally appeared in the 2013 Perl Advent Calendar.

These days, gettext is far and away the most widely-used localization (l10n) and internationalization (i18n) library for open-source software. So far, it has not been widely used in the Perl community, even though it’s the most flexible, capable, and easy-to use solution, thanks to Locale::TextDomain.1 How easy? Let’s get started!

Module Internationale

First, just use Locale::TextDomain. Say you’re creating an awesome new module, Awesome::Module. These CPAN distribution will be named Awesome-Module, so that’s the “domain” to use for its localizations. Just let Locale::TextDomain know:

use Locale::TextDomain 'Awesome-Module';

Locale::TextDomain will later use this string to look for the appropriate translation catalogs. But don’t worry about that just yet. Instead, start using it to translate user-visible strings in your code. With the assistance of the Locale::TextDomain’s [comprehensive documentation], you’ll find it second nature to internationalize your modules in no time. For example, simple strings are denoted with __:

say __ 'Greetings puny human!';

If you need to specify variables, use __x:

say __x(
   'Thank you {sir}, may I have another?',
   sir => $username,

Need to manage plurals? Use __n:

say __n(
    'I will not buy this record, it is scratched.',
    'I will not buy these records, they are scratched.',

If $num_records is 1, the first phrase will be used. Otherwise the second.

Sometimes you gotta do both, mix variables and plurals. __nx has got you covered there:

say __nx(
    'One item has been grokked.',
    '{count} items have been grokked.',
    count => $num_items,

Congratulations! Your module is now internationalized. Wasn’t that easy? Make a habit of using these functions in all the modules in your distribution, always with the Awesome-Module domain, and you’ll be set.

Encode da Code

Locale::TextDomain is great, but it dates from a time when Perl character encoding was, shall we say, sub-optimal. It therefore took it upon itself to try to do the right thing, which is to to detect the locale from the runtime environment and automatically encode as appropriate. Which might work okay if all you ever do is print localized messages — and never anything else.

If, on the other hand, you will be manipulating localized strings in your code, or emitting unlocalized text (such as that provided by the user or read from a database), then it’s probably best to coerce Locale::TextDomain to return Perl strings, rather than encoded bytes. There’s no formal interface for this in Locale::TextDomain, so we have to hack it a bit: set the $OUTPUT_CHARSET environment variable to “UTF-8” and then bind a filter. Don’t know what that means? Me neither. Just put this code somewhere in your distribution where it will always run early, before anything gets localized:

use Locale::Messages qw(bind_textdomain_filter);
use Encode;
    bind_textdomain_filter 'Awesome-Module' => \&Encode::decode_utf8;

You only have to do this once per domain. So even if you use Locale::TextDomain with the Awesome-Module domain in a bunch of your modules, the presence of this code in a single early-loading module ensures that strings will always be returned as Perl strings by the localization functions.

Environmental Safety

So what about output? There’s one more bit of boilerplate you’ll need to throw in. Or rather, put this into the main package that uses your modules to begin with, such as the command-line script the user invokes to run an application.

First, on the shebang line, follow Tom Christiansen’s advice and put -CAS in it (or set the $PERL_UNICODE environment variable to AS). Then use the POSIX setlocale function to the appropriate locale for the runtime environment. How? Like this:

#!/usr/bin/perl -CAS

use v5.12;
use warnings;
use utf8;
use POSIX qw(setlocale);
    if ($^O eq 'MSWin32') {
        require Win32::Locale;
        setlocale POSIX::LC_ALL, Win32::Locale::get_locale();
    } else {
        setlocale POSIX::LC_ALL, '';

use Awesome::Module;

Locale::TextDomain will notice the locale and select the appropriate translation catalog at runtime.

Is that All There Is?

Now what? Well, you could do nothing. Ship your code and those internationalized phrases will be handled just like any other string in your code.

But what’s the point of that? The real goal is to get these things translated. There are two parts to that process:

  1. Parsing the internationalized strings from your modules and creating language-specific translation catalogs, or “PO files”, for translators to edit. These catalogs should be maintained in your source code repository.

  2. Compiling the PO files into binary files, or “MO files”, and distributing them with your modules. These files should not be maintained in your source code repository.

Until a year ago, there was no Perl-native way to manage these processes. Locale::TextDomain ships with a sample Makefile demonstrating the appropriate use of the GNU gettext command-line tools, but that seemed a steep price for a Perl hacker to pay.

A better fit for the Perl hacker’s brain, I thought, is Dist::Zilla. So I wrote Dist::Zilla::LocaleTextDomain to encapsulate the use of the gettext utiltiies. Here’s how it works.

First, configuring Dist::Zilla to compile localization catalogs for distribution: add these lines to your dist.ini file:


There are configuration attributes for the LocaleTextDomain plugin, such as where to find the PO files and where to put the compiled MO files. In case you didn’t use your distribution name as your localization domain in your modules, for example:

use Locale::TextDomain 'com.example.perl-libawesome';

Then you’d set the textdomain attribute so that the LocaleTextDomain plugin can find the translation catalogs:

textdomain = com.example.perl-libawesome

Check out the configuration docs for details on all available attributes.

At this point, the plugin doesn’t do much, because there are no translation catalogs yet. You might see this line from dzil build, though:

[LocaleTextDomain] Skipping language compilation: directory po does not exist

Let’s give it something to do!

Locale Motion

To add a French translation file, use the msg-init command2:

% dzil msg-init fr
Created po/fr.po.

The msg-init command uses the GNU gettext utilities to scan your Perl source code and initialize the French catalog, po/fr.po. This file is now ready translation! Commit it into your source code repository so your agile-minded French-speaking friends can find it. Use msg-init to create as many language files as you like:

% dzil msg-init de ja.JIS en_US.UTF-8 en_UK.UTF-8
Created po/de.po.
Created po/ja.po.
Created po/en_US.po.
Created po/en_UK.po.

Each language has its on PO file. You can even have region-specific catalogs, such as the en_US and en_UK variants here. Each time a catalog is updated, the changes should be committed to the repository, like code. This allows the latest translations to always be available for compilation and distribution. The output from dzil build now looks something like:

po/fr.po: 10 translated messages, 1 fuzzy translation, 0 untranslated messages.
po/ja.po: 10 translated messages, 1 fuzzy translation, 0 untranslated messages.
po/en_US.po: 10 translated messages, 1 fuzzy translation, 0 untranslated messages.
po/en_UK.po: 10 translated messages, 1 fuzzy translation, 0 untranslated messages.

The resulting MO files will be in the shared directory of your distribution:

% find Awesome-Module-0.01/share -type f

From here Module::Build or ExtUtils::MakeMaker will install these MO files with the rest of your distribution, right where Locale::TextDomain can find them at runtime. The PO files, on the other hand, won’t be used at all, so you might as well exclude them from the distribution. Add this line to your MANIFEST.SKIP to prevent the po directory and its contents from being included in the distribution:


Mergers and Acquisitions

Of course no code base is static. In all likelihood, you’ll change your code — and end up adding, editing, and removing localizable strings as a result. You’ll need to periodically merge these changes into all of your translation catalogs so that your translators can make the corresponding updates. That’s what the the msg-merge command is for:

% dzil msg-merge
extracting gettext strings
Merging gettext strings into po/de.po
Merging gettext strings into po/en_UK.po
Merging gettext strings into po/en_US.po
Merging gettext strings into po/ja.po

This command re-scans your Perl code and updates all of the language files. Old messages will be commented-out and new ones added. Commit the changes and give your translators a holler so they can keep the awesome going.

Template Scan

The msg-init and msg-merge commands don’t actually scan your source code. Sort of lied about that. Sorry. What they actually do is merge a template file into the appropriate catalog files. If this template file does not already exist, a temporary one will be created and discarded when the initialization or merging is done.

But projects commonly maintain a permanent template file, stored in the source code repository along with the translation catalogs. For this purpose, we have the msg-scan command. Use it to create or update the template, or POT file:

% dzil msg-scan
extracting gettext strings into po/Awesome-Module.pot

From here on in, the resulting .pot file will be used by msg-init and msg-merge instead of scanning your code all over again. But keep in mind that, if you do maintain a POT file, future merges will be a two-step process: First run msg-scan to update the POT file, then msg-merge to merge its changes into the PO files:

% dzil msg-scan
extracting gettext strings into po/Awesome-Module.pot
% dzil msg-merge
Merging gettext strings into po/de.po
Merging gettext strings into po/en_UK.po
Merging gettext strings into po/en_US.po
Merging gettext strings into po/ja.po

Lost in Translation

One more thing, a note for translators. They can, of course, also use msg-scan and msg-merge to update the catalogs they’re working on. But how do they test their translations? Easy: use the msg-compile command to compile a single catalog:

% dzil msg-compile po/fr.po
[LocaleTextDomain] po/fr.po: 195 translated messages.

The resulting compiled catalog will be saved to the LocaleData subdirectory of the current directory, so it’s easily available to your app for testing. Just be sure to tell Perl to include the current directory in the search path, and set the $LANGUAGE environment variable for your language. For example, here’s how I test the [Sqitch] French catalog:

% dzil msg-compile po/fr.po              
[LocaleTextDomain] po/fr.po: 148 translated messages, 36 fuzzy translations, 27 untranslated messages.
% LANGUAGE=fr perl -Ilib -CAS -I. bin/sqitch foo
"foo" n'est pas une commande valide

Just be sure to delete the LocaleData directory when you’re done — or at least don’t commit it to the repository.


This may seem like a lot of steps, and it is. But once you have the basics in place — Configuring the Dist::Zilla::LocaleTextDomain plugin, setting up the “textdomain filter”, setting and the locale in the application — there are just a few habits to get into:

  • Use the functions __, __x, __n, and __nx to internationalize user-visible strings
  • Run msg-scan and msg-merge to keep the catalogs up-to-date
  • Keep your translators in the loop.

The Dist::Zilla::LocaleTextDomain plugin will do the rest.

  1. What about Locale::Maketext, you ask? It has not, alas, withsthood the test of time. For details, see Nikolai Prokoschenko’s epic 2009 polemic, “On the state of i18n in Perl.” See also Steffen Winkler’s presentation, Internationalisierungs-Framework auswählen (and the English translation by Aristotle Pagaltzis), from German Perl Workshop 2010.

  2. The msg-init function — like all of the dzil msg-* commands — uses the GNU gettext utilities under the hood. You’ll need a reasonably modern version in your path, or else it won’t work.

Lexical Subroutines

Ricardo Signes:

One of the big new experimental features in Perl 5.18.0 is lexical subroutines. In other words, you can write this:

my sub quickly { ... }
my @sorted = sort quickly @list;

my sub greppy (&@) { ... }
my @grepped = greppy { ... } @input;

These two examples show cases where lexical references to anonymous subroutines would not have worked. The first argument to sort must be a block or a subroutine name, which leads to awful code like this:

sort { $subref->($a, $b) } @list

With our greppy, above, we get to benefit from the parser-affecting behaviors of subroutine prototypes.

My favorite tidbit about this feature? Because lexical subs are lexical, and method-dispatch is package-based, lexical subs are not subject to method lookup and dispatch! This just might alleviate the confusion of methods and subs, as chromatic complained about just yesterday. Probably doesn’t solve the problem for imported subs, though.

More about…

TPF To Revamp Grants

Alberto Simões:

Nevertheless, this lack of “lower than $3000” grant proposals, and the fact that lot of people have been discussing (and complaining) about this value being too low, the Grants Committee is starting a discussion on rewriting and reorganizing the way it works. Namely, in my personal blog I opened a discussion about the Grants Committee some time ago, and had plenty of feedback, that will be helpful for our internal discussion.

This is great news. I would love to see more and more ambitious grant proposals, as well as awards people an subsist on. I look forward to seeing the new rules.

More about…

Mopping the Moose

Stevan Little:

I spent much of last week on vacation with the family so very little actual coding got done on the p5-mop, but instead I did a lot of thinking. My next major goal for the p5-mop is to port a module written in Moose, in particular, one that uses many different Moose features. The module I have chosen to port is Bread::Board and I chose it for two reasons; first, it was the first real module that I wrote using Moose and second, it makes heavy use of a lot of Moose’s features.

I’m so happy to see Stevan making progress on the Perl 5 MOP again.

More about…

A Perl Blog

I have been unsatisfied with Just a Theory for some time. I started that blog in 2004 more or less for fun, thinking it would be my permanent home on the internet. And it has been. But the design, while okay in 2004, is just awful by today’s standards. A redesign is something I have planned to do for quite some time.

I had also been thinking about my audience. Or rather, audiences. I’ve blogged about many things, but while a few dear family members might want to read everything I ever post, most folks, I think, are interested in only a subset of topics. Readers of Just a Theory came for posts about Perl, or PostgreSQL, or culture, travel, or politics. But few came for all those topics, in my estimation.

More recently, a whole bunch of top-level domains have opened up, often with the opportunity for anyone to register them. I was lucky enough to snag theory.pm and theory.pl, thinking that perhaps I would create a site just for blogging about Perl. I also nabbed theory.so, which I might dedicate to database-related blogging, and theory.me, which would be my personal blog (travel, photography, cultural essays, etc.).

And then there is Octopress. A blogging engine for hackers. Perfect for me. Hard to imagine something more appropriate (unless it was written in Perl). It seemed like a good opportunity to partition my online blogging.

So here we are with my first partition. theory.pm is a Perl blog. Seemed like the perfect name. I fiddled with it off and on for a few months, often following Matt Gemmell’s Advice, and I’m really happy with it. The open-source fonts Source Sans Pro and Source Code Pro, from Adobe, look great. The source code examples are beautifully marked up and displayed using the Solarized color scheme (though presentation varies in feed readers). Better still, it’s equally attractive and readable on computers, tablets and phones, thanks to the foundation laid by Aron Cedercrantz’s BlogTheme.

I expect to fork this code to create a database blog soon, and then perhaps put together a personal blog. Maybe the personal blog will provide link posts for posts on the other sites, so that if anyone really wants to read everything, they can. I haven’t decided yet.

In the meantime, now that I have a dedicated Perl blog, I guess I’ll have to start writing more Perl-related stuff. I’m starting with some posts about the state of exception handling in Perl 5, the first of which is already up. Stay tuned for more.

Trying Times

Exception handling is a bit of a pain in Perl. Traditionally, we use eval {}:

eval {
if (my $err = $@) {
    # Inspect $err…

The use of the if block is a bit unfortunate; worse is the use of the global $@ variable, which has inflicted unwarranted pain on developers over the years1. Many Perl hackers put Try::Tiny to work to circumvent these shortcomings:

try {
} catch {
    # Inspect $_…

Alas, Try::Tiny introduces its own idiosyncrasies, particularly its use of subroutine references rather than blocks. While a necessity of a pure-Perl implementation, it prevents returning from the calling context. One must work around this deficiency by checking return values:

my $rv = try {
} catch {
   # …

if (!$rv) {

I can’t tell you how often this quirk burns me.

Sadly, there is a deeper problem then syntax: Just what, exactly, is an exception? How does one determine the exceptional condition, and what can be done about it? It might be a string. The string might be localized. It might be an Exception::Class object, or a Throwable object, or a simple array reference. Or any other value a Perl scalar can hold. This lack of specificity requires careful handling of exceptions:

if (my $err = $@) {
    if (ref $err) {
        if (eval { $err->isa('Exception::Class') }) {
            if ( $err->isa('SomeException') ) {
                # …
            } elsif ( $err->isa('SomeException') ) {
                # …
            } else {
                # …
        } elsif (eval { $err->DOES('Throwable') }) {
            # …
        } elsif ( ref $err eq 'ARRAY') {
            # …
    } else {
        if ( $err =~ /DBI/ ) {
            # …
        } elsif ( $err =~ /cannot open '([^']+)'/ ) {
            # …

Not every exception handler requires so many conditions, but I have certainly exercised all these approaches. Usually my exception handlers accrete condition as users report new, unexpected errors.

That’s not all. My code frequently requires parsing information out of a string error. Here’s an example from PGXN::Manager:

try {
    $self->distmeta(decode_json scalar $member->contents );
} catch {
    my $f = quotemeta __FILE__;
    (my $err = $_) =~ s/\s+at\s+$f.+//ms;
        'Cannot parse JSON from “[_1]”: [_2]',
} or return;

return $self;

When JSON throws an exception on invalid JSON, the code must catch that exception to show the user. The user cares not at all what file threw the exception, nor the line number. The code must strip that stuff out before passing the original message off to a localizing error method.


It’s time to end this. A forthcoming post will propose a plan for adding proper exception handling to the core Perl language, including exception objects and an official try/catch syntax.

  1. In fairness much of the $@ pain has been addressed in Perl 5.14.
More about…

Sqitch on Windows (and Linux, Solaris, and OS X)

Thanks to the hard-working hamsters at the ActiveState PPM Index, Sqitch is available for installation on Windows. According to the Sqitch PPM Build Status, the latest version is now available for installation. All you have to do is:

  1. Download and install ActivePerl
  2. Open the Command Prompt
  3. Type ppm install App-Sqitch

As of this writing, only PostgreSQL is supported, so you will need to install PostgreSQL.

But otherwise, that’s it. In fact, this incantation works for any OS that ActivePerl supports. Here’s where you can find the sqitch executable on each:

  • Windows: C:\perl\site\bin\sqitch.bat
  • Mac OS X: ~/Library/ActivePerl-5.16/site/bin/sqitch (Or /usr/local/ActivePerl-5.16/site/bin if you run sudo ppm)
  • Linux: /opt/ActivePerl-5.16/site/bin/sqitch
  • Solaris/SPARC (Business edition-only): /opt/ActivePerl-5.16/site/bin/sqitch

This makes it easy to get started with Sqitch on any of those platforms without having to become a Perl expert. So go for it, and then get started with the tutorial!

Looking for the comments? Try the old layout.

Dist::Zilla::LocaleTextDomain for Translators

Here’s a followup on my post about localizing Perl modules with Locale::TextDomain. Dist::Zilla::LocaleTextDomain was great for developers, less so for translators. A Sqitch translator asked how to test the translation file he was working on. My only reply was to compile the whole module, then install it and test it. Ugh.

Today, I released Dist::Zilla::LocaleTextDomain v0.85 with a new command, msg-compile. This command allows translators to easily compile just the file they’re working on and nothing else. For pure Perl modules in particular, it’s pretty easy to test then. By default, the compiled catalog goes into ./LocaleData, where convincing the module to find it is simple. For example, I updated the test sqitch app to take advantage of this. Now, to test, say, the French translation file, all the translator has to do is:

> dzil msg-compile po/fr.po
[LocaleTextDomain] po/fr.po: 155 translated messages, 24 fuzzy translations, 16 untranslated messages.

> LANGUAGE=fr ./t/sqitch foo
"foo" n'est pas une commande valide

I hope this simplifies things for translators. See the notes for translators for a few more words on the subject.

Looking for the comments? Try the old layout.

Localize Your Perl modules with Locale::TextDomain and Dist::Zilla

I’ve just released Dist::Zilla::LocaleTextDomain v0.80 to the CPAN. This module adds support for managing Locale::TextDomain-based localization and internationalization in your CPAN libraries. I wanted to make it as simple as possible for CPAN developers to do localization and to support translators in their projects, and Dist::Zilla seemed like the perfect place to do it, since it has hooks to generate the necessary binary files for distribution.

Starting out with Locale::TextDomain was decidedly non-intuitive for me, as a Perl hacker, likely because of its gettext underpinnings. Now that I’ve got a grip on it and created the Dist::Zilla support, I think it’s pretty straight-forward. To demonstrate, I wrote the following brief tutorial, which constitutes the main documentation for the Dist::Zilla::LocaleTextDomain distribution. I hope it makes it easier for you to get started localizing your Perl libraries.

Localize Your Perl modules with Locale::TextDomain and Dist::Zilla

Locale::TextDomain provides a nice interface for localizing your Perl applications. The tools for managing translations, however, is a bit arcane. Fortunately, you can just use this plugin and get all the tools you need to scan your Perl libraries for localizable strings, create a language template, and initialize translation files and keep them up-to-date. All this is assuming that your system has the gettext utilities installed.

The Details [The-Details]

I put off learning how to use Locale::TextDomain for quite a while because, while the gettext tools are great for translators, the tools for the developer were a little more opaque, especially for Perlers used to Locale::Maketext. But I put in the effort while hacking Sqitch. As I had hoped, using it in my code was easy. Using it for my distribution was harder, so I decided to write Dist::Zilla::LocaleTextDomain to make life simpler for developers who manage their distributions with Dist::Zilla.

What follows is a quick tutorial on using Locale::TextDomain in your code and managing it with Dist::Zilla::LocaleTextDomain.

This is my domain [This-is-my-domain]

First thing to do is to start using Locale::TextDomain in your code. Load it into each module with the name of your distribution, as set by the name attribute in your dist.ini file. For example, if your dist.ini looks something like this:

name    = My-GreatApp
author  = Homer Simpson <homer@example.com>
license = Perl_5

Then, in you Perl libraries, load Locale::TextDomain like this:

use Locale::TextDomain qw(My-GreatApp);

Locale::TextDomain uses this value to find localization catalogs, so naturally Dist::Zilla::LocaleTextDomain will use it to put those catalogs in the right place.

Okay, so it’s loaded, how do you use it? The documentation of the Locale::TextDomain exported functions is quite comprehensive, and I think you’ll find it pretty simple once you get used to it. For example, simple strings are denoted with __:

say __ 'Hello';

If you need to specify variables, use __x:

say __x(
    'You selected the color {color}',
    color => $color

Need to deal with plurals? Use __n

say __n(
    'One file has been deleted',
    'All files have been deleted',

And then you can mix variables with plurals with __nx:

say __nx(
    'One file has been deleted.',
    '{count} files have been deleted.'",
    count => $num_files,

Pretty simple, right? Get to know these functions, and just make it a habit to use them in user-visible messages in your code. Even if you never expect to translate those messages, just by doing this you make it easier for someone else to come along and start translating for you.

The setup [The-setup]

Now you’re localizing your code. Great! What’s next? Officially, nothing. If you never do anything else, your code will always emit the messages as written. You can ship it and things will work just as if you had never done any localization.

But what’s the fun in that? Let’s set things up so that translation catalogs will be built and distributed once they’re written. Add these lines to your dist.ini:


There are actually quite a few attributes you can set here to tell the plugin where to find language files and where to put them. For example, if you used a domain different from your distribution name, e.g.,

use Locale::TextDomain 'com.example.My-GreatApp';

Then you would need to set the textdomain attribute so that the LocaleTextDomain does the right thing with the language files:

textdomain = com.example.My-GreatApp

Consult the LocaleTextDomain configuration docs for details on all available attributes.

(Special note until this Locale::TextDomain patch is merged: set the share_dir attribute to lib instead of the default value, share. If you use Module::Build, you will also need a subclass to do the right thing with the catalog files; see “Installation” in Dist::Zilla::Plugin::LocaleTextDomain for details.)

What does this do including the plugin do? Mostly nothing. You might see this line from dzil build, though:

[LocaleTextDomain] Skipping language compilation: directory po does not exist

Now at least you know it was looking for something to compile for distribution. Let’s give it something to find.

Initialize languages [Initialize-languages]

To add translation files, use the msg-init command:

> dzil msg-init de
Created po/de.po.

At this point, the gettext utilities will need to be installed and visible in your path, or else you’ll get errors.

This command scans all of the Perl modules gathered by Dist::Zilla and initializes a German translation file, named po/de.po. This file is now ready for your German-speaking public to start translating. Check it into your source code repository so they can find it. Create as many language files as you like:

> dzil msg-init fr ja.JIS en_US.UTF-8
Created po/fr.po.
Created po/ja.po.
Created po/en_US.po.

As you can see, each language results in the generation of the appropriate file in the po directory, sans encoding (i.e., no .UTF-8 in the en_US file name).

Now let your translators go wild with all the languages they speak, as well as the regional dialects. (Don’t forget to colour your code with en_UK translations!)

Once you have translations and they’re committed to your repository, when you build your distribution, the language files will automatically be compiled into binary catalogs. You’ll see this line output from dzil build:

[LocaleTextDomain] Compiling language files in po
po/fr.po: 10 translated messages, 1 fuzzy translation, 0 untranslated messages.
po/ja.po: 10 translated messages, 1 fuzzy translation, 0 untranslated messages.
po/en_US.po: 10 translated messages, 1 fuzzy translation, 0 untranslated messages.

You’ll then find the catalogs in the shared directory of your distribution:

> find My-GreatApp-0.01/share -type f

These binary catalogs will be installed as part of the distribution just where Locale::TextDomain can find them.

Here’s an optional tweak: add this line to your MANIFEST.SKIP:


This prevents the po directory and its contents from being included in the distribution. Sure, you can include them if you like, but they’re not required for the running of your app; the generated binary catalog files are all you need. Might as well leave out the translation files.

Mergers and acquisitions [Mergers-and-acquisitions]

You’ve got translation files and helpful translators given them a workover. What happens when you change your code, add new messages, or modify existing ones? The translation files need to periodically be updated with those changes, so that your translators can deal with them. We got you covered with the msg-merge command:

> dzil msg-merge
extracting gettext strings
Merging gettext strings into po/de.po
Merging gettext strings into po/en_US.po
Merging gettext strings into po/ja.po

This will scan your module files again and update all of the translation files with any changes. Old messages will be commented-out and new ones added. Just commit the changes to your repository and notify the translation army that they’ve got more work to do.

If for some reason you need to update only a subset of language files, you can simply list them on the command-line:

> dzil msg-merge po/de.po po/en_US.po
Merging gettext strings into po/de.po
Merging gettext strings into po/en_US.po
What’s the scan, man [Whats-the-scan-man]

Both the msg-init and msg-merge commands depend on a translation template file to create and merge language files. Thus far, this has been invisible: they will create a temporary template file to do their work, and then delete it when they’re done.

However, it’s common to also store the template file in your repository and to manage it directly, rather than implicitly. If you’d like to do this, the msg-scan command will scan the Perl module files gathered by Dist::Zilla and make it for you:

> dzil msg-scan
gettext strings into po/My-GreatApp.pot

The resulting .pot file will then be used by msg-init and msg-merge rather than scanning your code all over again. This actually then makes msg-merge a two-step process: You need to update the template before merging. Updating the template is done by exactly the same command, msg-scan:

> dzil msg-scan
extracting gettext strings into po/My-GreatApp.pot
> dzil msg-merge
Merging gettext strings into po/de.po
Merging gettext strings into po/en_US.po
Merging gettext strings into po/ja.po
Ship It! [Ship-It-]

And that’s all there is to it. Go forth and localize and internationalize your Perl apps!


My thanks to Ricardo Signes for invaluable help plugging in to Dist::Zilla, to Guido Flohr for providing feedback on this tutorial and being open to my pull requests, to David Golden for I/O capturing help, and to Jérôme Quelin for his patience as I wrote code to do the same thing as Dist::Zilla::Plugin::LocaleMsgfmt without ever noticing that it already existed.

Looking for the comments? Try the old layout.

Use of DBI in Sqitch

Sqitch uses the native database client applications (psql, sqlite3, mysql, etc.). So for tracking metadata about the state of deployments, I have been trying to stick to using them. I’m first targeting PostgreSQL, and as a result need to open a connection to psql, start a transaction, and be able to read and write stuff to it as migrations go along. The IPC is a huge PITA. Furthermore, getting things properly quoted is also pretty annoying — and it will be worse for SQLite and MySQL, I expect (psql’s --set support is pretty slick).

If, on the other hand, I used the DBI, on the other hand, all this would be very easy. There is no IPC, just a direct connection to the database. It would save me a ton of time doing development, and be robust and safer to use (e.g., exception handling rather than platform-dependent signal handling (or not, in the case of Windows)). I am quite tempted to just so that.

However, I have been trying to be sensitive to dependencies. I had planned to make Sqitch simple to install on any system, and if you had the command-line client for your preferred database, it would just work. If I used the DBI instead, then Sqitch would not work at all unless you installed the appropriate DBI driver for your database of choice. This is no big deal for Perl people, of course, but I don’t want this to be a Perl people tool. I want it to be dead simple for anyone to use for any database. Ideally, there will be RPMs and Ubuntu packages, so one can just install it and go, and not have to worry about figuring out what additional Perl DBD to install for your database of choice. It should be transparent.

That is still my goal, but at this point the IPC requirements for controlling the clients is driving me a little crazy. Should I just give up and use the DBI (at least for now)? Or persevere with the IPC stuff and get it to work? Opinions wanted!

Looking for the comments? Try the old layout.

More about…

Today on the Perl Advent Calendar

Hey look everybody, I wrote today’s Perl Advent Calendar post, Less Tedium, More Transactions. Go read it!

Looking for the comments? Try the old layout.

DBIx::Connector and Serializable Snapshot Isolation

I was at Postgres Open week before last. This was a great conference, very welcoming atmosphere and lots of great talks. One of the more significant, for me, was the session on serializable transactions by Kevin Grittner, who developed SSI for PostgreSQL 9.1. I hadn’t paid much attention to this feature before now, but it became clear to me, during the talk, that it’s time.

So what is SSI? Well, serializable transactions are almost certainly how you think of transactions already. Here’s how Kevin describes them:

True serializable transactions can simplify software development. Because any transaction which will do the right thing if it is the only transaction running will also do the right thing in any mix of serializable transactions, the programmer need not understand and guard against all possible conflicts. If this feature is used consistently, there is no need to ever take an explicit lock or SELECT FOR UPDATE/SHARE.

This is, in fact, generally how I’ve thought about transactions. But I’ve certainly run into cases where it wasn’t true. Back in 2006, I wrote an article on managing many-to-many relationships with PL/pgSQL which demonstrated a race condition one might commonly find when using an ORM. The solution I offered was to always use a PL/pgSQL function that does the work, and that function executes a SELECT...FOR UPDATE statement to overcome the race condition. This creates a lock that forces conflicting transactions to be performed serially.

Naturally, this is something one would rather not have to think about. Hence SSI. When you identify a transaction as serializable, it will be executed in a truly serializable fashion. So I could actually do away with the SELECT...FOR UPDATE workaround — not to mention any other race conditions I might have missed — simply by telling PostgreSQL to enforce transaction isolation. This essentially eliminates the possibility of unexpected side-effects.

This comes at a cost, however. Not in terms of performance so much, since the SSI implementation uses some fancy, recently-developed algorithms to keep things efficient. (Kevin tells me via IRC: “Usually the rollback and retry work is the bulk of the additional cost in an SSI load, in my testing so far. A synthetic load to really stress the LW locking, with a fully-cached database doing short read-only transactions will have no serialization failures, but can run up some CPU time in LW lock contention.”) No, the cost is actually in increased chance of transaction rollback. Because SSI will catch more transaction conflicts than the traditional “read committed” isolation level, frameworks that expect to work with SSI need to be prepared to handle more transaction failures. From the fine manual:

The Serializable isolation level provides the strictest transaction isolation. This level emulates serial transaction execution, as if transactions had been executed one after another, serially, rather than concurrently. However, like the Repeatable Read level, applications using this level must be prepared to retry transactions due to serialization failures.

And that brings me to DBIx::Connector, my Perl module for safe connection and transaction management. It currently has no such retry smarts built into it. The feature closest to that is the “fixup” connection mode, wherein if a execution of a code block fails due to a connection failure, DBIx::Connector will re-connect to the database and execute the code reference again.

I think I should extend DBIx::Connector to take isolation failures and deadlocks into account. That is, fixup mode would retry a code block not only on connection failure but also on serialization failure (SQLSTATE 40001) and deadlocks (SQLSTATE 40P01). I would also add a new attribute, retries, to specify the number of times to retry such execution, with a default of three (which likely will cover the vast majority of cases). This has actually been an oft-requested feature, and I’m glad to have a new reason to add it.

There are a few design issues to overcome, however:

  • Fixup mode is supported not just by txn(), which scopes the execution of a code reference to a single transaction, but also run(), which does no transaction handling. Should the new retry support be added there, too? I could see it either way (a single SQL statement executed in run() is implicitly transaction-scoped).
  • Fixup mode is also supported by svp(), which scopes the execution of a code reference to a savepoint (a.k.a. a subtransaction). Should the rollback and retry be supported there, too, or would the whole transaction have to be retried? I’m thinking the latter, since that’s currently the behavior for connection failures.
  • Given these issues, will it make more sense to perhaps create a new mode? Maybe it would be supported only by txn().

This is do-able, will likely just take some experimentation to figure it out and settle on the appropriate API. I’ll need to find the tuits for that soon.

In the meantime, given currently in-progress changes, I’ve just released a new version of DBIx::Connector with a single change: All uses of the deprecated catch syntax now throw warnings. The previous version threw warnings only the first time the syntax was used in a particular context, to keep error logs from getting clogged up. Hopefully most folks have changed their code in the two months since the previous release and switched to Try::Tiny or some other model for exception handling. The catch syntax will be completely removed in the next release of DBIx::Connector, likely around the end of the year. Hopefully the new SSI-aware retry functionality will have been integrated by then, too.

In a future post I’ll likely chew over whether or not to add an API to set the transaction isolation level within a call to txn() and friends.

Looking for the comments? Try the old layout.

Up for Adoption: SVN::Notify

I’ve kept my various Perl modules in a Subversion server run by my Bricolage support company, Kineticode, for many years. However, I’m having to shut down the server I’ve used for all my services, including Subversion, so I’ve moved them all to GitHub. As such, I no longer use Subversion in my day-to-day work.

It no longer seems appropriate that I maintain SVN::Notify. This has probably been my most popular modules, and I know that it’s used a lot. It’s also relatively stable, with few bug reports or complaints. Nevertheless, there certainly could be some things that folks want to add, like TLS support, I18N, and inline CSS.

Therefore, SVN::Notify is formally up for adoption. If you’re a Subversion users, it’s a great tool. Just look at this sample output. If you’d like to take over maintenance, make it even better, please get in touch. Leave a comment on this post, or @theory me on Twitter, or send an email.

PS: Would love it if someone also could take over activitymail, the CVS notification script from which SVN::Notify was derived — and which I have even less right to maintain, given that I haven’t used CVS in years.

Looking for the comments? Try the old layout.

More about…