Just a Theory

By David E. Wheeler

Posts about CPAN

Wanted: New SVN::Notify Maintainer

I’ve used Subversion very occasionally since 2009, and SVN::Notify at all. Over the years, I’ve fixed minor issues with it now and then, and made the a couple of releases to address issues fixed by others. But it’s past the point where I feel qualified to maintain it. Hell, the repository for SVN::Notify has been hosted on GitHub ever since 2011. I don’t have an instance of Subversion against which to test it; nor do I have any SMTP servers to throw test messages at.

In short, it’s past time I relinquished maintenance of this module to someone with a vested interest in its continued use. Is that you? Do you need to keep SVN::Notify running for your projects, and have a few TUITs to fix the occasional bug or security issue? If so, drop me a line (david @ this domain). I’d be happy to transfer the repository.

Build Modern Perl RPMs with rpmcpan

iovation + Perl = Love

We’ve been using the CentOS Perl RPMs at iovation to run all of our Perl applications. This has been somewhat painful, because the version of Perl, 5.10.1, is quite old — it shipped in August 2009. In fact, it consists mostly of bug fixes against Perl 5.10.0, which shipped in December 2007! Many of the modules provided by CentOS core and EPEL are quite old, as well, and we had built up quite the collection of customized module RPMs managed by a massive spaghetti-coded Jenkins job. When we recently ran into a Unicode issue that would best have been addressed by running a more modern Perl — rather than a hinky workaround — I finally sat down and knocked out a way to get a solid set of Modern Perl and related CPAN RPMs.

I gave it the rather boring name rpmcpan, and now you can use it, too. Turns out, DevOps doesn’t myopically insist on using core RPMs in the name of some abstract idea about stability. Rather, we just need a way to easily deploy our stuff as RPMs. If the same applies to your organization, you can get Modern Perl RPMs, too.

Here’s how we do it. We have a new Jenkins job that runs both nightly and whenever the rpmcpan Git repository updates. It uses the MetaCPAN API to build the latest versions of everything we need. Here’s how to get it to build the latest version of Perl, 5.20.1:

./bin/rpmcpan --version 5.20.1

That will get you a nice, modern Perl RPM, named perl520, completely encapsulated in /usr/local/perl520. Want 5.18 instead: Just change the version:

./bin/rpmcpan --version 5.18.2

That will give you perl518. But that’s not all. You want to build CPAN distributions against that version. Easy. Just edit the dists.json file. Its contents are a JSON object where the keys name CPAN distributions (not modules), and the values are objects that customize our RPMs get built. Most of the time, the objects can be empty:

{
    "Try-Tiny": {}
}

This results in an RPM named perl520-Try-Tiny (or perl518-Try-Tiny, etc.). Sometimes you might need additional information to customize the CPAN spec file generated to build the distribution. For example, since this is Linux, we need to exclude a Win32 dependency in the Encode-Locale distribution:

{
    "Encode-Locale": { "exclude_requires": ["Win32::Console"] }
}

Other distributions might require additional RPMs or environment variables, like DBD-Pg, which requires the PostgreSQL RPMs:

{
    "DBD-Pg": {
        "build_requires": ["postgresql93-devel", "postgresql93"],
        "environment": { "POSTGRES_HOME": "/usr/pgsql-9.3" }
    }
}

See the README for a complete list of customization options. Or just get started with our dists.json file, which so far builds the bare minimum we need for one of our Perl apps. Add new distributions? Send a pull request! We’ll be doing so as we integrate more of our Perl apps with a Modern Perl and leave the sad RPM past behind.

More about…

Localize Your Perl modules with Locale::TextDomain and Dist::Zilla

I’ve just released Dist::Zilla::LocaleTextDomain v0.80 to the CPAN. This module adds support for managing Locale::TextDomain-based localization and internationalization in your CPAN libraries. I wanted to make it as simple as possible for CPAN developers to do localization and to support translators in their projects, and Dist::Zilla seemed like the perfect place to do it, since it has hooks to generate the necessary binary files for distribution.

Starting out with Locale::TextDomain was decidedly non-intuitive for me, as a Perl hacker, likely because of its gettext underpinnings. Now that I’ve got a grip on it and created the Dist::Zilla support, I think it’s pretty straight-forward. To demonstrate, I wrote the following brief tutorial, which constitutes the main documentation for the Dist::Zilla::LocaleTextDomain distribution. I hope it makes it easier for you to get started localizing your Perl libraries.

Localize Your Perl modules with Locale::TextDomain and Dist::Zilla

Locale::TextDomain provides a nice interface for localizing your Perl applications. The tools for managing translations, however, is a bit arcane. Fortunately, you can just use this plugin and get all the tools you need to scan your Perl libraries for localizable strings, create a language template, and initialize translation files and keep them up-to-date. All this is assuming that your system has the gettext utilities installed.

The Details [The-Details]

I put off learning how to use Locale::TextDomain for quite a while because, while the gettext tools are great for translators, the tools for the developer were a little more opaque, especially for Perlers used to Locale::Maketext. But I put in the effort while hacking Sqitch. As I had hoped, using it in my code was easy. Using it for my distribution was harder, so I decided to write Dist::Zilla::LocaleTextDomain to make life simpler for developers who manage their distributions with Dist::Zilla.

What follows is a quick tutorial on using Locale::TextDomain in your code and managing it with Dist::Zilla::LocaleTextDomain.

This is my domain [This-is-my-domain]

First thing to do is to start using Locale::TextDomain in your code. Load it into each module with the name of your distribution, as set by the name attribute in your dist.ini file. For example, if your dist.ini looks something like this:

name    = My-GreatApp
author  = Homer Simpson <homer@example.com>
license = Perl_5

Then, in you Perl libraries, load Locale::TextDomain like this:

use Locale::TextDomain qw(My-GreatApp);

Locale::TextDomain uses this value to find localization catalogs, so naturally Dist::Zilla::LocaleTextDomain will use it to put those catalogs in the right place.

Okay, so it’s loaded, how do you use it? The documentation of the Locale::TextDomain exported functions is quite comprehensive, and I think you’ll find it pretty simple once you get used to it. For example, simple strings are denoted with __:

say __ 'Hello';

If you need to specify variables, use __x:

say __x(
    'You selected the color {color}',
    color => $color
);

Need to deal with plurals? Use __n

say __n(
    'One file has been deleted',
    'All files have been deleted',
    $num_files,
);

And then you can mix variables with plurals with __nx:

say __nx(
    'One file has been deleted.',
    '{count} files have been deleted.'",
    $num_files,
    count => $num_files,
);

Pretty simple, right? Get to know these functions, and just make it a habit to use them in user-visible messages in your code. Even if you never expect to translate those messages, just by doing this you make it easier for someone else to come along and start translating for you.

The setup [The-setup]

Now you’re localizing your code. Great! What’s next? Officially, nothing. If you never do anything else, your code will always emit the messages as written. You can ship it and things will work just as if you had never done any localization.

But what’s the fun in that? Let’s set things up so that translation catalogs will be built and distributed once they’re written. Add these lines to your dist.ini:

[ShareDir]
[LocaleTextDomain]

There are actually quite a few attributes you can set here to tell the plugin where to find language files and where to put them. For example, if you used a domain different from your distribution name, e.g.,

use Locale::TextDomain 'com.example.My-GreatApp';

Then you would need to set the textdomain attribute so that the LocaleTextDomain does the right thing with the language files:

[LocaleTextDomain]
textdomain = com.example.My-GreatApp

Consult the LocaleTextDomain configuration docs for details on all available attributes.

(Special note until this Locale::TextDomain patch is merged: set the share_dir attribute to lib instead of the default value, share. If you use Module::Build, you will also need a subclass to do the right thing with the catalog files; see “Installation” in Dist::Zilla::Plugin::LocaleTextDomain for details.)

What does this do including the plugin do? Mostly nothing. You might see this line from dzil build, though:

[LocaleTextDomain] Skipping language compilation: directory po does not exist

Now at least you know it was looking for something to compile for distribution. Let’s give it something to find.

Initialize languages [Initialize-languages]

To add translation files, use the msg-init command:

> dzil msg-init de
Created po/de.po.

At this point, the gettext utilities will need to be installed and visible in your path, or else you’ll get errors.

This command scans all of the Perl modules gathered by Dist::Zilla and initializes a German translation file, named po/de.po. This file is now ready for your German-speaking public to start translating. Check it into your source code repository so they can find it. Create as many language files as you like:

> dzil msg-init fr ja.JIS en_US.UTF-8
Created po/fr.po.
Created po/ja.po.
Created po/en_US.po.

As you can see, each language results in the generation of the appropriate file in the po directory, sans encoding (i.e., no .UTF-8 in the en_US file name).

Now let your translators go wild with all the languages they speak, as well as the regional dialects. (Don’t forget to colour your code with en_UK translations!)

Once you have translations and they’re committed to your repository, when you build your distribution, the language files will automatically be compiled into binary catalogs. You’ll see this line output from dzil build:

[LocaleTextDomain] Compiling language files in po
po/fr.po: 10 translated messages, 1 fuzzy translation, 0 untranslated messages.
po/ja.po: 10 translated messages, 1 fuzzy translation, 0 untranslated messages.
po/en_US.po: 10 translated messages, 1 fuzzy translation, 0 untranslated messages.

You’ll then find the catalogs in the shared directory of your distribution:

> find My-GreatApp-0.01/share -type f
My-GreatApp-0.01/share/LocaleData/de/LC_MESSAGES/App-Sqitch.mo
My-GreatApp-0.01/share/LocaleData/en_US/LC_MESSAGES/App-Sqitch.mo
My-GreatApp-0.01/share/LocaleData/ja/LC_MESSAGES/App-Sqitch.mo

These binary catalogs will be installed as part of the distribution just where Locale::TextDomain can find them.

Here’s an optional tweak: add this line to your MANIFEST.SKIP:

^po/

This prevents the po directory and its contents from being included in the distribution. Sure, you can include them if you like, but they’re not required for the running of your app; the generated binary catalog files are all you need. Might as well leave out the translation files.

Mergers and acquisitions [Mergers-and-acquisitions]

You’ve got translation files and helpful translators given them a workover. What happens when you change your code, add new messages, or modify existing ones? The translation files need to periodically be updated with those changes, so that your translators can deal with them. We got you covered with the msg-merge command:

> dzil msg-merge
extracting gettext strings
Merging gettext strings into po/de.po
Merging gettext strings into po/en_US.po
Merging gettext strings into po/ja.po

This will scan your module files again and update all of the translation files with any changes. Old messages will be commented-out and new ones added. Just commit the changes to your repository and notify the translation army that they’ve got more work to do.

If for some reason you need to update only a subset of language files, you can simply list them on the command-line:

> dzil msg-merge po/de.po po/en_US.po
Merging gettext strings into po/de.po
Merging gettext strings into po/en_US.po
What’s the scan, man [Whats-the-scan-man]

Both the msg-init and msg-merge commands depend on a translation template file to create and merge language files. Thus far, this has been invisible: they will create a temporary template file to do their work, and then delete it when they’re done.

However, it’s common to also store the template file in your repository and to manage it directly, rather than implicitly. If you’d like to do this, the msg-scan command will scan the Perl module files gathered by Dist::Zilla and make it for you:

> dzil msg-scan
gettext strings into po/My-GreatApp.pot

The resulting .pot file will then be used by msg-init and msg-merge rather than scanning your code all over again. This actually then makes msg-merge a two-step process: You need to update the template before merging. Updating the template is done by exactly the same command, msg-scan:

> dzil msg-scan
extracting gettext strings into po/My-GreatApp.pot
> dzil msg-merge
Merging gettext strings into po/de.po
Merging gettext strings into po/en_US.po
Merging gettext strings into po/ja.po
Ship It! [Ship-It-]

And that’s all there is to it. Go forth and localize and internationalize your Perl apps!

Acknowledgements

My thanks to Ricardo Signes for invaluable help plugging in to Dist::Zilla, to Guido Flohr for providing feedback on this tutorial and being open to my pull requests, to David Golden for I/O capturing help, and to Jérôme Quelin for his patience as I wrote code to do the same thing as Dist::Zilla::Plugin::LocaleMsgfmt without ever noticing that it already existed.

Looking for the comments? Try the old layout.

PGXN Development Project

I’m pleased to announce the launch of the PGXN development project. I’ve written a detailed specification and pushed it through general approval on pgsql-hackers. I’ve written up a detailed project plan and estimated things at a highly reduced PostgreSQL Experts rate to come up with a fundraising goal: $25,000. And now, thanks to founding contributions from myYearbook.com, and PostgreSQL Experts, we have started the fundraising phase of the project.

So what’s this all about? PGXN, the PostgreSQL Extension Network, is modeled on CPAN, the Perl community’s archive of “all things Perl.” PGXN will provide four major pieces of infrastructure to the PostgreSQL community:

I’ve been wanting to start this project for a long time, but given my need to pay the bills, it didn’t seem like I’d ever be able to find the time for it. Then Josh Berkus suggested that we try to get community interest and raise money for me to have the time to work on it. So I jumped on that, putting in the hours needed to get general approval from the core PostgreSQL developers and to create a reasonable project plan and web site. And thanks to MyYearbook’s and PGX’s backing, I’m really excited about it. I hope to start on it in August.

If you’d like to contribute, first: Thank You!. The PGXN site has a Google Checkout widget that makes it easy to make a donation. If you’d rather pay by some other means (checks are great for us!), drop me a line and we’ll work something out. We have a few levels of contribution as well, including permanent linkage on the PGXN site for your organization, as well as the usual t-shirts launch party invitations.

Looking for the comments? Try the old layout.

More about…

PGAN Bikeshedding

I’ve put together a description of PGAN, the PostgreSQL extension distribution system I plan to develop later this year based on the Comprehensive Archive Perl Network or CPAN. Its primary features will be:

  • Extension distribution
  • Search site with extension documentation
  • Client for downloading, building, testing, and installing extensions.

I’ve never been thrilled with the name, though, so I’m asking for suggestions for a better one. I’ve used the term “extension” here because it seems to be the term that the PostgreSQL community has settled on, but other terms might work, since things other than extensions might be distributed.

What I’ve come up with so far is:

Name Long Name Pronunciation Advantages Disadvantages
PGAN PostgreSQL Add-on Network pee-gan Short, similar to CPAN Ugly
PGEX PostgreSQL Extensions pee-gee-ex or pee-gex Short, easier to pronounce Too similar to PGX)
PGCAN PostgreSQL Comprehensive Archive Network pee-gee-can Similar to CPAN Similar to CPAN
PGDAN PostgreSQL Distribution Archive Network pee-gee-dan Short, easy to pronounce Who’s “Dan”? Doesn’t distribute PostgreSQL itself.
PGEDAN PostgreSQL Extension Distribution Archive Network pee-gee-ee-dan References extensions Long, sounds stupid

Of these, I think I like “PGEX” best, but none are really great. So I’m opening up the bike shed to all. What’s a better name? Or if you can’t think of one, which of the above do you like best? Just leave a comment on this post. The only requirements for suggestions are that a .org domain be available and that it suck less than the alternatives.

Comments close in 2 weeks. Thanks!

Looking for the comments? Try the old layout.

More about…

RFC: PostgreSQL Add-on Network

I’ve posted a plan to implement PGAN, a CPAN for PostgreSQL extensions. I’ve tried to closely follow the CPAN philosophy to come up with a plan that requires a minimum-work implementation that builds on the existing PostgreSQL tools and the examples of the CPAN and JSAN. My hope is that it’s full of JFDI! I would be very grateful for feedback and suggestions.

Looking for the comments? Try the old layout.

Quest for PostgreSQL Project Hosting

The pgTAP project is currently hosted by pgFoundry. This is an old version of GForge, and from what I understand, highly modified for the PostgreSQL project. That’s fine, except that it apparently makes it impossible for anyone to find the tuits to upgrade it to newer versions.

And it needs upgrading. One annoying thing I noticed is that the URLs for release files include an integer in them. For example, the URL to download pgTAP 0.23 is http://pgfoundry.org/frs/download.php/2511/pgtap-0.23.tar.bz2. See the “25111” there? It appears to be a primary key value or something, but is completely irrelevant for a release URL. I would much prefer that the URL be something like http://pgfoundry.org/frs/download.php/pgtap-0.23.tar.bz2 or, even better, http://pgfoundry.org/projects/pgtap/frs/pgtap-0.23.tar.bz2. But such is not the case now.

Another issue is hosting. I’ve registered pgtap.org to use for hosting the pgTAP Web site, but there is no support for pointing a hostname at a pgFoundry/GForge site.

These issues could of course be worked out if someone had the tuits to take them on, but apparently there is no one. So I’m looking to move.

The question is, where to? I could get a paid GitHub account (the pgTAP source is already on GitHub) and be able to have a pgTAP site on pgtap.org from there, so that’s a plus. And I can do file releases, too, in which case the URL format would be something like http://cloud.github.com/downloads/theory/pgtap/pgtap-0.23.tar.bz2, which isn’t ideal, but is a hell of a lot better than a URL with a sequence number in it. I could put them on the hosted site, too, in which case they’d have whatever URL I wanted them to have.

There are only two downsides I can think of to moving to GitHub:

  1. No mail list support. The pgTAP mail list has next to no traffic so far, so I’m not sure this is a big deal. I could also set up a list elsewhere, like Librelist, if I really needed one. I’d prefer to have @pgtap.org mail lists, but it’s not a big deal.

  2. I would lose whatever community presence I gain from hosting on pgFoundry. I know that when I release a Perl module to CPAN that it will be visible to lots of people in the Perl community, and automatically searchable via search.cpan.org and other tools. A CPAN release is a release to the Perl community.

    There is nothing like this for PostgreSQL. pgFoundry is the closest thing, and, frankly, nowhere near as good (pgFoundry’s search rankings have always stunk). So if I were to remove my projects from pgFoundry, how could I make them visible to the community? Is there any other central repository of or searchable list of third-party PostgreSQL offerings?

So I’m looking for advice. Does having an email list matter? If I can get pgTAP announcements included in the PostgreSQL Weekly News, is that enough community visibility? Do you know of a nice project hosting site that offers hosting, mail lists, download mirroring and custom domain handling?

I’ll follow up with a summary of what I’ve found in a later post.

Looking for the comments? Try the old layout.

What's With These CPAN-Testers Failures?

So I just learned about and subscribed to the CPAN-Testers feed for my modules. There appear to be a number of odd failures. Take this one. It says,“Can’t locate Algorithm/Diff.pm,” despite the fact that I have properly specified the requirement for Text::Diff, which itself properly requires Algorithm::Diff.. Is this an instance of CPAN.pm or CPANPLUS not following all prerequisites, or what?

Or take this failure. It says, “[CP_ERROR] [Mon Sep 5 09:32:08 2005] No such module ‘mod_perl’ found on CPAN”. Yet here it is. Maybe the CPANPLUS indexer has a bug? Or are people’s configurations just horked? Or am I just doing something braindead?

Opinions welcomed.

Looking for the comments? Try the old layout.

More about…

How I Increment Module Version Numbers

Here’s how I quickly increment version numbers in my modules. I call this script reversion:

#!/usr/bin/perl -w

use strict;

unless (@ARGV) {
    print "  Usage: $0 version\n\n";
    exit;
}

my $old = shift;
my $new = $old + .01;
my $dir = shift || '.';

system qq{grep -lr '\Q$old\E' $dir }
  . '| grep -v \\.svn '
  . '| grep -v Changes '
  . '| grep -v META\\.yml '
  . "| xargs $^X -i -pe \""
  . qq{print STDERR \\\$ARGV[0], \\\$/ unless \\\$::seen{\\\$ARGV[0]}++;}
  . qq{s/(\\\$VERSION\\s*=?\\s*'?)\Q$old\E('?)/\\\${1}$new\\\$2/g"};

__END__

Enjoy!

Looking for the comments? Try the old layout.

More about…