Keep DBI’s connect_cached From Horking Transactions
I’ve been on a bit of a Perl hacking tear lately. In addition to knocking out Test::XPath last week, I’ve been experimenting with TAP::Harness sources, Template::Declare, Catalyst views, a new Module::Build subclass for building database-backed applications, and, last but not least, an IRC logging bot. Oh, and that application I’m working on for PGX with Quinn Weaver. So much is crowding my mind these days that I’m having trouble sleeping. Tonight I’m up late hacking to try to get some of this stuff out of my head.
But before I crash, I wanted to share a useful DBI hack. I use connect_cached in a lot of my applications, because it’s a nice way to reuse database handles without having to figure out my own caching algorithm. The code I have to use it looks like this:
sub dbh {
my $self = shift;
DBI->connect_cached( @{ $self->_dbi }{qw(dsn username password)}, {
PrintError => 0,
RaiseError => 0,
HandleError => Exception::Class::DBI->handler,
AutoCommit => 1,
});
}
Very simple. I just call the dbh()
method whenever I need to talk to the
database, and I’m set. Except for one problem: transactions.
Say I have a method that grabs the handle, starts a transaction with the
begin_work
method, and then inserts a row. Then another method grabs the
handle from dbh()
on the assumption that it’s in the same transaction, and
does its own work. Only, it’s not the same transaction, because, unfortunately,
DBI sets the attributes passed to connect_cached
every single time it’s
called!. So even though that second method may think it’s in the middle of a
transaction, it’s really not, because when connect_cached
sets AutoCommit
back to 1, the transaction gets committed.
Oops.
This really fucks with my tests, where I’m often fetching the database handle to start a transaction, running some tests, and then wanting to rollback the transaction when I’m done. It’s irritating as all hell to discover that data has been inserted into the database. And the DBI, alas, just gives me this warning:
rollback ineffective with AutoCommit enabled at t/botinst.t line 67.
I’m likely not to notice until I get a duplicate key error the next time I run the tests.
As I was dealing with this today, my memory started poking at me, telling me that I’ve dealt with this before. And sure enough, a quick Google shows that Tim Bunce and I had an extensive conversation on this very topic – over four years ago. If you’re patient enough to dig through that thread, you’ll note that this issue is due to some architectural difficulties in the DBI, to be worked out in DBI 2.
Over the last four years, I’ve implemented a couple of solutions to this
problem, all involving my code tracking the transaction state and modifying the
AutoCommit
attribute in the appropriate places. It’s all rather fragile. But
as I dug through the thread, I discovered a much cleaner fix, using a
little-known and so-far undocumented feature of the DBI: callbacks. This is
actually I feature I half-way implemented in the DBI years ago, getting it just
far enough that Tim was willing to finish it. And it’s just sitting there,
waiting to be used.
So here’s the trick: Specify a callback for the connect_cached()
method that’s
used only when an existing file handle is retrieved from the cache. A bunch of
stuff is passed to the callback, but the important one is the fifth argument,
the attributes. All it has to do is delete the AutoCommit
attribute. Since
this callback is called before the DBI looks at the attributes to set them on
the handle, the callback effectively prevents the DBI from horking up your
transactions.
Here’s the modified code:
my $cb = {
'connect_cached.reused' => sub { delete $_[4]->{AutoCommit} },
};
sub dbh {
my $self = shift;
DBI->connect_cached( @{ $self->_dbi }{qw(dsn username password)}, {
PrintError => 0,
RaiseError => 0,
HandleError => Exception::Class::DBI->handler,
AutoCommit => 1,
Callbacks => $cb,
});
}
Callbacks are passed as a hash reference, with the keys being the names of the
DBI methods that should trigger the callbacks, such as ping
, data_sources
,
or connect_cached
. The values are, of course, code references. When the DBI
calls the code callbacks, it passes in stuff relevant to the method.
In the case of connect_cached
, there are two additional special-case
callbacks, connect_cached.new
and connect_cached.reused
, so that you can
have different callbacks execute depending on whether connect_cached
used a
cached database handle or had to create a new one. Here, I’ve used
connect_cached.reused
, of course, and all I do is kill off the AutoCommit
attribute before the DBI gets its greedy hands on it. Et voilà, problem solved!
And before you ask, no, you can’t simply omit AutoCommit
from the attributes
passed to connect_cached
, because the DBI helpfully adds it for you.
So now this is here so I can find it again when it bites me next year, and I hope it helps you, too. Meanwhile, perhaps someone could take it upon themselves to document DBI’s callbacks? At this point, the closest thing to documentation is in the tests. (Hrm. I think I might have had a hand in writing them.) Check ’em out.
Looking for the comments? Try the old layout.