I was facing a challenge recently at work: migrating a bunch of blog entries stored in a Postgres database of a home-made Rails application to a WordPress blog.
This challenge was trickier than you may first think because of the following reasons:
- I don’t have manual access to the Postgres DB, I can only run a Rails console that connects to it
- The Postgres DB and the WordPress Blog are installed on two different servers that are not in the same LAN.
Here is how I did the job, of course, using Perl as my weapon was the only option I considered, and I’m glad of it.
First step : exporting the data, JSON FTW
As the only way I had to handle the data source was a Rails console, I chose first to use YAML for exporting the data into a file, but I wasn’t able to parse it with Perl then, because of badly-written multiline scalars (I don’t know whose fault it is, either Ruby or Perl YAML modules, but it didn’t work out of the box).
Then I tried JSON:
# dump_posts.rb (to be run with ./script/runner -e production)
posts = MyRailsPost.find(:all)
f = File.new("/tmp/posts.json", 'w')
f.write(posts.to_json)
f.close
I then uploaded the json file to my second server, where I have access to the WordPress DB and started writing an importer.
From a JSON file to a WordPress DB
Now I have my data in JSON format, I can write a Perl script that will parse it and insert each post item found in the WordPress “wp_posts” table.
First, I’ll write a Coat::Persistent class to handle WordPress blog posts:
package WPPost;
use Coat;
use Coat::Persistent
table_name => "wp_posts",
primary_key => "ID";
has_p post_author => (isa => 'Int', required => 1);
has_p post_date => (isa => 'DateTime', coerce => 1);
has_p post_excerpt => (isa => 'Str');
has_p post_content => (isa => 'Str');
has_p post_title => (isa => 'Str');
has_p post_name => (isa => 'Str');
sub BUILD {
my $self = shift;
my $class = ref($self);
$class->dbh->do('SET NAMES utf8') or die $!;
}
Coat::Persistent->disable_internal_sequence_engine();
__PACKAGE__->map_to_dbi(mysql => "MYDATABASE", "DBUSER", "DBPASS")
Please note that we tells DBD::mysql to send data in utf8 (SET NAMES utf8), this is very important, if we don’t, we’ll endup in WordPress with a mess of utf8/latin1 crap.
The post_date field will be set with Rails’ created_at attribute, and it might be in a different format than YYYY-MM-DD HH:MM:SS, so I’ll write subtypes and coercions to handle that cleanly:
use Coat::Types;
use Coat::Persistent::Types;
subtype 'DateTimeSec'
=> as 'Str'
=> where { /\d{4}-\d\d-\d\d \d+:\d+:\d+\.\d+/ };
subtype 'DateTimeWithTZ'
=> as 'Str'
=> where { /\d{4}-\d\d-\d\dT\d\d:\d\d:\d\d\+\d\d:\d\d/ };
subtype 'DateTimeWithT'
=> as 'Str'
=> where { /\d{4}-\d\d-\d\dT\d\d:\d\d:\d\d/ };
coerce 'DateTime'
=> from 'DateTimeWithTZ'
=> where { s/\+.+$//; s/T/ /; return $_ };
coerce 'DateTime'
=> from 'DateTimeWithT'
=> via { s/T/ /; return $_ };
coerce 'DateTime'
=> from 'DateTimeSec'
=> via { s/\.\d+$//; return $_ };
Now, I have all I need, I can write the scrit:
#!/usr/bin/perl use strict; use warnings; # slurping my JSON file open FIC, '< ', $ARGV[0] or die $!; my @lines =; close FIC; my $content = join '', @lines; # processing posts my $data = from_json($content); for my $post (@$data) { my $p = $post->{rails_post_class}; # JSON output in Rails works this way my $wp = WPPost->create( post_author => 1, # admin post_date => $p->{created_at}, # coercion will occur here post_excerpt => $p->{excerpt}, post_content => $p->{body}, post_title => $p->{title}, post_name => $p->{permalink}, ); print "+ ".$wp->post_title." -> #".$wp->ID."\n"; }
Note: you can consider this post as the first-and-last item of my 2009 Perl Advent Calendar ;)
