Migrating homebrew Rails blog-entries to Wordpress with Perl

I was facing a challenge recently at work: migrating a bunch of blog entries stored in a Postgres database of a home-made Rails application to a Wordpress blog.

This challenge was trickier than you may first think because of the following reasons:

  1. I don’t have manual access to the Postgres DB, I can only run a Rails console that connects to it
  2. The Postgres DB and the Wordpress Blog are installed on two different servers that are not in the same LAN.

Here is how I did the job, of course, using Perl as my weapon was the only option I considered, and I’m glad of it.

First step : exporting the data, JSON FTW

As the only way I had to handle the data source was a Rails console, I chose first to use YAML for exporting the data into a file, but I wasn’t able to parse it with Perl then, because of badly-written multiline scalars (I don’t know whose fault it is, either Ruby or Perl YAML modules, but it didn’t work out of the box).

Then I tried JSON:

# dump_posts.rb (to be run with ./script/runner -e production)
posts = MyRailsPost.find(:all)
f = File.new("/tmp/posts.json", 'w')
f.write(posts.to_json)
f.close

I then uploaded the json file to my second server, where I have access to the Wordpress DB and started writing an importer.

From a JSON file to a Wordpress DB

Now I have my data in JSON format, I can write a Perl script that will parse it and insert each post item found in the Wordpress “wp_posts” table.

First, I’ll write a Coat::Persistent class to handle Wordpress blog posts:

package WPPost;
use Coat;
use Coat::Persistent
    table_name => "wp_posts",
    primary_key => "ID";

has_p post_author => (isa => 'Int', required => 1);
has_p post_date => (isa => 'DateTime', coerce => 1);
has_p post_excerpt => (isa => 'Str');
has_p post_content => (isa => 'Str');
has_p post_title => (isa => 'Str');
has_p post_name => (isa => 'Str');

sub BUILD {
    my $self = shift;
    my $class = ref($self);
    $class->dbh->do('SET NAMES utf8') or die $!;
}

Coat::Persistent->disable_internal_sequence_engine();
__PACKAGE__->map_to_dbi(mysql => "MYDATABASE", "DBUSER", "DBPASS")

Please note that we tells DBD::mysql to send data in utf8 (SET NAMES utf8), this is very important, if we don’t, we’ll endup in Wordpress with a mess of utf8/latin1 crap.

The post_date field will be set with Rails’ created_at attribute, and it might be in a different format than YYYY-MM-DD HH:MM:SS, so I’ll write subtypes and coercions to handle that cleanly:

use Coat::Types;
use Coat::Persistent::Types;

subtype 'DateTimeSec'
    => as 'Str'
    => where { /\d{4}-\d\d-\d\d \d+:\d+:\d+\.\d+/ };

subtype 'DateTimeWithTZ'
    => as 'Str'
    => where { /\d{4}-\d\d-\d\dT\d\d:\d\d:\d\d\+\d\d:\d\d/ };

subtype 'DateTimeWithT'
    => as 'Str'
    => where { /\d{4}-\d\d-\d\dT\d\d:\d\d:\d\d/ };

coerce 'DateTime'
    => from 'DateTimeWithTZ'
    => where { s/\+.+$//; s/T/ /; return $_ };

coerce 'DateTime'
    => from 'DateTimeWithT'
    => via { s/T/ /; return $_ };

coerce 'DateTime'
    => from 'DateTimeSec'
    => via { s/\.\d+$//; return $_ };

Now, I have all I need, I can write the scrit:

#!/usr/bin/perl
use strict;
use warnings;

# slurping my JSON file
open FIC, '< ', $ARGV[0] or die $!;
my @lines = ;
close FIC;
my $content = join '', @lines;

# processing posts
my $data = from_json($content);
for my $post (@$data) {

    my $p = $post->{rails_post_class}; # JSON output in Rails works this way

    my $wp = WPPost->create(
        post_author => 1,               # admin
        post_date => $p->{created_at},  # coercion will occur here
        post_excerpt => $p->{excerpt},
        post_content => $p->{body},
        post_title => $p->{title},
        post_name => $p->{permalink},
    );
    print "+ ".$wp->post_title." -> #".$wp->ID."\n";
}

Note: you can consider this post as the first-and-last item of my 2009 Perl Advent Calendar ;)

560 views · Tags: , , , · Both comments and pings are currently closed.

Comments are closed.

Get Adobe Flash playerPlugin by wpburn.com wordpress themes