Drupal 7 Migrate

Every now and then we came across projects that requires migration. The old website could be a WordPress, CodeIgniter or some other database driven application. Usually we do such migrations using Custom module with batch processing. In last project in Drupal 7 i decided to give a shot to migrate module. After few trial and error i found this to be ultimate solution to migration. I tried with csv data and mysql table; both solution took 25% time to solve compare to my old custom module solution.  I have also seen couple posts that explains how to migrate html raw files.

You can learn more about migrate module from their documentation, their sample migrate_example module also has tons of examples. Do look at migrate_extras that provides support for additional contributed modules.

Here i am collecting some obstacles that i faced and their solution.

I am migrating contents from old website but want to keep their url alias due to SEO factor. I have pathauto module setup but don’t want to trigger that.

// Block pathauto from interfering. This requires migrate_extras
// Lets Put new Path
$this->addFieldMapping('path', 'old_path');

After i migrate body content it seems full_html text format is not placed. Drupal is splitting the html code as plain text. Solution:

$this->addFieldMapping('body', 'content');  
$this->addFieldMapping('body:format')->defaultValue('full_html'); // You can set other formats

In that old system there was this type which is a select option with value : On and off . In D7 i have set this as  a Term Reference field. I had to use prepareRow to map select value to tid.

class MyNodeMigration extends Migration{

 public function __construct( $arguments )
   #.. Other Code 
   $this->addFieldMapping('field_status', 'status');
   #.. Other code

 public function prepareRow($row)
   $row->status = $row->status === 'On' ? 12 : 13;

I am migrating over 65,000 Contents. All my trials were less than 100 entries and i was happy to see the throughput which was over 800/min. Then I start a full run and things start to look horrific. At some point i am getting throughput of 2/min which means to finish all content migration will take 541 Hours . Impossible! . After few tinkering i re-run the test and i able to migrate 11000 posts within 15minutes.  Awesome!

$query = db_select('tbl_pages', 'page');
  #.. Other codes
  #.. Other code
 $this->highwaterField = array(
     'name' => 'id',
     'alias' => 'page',
     'type' => 'int', );

In my case, i had to skip particular rows based on some logic which i could not able to ignore in MySQL. You can put such logics inside prepareRow function.

public function prepareRow($row){
   // This will skip this row 
   if( $row->type == 'blah' and $row->some == 'foo' )  return FALSE; 

