Web development

Handling Drupal migration edge cases, Part I: Scald to media

Many organizations are moving their sites to Drupal 8+ as Drupal 7 (D8 and D7, respectively) slowly comes to end-of-life, and discovering that migrating content to a new system can be one of the biggest headaches an organization experiences. Thankfully, we have the Migrate API – a suite of tools that allow us to customize and extend migrations, or create them from the ground up if need be. 

First things first: the Migrate Upgrade module can take care of a good number of use cases for most Drupal-to-Drupal migrations. Migrate Update gives you a starting point for key tasks like recreating your content types and moving your content, taxonomy terms and users, and attempts to configure your Drupal 8’s settings to match what you had in Drupal 7. This migration follows the Pareto principle in the sense that it’s pretty easy to set up, and satisfies many of the basic demands of the migration process. However, the remaining use cases are where the Migrate API comes into play.

In this post, however, we’ll take a closer technical look at a use case that, while familiar to many Drupal developers, was specific enough to require customization. (For the non-technical, look for links that in black – hover over them and a definition will appear.)

For a soon-to-be-launched client site, we had a migration problem revolving around media. The client’s Drupal 7 site used the old Scald module. Scald is a robust media module that allows creating media items of many types, fielding them, and embedding them inside your content in Drupal 7. This client’s implementation had upwards of 15,000 media items (called “atoms” in Scald) across several types, all with field metadata, and embedded using a variety of presentations on thousands of nodes. So, the problem? Scald has no Drupal 8 destination. All that information needed to be migrated to something else.

To solve this problem, we needed to do several things:

  • Identify a Drupal 8 destination for Scald content.
  • Write Migrate API implementations to move Scald atoms to that destination.
  • Parse all the site’s content to remove old Scald embeds and replace them with new, D8-friendly embeds.

Problem #1: Where does Scald content go?

The logical choice is Media entities. The Media module has been moved into core in Drupal 8, with significant improvements made to the suite in 2019 and 2020. Entity Embed would work to replace the old Scald embeds. Media entities can be bundled and fielded in the same manner as Scald types. While there’s no Migrate source for Scald atoms, there is a Migrate destination for Media entities, meaning at least a chunk of the work is already done.

Problem #2: How do we move Scald atoms to their new home on the Drupal 8 site?

We wrote a Migrate source for Scald atoms, extending the FieldableEntity from the core migrate_drupal module. At its core, a Scald atom is just like any other fieldable entity, so the code is pretty straightforward.

<?php

namespace Drupalcbpp_migratePluginmigratesourced7;

use DrupalmigrateRow;
use Drupalmigrate_drupalPluginmigratesourced7FieldableEntity;

/**
 * Drupal 7 node source from database.
 *
 * @MigrateSource(
 *   id = "d7_scald_atom",
 *   source_module = "scald",
 *   source_provider = "scald"
 * )
 */
class ScaldAtom extends FieldableEntity {

  /**
   * {@inheritdoc}
   */
  public function query() {
    $fields = $this->fields();
    $query = $this->select('scald_atoms', 's')
      ->fields('s', array_keys($fields));

    if (isset($this->configuration['atom_type'])) {
      $query->condition('s.type', $this->configuration['atom_type']);
    }

    return $query;
  }

  /**
   * {@inheritdoc}
   */
  public function prepareRow(Row $row) {
    // Get Field API field values.
    foreach (array_keys($this->getFields('scald_atom', $row->getSourceProperty('type'))) as $field) {
      $sid = $row->getSourceProperty('sid');
      $row->setSourceProperty($field, $this->getFieldValues('scald_atom', $field, $sid));
    }
    return parent::prepareRow($row);
  }

  /**
   * {@inheritdoc}
   */
  public function fields() {
    $fields = array(
      'sid' => $this->t('Scald Atom ID'),
      'provider' => $this->t('Provider module name'),
      'type' => $this->t('Scald Atom type'),
      'base_id' => $this->t('Scald Atom base ID'),
      'language' => $this->t('Scald Atom language'),
      'publisher' => $this->t('Scald Atom publisher (User ID)'),
      'actions' => $this->t('Available Scald actions'),
      'title' => $this->t('Scald Atom title'),
      'data' => $this->t('Scald Atom data'),
      'created' => $this->t('Created timestamp'),
      'changed' => $this->t('modified timestamp'),
    );
    return $fields;
  }

  /**
   * {@inheritdoc}
   */
  public function getIds() {
    $ids['sid']['type'] = 'integer';
    $ids['sid']['alias'] = 's';
    return $ids;
  }

}

query tells the migration how to generate a SQL query that can be used to get all of the source data from the Migrate source database. It allows an optional “atom_type” variable in the Migrate configuration, which will limit this query to one type of atom. More on that later.

prepareRow is the same as it is for most fieldable entities, and just tells the migration how to get all the field data for this entity. It uses the inherited getFields function and passes through the “scald_atom” entity type as well as the bundle type (since different bundles often have different fields.)

fields tells the migration what fields are available for use in Migrate configuration, as well as giving them human names for readability.

Problem #3: How do we replace the old Scald embeds with new, fancy Drupal 8 entity embeds?

Scald atoms embedded in HTML look like this in the database (along with wrapping <div> tags with various metadata):

<!-- scald=24069:report_371 {"share":true,"link":"node/24504","linkTarget":"","figure":"1"} -->

Unfortunately, this code will do nothing on a Drupal 8 site, and will just show up in your content as text. That means we needed to replace each instance of these embeds with Entity Embed. (Note: This Lullabot article is a good jumping-off point for how to configure that module.) The eventual embed code looks like this:

<drupal-entity data-align="right" data-caption="Figure 1" data-embed-button="media" data-entity-embed-display="view_mode:media.report_371" data-entity-embed-link-url="/families-spend-far-less-time-preparing-food-than-thrifty-food-plan-tfp-assumes" data-entity-type="media" data-entity-uuid="38f62de5-3f35-44fa-979e-2de29e664f06"></drupal-entity>

To solve this problem, we worked closely with our client to develop a parsing module that could be run separately from migrations. The parser used DOMDocument to hunt down Scald atoms, locate their new corresponding media entity in the database, and build the new HTML5 drupal-entity tag and its attributes. Notice, for instance, the view mode in each tag: 

<!-- scald=24069:report_371 {"share":true,"link":"node/24504","linkTarget":"","figure":"1"} -->

... data-entity-embed-display="view_mode:media.report_371" ...

This module also did a number of other custodial tasks like managing the Scald atoms’ wrapping divs, replacing deprecated view modes with their new counterparts, and thoroughly logging the process to detect edge cases that needed to be handled specially.

Then that tag replaces the old Scald tag in the content, and the content is saved. For us, this was implemented as a batch operation that could be filtered by content type or node ID. Doing it as a batch was more performant and faster to test with XDebug. This code could also be abstracted into a Migrate processor. If you’re interested in the details on that, stay tuned – we’ll talk about building custom Migrate processes in Part II of this post.

Summary

While many use cases can be handled with the Migrate Upgrade module, most migrations require at least a little bit of custom code. The Migrate API is a highly customizable tool that can be used to bridge gaps where necessary. In this case, we used it to migrate Scald atoms to Media entities. We also used the Batch API to fix up the places where that media had been embedded directly in the content. This particular migration edge case has been solved!

Are you running into a Drupal migration edge case, or struggling with an update to Drupal 8+? Reach out to Fíonta if you’d like to learn more about our web development process or customizable Drupal support packages.