09 June 2014

SeedMigrations. Like schema migrations, but for your data

by Pierre Jambet

tl;dr Seed migration is a rails gem similar to schema migrations but for data instead.

The problem

Many projects rely on some kind of initial data, a list of products for an e-commerce shop, a list of post categories in a blog, or a set of user roles, for instance. A classic pattern is to keep that data in Rails’ db/seeds.rb file.

This approach has multiple issues. You have to manually update the seeds file for every single change. It must be written in a really verbose way to be idempotent. It still requires you to manually update your production data for each new update, by either running a few updates directly in the production console (yuk!) or by running a manually written one off script against your production database (yuk again!).

Enter Seed Migrations

The SeedMigration gem solves these issues by allowing you to define seed migrations. At their core they are extremely similar to Rails’ built-in schema migrations. Just as schema migrations apply incremental changes to your database, seed migrations apply incremental data updates.

A few examples of real world seed migrations:

  • Add new products to the database
  • Capitalize product titles
  • Updating TaxRates
  • Updating asset file paths

Applying those migrations will automatically update the seeds file with instructions to recreate all the objects.

You don’t have to make your changes idempotent (e.g. “Create this new object only if it doesn’t already exist”) as the migrations will only be applied once.

Why would I want this?

  • You are using rails seeds and manually update the seeds file
  • You’re not using rails seeds but have production data that occasionally needs to be updated
  • You like Harry’s and want to show your support

If you meet any of those conditions, Seed Migration can help you.

Workflow integration

We use seed migration on a daily basis at Harry’s, here is how it is integrated in our workflow.

Bootstrap a new environment

It makes it trivial to bootstrap a development or test environment, by simply running

rake db:create && rake db:schema:load && rake db:seed or simply rake db:reset.

Production updates

To apply your latest changes to a production environment, simply run rake seed:migrate. A list of all previously run seed migrations live in a table called data_migrations to prevent running the same migration twice on the same system. The table also includes the date and runtime for each migration in case you need it for historical purposes. And, because everybody makes mistakes, it’s easy to rollback migrations with the rake seed:rollback command.

How it works

If you’ve ever run rails g migration AddFooToBar and rake db:migrate, you should feel comfortable.

Create a migration with rails g seed_migration NameOfMigration. That will create a new file under db/data. Write your changes in the up and down methods, exactly like you would do for a regular migration.

You have to manually register models as seed models to have them automatically included in seeds.rb after each migration. In an initializer file, do this:

``` SeedMigration.register Category # in config/initializers/seed_migration.rb

```

For example, we have marked our Product, TaxRate and ShippingType models as seed data. We do not add new records of this type at runtime, only through seed migrations.

Apply the migration by running rake seed:migrate.

Voila!

The db/seeds.rb file now contains instructions to create all the needed Categories.

The Code

The code and more detailed documentation are on GitHub

Disclaimer

Seed migration emerged from our internal needs at Harry’s. After we started working on it, we found those libraries that have similar yet different behavior.

Feedback

Welcomed via email (pierre@h*****.com) or twitter: @pierre_jambet or pull requests on GitHub.

It’s been used and tested with rails 3.2 and 4.x.

We’d love for you to use it, fix bugs, and open some pull reqeusts. Enjoy!

License

This code is released under the MIT License.