How to export WordPress posts to Gatsby
If you're switching from WordPress to Gatsby, you'd probably like to transfer your existing content.
The gatsby-source-wordpress plugin can pull in data from a live WordPress install when Gatsby builds your static site. That's a good solution if you want to run WordPress as a headless CMS. But what if you'd like to export your content from WordPress to Gatsby once and get rid of WordPress entirely?
One convenient way to add pages to Gatsby is to write your content in Markdown and to let Gatsby transform your Markdown content into pages. We can do this using the gatsby-source-filesystem and gatsby-transformer-remark plugins. Could we somehow export our WordPress content to Markdown?
There is a tool that can convert your WordPress content to Markdown files: wp2md. But wp2md does not create "front matter", a block of YAML at the top of each Markdown file that the gatsby-transformer-remark plugin needs to read metadata about the content, such as a post's title, slug, and category.
If you have several hundred posts on your website, like I did, it would be tedious to manually add the front matter to each Markdown file.
Fortunately, a better tool exists. There is a WordPress-to-Hugo exporter; it can convert your WordPress content to Markdown files that contain the front matter we need to create pages in Gatsby.
In the remainder of this article, I'll walk you through using the WordPress-to-Hugo exporter to create Markdown files from which we'll generate pages in Gatsby.
(Note: If your WordPress site does not have many posts and pages, you can probably transfer your content more quickly by using wp2md and manually adding the front matter to the Markdown files. If you have a lot of content, or if you'd like to automate the process as much as possible, read on!)
Running the exporter
We'll start by running the WordPress-to-Hugo exporter. You can run it from the WordPress admin area or from the command line. The exporter's instructions explain how to do this.
After you run the exporter, you'll find a new folder under wp-content/uploads, which contains your WordPress posts and pages. In the "posts" subfolder, there is a list of posts. Mine looked like this:
(The exporter also produces a Hugo configuration file, config.yaml, which we won't need.)
Open one of the Markdown files under posts and you'll see that it has the front matter we need. Whoo! No need to manually add front matter to each post. 😅
If you have Gatsby set up to transform Markdown files to pages, you can now drop these files into the appropriate source folder (for me, that's src/posts) and just like that, your posts will appear in your Gatsby-powered site!
Fixing the slugs
Chances are that you'll need to do a little more work, though.
When you're moving from WordPress to Gatsby, you probably want to maintain your site's URL structure. For example, if a post lived at https://example.com/vegan-chocolate-cake, you'd like it to stay there.
By default, the WordPress-to-Hugo exporter creates Markdown files with the filename format YYYY-MM-DD-slug, because that's the format that Hugo expects. If you use the Gatsby blog starter or you followed the Gatsby tutorial on adding Markdown pages, you'll have Gatsby set up to use the Markdown files' filenames as slugs.
That means your post might now live at https://example.com/2018-07-15-vegan-chocolate-cake instead. This will break existing incoming links and confuse search engines.
Fortunately, there are two ways to get around this problem.
Option 1: use the "url" field as the slug
In gatsby-node.js, you can modify createPages() to set the path of your posts based on the URL in the Markdown files' front matter, rather than on the filename, like so:
createPage({
path: post.frontmatter.url,
component: blogPost,
context: {
slug: post.frontmatter.url,
previous,
next
}
});
If you do this, you'll have to add a "url" field to the front matter when you create new Markdown files, too.
Option 2: remove dates from filenames
An alternative is to (continue to) use the filenames of Markdown files as slugs for posts, but to remove the dates from the filenames.
To rename the filenames, you can use a regular expression to match the date portion of the filename and then replace it with nothing. Here's a regular expression that matches the date portion of the filenames:
[0-9]{4}\-[0-9]{2}\-[0-9]{2}\-
(I uploaded this regex to RegExr, where you can play around with it.)
To batch rename files using a regular expression, I used a neat tool called transnomino, which runs on macOS. If you use a different operating system, you'll have to find a different rename tool that supports regex.
Once you rename the files, your posts will maintain their existing URLs. Whoo! 🎉
Handling draft posts
If you have any draft posts in WordPress, you might run into an error when running gatsby develop after you place the Markdown files in src/posts.
Any draft posts you export will have a nonsensical date that the gatsby-transformer-remark plugin can't parse. Unless you have an extreme number of draft posts, the quickest way to resolve this problem is to manually update the front matter for your draft posts.
Next steps
That was the core of it!
From here, you might want to:
- Move your images from your WordPress wp-uploads folder to your Gatsby src folder and then import them to your Gatsby posts.
- Transfer not only your WordPress posts, but also your pages.
- Read the WordPress category and/or tags from the front matter and display them on your Gatsby site.
For now, I hope this guide was clear. Please let me know if you run into obstacles. Happy developing!