all

How I ported my Jekyll blog to Gatsby

How I ported my Jekyll blog to Gatsby

There comes a point in every developer’s life where they sit back and look at a piece of old work and decide that it could be better. I recently did that with my blog, mostly because I had heard so many great things about Gatsby. And since I actively worked with both React and GraphQL on a regular basis, I felt comfortable diving into the tech stack. However, I didn’t want to lose all of my posts. So, I did what any developer would do, and solved my problem with code.

As I hard started with the Gatsby theme Lumen (v2), I took a look at the post format and how it was different from what I had been doing. Lumen also rendered posts from Markdown, a workflow I am very comfortable using, so that meant I just needed to know what the key differences were between the post formats. The biggest difference is that with Jekyll, it was a one-file-per-post format, where I would create a directory full of .md files, and Jekyll would convert it to rendered pages. However, Gatsby, and Lumen v2 were doing a directory for each post, with a very clear convention for the naming scheme.

In Jekyll, a post filename would be _posts/2019-07-23-this-is-a-post.md.

However, Gatsby wanted a structure such as this:

./src
  - pages/
    - articles/
      - 2019-07-23---this-is-a-post/
        - index.md

Which, since I now knew which directory to put them in, I could simplify to this:

2019-07-23---this-is-a-post/
  - index.md

From there, I needed to compare the differences in the metadata between one post format and the other. Fortunately, they both used the standard of having YaML-formatted frontmatter at the top of the file. The term frontmatter comes from the days when books were made out of dead trees and contained the ideas worthy enough to print on paper. Frontmatter depends on the type of publication, but it often contains metadata about the publication. Titles, subtitles, publishing information, copyright, disclaimers, warranties, even forewords can be considered frontmatter.

Blogs have adopted this standard, using frontmatter to provide the content management system some context on how to render the content correctly. For example, which layout to use, what the URL for that content piece should be, tags, categories, and other associated metadata.

Here was the sample frontmatter from an old blog post on my Jekyll blog:

---
title: The Dialog Tag
author: Don B
layout: post
permalink: the-dialog-tag/
robotsmeta:
  - index,follow
categories:
  - HTML5
  - Javascript
tags:
  - html5
---

It’s in the YaML syntax, which means it’s a standard format that surely someone has come up with a library to be able to read and write. Enter the npm packages yaml and front-matter. Front-matter will take a string and parse out the frontmatter into a JS object that can be manipulated. And then the yaml package will let me write it back out in the correct format.

And for comparison, the frontmatter from a more recent blog post on my Gatsby blog:

---
date: 2019-07-12T12:47:44.006Z
path: /posts/graphql-security-talk-at-fullstack-london-2019
draft: false
title: GraphQL Security Talk at Fullstack London 2019
layout: post
tags: 
  - conference
category: GraphQL
description: I recently gave a talk at Fullstack London 2019 about GraphQL Security. Here is a link to the slides and some packages I referenced in my talk. 
---

Now that I could see the differences, all I needed was to write a script that would do the following:

  1. Load the list of files from previous blog, and do the following on each file:
  2. Read the file into memory
  3. Parse the frontmatter
  4. Rewrite the frontmatter to match the new standard
  5. Make a new directory for the post in the Gatsby installation
  6. Make an index.md file in the new directory
  7. Write out the frontmatter to the new file
  8. Write out the original content to the new file

Most of the work was fs-based, so I did my pilgrimage to the Node documentation to doublecheck that I knew which function to use. fs.readdir, fs.mkdirSync, and fs.writeFileSync were all that I needed. Sure, I could have done the file writing as an asynchronous action, but it wasn’t necessary for this operation.

It nicely turned out that the front-matter module would separate out attributes and body when parsing the string, so it was a one-stop shop for everything I needed for the content of the old blog post files.

Here is the resulting script:

const fs = require('fs');
const fm = require('front-matter');
const YAML = require('yaml');

fs.readdir('./jekyll', (err, files) => {
  if (err) {
    console.error("The following error occurred: ", err);
    return false;
  }

  files.forEach(file => {
    const date = file.split('-', 3).join('-');
    const slug = file.replace('.md', '').split('-').slice(3).join('-');

    fs.readFile(`./jekyll/${file}`, 'utf8', (err, data) => {
      if (err) throw err;

      const content = fm(data);

      content.attributes.category = content.attributes.categories;
      
      //Don't care about these fields
      delete content.attributes.categories;
      delete content.frontmatter;
      delete content.attributes.permalink;
      delete content.attributes.robotsmeta;
      delete content.attributes.author;
      
      //As time post was written isn't important, default it all to 12:34:56. It's a good time.
      content.attributes.date = `${date}T12:34:56.789Z`;
      content.attributes.draft = false;
      content.attributes.path = `/posts/${slug}`;

      const dirPath = `./gatsby/${date}---${slug}`;
      console.log("Making directory ", dirPath);
      fs.mkdirSync(dirPath);
      content.frontmatter = YAML.stringify(content.attributes);
      console.log("Writing index.md to ", dirPath);
      fs.writeFileSync(`${dirPath}/index.md`, `---\n${content.frontmatter}\n---\n${content.body}`);
    });
  });
});

I probably could have been cleaner about removing unnecessary front-matter fields from previous posts, and used the path module to build paths more cleanly. In order to do this in a way that didn’t accidentally damage data, I copied all my Jekyll posts from the _posts directory in which they lived to a new directory, aptly named jekyll. There was an empty gatsby subdirectory as a sibling, to put the revised versions in. And if everything went well, I’d be able to take the contents of the gatsby directory and copy it wholesale into my waiting Gatsby installation.

I’m pleased to say that the only “gotcha” I encountered was the pluralization of ‘categories’. Otherwise, it went smoothly. It worked in two runs. The first one was where I found the gotcha, and the second was a clean run that I could then work from and add some of the extra fields. This also allowed me to clean up some older posts that were no longer relevant (radio shows, where the archives/recordings are no longer available. C’est la vie!).

Of course, with this out of the way, that primed me to replicate something I had for my previous Jekyll blog, which was a task(rake) command that would auto-generate the scaffolding for a new post for me. I’d found it quite convenient, so I decided to take my newfound knowledge of writing out Gatsby post content in the right format to build myself a CLI script to generate a new post.

const fs = require('fs');
const YAML = require('yaml');

const title = process.argv.slice(2).join(' ');
const baseDir = './src/pages/articles';
const slug = title.replace(/[^a-zA-Z0-9_ ]/g, '').toLowerCase().replace(/\s+/g, '-');
const now = new Date();
const content = ` 
---

If you have any comments or questions about this post, please feel free to shoot me an e-mail at don (at) donburks (dot) com. I would love to hear from you and continue the conversation. 
`;

const buildDate = () => {
  const month = `${now.getMonth()+1}`.padStart(2, '0');
  const day = `${now.getDate()}`.padStart(2, '0');
  return `${now.getFullYear()}-${month}-${day}`;
};

const date = buildDate();
const dirPath = `${baseDir}/${date}---${slug}`;
const frontMatter = {
  date: now.toISOString(),
  path: `/posts/${slug}`,
  draft: false,
  title,
  layout: 'post',
  tags: '', //Default
  category: 'JavaScript', //Default
  description: 'This is a post about JavaScript' //Default
};

try {
  fs.mkdirSync(dirPath);
  fs.writeFileSync(`${dirPath}/index.md`, `---\n${YAML.stringify(frontMatter)}---\n${content}\n`);
  console.log(`Done. You can now issue the following command and edit your post:\nvim ${dirPath}/index.md`);
} catch(err) {
  console.error("Something STB'ed:", err);
}

Some sane defaults in there and some error handling in case things didn’t go so well. Adding it as a script in my package.json file with the job name new-post means that now I can run yarn new-post Some title and get the appropriate directory and index.md file generated for me, with pre-generated front-matter and my standard footer below, soliciting feedback through e-mail. These scripts aren’t robust toolings like an npm or a webpack might be. They’re utility scripts that allow me to swiftly overcome a problem. Sometimes, that’s the kind of dev work you do. Sometimes, you build beautiful frameworks that are highly-optimized for performance. And sometimes you just brute force through some content to get it wrangled to the way you want.

Similar Posts