My blog workflow

Alex Muscar

February 6, 2023

I like to keep things simple. I will use shell scripts and standard Unix tools whenever I can get away with it.

The publish “pipeline”

I use Vim to write my blog posts.1 I currently use Pandoc to convert my posts to HTML, so the source files are in Pandoc’s flavour of Markdown. Since I don’t use most of Pandoc’s features, I’m thinking of migrating to somthing more lightweight like Lowdown.

Pandoc accepts metadata in its Markdown input. The metadata section is delimited by --- marker lines. For example, the metadata block for this post looks like:

---
title: My blog workflow
author: Alex Muscar
description: How I publish this blog
date: February 6, 2023
publishDate: Mon, 06 Feb 2023 17:41:20 +0000
---

I don’t particularly enjoy writing RFC822 dates by hand, so I have a little script that does it for me. I just read its output in Vim using :r!.2 And for uniformity–definitely not because I’m lazy–the script also outputs a more readable date for Pandoc.

#!/bin/sh

case $1 in
rfc822)
    date '+%a, %d %b %Y %H:%M:%S %z'
    ;;
post)
    date '+%B %d, %Y'
    ;;
*)
    date
esac

The build script uses the publishDate field to decide which posts are published. Drafts don’t have a publish date.

#!/bin/sh

FILES=$(grep -Fl publishDate posts/*.md)
FILES="$FILES posts/index.md"

PANDOCOPTS='-f markdown+smart+tex_math_dollars+raw_tex -t html -s -H include/header.html -B include/header-link.html -V monofont:CascadiaCode,monospace'

for f in $FILES;
do
    pandoc $PANDOCOPTS "$f" -o out/$(basename "$f" .md).html
done

I also use the publish date to generate the RSS feed. That’s why it’s RFC822.

I used to have a Go program that generated the RSS feed, but for my needs I can get away with a shell script and two awk helpers. The shell script, rssgen.sh, is a glorified template. It uses heredocs to generate the feed XML, and a few utilities to fill in the post specific details.

#!/bin/sh

cat << TPL
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
<title>Alex&#39;s website</title>
<link>https://muscar.eu/</link>
<description>I try to make computers do cool things. Sometimes I succeed.</description>
<managingEditor>[email protected] (Alex Muscar)</managingEditor>
<pubDate>$(./date.sh rfc822)</pubDate>
TPL

grep -Frl publishDate posts/*.md | cut -d. -f1 | while read -r f;
do
    cat <<- TPL
    <item>
    <title>$(./meta.awk -v field=title "$f".md)</title>
    <link>https://muscar.eu/$f.html</link>
    <description>$(./meta.awk -v field=title "$f".md)</description>
    <content:encoded><![CDATA[$(./body.awk "out/$(basename $f).html")]]></content:encoded>
    <author>$(./meta.awk -v field=author "$f".md)</author>
    <pubDate>$(./meta.awk -v field=publishDate "$f".md)</pubDate>
    </item>
    TPL
done

cat << TPL
</channel>
</rss>
TPL

The awk helpers are really simple. meta.awk extracts a metadata field given its name.

#!/usr/bin/awk -f

/---/ && ismeta { exit }

/---/ { ismeta = 1; FS=": "; next }

ismeta && $1 == field { print $2 }

body.awk extracts the body from the HTML version of the post.

#!/usr/bin/awk -f

/<\/?body>/ { isbody = !isbody; next } isbody

Both scripts use a common awk idiom for extracting multi-line records.

Similarly to rssgen.sh, I have a script to generate the index page automatically. Very imaginatively it’s called indexgen.sh:

#!/bin/sh

cat <<TPL
---
title: Alex's website
---

Hey there! Welcome to my site.

## Writing ([RSS feed](/feed.xml))

TPL

grep -Fl publishDate posts/*.md | while read -r f;
do
    title=$(./meta.awk -v field=title "$f")
    date=$(./meta.awk -v field=date "$f")

    printf "%s\0%s\0/%s\n" "$date" "$title" "$(basename "$f" .md).html"
done |
sort -k3.1,3.5r -k1Mr -k2nr |
tr '\0' '$' |
awk -F'$' '{ print "**" $1 "**<br/>[" $2 "](" $3 ")\n"}'

cat <<TPL
## Blogroll

[Laurențiu](https://blog.dend.ro)

[Andrei](https://www.andreinc.net/)

## About

I use [Vim](https://www.vim.org/) to edit the Markdown posts on computers
running [openbsd](https://www.openbsd.org/) and macOS. You can find my .vimrc
file <a href="_vimrc" type="text/plain">here</a> if you're curious.

I use [Pandoc](https://pandoc.org/) to convert the posts to HTML.

The publishing "pipeline" is a [collection of shell scripts](my-blog-workflow.html) that I run manually.

The monospaced font is [Cascadia Code](https://github.com/microsoft/cascadia-code).

The site is hosted on [openbsd.amsterdam](https://openbsd.amsterdam/).
TPL

indexgen.sh is definitely pushing it. Note the \0 field delimiter for the lines printed in the while loop, and the complicated sort keys.3

And to tie it all up, I have a Makefile:

gen:
    ./build.sh
    ./rssgen.sh > out/feed.xml

upload:
    ./upload.sh

clean:
    rm -f out/*.{html,xml}

Publishing my blog is then just make clean gen upload.4

I use shellcheck to check the scripts.

Conclusion

You can publish a blog using minimal dependencies. You don’t even need a static site generator. You can get away with shell scripts and standard Unix tools.

There are still a few shortcomings to this approach.

First, I don’t have a post tagging system yet, or an archive. I think that’d be relatively straightforward to implement though, and it’s on my TODO list.

Second, I have to manually update the list of posts in the index file. This can also be automated, but it’s not a massive pain point right now.


  1. You can find my .vimrc file here if you’re curious.↩︎

  2. I actually have some key bindings to do that for me :^)↩︎

  3. I’m not very happy with using $ as a delimiter for the fields I pass to awk, but awk can’t handle \0 as a field delimiter in a portable way.↩︎

  4. upload.sh is just a wrapper around scp.↩︎