05 Feb 2017, 03:05

The markup language known as Markdown

What it Is

Markdown is a lightweight markup language (as opposed to a heavyweight one like HTML or LaTeX). If you’ve ever taken plain-text file notes and used an asterisk to represent a bullet point, or a line of dashes like an underline for a heading, then you’ve basically already written Markdown. Markdown is a natural-looking “syntax” that lets you turn text like this:

## What it is
[Markdown](https://en.wikipedia.org/wiki/Markdown) is a *lightweight markup language* (as opposed to a **heavyweight** one like HTML or LaTeX).

into the HTML+CSS page that you’re looking at right now. We’ve all known that annoying nerd who refuses to send HTML-formatted emails and insists on sending plain text peppered with slashes and asterisks instead of italics and bold? Yea, basically a guy like that turned it into a semi-official standard, and now instead of being imaginary italics and bold, it actually renders that way.

Why Not Just Write in HTML Directly?

I think this is best explained by Brett Terpstra here. Basically, HTML sucks to work in, if you’re just trying to write some content. HTML is tedious, hard on the eyes, and error prone. If you want a site to look good at all, it also needs CSS, and that sucks even worse. Real people don’t want to work in either one, they just want to write prose and have it look good. For this, you can use Markdown as a standard format for your thoughts, and then let a static site generator (like Hugo) turn it into a web page. Be a writer, not a website developer, is the thinking.

Editors

I went looking for the ideal text editor in which to edit Markdown files. Ideally I could find something that ran on both MacOS and Linux, just for consistency since I use both. But Markdown is a standard format so I would also settle for the best on each respective platform, even if I had to use two different editors.

If there is one thing that gets reinvented most often, it’s text editors. Programmers love to re-solve their own nerd problems rather than tackle real-world problems, and one problem every programmer has is editing text. So there are about 500 choices of text editor at this point, but I narrowed it down to these 7 just on the basis of using them to edit Markdown.

MarkdownEditors

If the job is programming, a text editor should have code-completion and syntax highlighting. But if the job is editing a markup language, WYSIWYG is the number one thing I care about. So you notice I am not even considering the Vims and Emacs of the world. The whole point of Markdown was to be easy: easy for a human to read in its raw state, easy to edit. But I guess I would take this sentiment a little bit further: why should I have to look at markup language at all? It’s 2017, shouldn’t I have an editor at least as good as WordPad from Windows 95? Of course, I want it to be able to flip back to the raw “source” markup when necessary, but most of the time I just want to edit it the way it’s going to look when I’m done.

This is a surprisingly uncommon feature. It seems that most editors have adopted the two-view or split-window paradigm seen in editors for more complicated markup and typesetting languages like LaTeX. They present the raw Markdown source on the left, and the rendered version on the right. Booooo. The concept of Auto-save, on the other hand, is a ubiquitous feature nowadays. That’s great to see.

TextNut Typora HarooPad Quiver Atom Sublime Texts
WYSIWYG Yes Yes No No No No Yes
Easy to Use Yes Yes Yes Yes No No Yes
Both Mac & Linux No Yes Yes No Yes Yes No
Free / OSS No ($25) No (Beta) Yes No ($10) Yes No ($70) No ($20)
Auto-save Yes Yes Yes Yes Yes Yes Yes
Work Directly in .md No* Yes Yes No* Yes Yes Yes
Leaves TOML Intact Yes Yes Yes Unknown Yes Yes No

* = TextNut can open and edit a .md file, but the WYSIWYG aspect only works when it is “imported” to the proprietary format and edited there, then exported back to Markdown. Quiver has similar focus on Notes and a similar weakness in working in .md directly: basically it’s a less capable TextNut.

I like Texts, but it breaks the “front matter” on a Hugo post, so that’s a deal-breaker. HarooPad, besides having a lack of documentation in English and development that has been dead for a couple of years, is pretty robust. If it could only offer WYSIWYG editing I think it would have been my choice.

So the overall winner for me is Typora. Eventually it will leave beta, and they’ll charge for it, but hopefully it’s something reasonable. Until then, it’s free!

QuickLook for .md Files (MacOS)

I am all about using QuickLook in MacOS. You just hit space bar on a file in Finder and you get a perfectly good read-only peek at the file. But it doesn’t handle Markdown (as plain text, let alone as a rendered view). Fortunately someone made a QuickLook generator you can install using Homebrew: brew install Caskroom/cask/qlmarkdown.

Now you’re ready to work with Markdown files!

It’s Not All Roses Though: Behind the Scenes on This Post

The rendering of your content doesn’t always turn out the way you envisioned. When this happens, it’s either your Hugo theme, or the Markdown renderer that is to blame. Unfortunately, this might mean rolling up your sleeves and fixing CSS, as I had to do in order to get the table above to look decent. Hopefully this is a one-time thing. I had to go into the purehugo theme subdirectory, and edit static/all.min.css within that.

Secondly, when editing Markdown to embed an image from a local directory (as I’ve done above), Hugo requires you to put the files in /static and then in the Markdown you specify a relative path without a leading slash, such as ![MarkdownEditors](img/markdownEditors.png) (actual path of the image in the source tree is /blog/static/img/markdownEditors.png) and Hugo will copy it to the publishdir during rendering. Because of this, you can’t actually see the image in your Markdown editor, which sucks, and your source tree will have two copies of the file, which also sucks.

03 Feb 2017, 19:07

Hugo, the static site generator

Hugo, a Static Site Generator

In my last post, I covered the rationale behind using a static site generator. Static site generators are not just for creating blogs. They can also be used to create online resumes, company sites, online documentation, etc.

The default choice for static site generator is Jekyll, which has the most support, but it’s troublesome to install and use. Hugo is a popular alternative that is easier to install, and faster to work with. It’s implemented in Golang, a.k.a. Go. This means it is written in a statically compiled language (The Best Kind) and is completely dependency free. Dependency hell is the bane of my existence. It’s like work that you have to do before you can start working. Anyway, let’s look at how to get started.

Hugo Install Process (MacOS)

This is so simple, and its simplicity is the reason why I went with Hugo after trying the more popular Jekyll, which was a mess.

brew update && brew install hugo
hugo new site myBlog
cd myBlog
git clone https://github.com/dplesca/purehugo.git themes/purehugo
echo "theme = purehugo" >> config.toml 

Creating or customizing themes is beyond the scope of this post, but what we are doing here is “installing” a pre-baked Hugo theme, and then setting it as our default.

Hugo Workflow: Drafting & Publishing a Post (MacOS)

In order to create a new post for your blog:

cd myBlog
hugo new post/myReviewOfHugo.md
open content/post/myReviewOfHugo.md # write the post in your text editor

# Optional: launch a local webserver, give it a sec, and preview the blog
hugo server & sleep 2 && open http://localhost:1313/blog/
killall hugo # because we left hugo running in the background there

While the server is running, you can actually continue to edit the post in your editor. The server will live update the view in your browser. This is optional, but it will verify that everything will look correct when you publish.

When you’re satisfied, you can generate the actual web content to disk, and publish it. The following steps assume you are using Github Pages, so the publish is made using a git push.

# You must already have a GitHub project, and in its settings page, and have set the GitHub pages to "master branch / docs". In this example, the project name is "blog".

# These are the one-time Hugo steps:
echo "publishDir = docs" >> config.toml
echo "baseURL = https://myname.github.com/blog" >> config.toml

# These are the one-time Git steps:
rm -rf themes/.git # delete existing git files so they don't interfere
git init  # turn this directory into a git repo
git remote add origin https://github.com/myname/blog.git

# These are the only steps needed every time you publish new content:
hugo  # this generates HTML + JS + CSS under the publishdir (blog/docs/)
git add -A
git commit -m "Add a blog post about whatever."
git push

That’s all there is to it, although you can always use a different Git client if you don’t like the command line. I sure as hell don’t like it (I use Atlassian Sourcetree) but it’s up to you.

Post Metadata: WTF is “Front Matter” ?

In each post (each Markdown file), there is some metadata in a header at the top of the file, called “front matter.” Jekyll was the first to introduce this concept (in name, at least), but it is common across other static generators now. Hugo lets you write front matter in YAML, JSON or TOML (the default). If you’ve worked in web development surely you’ve heard of JSON, but now you may be asking WTF is YAML and TOML?

These are syntaxes invented specifically for controlling the settings of static site generators. It seems to be a case of “reinventing the wheel” of INI files, which have been around for decades. Basically, a config file. Key-value pairs. Associative array. Hash table (please don’t shorten it to just “hash,” words have meanings, know the difference). Dictionary. They’re all basically the same thing. YAML started in 2009 or so, as a minimalist-syntax alternative to JSON, which itself was a minimalist alternative to XML. We’ll get this right some day.

The CEO of GitHub and inventor of Jekyll, probably high on the smell of his own farts, in 2013 decided that YAML needed to be even more minimal, and renamed this idea after himself (“TOM”), and thus was born TOML, which primarily because of the fame of the creator has now spread to a few other projects. Thus, we have minimalized almost all the way back to INI files (except now it has been “standardized”). Progress.

Oh and by the way, none of these are actually markup languages at all. They just aren’t. The insistence on propagating the use of the acronym letters -ML for config file formats is basically an inside joke at this point.

The takeaway for me is that in the mid-2000s it became fashionable to ditch braces and brackets in all syntax for everything, in favor of careful indentation. Thus returning to the fashion of the 1970s and FORTRAN. You know what’s popular today, though? Look at Go, Rust, and Swift. Yea that’s right, compiled languages with curly braces are back again. Urge to kill risinnnnnnng. All right, deep breaths.

Anyway, within this “front matter,” you can define tags and categories, timestamps, and titles for every post. For examplte, the front matter for this post was defined as such:

+++
Tags = ["web","blogging","Hugo", "Jekyll", "YAML", "TOML"]
Description = "Initial impressions on the static site generator, Hugo"
date = "2017-02-03T19:07:12-05:00"
title = "Hugo, the static site generator"
Categories = ["web","blogging","Hugo"]
+++

You can also set optional variables like a publish date in the future (Hugo will not render it to the content directory until this date), or an alias (if you want to forward visitors from another URL to this post instead).

The configuration file for your Hugo site, config.toml, is also in this syntax.

That more or less covers the basics of Hugo, and static site generators like it. My next post will be about Markdown (an actual markup language).

03 Feb 2017, 15:43

How to Blog in 2017

My first blog, back in the early 2000s, was on a hosted blogging platform known as Blogger. It was simple and convenient: as the admin you just logged into the Blogger service, edited posts in your browser, and hit publish. This is basically how Tumblr still works today, although Tumblr’s innovation was to include media file hosting and allow everyone to repost each others’ content.

But Blogger content was static, and textual. You could post a few paragraphs of text, and embed images if they were hosted elsewhere. Only later did Google buy out the service and integrate it with their photo-hosting service. In the mid-2000s, many geeks wanted more flexibility, like the ability to limit access to members only, integrate their own photo/video/audio collections, and – most importantly – control the appearance of their blog.

So my second blog was generated with a Web Content Management System (CMS) and self-hosted on a home Windows XP PC running the “WAMP” software stack, with a DNS record from a free dynamic DNS service. If you’re a system admin or security expert you’re probably cringing. I am too. In hindsight, it’s a miracle if that PC was not 0wned by a hacker at some point, but at least I have no evidence to believe it was. But I thought my blog was pretty cool, it had a custom look, custom domain name, its own forums, file storage, a weather widget on the sidebar. I believe it was using the Drupal CMS. The 2000s saw this rise of the “web app,” a concept that an application was something that ran in a scripting language on a web server and presented you with a web page as the user interface. As a system programmer who thinks an application is a single self-contained compiled binary, I thought this was an anathema. But the rest of the tech world decided otherwise: websites that were not database-backed and server-side-scripted were totally 90s! That meant lame. 90s wasn’t cool again yet.

The reason why the self-hosted CMS approach to blogging is cringey is that it is notoriously difficult to secure a CMS, especially one written in PHP. PHP is now known to be prone to reoccuring security issues because of flaws in its design (unvalidated input, access control problems, command injection issues, etc.), and the use of a SQL database means fighting a war agains SQL injection attacks from anyone who uses your site. Spammers will leave spam comments. You just want to run a blog, but now you’re a system admin for a web server, a database admin for a database, and you have to understand the PHP (or Java, or whatever) that generates your site on the fly every time a visitor loads a page. If you ever want to use a web hosting service for your CMS-based site instead of hosting it at home, you have to pay real money, because supporting and securing Apache, PHP, and MySQL is a full-time job! On top of all of that, all of this script and database stuff makes the site is slower to load, and prone to Denial of Service attacks.

This is no way to live. And so, as is typical, the tech community decided that what is old is new again, and that static sites were actually a good idea that should never have been abandoned. Rolling my eyes so hard I went temporarily blind, I actually resisted even caring about the cool way to blog in the 2010s. I used LiveJournal for a bit. I tried a hosted Wordpress (Wordpress.com) account to blog about game console emulators. I got into using Tumblr, even though (or maybe because) the tech community is not on there. But now I’ve decided to give a fresh look at what’s fresh, and give it a chance.

Here are some things I noticed about the current Preferred Way for Cool Kids to Blog.

  • If you write any kind of code for a living, you host it on a free hosting service in the .io TLD. This is just what is fashionable, and like all fashion choices, it can’t really be explained. “Everyone is doing it”, including this blog. We are not all hosting sites in the British Indian Ocean Territory, but yes, this TLD exists because the UK stole some Pacific Islanders’ land during the Cold War, and its only other claim to fame might be its black site CIA torture prison. How’s that for oblivious Silicon Valley tech privilege!
  • Because HTML, JS, and CSS are nearly impossible to work in directly anymore (much like assembly code), people write their web page content in a highly simplified markup language, and then run that through a compiler (oh, sorry, static site generator) to produce a web site in actual HTML, JS, and CSS. The output is then posted to a web hosting service. There are some 450 static site generators to choose from. This site uses Hugo, which I’ll talk about in a future post. An even more popular choice is Jekyll, which is fine…for me to poop on.
  • The simplified markup language of choice currently is Markdown, which will also be the subject of a future post because it is pretty neat.
  • Because supporting the ability for visitors to post comments would require a dynamic site, static sites have outsourced this responsibility to third-party services. That is, comments are implemented with an embedded JavaScript element that is loaded from a remote service. The dominant choice of service at the moment is Disqus. This and any other user-account-based service that embeds its content on your blog is a privacy problem: it means Disqus is basically assigning you an identifier and following you around to all of the Disqus-enabled sites you visit. Ghostery blocks Disqus by default, for this reason. I suggest using Twitter to reach me if you have a comment.
  • Because static sites cannot track how many visitors they get and where they visited from, that too has been outsourced. Google Analytics is now more prevalent than HPV and herpes combined. I have had to delete it out of every web-related code repository that I have borrowed to make anything. Even if I’m the last one on Earth who cares about privacy, I will not be including that here. The same goes for social media sharing links. You’re a big boy and/or girl, I bet you’ll figure out how to share a URL yourself!

So there you have it, my take on the Way to Blog in the 2010s for Cool Kids. Thanks for reading. – MM