Pelican Static Web Generator Parsing Problems

Pelican is a static website generator that can produce static HTML from a variety of sources: I mostly use ReStructured Text (RST) files although it can also ingest HTML and I think MarkDown. I've used it extensively with almost zero problems for over two and a half years now. This blog is built with it.

I'm using Fedora Core's bundled version of pelican-3, version 3.6.3, and have found what appears to be a parsing bug. I was writing a review of the movie "Hidden Figures", but when I regenerated the site I found that I didn't have a title for the blog entry either in the index page or in the content page. In Pelican RST a title is of the form :title: Your Title Here. Initially it appeared to have to do with the inclusion in the title of the word "hidden".

Title is hidden:

:title: "Hidden Figures" - Movie Review
:title:    "Hidden Figures" - Movie Review
:title:  "Hidden Figures" - Movie Review
:title: "\Hidden Figures" - Movie Review
:title: "hidden Figures" - Movie Review
:title: "Not Hidden Figures" - Movie Review
:title: Movie Review - "Hidden Figures"
:title: "The multiple hidden fallacies of Ansible Development"

Title is NOT hidden:

:title: "Hidden" Figures - Movie Review
:title: Hidden Figures - Movie Review
:title: 'Hidden Figures' - Movie Review
:title: "Hidden_Figures" - Movie Review

It appears that "hidden" must be in double quotes, isn't a problem if it's the only thing in quotes, and causes the title to vanish if there are other words in quotes with it.

Generated HTML for a "normal" blog title including double quotes:

<h1><a href="/blog/the-trouble-with-harry.html" rel="bookmark" title="Permalink to "The Trouble With Harry" - Movie Review">"The Trouble With Harry" - Movie Review</a></h1>

Note the double quotes inside double quotes in the title attribute!

Seen in Firefox's "Inspect Element":

<a href="/blog/the-trouble-with-harry.html" rel="bookmark" title="Permalink to " the="" trouble="" with="" harry"="" -="" movie="" review"="">"The Trouble With Harry" - Movie Review</a>

That's made a hash of things. But it's always worked ... until the inclusion of the word "hidden". If we use "Inspect Element" on the "Hidden Figures" entry, we see this:

<h1><a href="/blog/hidden-figures.html" rel="bookmark" title="Permalink to " figures"="" -="" movie="" review"="" hidden="">"Hidden Figures" - Movie Review</a></h1>

Very similar, but note that hidden="".

"hidden" is an HTML attribute. It's normally used like this:

This is a Hidden title

so that the h1 tag will never show up on the page. Turns out that hidden="" has the same effect.

Trying one more time, this time with 'Hidden Figures' in single quotes, we get:

<h1><a href="/blog/hidden-figures.html" rel="bookmark" title="Permalink to 'Hidden Figures' - Movie Review">'Hidden Figures' - Movie Review</a></h1>

This looks totally normal and as it should be, which suggests that the primary problem is double-quotes in :title: tags - and the fact that Pelican isn't escaping them (they should be turned into "). There are undoubtedly other words that would cause problems, and I'd be concerned about other RST header tags too. The workaround for now would seem to be to use single quotes rather than double quotes in all RST headers ...