How does incompatibility emerge? The case of markdown for HTML and PDF production.

abstract: Incompatibility emerges often and is usually noticed only when it is too late. I observed today a case which I want to analyse to understand the forces behind such developments.

The use case

In my blog I collect contributions which are typically blog entry size (less than 5 printed pages) and collections of several entries which together form a small book. In both cases HTML files for the homepage and printed output are desired.

The processes

HTML for the homepage

All the files for the homepage are produced in a process with a makefile like tool bake, written in Haskell. It starts with the desired result files - identified by the rule that each md file (i.e text written in markdown) results in a html file included in the homepage. The markdown files are converted using pandoc (equally a haskell tool!), including the references from the bib files etc. All the supporting files, which are linked to the files produced from the markdown files, are copied to the site.

The markdown files include metadata in a Yaml header: title, author, date etc.

Producing the booklets

The goal of producing a booklet, i.e. a pdf file containing all the text in an organized format. It is also produced using pandoc and latex.

Why no simple solution

The text with fonts, footnotes and layout in lines and paragraphs is the same for both output; different is the interpretation of title and other front matter. I do not see a simple solution for the conversions necessary:

solution in parallel with blog conversion

The pandoc reading is the same for both processes. For the printed output could the tex file result from a template? (similar to the blog production in bake)

The putting the tex files together requires a main.tex files, with the preamble and includes for the files and the backmatter. The same information is necessary for the construction of the index file for this topic (which I assume to be always a directory). Could go to the 'yaml' metadata of the index.md file which then transforms to a booklet.

particulars of the conversion to blog

The issues in the abstract

If I distance myself from the particulars of the use case, I could try to see the generic reasons for the incompatibility between the two use cases and why a solution with a few switches will not work.

Aside: solutions with a few switches are difficult with Pandoc. There are too many switches and extensions and the descriptions are not always easy to understand. It takes long to find the appropriate ones.

The solution would be straightforward, if the structure of elements would be a simple (flat) list, but it is not: the title is not another header level (say 0), the abstract is not a.

Pandoc is relatively close to a structure which is very restricted. It is a tree with a few data types as nodes. The transformation can be approached as a transformation of the structure.

The structure is not orthogonal enough to allow two simple output transformations, but the I expect that the JSON structure, in which the input can be translated, will allow two transformations.

another case

I saw an important project fail: the Leksah IDE for Haskell. Initially I embraced the effort and supported it with writing documentation (probably 2013); Leksah was the IDE I used likely from 2012 till 2020. Leksah was the fastest, most comfortable and best integrated IDE. When it worked, it was a pleasure to use and it disappeared, as any good tool should.

I gave up, because it became more and more difficult to install. Installations usually failed, after days of effort. Leksah became to complicated:

The idea to develop applications and connect them to the browser with javascript code, was attractive (I experimented in 2013 or 2016; for a retrospective ). The question is still open today. Hamish (one of the main movers in the Leksah development) is working on GHCJS, a haskell to javascript compiler, which obviously influenced the Leksah development a lot.

Analysis 1: misunderstood versioning

Leksah was trying -- it seems to have given up since -- to support old GHC versions too long. I think I understand the GHC major-minor versioning method (TODO add link to GHC doc) which --- at least for me --- means to give up improvements for old versions. With hackage preserving old versions indefinitely, old versions can still be built, but improvements are not backported. Backports in Debian are a special concept and a special effort, to include new services in old versions; they are not in the mainline of development.

Generalization: Perhaps misunderstanding a fundamental concept of one of the foundations of a development creates a tension which will bring - over time - failure.

Analysis 2: separate undocumented development from stable

Again, something to be learned from Debian: there is a stable version, evolving in steps, separated from the development. The stable branch can always be installed and run and is not affected by development.

I understand that maintaining the stable branch is not as much fun as adding new features!

Analysis 3: Avoid opinionated foundations

The choice of NixOS as a base (which is resting on Windows, Linux and MacOS) is an interesting idea, but the concepts deviate from the mainstream ideas of installation management. It seems not to have a large group of developers and users (compared to Debian).

Summary

Using now VScode (after trying Atom - they seem about equal in usability) I am amazed how smooth, snappy and well integrated Leksah was. The base idea: to build an IDE in Haskell for Haskell was definitely a good one.

Could the good parts be brought over to the VScode/Atom foundation? The editor was not the main part of Leksah, but a large part of the complication in the foundation (at least for me to understand the code).

Produced with SGG on with master5.dtpl.