loccount designer’s notes

Motivation and history

This program started out with two purposes: (1) to be my first, language-learning project in Go, and (2) to do a faster job than David Wheeler’s sloccount tool. At the time I was regularly taking statistical snapshots of NTPsec so we could talk about how much code we’d removed. I was dissatisfied with sloccount’s slowness, its large set of prerequisites, and its degree of unnecessary verbosity.

The program grew into more than that because I enjoy writing parsers. What was originally a translation of ad-hoc code evolved into a collection of generic language parsers driven by tables of syntax elements. Over time the number of distinct parsers decreased; they all got folded into what had originally been the parser for the C language family, the only one with quirk flags.

Eventually the goal became: I should be able to add the next language just by adding a row to the table driving the generic parser.

It became a game to see at each stage how many more languages I could express the relevant syntantic parts for before having to add capabilities (gated by quirk flags) to the framework.

I like learning new languages and this made wonderful motivation to taste a lot of them. And make comparisons from which I might learn.

I ended up trawling the following sources to find languages to include:

Comparison of programming languages (syntax): https://en.wikipedia.org/wiki/Comparison_of_programming_languages_(syntax)
Comparison of programming languages (strings): https://en.wikipedia.org/wiki/Comparison_of_programming_languages_(strings)
Baylor Computer Science Programming Archive: http://cs.ecs.baylor.edu/~maurer/SieveE/
List of programming languages: https://en.wikipedia.org/wiki/List_of_programming_languages
Comments: http://www.gavilan.edu/csis/languages/comments.html
ACM "Hello World!" project: http://www2.latech.edu/~acm/HelloWorld.shtml

Why particular languages are supported, or not

I haven’t pulled in everything I could have. Factors that motivated me to include languages include:

Commonly used on Unix systems
Of some particular historical interest
Easily fits in the framework

Factors that motivated me not to support languages I theoretically could have included:

It’s extinct.
It’s proprietary.
Not possible or extremely difficult to handle with the generic framework.
Restricted in scope to a narrow functional niche.
Academic toy intended to generate papers rather than for production use.

See the Hacking Guide for how to add a language, or what to do to get me to add one. In general, if someone asks for language X to get support, I’m not going to turn down the request unless it would be an unreasonable amount of work.

Some languages I turned down follow. In most cases the reason should be obvious. If you have an actual need for one of these, speak up…

ABAP, Agda, Alex, Scratch.