Name

loccount — count lines of code in a source tree and perform cost estimation

Synopsis

loccount [-c] [-e] [-i] [-l] [-u] [-x pathlist] [-V] [-?] file-or-dir

DESCRIPTION

This program counts source lines of code (SLOC) in one or more files or directories given on the command line. A line of code is counted if it (a) includes characters other than whitespace and a terminating newline, and (b) is not composed solely of a comment or part of a comment. Comment leaders and trailers in string literals (including multiline string literals) in languages that have them) are ignored.

Optionally, this program can perform a cost estimation using the COCOMO I model. It uses the "organic" profile of COCOMO I, which is generally appropriate for open-source projects.

SLOC figures should be used with caution. While they do predict project costs reasonably well, they are not appropriate for use as productivity measures; good code is often less bulky than bad code. Comparing SLOC across languages is also dubious, as differing languages can have very different complexity per line.

With these qualifications, SLOC does have some other uses. It is quite effective for tracking changes in complexity and attack surface as a codebase evolves over time.

All languages in common use on Unix-like operating systems are supported. For a full list of supported languages, run "loccount -l". Note that (1) "shell" includes bash, dash, ksh, and other similar variants descended from the Bourne shell, and (2) the language "c-header" is a marker for C-style include (.h) files which will be assigned to the dominant C-family language in a report (if there is one).

The program also emits counts for build recipes - Makefiles, autoconf specifications, scons recipes, and waf scripts. Generated Makefiles are recognized and ignored.

Languages are recognized by file extension or filename pattern; executable filenames without an extension are mined for #! lines identifying an interpreter. Files that cannot be classified in this way are skipped, but a list of files skipped in this way is available with the -u option.

Some file types are identified and silently skipped without being reported by -u; these include symlinks, .o, .a, and .so object files, various kinds of image and audio files, and the .pyc/.pyo files produced by the Python interpreter. All files and directories named with a leading dot are also silently skipped (in particular, this ignores metadata associated with version-control systems).

OPTIONS

-?
Display usage summary and quit.
-c
Report a COCOMO I cost estimate. Use the coefficients for the "organic" project type, which is the best for for most open-source projects.
-d n
Set debug level. At > 0, displays various progress messages. Mainly of interest to developers.
-e
Show the association between languages and file extensions.
-i
Report file path, line count, and type for each individual path.
-j
Dump the results as self-describing JSON records for for postprocessing.
-l
List supported languages and exit.
-u
List paths of files that could not be classified into a type.
-V
Show program version and exit.

HISTORY AND COMPATIBILITY

The algorithms in this code originated with David A. Wheeler’s sloccount utility, version 2.26 of 2004. It is, however, faster than sloccount, and handles many languages that sloccount does not.

Generally it will produce identical figures to sloccount for a language supported by both tools; the differences in whole-tree reports will mainly be due to better detection of some files sloccount left unclassified. Notably, for individual C and Perl files you can expect both tools to produce identical counts. However, Python counts are different, because sloccount does not recognize and ignore single-quote multiline literals.

A few of sloccount’s tests have been simplified in cases where the complexity came from a rare or edge case that the author judges to have become extinct since 2004.

The base salary used for cost estimation will differ between these tools depending on time of last release.

BUGS

The sloccount logic for treating multiple argument directories as different projects has not been reproduced. This may change in a future release.

PHP #-comments taking up an entire line or following only whitespace on a line will be counted, not recognized as comments and skipped.

Eiffel indexing comments are counted as code, not text. (This is arguably a feature.)

The asm counter assumes ";" winged comments. This is correct for Intel assemblers and some others, but not all.

In lex, flex, yacc, and bison files, block comments beginning within string literals will confuse this program and throw a warning.

Literate Haskell (.lhs) is not supported. (This is a regression from sloccount).

REPORTING BUGS

Report bugs to Eric S. Raymond <esr@thyrsus.com>.