I maintain a man-page-to-DocBook converter, doclifter. A side effect of this program is that it serves as a validator for the correctness and portability of the markup used on Unix manual pages. I test it by running it against all the manual pages in a full Xubuntu 14.10 with some extras; there are 11394 of these on my development machine, of which 962 already have DocBook masters. It converts 10063 (96.46%) of the remaining 10432 into valid XML-DocBook.

Most of the remaining 3.54% of errors happen because groff(1) and its kin have weak-to-nonexistent validity checking. Often, doclifter fails because of outright errors in macro usage that groff does not catch. Sometime it fails on constructions that are legal but perverse. Very occasionally it throws an error because a man page is correct but has a structure that cannot be translated to DocBook. I keep a database of patches for such problems, and periodically try to push fix patches out to the manual-page maintainers.

Even if you do not care about DocBook, this cleanup work benefits all third-party manual page viewers, including the GNOME and KDE documentation browsers; groff constructions that confuse doclifter are very likely to produce visible problems on these.

The table below is a listing of the 357 (3.42%) pages on which doclifter fails, but the failure can be prevented with a fix patch to the manual page source. 12 pages (0.12%) remain intractable, generally due to markup problems more severe than a point patch can address. I am working with the individual projects responsible to get those cleaned up.

It is likely that you are reading this because you have received email telling you that patches are associated with your name or list address. Please consider incorporating them, or equivalents, in your next release. Also, please write back and tell me what you plan to do so I can keep my database up-to-date.

If you are not already considering it, please think about moving the documentation masters of your project to DocBook (or some format from which you can generate DocBook). If everybody moved to using DocBook as a common exchange format, it would become much easier to support unified browsing of all system documentation with Web-like hypertext capabilities, automatic indexing, and rich search facilities.

Tools to generate man pages, HTML, and PostScript from DocBook files are open-source and generally available. My program, doclifter, should make moving your manual-page masters to DocBook a fairly painless process.

Many major open source projects (including the Linux kernel, the Linux Documentation Project, X.org, GNOME, KDE, and FreeBSD) have already moved to DocBook or are in the process of doing so.

(Individual entries for accepted patches are no longer shown.)

Summary: 288 patches pending, 548 accepted, 0 rejected.

Status codes are as follows:


n No response yet.
p Maintainer has informed me that this is fixed in the masters, but I have not seen the fix yet.
y Accepted
r Rejected
s Superseded (page lifts correctly without the patch)
[0-9]+ number of mailings sent
b Address is blocked

Problem codes are explained after the table.


Patch:Problem code:Status:
_build_buildd_libcaca-0.99.beta18_ruby_.3caca
I F W Mn
_build_buildd_libcaca-0.99.beta18_caca_.3caca
I F W Mn
_build_buildd_libcaca-0.99.beta18_caca_codec_.3caca
I F W Mn
_build_buildd_libcaca-0.99.beta18_caca_driver_.3caca
I F W Mn
acl.5
Ib
admin.1posix
C1n
american.5
english.5
I 9p
analog.1
C Z1n
AnyEvent::FAQ.3pm
Wn
arp.7
p7n
as.1
Z y1n
bash.1
Ln
bc.1
J1n
bitmap.1
atobm.1
bmtoa.1
In
bootparam.7
I u 7b
btcflash.8
J1n
bzfs.6
ob
bzr.1
J X1n
cdparanoia.1
L1n
chat.8
Jp
chmoddic.1
B CbA
chroot.2
E L1n
co.1
ident.1
o2n
codepage.1
CbA
compose.1
edit.1
* y1n
corosync.conf.5
L Ip
curl_formadd.3
Jp
libcurl-tutorial.3
Jp
cvs.1
L3n
cxpm.1
Wp
dash.1
sh.1
J1n
dcut.1
R1n
Parse::DebControl::Error.3pm
W y1n
devnag.1
J1n
dh_install.1
i y1n
dhcp-eval.5
Js
directomatic.1
o G1n
dkms.8
X J1n
dmcs.1
mcs.1
gmcs.1
L An
dosbox.1
L1n
dump-acct.8
U1n
dv2dt.1
C1n
dvipdf.1
font2c.1
R2n
dvitodvi.1
R1n
editres.1
Ip
e2fsck.8
o2n
e2image.8
J1n
efax.1
J u g2n
ethtool.8
21n
exiv2.1
L1n
extractres.1
R2n
f2py.1
f2py2.7.1
C1n
faked-sysv.1
faked-tcp.1
faked.1
fakeroot-sysv.1
fakeroot-tcp.1
fakeroot.1
r1n
fence_drac.8
J1n
fence_na.8
W y1n
fence_drac5.8
J1n
fig2ps2tex.1
R2n
foomatic-rip.1
lpdomatic.8
o G2n
formail.1
lockfile.1
procmail.1
procmailex.5
procmailrc.5
procmailsc.5
K1n
fsck.8
fsck.ext2.8
fsck.ext3.8
fsck.ext4.8
fsck.ext4dev.8
op
ftm.7
D1n
fuser.1
J1n
fuzzyflakes.6x
C1n
gacutil.1
cli-gacutil.1
N1n
genhomedircon.8
H1n
genisoimage.1
o1n
getafm.1
Rn
getty.8
Ip
gftodvi.1
IbA
gpm-types.7
J C1n
grap.1
Qp
groff_out.5
1n
groff_mom.7
sp
grodvi.1
C 7p
gs.1
ghostscript.1
CbA
gthumb.1
L1n
gvcolor.1
C1n
gvpack.1
C *n
hfsutils.1
H J7n
hosts_access.5
hosts.allow.5
hosts.deny.5
hosts_options.5
I1n
html2text.1
C1n
html2textrc.5
X1n
hypertorus.6x
C1n
icclink.1
E2n
icctrans.1
L1n
tifficc.1
E2n
icedax.1
A I1n
ilbmtoppm.1
L1n
includeres.1
R2n
imake.1
Ip
intel_panel_fitter.1
E1n
IO::WrapTie.3pm
W C1n
ipcrm.1
Cp
ipppd.8
L2n
iptables-save.8
Un
ip6tables-save.8
UbA
ipv6calc.8
L o2n
irda.7
01n
ispell.1
buildhash.1
munchlist.1
findaffix.1
tryaffix.1
icombine.1
ijoin.1
C1n
ispell-wrapper.1
C1n
lamd.1
1n
lam.7
LAM.7
L1n
lam-helpfile.5
I1n
lastcomm.1
Ib
less.1
pager.1
J7n
lftp.1
I1n
libcaca-authors.3caca
W1n
libcaca-canvas.3caca
W J1n
libcaca-env.3caca
W L1n
libcaca-font.3caca
W J1n
libcaca-ruby.3caca
W1n
libcaca-tutorial.3caca
W1n
libpng.3
S Jp
libtiff.3tiff
I1n
list_audio_tracks.1
W1n
ln.1
j2n
locate.findutils.1
U1n
logger.1
Os
lkbib.1
Cp
lpr.1
U7n
makeindex.1
J1n
mathspic.1
J W tn
mawk.1
R1n
mdoc.7
Xn
mke2fs.8
mkfs.ext2.8
mkfs.ext3.8
mkfs.ext4.8
mkfs.ext4dev.8
Cn
mkjobtexmf.1
L y1n
mlocate.db.5
Jp
mmcli.8
Xn
mono.1
cli.1
J X1n
mono-config.5
X1n
more.1
Op
mp3-decoder.1
mpg123-alsa.1
mpg123-jack.1
mpg123-nas.1
mpg123-openal.1
mpg123-oss.1
mpg123-portaudio.1
mpg123.1
mpg123.bin.1
C J Ln
mpirun.1
mpirun.lam.1
L1n
mtools.5
mtools.conf.5
X1n
mtr.8
Jb
mutt.1
J Q1p
muttrc.5
J X u1p
nautilus.1
Ln
nautilus-connect-server.1
L1n
netpbm.1
J1n
netstat.8
C z1n
nfsmount.conf.5
C Y1n
nmcli.1
Jn
nsgmls.1
C I1n
ntfs-3g.secaudit.8
C1n
ntfs-3g.usermap.8
C1n
nvidia-settings.1
I x Y1n
nvidia-smi.1
I 6 Y1n
ode.1
e1n
oldfind.1
find.1
Jb
omfonts.1
W1n
openvt.1
open.1
L2n
orbd.1
W y Y1n
orca.1
s1n
pam_systemd.8
systemd-logind.8
systemd-logind.service.8
In
pax.1posix
W J L1n
pbmclean.1
pnmcomp.1
pnmnorm.1
pnmpad.1
pnmquant.1
pnmremap.1
pnmtotiff.1
pgmnorm.1
ppmcolors.1
ppmnorm.1
ppmntsc.1
ppmquant.1
ppmrainbow.1
ppmtogif.1
ppmtoxpm.1
tifftopnm.1
C1n
pbget.1
pbput.1
pbputs.1
W1n
pbmtextps.1
C1n
pcap-filter.7
I1n
pcreapi.3
IpA
pcreposix.3
HpA
pidgin.1
T1n
pkg-config.1
q1n
plot.1
plotfont.1
W1n
pnmhisteq.1
ppmcie.1
ppmlabel.1
sbigtopgm.1
R1n
pnmpaste.1
X1n
pnmtotiffcmyk.1
C1n
pnmtofiasco.1
e1n
policytool.1
W y1n
proc.5
o h1n
pstree.1
pstree.x11.1
C1n
pstops.1
RbA
ps2epsi.1
j7n
ps2pdfwr.1
R1n
psnup.1
J2n
ptx.1
j7n
pytest.1
C1n
qsub.1posix
I1n
rcsfile.5
d1n
regulatory.bin.5
w1n
renice.1
Os
rev.1
O Ls
rhythmbox-client.1
L1n
rlog.1
L1n
rlwrap.1
readline-editor.1
J1n
rmid.1
W y1n
rmiregistry.1
W y1n
ruby.1
ruby1.9.1.1
Lb
s3.4
Ip
sane-apple.5
Lp
sane-lexmark.5
L op
sane-mustek_pp.5
L op
sane-pixma.5
Wp
scons-time.1
L Z1n
script.1
Os
SDL_Init.3
L2n
SDL_CDPlayTracks.3
81n
see.1
run-mailcap.1
print.1
C1n
semanage-user.8
semanage-boolean.8
semanage-module.8
semanage-permissive.8
Bn
semanage-fcontext.8
B Un
setcap.8
C1n
sg_sat_phy_event.8
C1n
sgmlspl.1
L1n
slapd.conf.5
L IbA
slapd-config.5
L IbA
slapo-constraint.5
LbA
software-properties-gtk.1
WbA
spam.1
C1n
sudo.8
sudoedit.8
C1n
synctex.1
51n
rb.1
rx.1
rz.1
sb.1
sx.1
sz.1
e7n
tar.1
C Vp
tc-prio.8
tc-htb.8
tc-cbq.8
tc-cbq-details.8
C1n
tcpd.8
I2n
tcpdmatch.8
I1n
tek2plot.1
W1n
telnet.1
telnet.netkit.1
X1n
test.1
[.1
1n
TIFFGetField.3tiff
I1n
TIFFmemory.3tiff
41n
tnameserv.1
W y1n
tgatoppm.1
A1n
tidy.1
W m1bA
tree.1
b1n
ttf2tfm.1
I o1n
tune2fs.8
C7n
unrar.1
unrar-nonfree.1
Cn
upstart-events.7
I1p
usb-creator-gtk.8
W1n
xz.1
xzcat.1
unxz.1
unlzma.1
lzcat.1
lzma.1
C1n
unshare.1
Lp
updatedb.conf.5
Jp
uuencode.1posix
I1n
wall.1
L Os
whereis.1
L2n
winedbg.1
msiexec.1
Jn
winemaker.1
Un
X.7
I L op
XF86VM.3
XF86VidModeDeleteModeLine.3
XF86VidModeGetAllModeLines.3
XF86VidModeGetDotClocks.3
XF86VidModeGetGamma.3
XF86VidModeGetGammaRamp.3
XF86VidModeGetGammaRampSize.3
XF86VidModeGetModeLine.3
XF86VidModeGetMonitor.3
XF86VidModeGetPermissions.3
XF86VidModeGetViewPort.3
XF86VidModeLockModeSwitch.3
XF86VidModeQueryExtension.3
XF86VidModeModModeLine.3
XF86VidModeQueryVersion.3
XF86VidModeSetClientVersion.3
XF86VidModeSetGamma.3
XF86VidModeSetGammaRamp.3
XF86VidModeSetViewPort.3
XF86VidModeSwitchMode.3
XF86VidModeSwitchToMode.3
XF86VidModeValidateModeLine.3
Ip
xkbevd.1
Jp
xlogo.1
Ip
XML::LibXML::Pattern.3pm
Wn
XML::LibXML::Reader.3pm
Wn
XML::LibXML::RegExp.3pm
Wn
XML::LibXML::XPathExpression.3pm
Wn
xorg.conf.5
xorg.conf.d.5
L u2p
Xsecurity.7
Wp
Xserver.1
I1n
XStandards.7
Hp
xterm.1
L I2n
zic.8
I1n
zip.1
Jp
zipinfo.1
*1n
zipcloak.1
zipnote.1
zipsplit.1
Ip
zlib.3
C1n

Error codes:

0
Function declarations had to be modified in order to fit into the DocBook DTD. This is not an error in troff usage, but it reduces the quality of the HTML that can be generated from this page through the DocBook toolchain.
1
.MT was not properly closed by .ME.
2
Removed unnecessary \c that confused the doclifter parser.
3
Use of .RS/RE or man/mandoc list markup to produce indentation in examples and screenshots makes structural translation impossible. This bug is also likely to confuse third-party man-page browsers.
4
\c is an obscure feature; third-party viewers sometimes don't intepret it. Plain \ is safer.
5
Two-digit year in .Dd macro.
6
Presentation-level use of SS could not be structurally translated. I changed lower-level instances to .TP.
7
This page wins an award for exceptionally creative and perverse abuse of list syntax.
8
C function syntax has extra paren.
9
I replaced '-->' with a troff right arrow, which doclifter will translate properly to an XML/HTML arrow glyph.
A
Dot or single-quote at start of line turns it into a garbage command. This is a serious error; some lines of your page get silently lost when it is formatted.
B
( ) notation for mandatory parts of command syntax should be { }.
C
Broken command synopsis syntax. This may mean you're using a construction in the command synopsis other than the standard [ ] | { }, or it may mean you have running text in the command synopsis section (the latter is not technically an error, but most cases of it are impossible to translate into DocBook markup), or it may mean the command syntax fails to match the description.
D
Non-break space prevents doclifter from incorrectly interpreting "Feature Test" as end of function synopsis.
E
My translator trips over a useless command in list markup.
F
This looks like a build intermediate that was included in the shipped manual pages by mistake
G
Spurious trailing .CE
H
Renaming SYNOPSIS because either (a) third-party viewers and translators will try to interpret it as a command synopsis and become confused, or (b) it actually needs to be named "SYNOPSIS" with no modifier for function protoypes to be properly recognized.
I
Use of low-level troff hackery to set special indents or breaks can't be translated. The page will have rendering faults in HTML, and probably also under third-party man page browsers such as Xman, Rosetta, and the KDE help browser. This patch eliminates .br, .ta, .ti, .ce, .in, and \h in favor of requests like .RS/.RE that have structural translations.
J
Ambiguous or invalid backslash. This doesn't cause groff a problem. but it confuses doclifter and may confuse older troff implementations.
K
Renaming stock man macros throws warnings in doclifter and is likely to cause failures on third-party manual browsers. Please redo this page so it uses distinct names for the custom macros.
L
List syntax error. This means .IP, .TP or .RS/.RE markup is garbled. Common causes include .TP just before a section header, .TP entries with tags but no bodies, and mandoc lists with no trailing .El. These confuse doclifter, and may also mess up stricter man-page browsers like Xman and Rosetta.
M
Synopsis section name changed to avoid triggering command-synopsis parsing.
N
Extraneous . at start of line.
O
Wrong order of arguments in .Dd macro.
Q
Spelling error or typo.
R
.ce markup can't be structurally translated, and is likely to cause rendering flaws in generated HTML.
S
DEPRECATED: in function syntax cannot be translated. Also, the code and examples need to be marked up better.
T
Junk at the beginning of the manual page.
U
Unbalanced group in command synopis. You probably forgot to open or close a [ ] or { } group properly.
V
Missing body content in list trips up doclifter and is likely to cause rendering problems in other viewers. I have been able to fill in what was missing except for what should be under TAR_LONGLINK_100.
W
Missing or garbled name section. The most common form of garbling is a missing - or extra -. Or your manual page may have been generated by a tool that doesn't emit a NAME section as it should. Or your page may add running text such as a version or authorship banner. These problems make it impossible to lift the page to DocBook. They can also confuse third-party manpage browsers and some implementations of man -k.
X
Unknown or invalid macro. That is, one that does not fit in the macro set that the man page seems to be using. This is a serious error; it often means part of your text is being lost or rendered incorrectly.
Y
I have been unable to identify an upstream maintainer for this Ubuntu/Debian package, and am notifying the generic "Maintainer" address in the package. Please forward appropriately. Also fix the package metadata so it identifies the upstream maintainers.
Z
Your Synopsis is exceptionally creative. Unfortunately, that means it cannot be translated to structural markup even when things like running-text inclusions have been moved elswhere.
b
Attempt to interpolate unknown string.
d
.eo/.ec and complex tab-stop hackery can't be translated to XML/HTML and are almost certain to confuse third-party readers such as Rosetta and Xman.
e
Macro definitions in the NAME section confuse doclifter and are likely to screw up third-party man viewers with their own parsers.
g
Use of a double quote for inch measurements often confuses people who aren't from the Anglosphere.
h
.in arguments were swapped.
i
Non-ASCII character in document synopsis can't be parsed.
j
Parenthesized comments in command synopsis. This is impossible to translate to DocBook.
m
Contains a request or escape that is outside the portable subset that can be rendered by non-groff viewers such as the KDE and GNOME help browsers.
o
TBL markup not used where it should be. Tables stitched together with .ta or list requests can't be lifted to DocBook and will often choke third-party viewers such as TKMan, XMan, Rosetta, etc.
p
Synopsis was incomplete and somewhat garbled.
q
Unused macro causes parsing problems.
r
I supplied a missing mail address. Without it, the .TP at the end of the authors list was ill-formed.
s
Changed page to use the .URL macro now preferred on man(7).
t
Synopsis has to be immediately after NAME section for DocBook translation to work.
u
Use local definitions of .EX/.EE or .DS/.DE to avoid low-level troff requests in the page body. There are plans to add these to groff man; in the interim, this patch adds a compatible definition to your page.
w
.SS markup in name section seriously confuses parsing, and sections don't follow standard naming conventions.
x
Syntax had to be rearranged because of an options callout. This is still excessively complicated; third-party man-page viewers are likely to choke on it.
y
This page was generated from some sort of non-man markup. Please fix the upstream markup so that it generates a well-formed manual page with the indicated corrections.
z
Garbled or missing text near .SS tags. It's not clear to me what's going on here, but .SS tags on adjacent lines defeat any attempt to parse the markup. I have inserted text lines indicating that something needs to be written here.