I maintain a man-page-to-DocBook converter, doclifter. A side effect of this program is that it serves as a validator for the correctness and portability of the markup used on Unix manual pages. I test it by running it against all the manual pages in a full Xubuntu 13.10 with some extras; there are 10412 of these on my development machine, of which 893 already have DocBook masters. It converts 9085 (95.44%) of the remaining 9519 into valid XML-DocBook.

Most of the remaining 4.56% of errors happen because groff(1) and its kin have weak-to-nonexistent validity checking. Often, doclifter fails because of outright errors in macro usage that groff does not catch. Sometime it fails on constructions that are legal but perverse. Very occasionally it throws an error because a man page is correct but has a structure that cannot be translated to DocBook. I keep a database of patches for such problems, and periodically try to push fix patches out to the manual-page maintainers.

Even if you do not care about DocBook, this cleanup work benefits all third-party manual page viewers, including the GNOME and KDE documentation browsers; groff constructions that confuse doclifter are very likely to produce visible problems on these.

The table below is a listing of the 375 (3.94%) pages on which doclifter fails, but the failure can be prevented with a fix patch to the manual page source. 59 pages (0.62%) remain intractable, generally due to markup problems more severe than a point patch can address. I am working with the individual projects responsible to get those cleaned up.

It is likely that you are reading this because you have received email telling you that patches are associated with your name or list address. Please consider incorporating them, or equivalents, in your next release. Also, please write back and tell me what you plan to do so I can keep my database up-to-date.

If you are not already considering it, please think about moving the documentation masters of your project to DocBook (or some format from which you can generate DocBook). If everybody moved to using DocBook as a common exchange format, it would become much easier to support unified browsing of all system documentation with Web-like hypertext capabilities, automatic indexing, and rich search facilities.

Tools to generate man pages, HTML, and PostScript from DocBook files are open-source and generally available. My program, doclifter, should make moving your manual-page masters to DocBook a fairly painless process.

Many major open source projects (including the Linux kernel, the Linux Documentation Project, X.org, GNOME, KDE, and FreeBSD) have already moved to DocBook or are in the process of doing so.

(Individual entries for accepted patches are no longer shown.)

Summary: 283 patches pending, 534 accepted, 0 rejected.

Status codes are as follows:


n No response yet.
p Maintainer has informed me that this is fixed in the masters, but I have not seen the fix yet.
y Accepted
r Rejected
s Superseded (page lifts correctly without the patch)
[0-9]+ number of mailings sent
b Address is blocked

Problem codes are explained after the table.


Patch:Problem code:Status:
acl.5
Ib
admin.1posix
C1n
american.5
english.5
I 9p
amf.conf.5
3p
analog.1
C Z1n
arp.7
p7n
as.1
Z y1n
asn1_der_coding.3
L y1n
asn1_write_value.3
J y1n
audit.rules.7
ap
awk.1
Rn
barchart.3blt
stripchart.3blt
J G1n
graph.3blt
G1n
bc.1
J1n
bitmap.1
In
header_checks.5
m2n
bootparam.7
I u 7b
brltty.1
JbA
btcflash.8
J1n
bzfs.6
ob
bzr.1
J X1n
cdparanoia.1
L1n
chat.8
Jp
chmoddic.1
CbA
chroot.2
E L1n
co.1
ident.1
o2n
codepage.1
CbA
compose.1
edit.1
* y1n
console_codes.4
I s1n
corosync.conf.5
L Ip
curl_formadd.3
Jp
libcurl-tutorial.3
Jp
cvs.1
L3n
cxpm.1
Wp
dash.1
sh.1
J1n
DBD::Gofer.3pm
J y1n
dcut.1
R1n
Parse::DebControl::Error.3pm
W y1n
devnag.1
J1n
dh_install.1
i y1n
dhcp-eval.5
Js
directomatic.1
o G1n
dkms.8
X J1n
dpkg.1
dpkg-source.1
L1p
dosbox.1
L1n
dragdrop.3blt
f1n
dump-acct.8
U1n
duplicity.1
t1n
dv2dt.1
C1n
dvipdf.1
font2c.1
R2n
dvitodvi.1
R1n
editres.1
Ip
e2fsck.8
o2n
e2image.8
J1n
efax.1
J u g2n
ethtool.8
21n
expire.ctl.5
o1n
openais_overview.8
W2n
exiv2.1
L1n
extractres.1
R2n
f2py.1
f2py2.7.1
C1n
faked-sysv.1
faked-tcp.1
faked.1
fakeroot-sysv.1
fakeroot-tcp.1
fakeroot.1
r1n
fence_drac.8
J1n
fence_na.8
W y1n
fence_drac5.8
J1n
fig2ps2tex.1
R2n
foomatic-rip.1
lpdomatic.8
o G2n
formail.1
lockfile.1
procmail.1
procmailex.5
procmailrc.5
procmailsc.5
K1n
fsck.8
fsck.ext2.8
fsck.ext3.8
fsck.ext4.8
fsck.ext4dev.8
op
fsck.msdos.8
fsck.vfat.8
dosfsck.8
C1n
ftm.7
D1n
fuser.1
J1n
fuzzyflakes.6x
C1n
gacutil.1
cli-gacutil.1
gacutil2.1
N1n
gdb.1
c Js
genhomedircon.8
H1n
genisoimage.1
o1n
getafm.1
Rn
getpass.3
Lp
getty.8
Ip
gftodvi.1
IbA
gmcs.1
L1n
gnumeric.1
L1p
gpm-types.7
J C1n
grap.1
Qp
groff_mom.7
sp
grodvi.1
C 7p
gs.1
ghostscript.1
CbA
gthumb.1
L1n
gvcolor.1
C1n
gvpr.1
W I1p
hfsutils.1
H J7n
hosts_access.5
hosts.allow.5
hosts.deny.5
hosts_options.5
I1n
htext.3blt
tree.3blt
X1n
html2text.1
C1n
html2textrc.5
X1n
hypertorus.6x
C1n
icclink.1
E2n
icctrans.1
L1n
tifficc.1
E2n
icedax.1
A I1n
ilbmtoppm.1
L1n
includeres.1
R2n
imake.1
Ip
innfeed.8
B1n
intel_panel_fitter.1
E1n
IO::WrapTie.3pm
W C1n
ipcrm.1
Cp
ipppd.8
L2n
ip6tables-save.8
UbA
ipv6calc.8
L o2n
irda.7
01n
ispell.1
buildhash.1
munchlist.1
findaffix.1
tryaffix.1
icombine.1
ijoin.1
C1n
ispell-wrapper.1
C1n
kioclient.1
kbA
lamd.1
1n
lam.7
LAM.7
L1n
lam-helpfile.5
I1n
lastcomm.1
Ib
less.1
pager.1
J7n
lftp.1
I1n
libcaca-authors.3caca
W1n
libcaca-canvas.3caca
W J1n
libcaca-env.3caca
W L1n
libcaca-font.3caca
W J1n
libcaca-ruby.3caca
W1n
libcaca-tutorial.3caca
W1n
libpng.3
S Jp
libreoffice.1
loffice.1
lofromtemplate.1
Jp
libtiff.3tiff
I1n
list_audio_tracks.1
W1n
ln.1
j2n
locate.findutils.1
U1n
logger.1
Os
logsys_overview.8
J1n
lkbib.1
Cp
lpr.1
U7n
lynx.1
www-browser.1
C1n
makeindex.1
J1n
mawk.1
R1n
mkdosfs.8
mkfs.msdos.8
mkfs.vfat.8
C1n
mkjobtexmf.1
L y1n
mlocate.db.5
Jp
mono.1
cli.1
J X1n
mono-config.5
X1n
more.1
Op
mpirun.1
mpirun.lam.1
L1n
mtools.5
mtools.conf.5
X1n
mtr.8
Jb
mutt.1
J Q1p
muttrc.5
J X u1p
nautilus.1
Ln
nautilus-connect-server.1
L1n
netpbm.1
J1n
netstat.8
C z1n
nfsmount.conf.5
C Y1n
nmcli.1
nm-connection-editor.1
W Xp
nsgmls.1
C I1n
ntfs-3g.secaudit.8
C1n
ntfs-3g.usermap.8
C1n
nvidia-settings.1
I x Y1n
nvidia-smi.1
I 6 Y1n
ode.1
e1n
oldfind.1
find.1
Jb
omfonts.1
W1n
openvt.1
open.1
L2n
orbd.1
W y Y1n
orca.1
s1n
osage.1
mm2gv.1
J1p
patch.1
I t1n
pax.1posix
W J L1n
pbmclean.1
pnmcomp.1
pnmnorm.1
pnmpad.1
pnmquant.1
pnmremap.1
pnmtotiff.1
pgmnorm.1
ppmcolors.1
ppmnorm.1
ppmntsc.1
ppmquant.1
ppmrainbow.1
ppmtogif.1
ppmtoxpm.1
tifftopnm.1
C1n
pbget.1
pbput.1
pbputs.1
W1n
pbmtextps.1
C1n
pcap-filter.7
I1n
pcreapi.3
IpA
pcreposix.3
HpA
pidgin.1
T1n
pkg-config.1
q1n
plot.1
plotfont.1
W1n
pnmhisteq.1
ppmcie.1
ppmlabel.1
sbigtopgm.1
R1n
pnmpaste.1
X1n
pnmtotiffcmyk.1
C1n
pnmtofiasco.1
e1n
policytool.1
W y1n
proc.5
o h1n
pstree.1
pstree.x11.1
C1n
pstops.1
RbA
ps2epsi.1
j7n
ps2pdfwr.1
R1n
psnup.1
J2n
ptx.1
j7n
pytest.1
C1n
qos.7
L2n
qsub.1posix
I1n
rcsfile.5
d1n
rc-alert.1
ubA
regulatory.bin.5
w1n
renice.1
Os
rev.1
O Ls
rhythmbox-client.1
L1n
rlog.1
L1n
rlogin.1
n1n
rlwrap.1
readline-editor.1
J1n
rmid.1
W y1n
rmiregistry.1
W y1n
rotatelogs.8
*bA
rsh.1
ssh.1
authorized_keys.5
sshd.8
n Y1n
rstartd.1
Ip
rsyslog.conf.5
JbA
ruby.1
ruby1.9.1.1
Lb
s3.4
Ip
sane-apple.5
Lp
sane-lexmark.5
L op
sane-mustek_pp.5
L op
sane-pixma.5
Wp
scons-time.1
L Z1n
script.1
Os
SDL_Init.3
L2n
SDL_CDPlayTracks.3
81n
see.1
run-mailcap.1
print.1
C1n
setcap.8
C1n
sg_sat_phy_event.8
C1n
sgmlspl.1
L1n
slapd.conf.5
L IbA
slapd-config.5
L IbA
slapo-constraint.5
LbA
slogin.1
n1n
snmpd.examples.5snmp
Jp
software-properties-gtk.1
WbA
spam.1
C1n
sshd_config.5
ssh_config.5
n4pA
ssh-keygen.1
n1n
sudo.8
sudoedit.8
*1n
sudoers.5
n1n
synctex.1
51n
rb.1
rx.1
rz.1
sb.1
sx.1
sz.1
e7n
tar.1
C Vp
tc-prio.8
tc-htb.8
tc-cbq.8
tc-cbq-details.8
C1n
tcpd.8
I2n
tcpdmatch.8
I1n
tek2plot.1
W1n
telnet.1
telnet.netkit.1
X1n
test.1
[.1
1n
terminfo.5
I abA
TIFFGetField.3tiff
I1n
TIFFmemory.3tiff
41n
Tk::Internals.3pm
W1n
tnameserv.1
W y1n
tgatoppm.1
A1n
tidy.1
W m1bA
top.1
X o Q1n
tree.1
b1n
ttf2tfm.1
I o1n
tty_ioctl.4
Pn
tune2fs.8
C7n
unity-2d-shell.1
C J1n
unity-2d-spread.1
C1n
upstart-events.7
I1p
uscan.1
J1n
usb-creator-gtk.8
W1n
xz.1
xzcat.1
unxz.1
unlzma.1
lzcat.1
lzma.1
C1n
unshare.1
Lp
updatedb.conf.5
Jp
uuencode.1posix
I1n
vector.3blt
L I G 3n
wall.1
L Os
weechat-curses.1
sp
whereis.1
L2n
whois.1
Lp
X.7
I L op
XF86VM.3
XF86VidModeDeleteModeLine.3
XF86VidModeGetAllModeLines.3
XF86VidModeGetDotClocks.3
XF86VidModeGetGamma.3
XF86VidModeGetGammaRamp.3
XF86VidModeGetGammaRampSize.3
XF86VidModeGetModeLine.3
XF86VidModeGetMonitor.3
XF86VidModeGetPermissions.3
XF86VidModeGetViewPort.3
XF86VidModeLockModeSwitch.3
XF86VidModeQueryExtension.3
XF86VidModeModModeLine.3
XF86VidModeQueryVersion.3
XF86VidModeSetClientVersion.3
XF86VidModeSetGamma.3
XF86VidModeSetGammaRamp.3
XF86VidModeSetViewPort.3
XF86VidModeSwitchMode.3
XF86VidModeSwitchToMode.3
XF86VidModeValidateModeLine.3
Ip
xclipboard.1
Ip
xclock.1
Ip
xkbevd.1
Jp
xlogo.1
Ip
xman.1
I op
xorg.conf.5
xorg.conf.d.5
L u2p
Xsecurity.7
Wp
Xserver.1
I1n
XStandards.7
Hp
xterm.1
L I2n
zic.8
I1n
zip.1
Jp
zipinfo.1
*1n
zipcloak.1
zipnote.1
zipsplit.1
Ip
zlib.3
C1n

Error codes:

0
Function declarations had to be modified in order to fit into the DocBook DTD. This is not an error in troff usage, but it reduces the quality of the HTML that can be generated from this page through the DocBook toolchain.
2
Removed unnecessary \c that confused the doclifter parser.
3
Use of .RS/RE or man/mandoc list markup to produce indentation in examples and screenshots makes structural translation impossible. This bug is also likely to confuse third-party man-page browsers.
4
\c is an obscure feature; third-party viewers sometimes don't intepret it. Plain \ is safer.
5
Two-digit year in .Dd macro.
6
Presentation-level use of SS could not be structurally translated. I changed lower-level instances to .TP.
7
This page wins an award for exceptionally creative and perverse abuse of list syntax.
8
C function syntax has extra paren.
9
I replaced '-->' with a troff right arrow, which doclifter will translate properly to an XML/HTML arrow glyph.
A
Dot or single-quote at start of line turns it into a garbage command. This is a serious error; some lines of your page get silently lost when it is formatted.
B
Bogus macro definition.
C
Broken command synopsis syntax. This may mean you're using a construction in the command synopsis other than the standard [ ] | { }, or it may mean you have running text in the command synopsis section (the latter is not technically an error, but most cases of it are impossible to translate into DocBook markup), or it may mean the command syntax fails to match the description.
D
Non-break space prevents doclifter from incorrectly interpreting "Feature Test" as end of function synopsis.
E
My translator trips over a useless command in list markup.
G
Spurious trailing .CE
H
Renaming SYNOPSIS because either (a) third-party viewers and translators will try to interpret it as a command synopsis and become confused, or (b) it actually needs to be named "SYNOPSIS" with no modifier for function protoypes to be properly recognized.
I
Use of low-level troff hackery to set special indents or breaks can't be translated. The page will have rendering faults in HTML, and probably also under third-party man page browsers such as Xman, Rosetta, and the KDE help browser. This patch eliminates .br, .ta, .ti, .ce, .in, and \h in favor of requests like .RS/.RE that have structural translations.
J
Ambiguous or invalid backslash. This doesn't cause groff a problem. but it confuses doclifter and may confuse older troff implementations.
K
Renaming stock man macros throws warnings in doclifter and is likely to cause failures on third-party manual browsers. Please redo this page so it uses distinct names for the custom macros.
L
List syntax error. This means .IP, .TP or .RS/.RE markup is garbled. Common causes include .TP just before a section header, .TP entries with tags but no bodies, and mandoc lists with no trailing .El. These confuse doclifter, and may also mess up stricter man-page browsers like Xman and Rosetta.
N
Extraneous . at start of line.
O
Wrong order of arguments in .Dd macro.
P
A .TP with no body defeats attempts to parse this page.
Q
Spelling error or typo.
R
.ce markup can't be structurally translated, and is likely to cause rendering flaws in generated HTML.
S
DEPRECATED: in function syntax cannot be translated. Also, the code and examples need to be marked up better.
T
Junk at the beginning of the manual page.
U
Unbalanced group in command synopis. You probably forgot to open or close a [ ] or { } group properly.
V
Missing body content in list trips up doclifter and is likely to cause rendering problems in other viewers. I have been able to fill in what was missing except for what should be under TAR_LONGLINK_100.
W
Missing or garbled name section. The most common form of garbling is a missing - or extra -. Or your manual page may have been generated by a tool that doesn't emit a NAME section as it should. Or your page may add running text such as a version or authorship banner. These problems make it impossible to lift the page to DocBook. They can also confuse third-party manpage browsers and some implementations of man -k.
X
Unknown or invalid macro. That is, one that does not fit in the macro set that the man page seems to be using. This is a serious error; it often means part of your text is being lost or rendered incorrectly.
Y
I have been unable to identify an upstream maintainer for this Ubuntu/Debian package, and am notifying the generic "Maintainer" address in the package. Please forward appropriately. Also fix the package metadata so it identifies the upstream maintainers.
Z
Your Synopsis is exceptionally creative. Unfortunately, that means it cannot be translated to structural markup even when things like running-text inclusions have been moved elswhere.
a
".fi" request was omitted or typoed as ".if".
b
Attempt to interpolate unknown string.
c
The composer of this man page misunderstood and seriously overused the \c escape. Some uses were broken; others (notably the sequence "\\c\n\\&") are bad style.
d
.eo/.ec and complex tab-stop hackery can't be translated to XML/HTML and are almost certain to confuse third-party readers such as Rosetta and Xman.
e
Macro definitions in the NAME section confuse doclifter and are likely to screw up third-party man viewers with their own parsers.
f
Absence of trailing \fRs makes synopsis unparseable.
g
Use of a double quote for inch measurements often confuses people who aren't from the Anglosphere.
h
.in arguments were swapped.
i
Non-ASCII character in document synopsis can't be parsed.
j
Parenthesized comments in command synopsis. This is impossible to translate to DocBook.
k
kdemangen.pl stuttered two copies of a page. Also, .SS markup is garbled.
m
Contains a request or escape that is outside the portable subset that can be rendered by non-groff viewers such as the KDE and GNOME help browsers.
n
Invalid Sx reference - not a section on this page.
o
TBL markup not used where it should be. Tables stitched together with .ta or list requests can't be lifted to DocBook and will often choke third-party viewers such as TKMan, XMan, Rosetta, etc.
p
Synopsis was incomplete and somewhat garbled.
q
Unused macro causes parsing problems.
r
I supplied a missing mail address. Without it, the .TP at the end of the authors list was ill-formed.
s
Changed page to use the .URL macro now preferred on man(7).
t
Synopsis has to be immediately after NAME section for DocBook translation to work.
u
Use local definitions of .EX/.EE or .DS/.DE to avoid low-level troff requests in the page body. There are plans to add these to groff man; in the interim, this patch adds a compatible definition to your page.
w
.SS markup in name section seriously confuses parsing, and sections don't follow standard naming conventions.
x
Syntax had to be rearranged because of an options callout. This is still excessively complicated; third-party man-page viewers are likely to choke on it.
y
This page was generated from some sort of non-man markup. Please fix the upstream markup so that it generates a well-formed manual page with the indicated corrections.
z
Garbled or missing text near .SS tags. It's not clear to me what's going on here, but .SS tags on adjacent lines defeat any attempt to parse the markup. I have inserted text lines indicating that something needs to be written here.