CVS is a wretched hive of scum and villainy. Lifting a CVS repository into a clean revision history is difficult and has rebarbative edge cases. If you are reading this, you have probably tripped over one of them. Here is what to do about it.

Build and port errors

These are relatively simple to fix. Specify your build environment: OS, compiler version, and platform. Include a transcript of the compiler errors.

Stack-smashing errors

If you get a crash message that says

*** stack smashing detected ***: terminated

you probably have revision identifiers with more branch elements than a stock build of cvs-fast-export can handle. The limit is set by the symbol CVS_MAX_BRANCHWIDTH in cvs.h and is normally 10. Try increasing it.

The reason the default isn’t higher is that increasing this limit blows up the program’s working-set size. You’ll know you’ve pushed it too far if the program OOMs.

Core dumps

If cvs-fast-export core dumps on you, try the same invocation with -t 0 to force it to run single-threaded. It is fairly likely you will get a message telling you you have overrun one of the hard limits in cvs.h. If this happens, build from source with the limit raised.

Branch cycle errors

Sorry, these are a hard fail. These errors are rare and mysterious. If you stumble over one, none of your options are good.

The branch-cycle error is a symptom of some problem deep in the core clique-resolution logic in cvs-fast-export, and unfortunately nobody actually understands that code. Yes, even the person who originally wrote it doesn’t. Until that changes, there are some limitations that can’t be fixed. Whatever causes the "branch cycle error" is the worst of these.

Here are the possibilities:

  1. Remove one of the masters involved in the cycle. This is probably the right thing to do if one of them is an Attic file.

    A master goes to an Attic directory when the file it controls is deleted. CVS’s data representation for deletions is brittle and its implementation has a history of buggy behavior in this area. It is possible that damaged or ambiguous metadata in the Attic file is the cause of error.

    By definition, deleting an Attic master cannot affect the content at the head revision. But deleting it removes the entire past history of whatever file it controls.

  2. You may be able to convert with cvs2git.

    cvs-fast-export has many advantages over cvs2git. It converts .cvsignore files, it sanity-checks tag names, it doesn’t generate superfluous TAG.FIXUP branches, it delivers an entire well-formed git stream that can be fed to git-fast-import directly (rather than two disconnected chunks that can’t) and it is orders of magnitude faster. If you have a choice between these tools, cvs-fast-export will almost always do a better and faster job.

    But, it has been reported that cv2git can sometimes break branch cycles.

Branch commits landing on trunk

This is probably caused by past imports to a vendor branch.

Vendor branches are an obscure feature of CVS that can badly confuse cvs-fast-export. Some cases are handled correctly, but others are not - for example, if your repository has more than one vendor import. We’re not sure what correct behavior should look like in the general case and don’t have test pairs to verify it.

There is one known workaround, which is to ensure that the last commit in your CVS repository was on trunk rather than a branch. Do this in your working directory:

$ for file in $(find . -type d -name CVS -prune -o -type f -print); do echo >>${file}; done
$ cvs commit -m "extra commit in trunk"

This might cause the errant branch commit to be landed correctly. The extra commit can be removed with reposurgeon(1) after conversion.

If this workaround doesn’t make the problem go away, you might get better results from cv2git. More likely it will merely fail in a different way than cvs-fast-export does. If you have fix patches for our vendor-branch code, we’ll take them. (Please include a test load and documentation as well.)

Other translation errors

Reduction

First, reduce the test case to a minimal set of CVS files that will reproduce the misbehavior. The cvsconvert wrapper script should be useful for getting a concise summary of errors visible at tagged locations and branches.

There is a tool called 'cvsstrip" in the cvs-fast-export distribution. It makes a skeletonized copy of a CVS repository, dropping out all the content but leaving the metadata in place. Revision comments are replaced with their MD5 hashes, very short but still unique.

The skeletonization process has the additional benefit of removing any sort of data that might be sensitive; only filenames, revision dates and committer IDs are left in place.

A conversion attempt on the skeletonized repo should raise the same errors as the original did (except for ignorable ones involving CVS keyword expansion, which will all go away when the file is skeletonized). If it does not, try skeletonizing with -t instead to preserve non-sticky tags

If you have a core dump that persists after a master has had cvsstrip applied, there is probably a garbled diff in the revision sequence somewhere. Here’s an example:

@
text
@d7 1
a7 1
......
d9 2
a13 1
@

The last two lines are incorrect. If they were replaced by three lines that are parallel to the earlier append command…​

a13 1
humpty dumpty
@

…​that would work. CVS occasionally drops these broken diff sections in masters, but has an unknown interpretation rule (or possibly an outright bug) that masks them.

Now try to make the skeletonized repository as small as possible by removing swathes of files from it, checking each time to make sure the error continues to reproduce. It is best if you can reduce the fileset to a single file or pair of files.

You should find you can simplify the directory structure by moving files from subdirectories to the root, doing file renames to avoid name collisions. Neither moves nor renames should change the errors reported except in the obvious way of changing pathnames in the messages. Again, if the errors do change in any other way this is interesting and should be reported.

At the end of this process you should have a handful of skeletonized master files.

Transmission

Make a session transcript showing the error replicating on the reduced masters. Make a tarball including the reduced masters and the session transcript. Mail this to the maintainer with any other information you think might be relevant. Better yet, open an issue at the project home on GitLab and attach this archive.

Warning

If you don’t pass me the ability to reproduce your error, and the fix isn’t instantly obvious, your issue may very well never be fixed at all. I didn’t write the core of cvs-fast-export, and debugging that core from the outside without the ability to reproduce errors is not merely brutally hard but verging on impossible.

Compensation

Wrestling with CVS repository malformations is ugly, difficult work.

If you are requesting help on behalf of an open-source software project, you will get help for free as the maintainer’s schedule permits.

The maintainer is available on a consulting basis to all others and will expect to be paid for his pain.