The reposurgeon codebase is moving from Python to Go because the Python implementation is too slow on really large repositories. The GCC history at over 259K commits is the straw that broke the camel’s back; using PyPy with every optimization on semi-custom hardware tuned for this job still yielded test conversion times of over 9 hours, which is death on one’s recipe-debugging cycle. A test translation of the auxiliary repocutter tool suggested that we could expect a 40x speedup, which is in line with published Python vs. Go comparative benchmarks.

The problem directed the choice of Go, not the other way around. I seriously considered OCaml or a compiled Lisp as alternatives.

I did examine the automated tools for Python to Go, but rejected them because the generated Go code would have been a maintainability disaster. Thus, translation by hand. Though at about 22% in I wrote a fast, crude, incomplete Python-to-Go translator to assist the process.

This is an experience report on that translation, written in attempted conformance with the Go community’s practice for grounding language enhancement requests not in it-would-be-nice-to-have abstractions but rather in a description of real-world problems. The scale driver of this real-world problem is that, while at 14KLOC reposurgeon is not expecially large, it is very dense. It’s a DSL that’s a tructure editor for attributed DAGs - algorithmically complex, bristling with graph theory, FSMs, parsing, tricky data structures, two robot harnesses driving other tools, and three different operator-composition algebras.

Accordingly, reposurgeon would be a challenge to the expressiveness of any language it were implemented in. It makes a good test of the relative expressiveness of Python and Go, and an effective way to audit for places where moving from Python to Go hinders concision and readability.

The skillset I approached this problem with is: Go novice, Python and C expert; old Unix hand; Lisp hacker of even more ancient vintage; ex-mathematician with strength in graph theory, group theory and combinatorics; lots of experience as a systems programmer, lots of exposure to odd languages, and lots of domain knowledge about version-control systems. Adjust bias compensators accordingly.

Expected problems that weren’t

Semantic distance. In general, the translation gap between Python and Go is narrower than I initially expected, especially considering the dynamic-vs-static type-system difference. On reflection, I think it turns out that GC and the existence of maps as first-class types do more to narrow that gap than the static/dynamic divergence in type systems does to widen it.

Polymorphic lists. The principal data structure of a repository’s representation in reposurgeon is a list of events - data structures representing modification operations on sets of files. The list is necessarily polymorphic because (for example) a change commit and a tag creation are different kinds of things. Translating this to static typing using interfaces proved less arduous than I had feared, though the process revealed a documentation issue and a problem with absence of sum types; I will return to both points.

Operator overloading. Good style in Python, but surprisingly easy to relinquish in Go. I went in thinking that they’d be on my want list for Go before I was done, but no - not even at reposurgeon’s complexity scale.

The positive part of my summation is that hand-translation of Python to Go even at this scale and complexity is not horrible. It’s not easy, exactly, but quite doable. It is however time-intensive; counting time to build the required unit tests in Go, I managed about 150-200 lines a day before writing pytogo and 500-600 lines per day afterwards.

The pain points are mainly around a handful of advanced Python features: iterators, generators, comprehensions, and class-valued exceptions. Fortunately, even in quite advanced Python code like reposurgeon these turn out to not be very common on a per-KLOC basis.

Problems that are

Keyword arguments

The problem that obtruded on me first was quite unexpected: absence of keyword arguments. In Python one can write a function signature like this

    def EuclideanDistance(x, y):

and then call it like this:

    d = EuclideanDistance(x=3.0, y=9.6)

I used keyword arguments extensively in the Python, especially in object-creation functions where it is often required to pass in multiple parameters that won’t fit neatly into a single data structure.

Go presently has no such feature. This is probably the single most serious readability hit my translation took; it got significantly more difficult to grasp what was going on at all those callsites.

Absence of "with" and try/finally context-manager idioms

This is a real expressiveness problem, not just a readability issue like the preceding. In Python I can do this:

class CriticalRegion:
    "Encapsulate operations to try and make us un-interruptible."
    NSIG = 32
    def __enter__(self):
        "Begin critical region."
        if debug_enable(DEBUG_COMMANDS):
            complain("critical region begins...")
        # Alas that we lack sigblock support
        self.handlers = [None]*(CriticalRegion.NSIG+1)
        for sig in range(1, CriticalRegion.NSIG):
            if sig not in (signal.SIGKILL, signal.SIGSTOP):
                self.handlers[sig] = signal.signal(sig, signal.SIG_IGN)
    def __exit__(self, extype_unused, value_unused, traceback_unused):
        "End critical region."
        for sig in range(1, CriticalRegion.NSIG):
            if sig not in (signal.SIGKILL, signal.SIGSTOP):
                signal.signal(sig, self.handlers[sig])
        if debug_enable(DEBUG_COMMANDS):
            complain("critical region ends.")
        return False

I can then say

            with CriticalRegion():
                doUninterruptibleStuff()

having a guarantee that enter() will run at the start of the with block scope and exit() will run at the end.

This ability to create a bounded context with enter and exit hooks that are guaranteed to fire even if there’s a return or keyboard interrupt signal inside the context is a really valuable idiom and I miss it extremely in Go.

Without it, I have to do spooky-action-at-a-distance things. That’s a readability/maintainability hit which I think is against the spirit of Go, insofar as I understand it.

The Go defer statement is a valiant effort to do something interesting in this direction but it’s not good enough by itself. One reason is that it enforces a cohesion between critical-region boundaries and function boundaries, with attendent consequences in name scoping. This is an orthogonality violation in the language design.

The problem is easily illustrated via Python-like pseudocode:

def exampleFunction():
    a = "Hello world"
    with CriticalRegion():
        a = doUninterruptibleStuff(a)
    print(a)

If I try to translate this to Go I’m almost forced to end up with something like this:

func exampleFunction() {
    a := "Hello world"
    a = innerFunction(a)
    fmt.Print(a)
}

func innerFunction(a string) string {
    c := newCriticalRegion()
    c.__enter__()
    defer c.__exit__()
    return doUninterruptibleStuff(a)
}

About the latter I can only say "Readability and expressiveness FAIL!". I say "almost" forced because it is possible to improve on this slightly using a closure:

func exampleFunction() {
    a := "Hello world"
    innerFunction := func(a string) {
        c := newCriticalRegion()
        c.__enter__()
        defer c.__exit__()
        return doUninterruptibleStuff(a)
    }
    a = innerFunction(a)
    fmt.Print(a)
}

I hope nobody is so foolish as to try to tell me this isn’t a substantial maintainability hit relative to the Python. The clutter is irritating in this toy example, and going to be worse at scale; I have a particular nasty example in mind from around line 7737 of reposurgeon.

A try/finally syntax on the Python and Java model would be some improvement:

func exampleFunction() {
    a := "Hello world"
    c := newCriticalRegion()
    try {
        c.enter()
        a = doUninterruptibleStuff(a)
    } finally {
        c.exit()
    }
    a = innerFunction(a)
    fmt.Print(a)
}

Or, in parallel with condition setup in Go if statements:

func exampleFunction() {
    a := "Hello world"
    try c := newCriticalRegion() {
        c.enter()
        a = doUninterruptibleStuff(a)
    } finally {
        c.exit()
    }
    a = innerFunction(a)
    fmt.Print(a)
}

This is still a bit grubby, though bit less so if it’s treated like a conditional and c is not be exposed in the outer scope. I don’t see any way to get what I really want - the Python context-manager protocol - without introducing a very un-Go-like form of magic structure-member naming, so I’d settle for try/finally.

No map over slices

Translating Python map() calls and comprehensions produces code that is ugly and bulky, forcing the declaration of dummy variables that don’t need to exist.

If one graded Go point extensions by a figure of merit in which the numerator is "how much Python expressiveness this keeps" and the denominator is "how simple and self-contained the Go feature would be" I think this one would be top of list.

So: map as a functional builtin takes two arguments, one x = []T and a second f = func(T)T. The expression map(x, f) yields a new slice in which for each element of x, f(x) is appended.

This proposal can be discarded if generics are implemented, as any reasonable implementation of generics would make it trivial to implement in Go itself.

Limitations on const

Inability to apply const to variables with structure, map, or slice initializers is annoying in these ways:

  1. Compiler can’t enforce noli mi tangere

  2. const functions as a declaration of programmer intent that is valuable at scale.

In Python one can often get a similar effect by using tuples. I used this as a form of internal documentation hint in the original Python. I want it back in Go.

Any extension in the scope of const, even a relatively conservartive one like only allowing const structures with compile-time constant members, would have significant benefits.

Absence of lookbehind in Go regexps

This is a small point problem, easily fixed, that was far more annoying in practice than it should have been in theory.

Python regexps have both positive and negative lookbehind clauses. The following expression looks for possible Subversion revision designators in comments, excluding bug references:

"(?<!bug )[0-9]+"

Go translation reveals that it is remarkably unpleasant, verging on "too painful to be worth it" to do that filtering without lookbehinds.

This is the only real problem I have identified in moving from Python regexps to Go ones. Take that "only" seriously, because regexps are a Swiss-army knife I use heavily; Go regexps are doing well to have no limits that are more annoying.

Absence of sum/discriminated union types

I have read issue #19412 and am aware of the objections to adding sum types to Go.

Nevertheless, I found their absence was something of a pain point in my translation. Because reposurgeon events can have any one of a set of types (Blob, Tag, Commit, Callout, Passthrough, Reset) I found myself writing a lot of stupid boilerplate code like this:

    for _, child := range commit.children() {
            switch child.(type) {
            case *Commit:
                    successorBranches.Add(child.(Commit).branch)
            case *Callout:
                    complain("internal error: callouts do not have branches: %s",
                            child.idMe())
            default:
                    panic("in tags method, unexpected type in child list")
            }
    }

Besides being inelegant, the requirement for a runtime check to exhaust all cases is a defect attractor. It’s way too easy to forget to write the default case and wind up with silent errors.

Thus, absence of discriminated-sum types is an actual hole in the language that compromises its goal of enforcing strong invariants through type safety checked at compile time.

This will especially tend to become an issue when translating from a language like Python with fully dynamic typing.

I don’t have a concrete proposal to fix this yet. If these notes are well received I may write one.

Catchable exceptions require silly contortions

Most of this report was written at about the 12% point of the translation. By twice that far in, 23%, another problem about which I had not originally been intending to complain became obtrusive. That is absence of a general facility for structured exceptions.

Yes, I’m familiar with all the reasons throw/catch wasn’t included in Go 1. Including the laudable goal of forcing programmers to be explicit about error handling and how they propagate errors up their call stack. And I understand that defer/recover was an attempt to provide a tractable subset of catchable exceptions that would minimize the temptation to sin.

Because I broadly agree with this set of goals, I was actively intending when I started this translation not to complain about the lack of general catchable exceptions, or ship any related RFEs, in spite of having a presentiment that they would be a problem. That is, until I hit a wall in the real world and had to rethink.

Here’s my use case. Reposurgeon is an interpreter for a DSL. Situations in which I can tolerate panic-out and die are rare and mostly occur at initialization time. Usually what I want to do instead of panicking on error is throw control back to the read/eval loop, executing some kind of local cleanup hook on the way out. Analogous situations will frequently occur in, for example, network servers.

In a language with labeled throw/catch, or class-valued exceptions, I can address this by explicitly target an exception to some level of the call stack above the point it’s raised. In reposurgeon, for example, there are usually two levels of interest. One is the read-eval loop. The other is the outermost scope; if an exception gets there I want to call hooks to gracefully remove working directories (blob storage associated with the repository-history structures being edited) before exiting the program.

In Go, I didn’t seem to have a clean option for this. Which was a problem on two levels….

  1. Reposurgeon is 14 KLOC of dense code. At that scale, any prudent person in a situation like this will perform as linear and literal a translation as possible; to do otherwise is to risk a complexity explosion as you try to cross the semantic gap and rethink the design at the same time. Absence of class-valued exceptions was far and away the biggest technical blocker. "First make it work, then make it right"; the least risky path seemed to be to shim in exceptions with the intention of removing them later.

Eventually, after beating on the panic/recover feature for a while, I found this kludge:

package main

import "fmt"

type exception struct {
        class string
        message string
}

func (e exception) Error() string {
        return e.message
}

func throw(class string, msg string, args ...interface{}) *exception {
        // We could call panic() in here but we leave it at the callsite
        // to clue the compiler in that no return after is required.
        e := new(exception)
        e.class = class
        e.message = fmt.Sprintf(msg, args...)
        return e
}

func catch(accept string, x interface{}) *exception {
        // Because recover() returns interface{}.
        // Return us to the world of type safety.
        if x == nil {
                return nil
        }
        err := x.(*exception)
        if err.class == accept {
                return err
        }
        panic(x)
}

func main() {
        defer println("Defer 1")
        defer println("Defer 2")
        defer println("Defer 3")

        defer func() {
                fmt.Println("Recover:", catch("recoverable", recover()))
        }()
        panic(throw("recoverable", "Don't Panic!!!"))

        fmt.Println("Unreachable.")
}

This works, and it works if you change the class to something other than "recoverable"; you get the expected rethrow and panic. But it is unreasonably ugly. So why am I bringing it forward? Because…

  1. The translation experience reduced my disposition to think that Go is right to be narrow and prescriptive on this issue. Two kinds of doubts grew on me:

    • Pragmatic doubt. Trying to be a good citizen, I kept looking at places where existing nonlocal control transfers in Python could be replaced by explicit Go-style passing upwards of an error status. I noticed that there were a significant percentage of cases in which doing this made the code more difficult to follow rather than easier.

A simple representative example is a call chain of several data transformations in which each stage has its own failure condition and any failure aborts the transformation. If we there were no error cases we might write, in a Pythonoid sort of notation:

 sink = transform3(transform2(transform1(source)))

If a stage can error out, we might have these structural alternatives to consider. One is Go style:

(fail1, result1) = transform1(source)
if fail1 == true:
     status = Exception1
else:
     (fail2, result2) = transform2(result1)
     if fail2 == true:
         status = Exception2
     else:
         (fail3, result3) = transform3(result1)
         if fail3 == true:
             status = Exception3
         else:
             sink = result3
             status = OK

The other style is with a catchable exception:

status = OK
try:
    sink = transform3(transform2(transform1(source)))
except (Exception1, Exception2, Exception3) as err:
    status = err

I don’t think there’s even a colorable argument that the Go structure is better in a case like this. Look at all those extra variables, that eye-confusing ladder structure, the defect-prone near-but-not-quite repetition of code.

An early reviewer pointed out that if the Go code were an entire function it could be expressed something like this:

func pipeline(source T)  {
{
        result1, err1 := transform1(source)
        if err1 != nil {
          return err
        }

        result2, err2 := transform2(result1)
        if err2 != nil {
          return err
        }

        result3, err3 := transform3(result2)
        if err3 != nil {
          return err

        return nil
}

That’s still a lot of eyeball friction compared to functional-style with exceptions. And it gets worse faster as the number of stages rises.

My problem was that I kept finding analogous situations in my translation. The specific one that motivated the above pseudocode was in a feature called "extractor classes". There are little bots that run the client tools of a VCS to mine the output for its metadata. It’s actually a five- or six-stage process wherein any command failure requires an abort.

In these cases moving to Go style produced a serious loss of clarity. And a rising feeling that I wanted my exceptions back (and in fact the extractor-class code now contains the one real instance of my exceptions kludge). Which leads to this:

  • Aesthetic doubt. I’ve never written a general-purpose language, but I have designed way more than my share of DSLs and declarative markups, and from this I have learned a heuristic for doing engineering that I won’t regret. For any given capability X:

Being able to express X elegantly is a good place to be. Leaving out X entirely for safety and verifiability can be a good choice, and is at least defensible on those grounds. But if you implement X in a half-hearted, weak way that requires ugly code to use and fails to actually foreclose the conceptual problems you were trying to dodge, that’s a bad place to be.

That bad place is where Go is right now with respect to nonlocal control transfers, and why I had to write my kludge.

Interestingly, I was also able to come up with a very minimalist solution. No new syntax, two minor new compilation rules.

To motivate it, let’s set the goal of being able to rewrite my example like this:

package main

import "fmt"

type exception struct {
        class string
        message string
}

func (e exception) Error() string {
        return e.message
}

func throw(class string, msg string, args ...interface{}) {
        e := new(exception)
        e.class = class
        e.message = fmt.Sprintf(msg, args...)
        panic(e)
}

func catch(accept string) *exception {
        if x := recover(); x == nil {
                return nil
        }
        err := x.(*exception)
        if err.class == accept {
                return err
        }
        panic(x)
}

func main() {
        defer println("Defer 1")
        defer println("Defer 2")
        defer println("Defer 3")

        defer func() {
                fmt.Println("Recover:", catch("recoverable"))
        }()
        throw("recoverable", "Don't Panic!!!")

        fmt.Println("Unreachable.")
}

That is rather less ugly, actually pretty reasonable if the implementations of throw and catch aren’t staring you in the face. And all it would take to get there is two minor loosenings of restrictions.

  1. The panic function has a new property, "terminating". If the compiler can prove that all exit paths from a function invoke terminating functions, it is marked "terminating". The effect of this property is to suppress "missing return" errors on any code path from call of a terminating function to exit of its caller, but not on other paths to exit.

  2. A recover() call is no longer required to be within the lexical frame of a defer(). It can be in a helper called by the defer clause (but still within the call scope of a defer). For safety we’d need an additional rule that a go clause in the helper puts the code it runs out of scope for purposes of this check.

Absence of iterators

Having Python iterators go missing is really annoying for reposurgeon, in which lazy evaluation of very long lists is a frequent requirement.

Here’s the type example. I have in my repository representation a list of possibly hundreds of thousands of events. A subset of these events is Commit objects. I would like to be able to write

        for i, commit := range repo.commits() {
                do_stuff_to(commit)
        }

In Python it is easy and natural to write commits() as an iterator which lazily walks the repository event list looking for Commit objects. Each time it is called it either returns with "yield", handing back the next commit, or actually returns - which is a signal that the for loop should terminate.

I can’t do this in Go; I have to write commits() to return an entire constructed slice made by filtering the event list. Which is annoying for long lists, especially when it might well terminate early.

Sure, there’s an alternative. It looks like this…

        for i, event := range self.events {
                switch.event.(type) {
                case *Commit:
                        do_stuff_to(event.(*Commit))
        }

…and about which what I have to say is "Ugh!". That code does not say "walk through all commits", it says "walk through all events and do something to the ones that happen to be commits". I don’t want to wander into event-land here; that type-assertion/cast pair looks altogether too much like a defect attractor. Also, unnecessary eyeball friction.

I had no good idea what could be done about this. I read Ewen Cheslack-Postava’s excellent discussion of iterator patterns in Go and agreed with him that none of them are really satisfactory.

Then, on my second reading, I had a brainstorm. I found the trivial Go extension that would give iterators with no new syntax, no hidden magic, and no yield/return distinction:

New evaluation rule on how to interpret for loops when the range operand is a callable: the loop runs until the callable returns the zero value of its type.

So, with that I could write a Repository method like this:

func (repo *Repository) commits() func()*Commit {
        idx := -1
        return func() *Commit {
                for {
                        if idx++; idx >= len(self.events) {
                               return nil
                        }
                        if _, ok = self.events[idx].(*Commit); ok {
                                return self.events[idx]
                        }
                 }
        }
}

…and there I have it. An iterator, with exactly the same lifetime as the for loop.

I suggest that this be adopted for Go 2. Small, easy new evaluation rule, big gain in expressiveness.

Hieratic documentation

Figuring out how to do type-safe polymorphism in the event list was more difficult than it should have been. The problem here wasn’t the Go language, it was the official (and unofficial) documentation.

There are two problems here, one of organization and one of style.

The organization problem is that there isn’t one. The official Go documentation seems to center on the library API docs, the specification, the Tour, and a couple of "official" essays written for it. It also includes a corona of white papers and blog posts. Often these are valuable deep dives into specific aspects of the language even when they are notionally obsolete. Some of them are outside the boundaries of the official documentation site.

For example, I got substantial help understanding interfaces from an old blog post by Ian Lance Taylor (one of the Go devs) that was offsite, dated from 2009, and contained obsolete implementation details.

The high-level problem is that while the devs have done a praiseworthy and unusually effective job of documenting their language considering the usual limitations of documentation-by-developers, finding things in the corona is hard. And knowing what’s current is hard.

The documentation is (dis)organized in such a way that it’s difficult to know what you still don’t know after reading a Tour page or blog entry or white paper. There should be more "But see here for a dangerous detail" links, in particular to the language specification.

Style. Go has a problem that is common to new languages with opinionated developers (this is part of "the usual limitations" above). There are one or two exceptions, but the documentation is predominantly written in a terse, hieratic style that implicitly assumes the reader already inhabits the mindset of a Go developer.

It is not very good at providing an entry path into that mindset. Not even for me, and I’m an extreme case of the sort of person for whom it should do an effective job if it can do that for anyone.

There is a fix for both problems. It is not magic, but it is doable.

The Go dev team should bring in a documentation specialist with no initial knowledge of Go and a directive to try to maintain an outside-in view of the language as he or she learns. That specialist needs to be full-time on the following tasks:

(1) Edit for accessibility - a less hieratic style

(2) Maintain a documentation portal that attempts to provide a reasonable map of where everything is and how to find it.

(3) Curate links to third-party documents (for example notable Stack Overflow postings), with dates and attached notes on what parts might be obsolete and when the document was last reviewed for correctness.

(4) Bring the very best third-party stuff inside, onto https://golang.org/doc/.

Note: After writing this, I had an even worse time digging up and fixing in my mind all the details of how defer/panic/recover works. It’s almost all documented somewhere, though Peter Seebach and I ended up writing a FAQ on how to set local variables from a defer clause to clear up minor confusion. There’s a very helpful blog post on the general topic. But the blog post leaves out the crucial detail that recover returns interface {}, not error; this tripped me up when I was writing my kludge, and I ended up on IRC getting referred to the formal Go specification.

This is all too typical. Everything makes sense once you know it, but before you know it critical details are often lurking in places you have no way of knowing you should look.

Attention to the problem and a good techical writer/editor can fix this.

Accentuating the positive

I think the Go translation of reposurgeon is going to be better - more maintainable - code than the Python original, not just faster. And this is not because I’m rewriting or refactoring as I go; I’ve explained that I’m trying very hard to avoid that. It’s that Go’s minimalistic approch actually…works.

I see a maintainability benefit from the static typing. The Go type system does what a type system is supposed to do, which is express program invariants and assist understanding of its operational semantics.

I also see a maintainability benefit from how easy Go makes it to write unit tests in parallel with code. I am fully exploiting this, and expect it to make life much less painful when I get to end-to-end testing of the translation and have to debug it in the large.

I have to call out the Go time library as a particularly good piece of work. Having the basic timestamp property be location-aware with its presentation modified by the implied zone offset simplified a lot of cruft out of the handling of commiter/author dates in Python.

Now that I’ve seen Go strings…holy hell, Python 3 unicode strings sure look like a nasty botch in retrospect. Good work not falling into that trap.

Envoi: Actionable recommendations, in priority order

  1. Keyword arguments should be added to the language.

  2. A technical writer with an outside-in view of the language should be hired on to do an edit pass and reorganization of the documents.

  3. try/finally should be added to the language.

  4. Yes, throw()/catch() needs to be writeable in the language. Two minimal relaxations of compilation rules make writing it possible.

  5. Lookbehinds should be added to the regexp library.

  6. If generics don’t fly, a map-over-slice intrinsic should be added.

Not quite actionable yet:

  • Absence of sum types creates an actual hole in the type-safety of the language.