Structs are pretty cool

timClicks · 3 March 2020 13:17

Part of https://discourse.jujucharms.com/t/read-before-contributing/47, an opinionated guide by William Reade

Structs Are Pretty Cool

Because structs are the underlying bones of an implementation, there are different sets of advice that apply more or less strongly in different circumstances – it’s hard to identify many practices that are inherently either bad or good – but I’ve still got a few heuristics.

Either Export Your Fields Or Don’t

…but try to avoid mixing exported and unexported fields in the same struct. Any exported field could be written by anyone; any unexported field that’s meant to be consistent with any exported one is thus at risk.

Mixed exportedness also sends mixed messages to your collaborators – they can create and change instances via direct access, but they don’t have full control, and that’s a bit worrying. Better to just decide whether your type is raw data (export it all) or a token representing internal state (hide fields, expose methods).

If You Export Fields, Expose A `Validate() error` Method

By exporting fields, you’re indicating that collaborators can usefully use this type. If you’re going to do that, it’s only polite to let them validate what they’re planning to do without forcing them to actually do it – if, say, some complex operation spec requires a heavyweight renderer to run it, it’s just rude to force your client to create that renderer before telling them the job never had a chance in the first place.

And give it a value receiver to signal that it’s not going to change the instance; and make a point of not changing anything, even if some of the fields are themselves references. In fact, more generally:

If You Export Fields, You’re A Value Type

If you have methods, accept value receivers. If you’re returning the type, return a value not a pointer. If you’re accepting the type, accept a value not a pointer.

If your type has exported maps or slices, beware, because they will not be copied nicely by default; make a point of copying them (including any nested reference types…) whenever your type crosses a boundary you control.

Channels Are Awesome But Deadly

They’re fantastic, powerful, and low-level. Within a single narrow context, orchestrating requests and responses and updates and changes, there is no finer tool: but they only work right when both sides agree on how they’re meant to be used, so they should not be exposed without great paranoid care.

Channels Should Be Private

In fact, almost the only channels you should ever see exported are <-chan struct{}s used to broadcast some single-shot event. They’re pretty safe – neither sender nor receiver has much opportunity to mess with the other.

The other places you see them are basically all related to watchers of one class or another; we’ll talk about them in the Juju-specific bits further down. For now just be aware that they come with a range of interesting drawbacks.

Channels Live In Select Statements

It is vanishingly rare for it to be a good idea to unconditionally block on a channel, whether sending or receiving or ranging. That means you absolutely depend on your collaborators to act as you expect, and that’s a bad idea – mismatched expectations can leave you blocked forever, and you never want that.

Buffers Are A Trap

It is depressingly common for people to design subtly-flawed channel protocols… and respond to observed flakiness by buffering the channel.

This leaves the algorithm exactly as flaky as before, but explodes the space of possible states and makes it orders of magnitude harder to debug. Don’t do that.

Naked Sends Into Buffers Are OK

…as long as they’re locally verifiable by inspection. It’s pretty rare to have a good reason to do this, but it can be tremendously useful in tests.

Nil Channels Are Powerful

…because they go so nicely with select statements. There’s a very common select-loop pattern in juju that looks something like:

var out outChan
var results []resultType
for {
    select {
    case <-w.abort:
        return errAborted
    case value := <- w.in:
        results = append(results, value)
        out = w.out
    case out <- results:
        results = nil
        out = nil
    }
}

…or, more concretely, for a goroutine-safe public method that invokes onto an internal goroutine:

func (w *Worker) GetThing(name string) (Thing, error) {
    response := make(chan thingResponse)
    requests := w.thingRequests
    for {
        select {
        case <-w.catacomb.Dying():
            return errors.New("worker stopping")
        case requests <- thingRequest{name, response}:
            requests = nil
        case result := <-response:
            if result.err != nil {
                return nil, errors.Trace(result.err)
            }
            return result.thing, nil
        }
    }
}

Understand what it’s doing, observe instances of it as you read the code, reach for channel-nilling as a tool of first resort when orchestrating communication between goroutines.

It’s interesting to compare these with lock-based approaches. A simple:

w.mu.Lock()
defer w.mu.Unlock()

…at the top of (most) exported methods is an effective way of synchronising access to a resource, at least to begin with; but it’s not always easy to extend as the type evolves. Complex operations that don’t need shared memory all the time end up with tighter locking; and then all internal methods require extra care to ensure they’re always called with the same lock state (because any of them might be called from more than one place).

And then maybe you’re tempted into an attempt to orchestrate complex logic with multiple locks, which is very very hard to get right – in large part because locks take away your ability to verify individual methods’ correctness independent of their calling context. And that’s dangerous for the same reason that globals are poisonous: the implicit factors affecting your code are much harder to track and reason about than the explicit ones. It’s worth learning to think with channels.

Channels Transfer Responsibility

It’s perfectly fine to send mutable things down channels – so long as you immediately render yourself incapable of mutating that state. (Usually, you just set something to nil.) The result-aggregation example above (with var results []resultType) does exactly that: on the branch where results are successfully handed over to whatever’s reading from w.out, we set results = nil not just to reset local state but to render it visibly impossible for the transfer to contribute to a new race condition.

Similarly, once you get a value out of a channel you are solely
responsible for it. Treat it as though you created it, and either tidy it up or hand responsibility on to something else.

(Note: this is another reason not to buffer channels. Values sitting in channel buffers are in limbo; you’re immediately depending on non-local context to determine whether those values ever get delivered, and that renders debugging and analysis needlessly brain-melting. Merely knowing that a value has been delivered does not imply it’s been handled, of course: the analysis remains non-trivial, but at least it’s tractable.)

Channels Will Mess You Up

This is just by way of a reminder: misused channels can and will panic or block forever. Neither outcome is acceptable. Only if you treat them with actual thought and care will you see the benefits.

Next: https://discourse.jujucharms.com/t/know-what-references-cost/2738

Structs are pretty cool