Making it easier to merge-through

barrettj12 · 20 July 2023 08:41

Currently, the Juju repo has five active branches: 2.9, 3.1, 3.2, 3.3, main. Patches are merged into the earliest applicable branch, then merged through to later branches. As there is often significant skew between the branches, merging through can be difficult and very painful.

There may be merge conflicts to resolve.
Even if a commit doesn’t create merge conflicts, it may not be correct when applied to the later branch, so the commit might have to be modified when merging through.

In the interests of making our lives easier as Juju developers, this document contains some guidelines to follow, in order to make it easier to merge through correctly. They are:

Don’t repeat yourself
Keep commits and PRs small
Keep merges small
Use git-imerge

Don’t repeat yourself (DRY)

Yes, this classic coding principle also helps to make merging easier. Let’s illustrate why with an example.

Suppose we have a test file with a variety of similar tests. (Let’s illustrate with only two tests - you can imagine there are many more). We might decide to copy-paste code between the tests, so we have something like:

func (s *suite) TestFoo() {
  thing := MyThing{
    name: "mything",
    number: 1,
  }
  // test stuff here...
}

func (s *suite) TestBar() {
  thing := MyThing{
    name: "mything",
    number: 2,
  }
  // test stuff here...
}

Imagine we now make a small change, such as exporting the name -> Name field of MyThing. We’ll have to make this change in every copy of the test:

commit X -> branch A
---------------------------------------
  func (s *suite) TestFoo() {
    thing := MyThing{
-     name: "mything",
+     Name: "mything",
      number: 1,
    }
    // test stuff here...
  }

  func (s *suite) TestBar() {
    thing := MyThing{
-     name: "mything",
+     Name: "mything",
      number: 2,
    }
    // test stuff here...
  }

How can this cause issues with merging? There are two scenarios in which it might:

Imagine that in branch B, we removed TestBar. Now, when we try to merge through commit X, we have a delete/modify conflict.
Imagine that in branch B, we added a new test TestBaz, and copy-pasted the other test again:
```
func (s *suite) TestBaz() {
  thing := MyThing{
    name: "mything",
    number: 3,
  }
  // test stuff here...
}
```
Commit X doesn’t contain any changes to TestBaz. When we merge through commit X, it will pick up the changes to MyThing, but not to TestBaz. Hence the tests will not compile.

To avoid these problems, we should instead structure our tests like so:

func (s *suite) TestFoo() {
  thing := newMyThing()
  // test stuff here...
}

func (s *suite) TestBar() {
  thing := newMyThing()
  thing.number = 2
  // test stuff here...
}

func (s *suite) TestBaz() {
  thing := newMyThing()
  thing.number = 3
  // test stuff here...
}

func newMyThing() MyThing {
  return MyThing{
    name: "mything",
    number: 1,
  }
}

Then, all we need to do is change newMyThing:

  func newMyThing() MyThing {
    return MyThing{
-     name: "mything",
+     Name: "mything",
      number: 1,
    }
  }

and it will merge through to branch B correctly and without conflicts.

Keep commits and PRs small

Again, good general development advice that particularly helps in the context of merging. Smaller changes and conflict sets are much easier to resolve.

Keep merges small

Especially when there are a lot of commits backed up, it can be tempting to merge them all through in one huge PR. It seems intuitive that doing them all at once will be less work. However, this is usually not true, because Git merges are all-at-once, not incremental. When you merge through a set of commits, you have to resolve all conflicts for all commits at the same time. This can be very overwhelming compared to merging in smaller chunks. It is much easier to make a mistake and merge incorrectly when there are 50 conflicts vs 5 conflicts.

For these reasons, doing a big merge can create more problems down the road, when merging through to even later branches. This is because in the initial merge, we create a merge commit which encompasses a bunch of commits, and so we lose the ability to consider these commits separately.

As an example, say we have three development branches A, B, C, with merges A -> B -> C. Say we have three commits X, Y, Z on branch A that need to be merged through:

A |--o---X--Y--Z------>
  |   \
B |----o-------------->
  |     \
C |------o------------>

If we merge them all to B in one PR, this creates a merge commit M containing changes from X, Y, Z:

A |--o---X--Y--Z------>
  |   \         \
B |----o---------M---->
  |     \
C |------o------------>

Now, if we want to merge through to branch C, we have no choice but to merge them all at once:

A |--o---X--Y--Z------>
  |   \         \
B |----o---------M---->
  |     \         \
C |------o---------x-->

which could be difficult if it creates a lot of conflicts.

Conversely, if we do a separate merge for each of X, Y, Z:

A |--o---X--Y--Z------>
  |   \   \  \  \
B |----o---o--o--o---->
  |     \
C |------o------------>

then when merging through to branch C, we have the choice to merge each commit separately if we want:

A |--o---X--Y--Z------>
  |   \   \  \  \
B |----o---o--o--o---->
  |     \   \  \  \
C |------o---x--x--x-->

Do small and frequent merges to make your life (and others’ lives) easier. You can merge an individual commit into your branch with git merge <SHA>.

`git-imerge`

To assist with “keeping merges small”, there is a tool called git-imerge that allows you to incrementally merge one branch into another. It can be used as a drop-in replacement for the standard git merge.