Stacked PRs: Code changes as narrative

A brief write-up on how stacked PRs work and why they work well at Aviator.
Travis is a software engineer at Aviator.

Stacked PRs: Code changes as narrative

A brief write-up on how stacked PRs work and why they work well at Aviator.

Pull request review time is not linear with the size of the change. A pull request that’s twice as large takes more than two times as long to review — or, at least to review thoroughly. This usually means big PRs either tend to languish and grow stale or simply get rubber-stamped.

Why? I’m not entirely sure, but one of the biggest reasons is that with bigger PRs, it’s harder to untangle the story that the pull request is telling. Consequently, reviewers have to spend longer trying to understand the narrative of the pull request: why each bit of code is changing the way it is.

Often, big PRs start out as little PRs. Developers just want to fix one little bug. But, as you likely already know, little bugs aren’t always as little as they seem. What should have been a ten-line change turns into a three-hundred-line refactor. It’s easy to see how the story gets lots: why is a function so far away from where the bug is — or at least, where we think the bug is — changing so much?

Alternatively, big PRs are created in an attempt to keep hacking on a feature. In order to implement a button on the frontend, we have to write the frontend code. And the backend code. And the API layer glue. And the database schema migration. And now our PR for a single button is massive.

Enter stacked PRs

Stacked PRs work by… well… stacking your PRs. A stack is essentially a sequence of PRs: the first PR to add the database migration is created from main, the second is stacked on top of the first PR, the third is stacked on top of the second, and so on. Each stacked PR is created with the assumption that its ancestors will be merged (though, as we’ll see below, we can handle the case where those PRs need to change as well!).

Stacked PRs work so well — in this humble author’s opinion — because they allow developers to construct a narrative with their PRs. The refactor of the library function can be done in a separate PR from the PR that needs the new functionality introduced. The backend code can build upon the database schema definitions that were done in the previous PR.

And, critically, stacked PRs mean developers can start working on their second, third, and fourth PRs while the first PR is being reviewed and iterated upon.

That’s great from the reviewer’s perspective, but stacking PRs also allows the developer writing the code in the first place to move faster. They can now work on multiple parts of a feature without waiting for reviewers to approve previous PRs. This couldn’t be more important in a world where devs are working remotely (and across who knows how many time zones). Being blocked and waiting for your coworkers is, at best, frustrating. Sometimes it’s genuinely heartbreaking. Either way, the emotional and practical costs make for a bad time.

Technical deep dive: How stacked PRs work with Aviator

We’ve taken a “git native” approach to creating stacked PRs. Stacked branches are just normal Git branches that were branched from their stack parent.

If that sounds confusing, let’s look at a quick (and realistic!) example. We want to add a like button to the website we’re building, and it makes sense to do it in three phases:

  1. Add backend service code (e.g., database schema changes and model logic)
  2. Add a REST API interface
  3. Implement the frontend

Without stacked PRs, we’d have to implement this in one big PR or wait for each PR to be approved and merged before we could start on the next piece of the feature. Instead, we can open three separate PRs and have them reviewed both independently and in parallel (possibly even by different people!).

The stacked branching strategy

👉 In Git, a branch is like a linked list of commits. The branch name (such as main or like-button-rest-api) is essentially just a pointer to a particular node in the linked list.

Technically, a Git branch is closer to a directed acyclic graph due to merge commits, but for this discussion, we’ll assume there are no merge commits on the stack.

Since we want to open three different PRs, we need to create three different branches. To do this, we’ll simply create each new branch off of the previous branch.

Updating and committing to stacked branches

One of the primary reasons to use stacked PRs is to enable easier code review — which means that it’s actually pretty likely that you’ll have to modify branches that have other branches stacked on top of them.

Modifying a stacked branch is just like modifying any other branch: perform your modifications and create a new commit!

However, if you commit to a branch that has children, you end up in a scenario where the parent and child branches have diverged!

In the diagram, like-button-frontend doesn’t contain the commit 2B which we’ve added to like-button-rest-api — which means that all of the CI and checks we’re running for like-button-frontend are now out of date!

To fix this, we can rebase like-button-frontend on top of like-button-rest-api. This effectively “replays” commits 3A and 3B on top of 2B:

👉 Here we’ve notated the rebased commits as 3A' and 3B' to illustrate the fact that, according to Git, these are actually new commits that are different from the old 3A and 3B. This is because Git considers the parents of a commit as part of the identity (i.e., hash) of that commit. Since 3A' has parent 2B and 3A has parent 2A, they are different commits. This is why rebasing requires a force push: Git thinks we’re losing 3A and 3B and wants to make sure we mean to erase those commits from the branch.

Stacked branches become stacked PRs

Since GitHub doesn’t have native support for Stacked PRs, we have to be clever when it comes time to open PRs. If we opened a PR from like-button-frontend into main, GitHub would show the diff from all three branches, which defeats the whole point of using Stacked PRs!

Instead, we open PRs in a linked-list-like fashion:

This turns out to work out really well! The diff for like-button-frontend only contains the changes from commits 3A and 3B as intended!

Merge time! (oh wait turns out it’s hard 🤧)

Merging is a very complicated subject, and to talk about it, we need to take a quick digression into the “squash merge” strategy.

A “squash merge” isn’t technically a merge in the Git sense. Instead, a squash merge generates a new commit that contains the diff from the commits on a branch and adds that commit to main. GitHub (and many other Git hosts) will consider the branch “merged” (and close the associated pull request), but Git doesn’t actually consider the commits from the original branch as merged into main since the squash-merge-commit isn’t technically related to any of the original commits.

👉 An actual Git “merge commit” is a commit that has two (or more!) parents: the mainline branch (i.e., the branch that is being merged into) and the branch(es) that are being merged into the mainline. Git then considers the history of the mainline branch to contain the commits that were merged in. Many codebases use squash commits instead of merge commits because the history of individual feature branches is often messy and uninteresting.

Ultimately, this means we can’t merge each PR in order (at least while using squash commits). Let’s consider the branch diagram after we merge the first branch in the stack:

Here, we’ve added the commit 1S to main which represents the “squash merge” of the like-button-backend branch. But, critically, Git doesn’t actually consider the like-button-backend branch merged, even though GitHub does.

Now when we want to merge the second branch, like-button-rest-api, into main, Git tries to calculate all of the commits that it needs to merge into main. Since 1A and 1B are part of the history of like-button-rest-api, and not main, ultimately the squash merge will consist of the diff of 1A, 1B, 2A, and 2B. Since 1A and 1B have already been applied, this almost always results in a merge conflict.

Merge time, for real!

There are a few ways to get around the merge issues presented above.

One option is to only merge the top branch of the stack (like-button-frontend in the example above). This is what Aviator does today (unless using our fast-forward mode — see below for details!).

Another option is to rebase each stacked PR after its parent is merged. This essentially means that we replay (git cherry-pick) the commits from a stacked branch on top of the squash commit generated by merging its parent branch:

his can cause some issues in high-throughput repositories where Aviator’s parallel merge queue mode is enabled. Since we can’t start rebasing the next PR until the previous one is merged, the merge process becomes serialized on running CI for each branch in the stack (since CI is re-triggered after rebasing a branch).

We do have plans to support this mode in the near future for repositories that have sequential mode enabled.

Bonus: Aviator’s fast-forward mode

Aviator’s fast-forward mode is a subset of parallel mode that works by only fast-forwarding your mainline branch to commits with known-good CI states.

👉 We talked about how Git branches are just pointers to commits above. In fast-forward mode, we simply move this pointer forward to a commit that we’ve already validated (instead of merging a PR which generates a new commit which hasn’t already been validated).

This means we run the validation in a new branch where we’re able to squash each branch into a single commit:

Once the checks pass for the validation branch, we fast-forward main to 3S. This means that the commit that ends up in main is the exact same (i.e., same commit hash) as the one we validated:

Aviator will automatically mark each of the individual PRs we created early as closed.

Aviator: Automate your cumbersome merge processes

Aviator automates tedious developer workflows by managing git Pull Requests (PRs) and continuous integration test (CI) runs to help your team avoid broken builds, streamline cumbersome merge processes, manage cross-PR dependencies, and handle flaky tests while maintaining their security compliance.

There are 4 key components to Aviator:

  1. MergeQueue – an automated queue that manages the merging workflow for your GitHub repository to help protect important branches from broken builds. The Aviator bot uses GitHub Labels to identify Pull Requests (PRs) that are ready to be merged, validates CI checks, processes semantic conflicts, and merges the PRs automatically.
  2. ChangeSets – workflows to synchronize validating and merging multiple PRs within the same repository or multiple repositories. Useful when your team often sees groups of related PRs that need to be merged together, or otherwise treated as a single broader unit of change.
  3. FlakyBot – a tool to automatically detect, take action on, and process results from flaky tests in your CI infrastructure.
  4. Stacked PRs CLI – a command line tool that helps developers manage cross-PR dependencies. This tool also automates syncing and merging of stacked PRs. Useful when your team wants to promote a culture of smaller, incremental PRs instead of large changes, or when your workflows involve keeping multiple, dependent PRs in sync.

Try it for free. | Blog