Monorepos make inner-source come to life
Part 1 of this blog series made large organizations agile and efficient by developing business capabilities through empowered, vertically integrated teams and tech stacks of manageable size. Part 2 uses inner-source to boost collaboration and the collective IQ of the organization. In part 3 here, we use a mono-repository to make sharing and collaboration the economical and rational choice, enable long-term health of the company’s codebase, and improve the company culture along the way.
what is a monorepo
Many organizations have multiple code repositories, each one containing one codebase. We call this a multi-repository or short multirepo layout. But one can store more than one logical codebase in a code repository. Thinking this idea all the way to the end means the organization has just a single code repository (a mono-repository or monorepo) that contains all the code, tests, test harnesses, documentation, DevOps scripting, and infrastructure definitions — everything minus the business secrets. Some monorepos also contain the source code of open-source dependencies as well as all other tools you need to compile, test, and run your applications. Monorepos are different from monolithic software architectures: a monorepo can contain many independent codebases, for example, microservices.
Many technology leaders like Google, Microsoft, Uber, Facebook, Twitter, and prolific open-source projects like Chromium and Android use monorepos and attribute a substantial amount of their engineering success to them. Others like Amazon, Apple, Intel, IBM, and Oracle use multiple repositories.
operating a monorepo
Mono-repositories contain a lot of content, often many Gigabytes or more. So much code requires a scalable source code management system. Fortunately, modern Git scales pretty well. As a data point, an out-of-the-box installation of Git can clone the Linux kernel, a codebase with dozens of millions of lines of code, decades of history, a million commits by many thousands of people, in a couple of minutes and manage code changes without noticeable slowdowns. Simple performance tuning like sparse checkouts, partial clones, untracked cache, and the manyFiles feature allow Git to handle orders of magnitudes larger codebases. For example, Microsoft manages the entire MS Office code base with hundreds of GB of content in a Git monorepo enhanced with Scalar.
A monorepo contains many frontend and backend technologies, typically written in a variety of programming languages. Each language requires dedicated tooling. Scalable, cross-language build systems like Bazel, Buck, or Please can build, test, and package such large amounts of heterogeneous code all at once. When they use cloud compute for the heavy lifting and caching of artifacts, they can finish even large amounts of work in seconds or minutes.
A mono-repository receives all changes from all developers in the company. Such an amount of traffic requires minimizing long-lived code branches and leaning on trunk-based development and feature flags, copious amounts of automated testing, and continuous integration (on scalable cloud infrastructure to keep up with demand). Additional DevOps infrastructure like a merge queue concurrently tests and merges approved pull requests. Auto-rollback undoes problematic commits to un-break the build for everybody. Git automation like Git Town keeps feature branches in sync with the main code line. Code review management automatically tags code reviewers, organizes pull requests with labels and other metadata, and re-surfaces forgotten pull requests.
Deployment can still happen individually for each codebase by executing CI jobs that are — of course — also stored in the monorepo.
benefits
Investments to develop the capabilities mentioned above are well worth it. To start, all of them are DevOps best practices. Most organizations that develop software should use them, with or without monorepos. Monorepos just force us to implement a higher level of DevOps discipline sooner rather than later, very much to our long-term benefit. At the same time, monorepos drastically simplify the overall DevOps infrastructure. With a monorepo, you don’t need package managers (npm, bundler, Maven, etc.) that much anymore. You can get rid of private artifact repositories (Artifactory, Nexus, etc.) and internal version numbers of libraries. All code is versioned together, and all dependencies are always in the correct version.
A monorepo removes the need to think about repository boundaries: should the frontend and the backend be in different repositories? If yes, what about end-to-end tests, DevOps/CI scripts, or documentation? They belong to both codebases but are also kind of separate from them. Codebase boundaries are fuzzy and fluid, causing debates around how to structure multirepos. Monorepo users don’t have these debates. They also don’t need tools to coordinate changes across multiple code repositories, nor do they have to manage multiple versions of compilers, libraries, and frameworks on developer and CI machines.
Even more important is that monorepos allow large-scale code maintenance to keep the codebase — the most critical business asset in the 21st century — healthy. Monorepos are more discoverable and auditable than multirepos: They allow end-to-end tracking of changes from developer to production. This is more challenging in a multirepo setup where code updates happen in many code repositories and are shared through opaque build artifacts without machine-readable build history. Mono-repositories allow more flexible team boundaries, provide more metrics for engineering activity, code quality, make upstream and downstream dependencies visible, and facilitate more collective ownership of the codebase. Most importantly, mono-repositories make collaboration, sharing, and inner-source the easy and rational choice and bring out friendly collaboration between your teams.
These are bold claims. Let’s investigate them by looking at typical situations in a software developer’s everyday life and see how things play out in a monorepo versus a multi-repo setup.
example 1: developing a feature
Jane Doe, a software developer, joins a company that develops an application consisting of 100 micro-services. 20 services are required to run, test, and debug the application on a developer machine. Jane’s goal for her first day is to get her local development environment ready and build and ship a small feature. Here are the steps she takes in the monorepo case:
1. clone the repository
The progress indicator while cloning the monorepo indicates that this will take a while, so Jane takes a break and gets a coffee. Her network connection flakes out in the process, so she has to restart the clone. When it goes through, her computer contains all the code needed, including the source code of dependencies. Jane is ready to write code and run tests.
2. compile dependencies and run tests
There is a lot of code to compile and test. Fortunately, the company has set up a scalable build system backed by a cloud-based compile and test farm. Most of the compilation artifacts and test results Jane’s computer needs exist there, so building and testing the entire application takes no more than a few minutes.
3. create a new branch
4. make changes to service 1
5. make changes to service 8
6. make changes to service 12
So far, so good. All tests pass. Time to get these changes reviewed!
7. create a pull request
The review system tags the code owners of all folders Jane has modified as reviewers for her pull request. The code reviewers see all changes Jane made. This helps them understand what she is trying to accomplish and provide more specific feedback. The review takes a while. Better keep her branch up to date by merging changes other people have made to the main code line.
8. run “git fetch && git merge origin/master” on her branch
Her change gets approved. Ready to ship the feature!
9. merge and delete the feature branch
Nine simple and straightforward steps. Let’s see how Jane’s first day would play out in the multi-repo scenario.
1–20. clone repository 1–20.
Jane has to clone 20 individual code repositories onto her machine. While one could automate this with additional tooling (that Jane has to install and learn to use), it takes longer than cloning a single monorepo. When the network connection flakes out, Jane has to investigate which of the existing repositories are okay and which need to start over. The next step is installing all the dependencies from the company’s Artifact repository.
21–40. run “maven install”, “gradle install”, “bundle install”, or “npm install” in each repository
This takes a while as well.
41–60. compile and run tests in all codebases
Each microservice codebase compiles and tests in a reasonable amount of time, but times 20 this is getting lengthy. There is no scalable build system, so all the compilation and testing has to happen on Jane’s computer. Finally, Jane is ready to develop her first feature.
61. create a new branch in repository 1
62. create a new branch in repository 8
63. create a new branch in repository 12
64. make changes to service 1
65. make changes to service 8
66. make changes to service 12
67. commit changes in repo 1
68. commit changes in repo 8
69. commit changes in repo 12
So far, so good. All tests pass. Time to get these changes reviewed!
70. submit a pull request for repo 1
71. submit a pull request for repo 8
72. submit a pull request for repo 12
Each reviewer sees only the subset of changes in their codebase, so Jane has to explain the overarching feature she is trying to implement to each one of them. The review takes a while. Better merge changes other people have made into her branch.
73. run “git fetch && git merge origin/master” on the feature branch in repo 1
74. run “git fetch && git merge origin/master” on the feature branch in repo 8
75. run “git fetch && git merge origin/master” on the feature branch in repo 12
Her change gets approved. Ready to ship the feature! This must happen in a particular order to avoid breaking the build. A senior engineer has to guide Jane through this the first few times.
76. merge and delete the feature branch in repo 12
77. merge and delete the feature branch in repo 1
78. merge and delete the feature branch in repo 8
Setting up a multi-repo and building a small feature requires 78 steps, lots of repetition, additional tooling, and contains plenty of possibilities for mistakes and inconsistencies. Jane lost 20 minutes debugging a failing integration test only to find that codebase 8 had the wrong branch checked out.
example 2: sharing reusable code
Jane in Team A has developed a library for creating timestamps that she wants to share with the rest of the company. Hopefully, this gets the company onto one standardized timestamp format. Here are the steps in the monorepo scenario.
1. move the file/folder containing the library and its tests into the “shared libraries” folder of the monorepo.
The IDE updates all usages of the library throughout the company’s entire code base.
Sharing her library took 10 minutes and was so easy that Jane wants to share a few more things while she is at it! Now, let’s look at the steps in a multi-repository setup:
1. create a new repository for the shared library
2. copy code and tests into the new repository
3. create a name and version number for the library — how stable is it currently?
4. set up CI and deployment pipelines for this new repository
5. publish the first version of the library to the company’s artifact repository
6. delete the old code and manually update all references to the new artifact
Phew, that took half a day! If management wants us to share code, they must give us time for that. Jane postpones sharing more code to when she has time later. We know that never happens because there will always be more things to do than there is time.
example 3: contributing to shared code
Team B is using Jane’s shared timestamps library. They need to fix a bug they found for timestamps made in leap years. Steps in a monorepo setup:
1. find the library source code
This is simple: ctrl-click on where your code calls the library in your IDE
2. fix the bug in the library source code
3. test that your code still works with the bugfix
4. submit a pull request to the code owner
5. wait until the pull request gets merged
That took just 1–2 hours. While at it, team B also adds JSON formatting and fixes a few typos in the library’s documentation. Now the steps in a multi-repo setup:
1. find the source code of the library and get access
It might not be shared at all.
2. clone the library source code repo
3. find the branch to edit in that repo
Is it the master, main, production, or development branch?
4. temporarily change your code to use the library source code instead of the compiled artifact
5. fix the bug in the library source code
6. test that your code works correctly with the bug fix
7. submit a pull request to the library maintainer
8. wait until the pull request gets merged
9. wait until a new version of the library gets published
Hopefully, this happens earlier than team A’s normal release cadence, which would be end of next week.
10. revert your app code to use the published artifact again
11. update to use the new version of the artifact
12. test that your app still works with the new artifact version
Overall, this process will take 1–2 weeks. Team B cannot wait this long. They copy the library code and fix the bug in their copy to unblock their release. The actual library still contains the bug. Team B postpones pushing their bugfix upstream to when they have time later.
example 4: automated large-scale code repair
Jane gets repeated feedback that the “import” method in her timestamp library is confusing. It expects data in CSV format, but some people give JSON-formatted data. She decides to rename this method to “importCSV” to clarify the desired input format. In the monorepo setup:
1. Rename the function
Jane right-clicks the function name in her IDE, chooses “Rename”, and enters the new name. The IDE renames the function name in all usages of the library across the company. Despite her just making a backward-incompatible change to a popular API, somehow there were no breakages and everything still works. Most people won’t even notice this change.
2. Submit a pull request, get the change approved by global code owners, affected users get notified.
3. The API and all its usages get updated in one atomic operation.
A quick and easy change, leaving no technical debt. Automation does all the hard work. The CI server runs the tests of all codebases that use the new “importCSV” function, giving Jane confidence that she didn’t break anybody. Now the multi-repo workflow:
1. rename the function
Jane right-clicks the function name in her IDE, chooses “Rename”, and enters the new name. The IDE renames the function only in that codebase.
2. publish a new version of the library
The new version breaks all users of the library.
3. deal with the angry teams that show up at her door asking why she broke their build for no good reason and without coordinating with them first
4. deal with her boss, who asks why he is getting calls from his boss
5. promise herself and her boss that she will never do something like this again
It is hard to do such code cleanup in a multi-repo setup without technical debt (maintaining the old and the new version of the function for some time) or drama. It’s probably less painful to leave things the way they are next time and deal with the bug reports than try another improvement of this widely used API.
example 5: manual code repair
Jane cleans up the implementation of her timestamp library. The changes look okay to her but break an edge case in how team B uses it. Let’s see what happens in a monorepo setup:
1. Jane makes the change to her shared library and submits a pull request
2. the CI server rejects the change because it breaks team B’s tests
Now Jane has five possible courses of action:
3a. change her implementation so that she doesn’t break team B
Jane adds the edge case to her new implementation. A day when we don’t break our users is a good day!
3b. fix the breakage in team B’s code
Since it’s her library, Jane knows how team B’s codebase should use it. Team B reviews and thanks her for saving them from this headache. Jane made new friends today.
3c. collaborate with team B
Jane gets in touch to see when she can pair up with somebody from team B for an hour to update their code to accommodate the API change she wants to do. They make an appointment for Friday and get it done in an hour. Jane made new friends today and got a ton of helpful feedback how to improve her library’s API design even more.
3d. implement the breaking change in a non-breaking way and let her users do the migration
Jane leaves the old code as-is, deprecates that API, and exposes her cleaned up implementation as a new API. She ships this change, waits until all teams migrate to the new API (the monorepo tells her who is using which of her APIs), then deletes the old API.
3e. decide not to do the change
Jane now realizes that this change breaks 27 teams in ways that require person-months of manual cleanup. This change does not make sense because, overall, it costs more than the benefits it creates. Jane documents this insight in the ticket for this change and discards the code branch. Jane learned something today.
A monorepo enforces great attitudes, communication, collaboration, empathy, and consideration of company-wide impacts. Now let’s look at what happens in a multi-repo setup:
1. Jane makes the change and submits a pull request
2. the CI build for her library is green. Jane publishes a new version of the library.
When team B updates to the new version of the library, it breaks their build. Jane realizes her mistake, but now it’s too late to revert the library change because other teams have already upgraded to the new version. Team B has three possible courses of action:
3a. fix the breakage
Team B has to drop an item from their current sprint to make the time to get their build green again. Jane has lost a few friends today.
3b. pin team B’s code to the old version of the library
Team B now doesn’t get other updates and security fixes and accumulates even more breaking changes that they have to resolve at some point. There are now two versions of Jane’s library in use across the company, leading to possible dependency hell, and more technical debt in the codebase.
3c. stop using the library and calculate timestamps themselves
The stronger boundaries in multirepos make changes easier but bring out self-centric attitudes, communication breakdowns, preventable breakage, and more technical debt. It takes a longer time to apply changes consistently across all codebases, making such changes more expensive than if they were done all at once.
example 6: cross-team development of a large feature in a micro-service architecture
Teams A (backend), B (frontend), and C (payment logic) collaborate to add payments via coupons to the product. Here is how they work together in a monorepo:
1. team A adds a coupons microservice and deploys it
2. team B adds coupons to the payment UI
Team B runs into a few bugs while using the new coupons microservice. Team A can use Team B’s branch to reproduce the broken behavior and develops a quick fix in it (that they then extract into its own branch and polish). Team B is blocked for less than a few hours and ships their feature on time.
3. team C adds coupons to their payment microservice
While doing so, they break the checkout workflow. The end-to-end tests catch that. Team C fixes the problems.
4. QA finds no major issues.
All problems were found and fixed quickly during active development. The feature ships on time and within budget.
Here is the collaboration in a multi-repo setup:
1. team A adds a coupons microservice and deploys it
2. team B adds coupons to the payment UI.
Team B runs into a few bugs while using the new coupons microservice. Team B is blocked for days while Team A reproduces the issue, implements, tests, and releases a new version of the micro-service (which might only happen at the end of their current sprint).
3. team C adds coupons to their payment microservice
While doing so, they break the payment workflow. Team C’s CI server doesn’t run those, so they ship.
4. QA finds UI breakage and stops the release
The issues are caught and fixed later in the process, causing longer delays, which lead to time and cost overruns.
example 7: Yak shaving
A new version of the programming language Jane’s team is using with faster compile times and less memory use is available. They can’t wait to upgrade.
In the monorepo setup, Jane doesn’t have to do anything. The company’s compiler team will update the entire monorepo to the new compiler version. Since they upgrade something (compiler or other tools) every couple of weeks, they have this down to a science and automated to a high degree. They have been tracking the beta releases of the new compiler version, performed and tested the necessary update steps, and are in contact with the vendor. They have reached out to various application teams to prepare everybody for the coming update and get the needed code updates in place.
A nice side effect is that the entire company, even older codebases in maintenance mode, always use the same modern compiler version, which reduces complexity and compatibility issues. Even though there is a dedicated compiler team, and more code gets updated regularly, the company spends a lot less overall effort and budget on compiler upgrades and security hotfixes thanks to economies of scale.
In the multi-repo setup, teams have a lot more independence, and every team updates tooling like compilers, linters, or code formatters on their own. Updates are done inconsistently across the company. Some older codebases in maintenance mode still run on many years old language versions. Jane will update the compiler for her team. She reads through the release notes and update instructions, tries out a tool that automates the upgrade but cannot make it work in an hour. She decides that it will be easier to fix all issues manually. Most teams go through the same experience. Overall the company spends more time on this update than if one person would have figured out the automatic update.
Monorepos make it natural to develop centers of excellence for enabling functions like compilers, hosting, and even external open-source dependencies and frameworks. When implemented well, these enabling functions keep the entire company consistently up to date, more efficiently than individual teams can. When done poorly, these enabling functions hold the company back. This is another example of how monorepos enforce excellence.
example 8: releasing
When releasing code, Jane’s team needs to gather all code to be released: business logic and external dependencies. They need to scan all code with security scanners. They need to certify that all code changes that go into this release come from trusted sources or have been reviewed. Then they run the end-to-end tests and check for inappropriate licenses. Finally, they compile and package all the code they use into a release.
A monorepo contains everything needed for deployment: the code, tests, dependencies, license information, and change history. Everything is ready to be scanned, compiled, tested, audited, packaged, and deployed. Deploys can happen even if the artifact repository or internet connection are slow or unavailable.
In a multi-repo setup, code, tests, and dependencies are spread over many repositories and build artifacts in various locations. The build script must download artifacts and unpack them for security and license scanning. Change history for these artifacts is lost and with that easy auditability. Running end-to-end tests requires cloning several repos onto the CI server. When the artifact repository or internet connection are slow or unavailable, deployments cannot happen at all.
challenges
Monorepos have a few disadvantages. They have less granular access control: everybody has full access to all code in the repo, at least when using Git. We discussed in the inner-source post that this is often less of a problem than people think. Another challenge with monorepos is silent updates: the implementation of dependencies can change at any time without their users being aware of it. The antidote for this is automated testing to prevent others from breaking your code.
Another concern around monorepos is that the large amount of code found in monorepos can slow down tooling and IDEs. Technology and engineering improvements address this better than splitting up a monorepo. If Google split its 86 TB of code into thousands of repositories, most of them would still be too large, especially considering that the pieces wouldn’t be equal in size.
While monorepos reduce the need for extremely stable APIs somewhat, they can lead to tighter coupling than multirepos. Fortunately, they also make it easier to refactor this coupling away.
Making parts of a monorepo open-source requires copying it from the monorepo to a public location and decoupling the code from the rest of the monorepo vs just making a repo in a multirepo setup public. But this sanitization of confidential elements and build history is often also needed when open-sourcing code bases within multirepos.
takeaway
Mono-repositories have surprisingly superior ergonomics compared to multi-repositories in everyday development work within large organizations. Monorepos make sharing and collaboration easy and economical. They accelerate the adoption of DevOps best practices and lead to a healthier codebase that enables long-term business agility by making large-scale code maintenance straightforward and cost-effective. A monorepo is better auditable and allows tracking code changes end-to-end from development to production. Most importantly, mono-repositories bring out good attitudes and behaviors in your organization by facilitating collaboration and empathy between teams.
For further perspective and data points, check out Google’s paper and journal publication, or see Uber’s presentation on the topic. For additional tooling, check out the awesome monorepo list.
Choose a monorepo if you want to:
- create a culture of sharing and collaboration (inner-source),
- increase visibility to facilitate learning and reuse,
- centralize and standardize enabling functions like DevOps, security, and compliance,
- mitigate your technical debt by cleaning it up via large-scale code maintenance.
Choose multirepos if you want to:
- give teams complete control with zero company interference (the Netflix/Spotify model),
- limit visibility in a highly confidential organization (the CIA/drug cartel/Apple model),
- decentralize enabling functions like DevOps, security, and compliance,
- mitigate your technical debt by moving it out of the way into separate repositories.
If you are unsure, start new projects as a monorepo to allow rapid progress and efficient discovery of the right abstractions and functional boundaries in your brand-new codebase. When the repo becomes unwieldy, you can decide whether you want to invest in scaling the monorepo or if you want to invest in tooling to manage the massive multirepo setup you will need at this point.