Split large Git repositories

Split large Git repositories

Recently I had to split a large application into several smaller applications. The source code of that application was in single Git repository and had well-encapsulated modules within its own sub directories such as module-a.

large-application
├── module-a
├── module-b
└── module-c
directory structure of the original application

Since the modules where already encapsulated the main goal was to extract the modules into new dedicated Git repositories. And here is how I achieved it.

Mission: Extraction

First let's start by going into the Git repository of the large application.

cd ~/large-application

For now we concentrate on extracting module module-a and we do this by telling Git to create a subtree of your directory and store that subtree in a new branch called feature/split-module-a.

git subtree split -P module-a -b feature/split-module-a

After this we create a new empty Git repository for module-a.

mkdir ~/new-repo
cd ~/new-repo
git init

Alright, we are about done here. All we need to do is to move the extracted branch from the source Git repository to the target Git repository.

git pull ~/large-application feature/split-module-a

This is it. Now your new repo contains only the module-a related commits in the new Git repository.

This post is inspired by the following answer at Stackoverflow.

Silvio Wangler

Silvio Wangler

Embrach, Switzerland