Merging Git Repositories

By Brad Lazaruk, Thu 27 June 2024, modified Fri 05 July 2024, in category Development

Git, GitHub

I had a bunch of separate git repositories that I wanted to merge into one, while also maintaining the existing git history as much as possible.

There were three cases: one where the original repository is in a directory which can be simply renamed to fit into the new structure, one where the repository was in a subfolder which needed to be moved up a level or two, and one where there were multiple repositories in a directory which needed to be all be folded together into one directory in the new structure. The new repository should be reformatted so that it looks like all the commits were originally made in the new directory structure. In all cases there are loose files which had been added here and there over time and then not added to the repository at all. Overall, a big mess that I just wanted to get cleaned up, all added to a single repository, and then archived.

Turns out this is time consuming but not technically challenging. Thanks to this post from Andresch Serj, and reinforced with this article on SlingAcademy.

Both of the cases require the use of git-filter-repo. To get this to work I simply downloaded the script, placed it into a new ~/opt directory, and set the file to executable. I then added this ~/opt directory to my path with export PATH=/home//opt:$PATH. I know this is ephemeral but it suits my purposes right now as I’m not sure I’m going to need to add more things to ~/opt in the future.

Case 1

In this case, the source directory contains the development material. But it should be bundled together into a single subfolder so that it fits in the target directory as a sub-project, rather than hanging directly off the root, to make room for further sub-projects to be added.

root
|- target new directory
|-- target new repository and tracked files
|- existing source directory
|-- existing source repository and tracked files
|-- loose files

to

root
|- new target directory
|-- subdirectory - exact copy of the source directory
|-- new repository blended with original repository

On the terminal, cd into the source directory. Use the command to bundle that material into a new sub-sub-project directory:

git filter-repo --to-subdirectory-filter <sub-sub-project> --force

So, if have a folder structure like this:

root
|- 2024_update
|-- source_code

and I want to have it look like this:

root
|- projectAlpha
|-- 2024_update
|--- source_code

then I first have to create the desired directory structure in place, with this command issued from the /2024_update folder:

git filter-repo --to-subdirectory-filter 2024_update --force

This will leave us with the intermediate structure of

root
|- 2024_update
|-- 2024_update
|--- source_code

Now you can go to the target repository and merge the modified repository in.

Note: In the git remote command be sure of your spelling. The command will not give any error messages if you get the path wrong.

cd <into the folder with the target repository>
git remote add <name> /path/to/the/modified_repository
git fetch <name> --tags
git merge --allow-unrelated-histories <name>/<branch name>
git remote remove <name>

The result of this will be the desired target structure and the git repository with the original timestamps and appearing to be commits to the current directory structure. Gather up and add the loose files and commit them to the new directory.

Case 2

Here, the source repository is in a subfolder which needs to be moved up.

root
|- target new directory
|-- target new repository and tracked files
|- existing source directory
|-- loose files
|-- source subfolder
|--- existing source repository
|--- loose files

to

root
|- new target directory
|-- subdirectory - exact copy of the source directory
|-- new repository blended with original repository

The process here is mostly the same as in case 1. You just need to use the git filter-repo command to create the target directory structure from the bottom up.

So, if have a folder structure like this:

root
|- 2024_update
|-- 2023_proposal
|--- july_presentation
|---- source_code

and I want to have it look like this:

root
|- projectAlpha
|-- 2024_update
|--- source_code

then I first have to create the desired directory structure in place, with this command issued from the july_presentation folder:

git filter-repo --to-subdirectory-filter 2024_update --force

This will leave us with the intermediate structure of

root
|- 2024_update
|-- 2023_proposal
|--- july_presentation
|---- 2024_update
|----- source_code

and the git repository will believe the root of this is the 2024_update directory. Now you can merge this repository into the target repository with the above commands to get the desired structure.

Case 3

This case is a touch more complex as there are multiple repositories that need to be merged together to give the appearance that they were always in one repository and under the same parent directory.

root
|- target new directory
|-- target new repository and tracked files
|- existing source directory
|-- loose files
|-- subdirectory project 1
|--- original repository 1
|--- loose files
|-- subdirectory project 2
|--- original repository 2
|--- loose files
|-- subdirectory project 3
|--- original repository 3
|--- loose files

to

root
|- target directory
|-- subdirectory1 - exact copy of the source subdirectory 1
|-- subdirectory2 - exact copy of the source subdirectory 2
|-- subdirectory3 - exact copy of the source subdirectory 3
|-- new repository blended with all the original repository

For this I found it easiest to go through the subfolders and use the above git filter-repo command to build the desired directory structure for each of the repositories. Then I used the fetch and merge commands above to merge each of those repositories into a new temporary repository, named for the desired new parent folder name. Once all the smaller repositories were folded together the git filter-repo command is used on the temporary repository to bundle it all together into a single subfolder - and then finally that repository can be merged into the target repository.

All Cases

Done. Not the end of the world, really. Now you can review the folder structure of the new repository to make sure things moved as they should have, and check the git history to see if the merging worked as expected, maintaining the timestamps and respecting the new folder structure. If it didn’t, stop now and re-clone the new repository and do it all over again. If the git commands worked properly but there are some objects in the original directories still well, those were added to the directories but never to any repository. Just add them to the new target directory now so they are captured.

When everything looks as it should, sync up to GitHub, or archive it however you like. Then you can delete all the old directories so you don’t in the future have to wonder if you ever merged them! Clean as you go!