Version Control

Git Submodules and Subtrees: When to Use Each

A practical comparison of Git submodules and subtrees for sharing code between repositories, covering setup, workflows, CI/CD integration, and decision criteria for choosing the right approach.

Git Submodules and Subtrees: When to Use Each

Every team that grows beyond a single repository eventually hits the same problem: how do you share code between repos without copy-pasting it? Git gives you two built-in mechanisms for this -- submodules and subtrees -- and most developers pick one without understanding the tradeoffs. That leads to pain. I have seen teams abandon submodules after weeks of broken CI builds, and I have seen subtrees turn into merge nightmares when the shared code diverges.

This article is the guide I wish I had five years ago. We will walk through exactly how each approach works, when each one is the right choice, and when you should skip both and use a package manager instead.

Prerequisites

  • Git 2.20 or later installed (subtree is bundled with Git since 1.7.11, but newer versions have important fixes)
  • A terminal you are comfortable with (bash, zsh, PowerShell, or Git Bash on Windows)
  • Solid understanding of Git fundamentals: commits, branches, remotes, merge, rebase
  • Two or more Git repositories (or the ability to create them for the examples)
  • For CI/CD sections: familiarity with either GitHub Actions or Azure Pipelines YAML

The Problem Both Solve

Suppose you have a utility library -- validation functions, shared configuration schemas, or a common logging module. You use it in three different services. The naive approach is to copy the files into each repo. That works until you fix a bug in the library and need to remember to update it everywhere. You forget. You always forget.

The slightly less naive approach is to publish it as an npm package. That is often the right answer, and we will talk about when. But sometimes you do not want a package registry in the loop. Maybe the shared code changes frequently. Maybe you want to develop it in lockstep with the consuming project. Maybe you need it to work in a language that does not have a convenient package manager.

Git submodules and subtrees both solve this by embedding one repository inside another. They just do it in fundamentally different ways.

Git Submodules Explained

A submodule is a pointer. That is the key mental model. When you add a submodule, Git does not copy the other repository's files into yours. Instead, it records two things:

  1. The URL of the external repository
  2. The exact commit SHA that your project depends on

This information is stored in two places:

  • .gitmodules -- a checked-in file that maps the submodule path to its remote URL
  • The Git index -- which records the specific commit SHA the submodule is pinned to

When someone clones your repository, they get the .gitmodules file and the recorded commit pointer. The actual submodule contents are not cloned automatically unless they explicitly initialize and update the submodules.

Here is what a .gitmodules file looks like:

[submodule "libs/shared-utils"]
    path = libs/shared-utils
    url = https://github.com/your-org/shared-utils.git
    branch = main

And when you look at the submodule entry in git ls-tree, you see something like this:

$ git ls-tree HEAD libs/shared-utils
160000 commit a1b2c3d4e5f6789012345678abcdef0123456789  libs/shared-utils

That 160000 mode is special -- it tells Git this entry is a submodule, not a regular file or directory. The SHA is the exact commit in the shared-utils repository that your project is pinned to.

Adding a Submodule

# Add a submodule at a specific path
$ git submodule add https://github.com/your-org/shared-utils.git libs/shared-utils
Cloning into 'libs/shared-utils'...
remote: Enumerating objects: 47, done.
remote: Counting objects: 100% (47/47), done.
remote: Compressing objects: 100% (31/31), done.
remote: Total 47 (delta 12), reused 42 (delta 9), pack-reused 0
Receiving objects: 100% (47/47), 8.12 KiB | 2.71 MiB/s, done.
Resolving deltas: 100% (12/12), done.

$ git status
On branch main
Changes to be committed:
  new file:   .gitmodules
  new file:   libs/shared-utils

Notice that Git stages two things: the .gitmodules file and the submodule directory itself (which is really just the commit pointer). You commit these normally:

$ git commit -m "add shared-utils submodule"

Cloning a Project with Submodules

This is where the first pain point hits. A regular git clone does not populate submodules:

$ git clone https://github.com/your-org/my-service.git
$ ls my-service/libs/shared-utils/
# Empty directory

You need an extra step:

# Option 1: Initialize and update after cloning
$ git clone https://github.com/your-org/my-service.git
$ cd my-service
$ git submodule init
$ git submodule update

# Option 2: Do it all in one command
$ git clone --recurse-submodules https://github.com/your-org/my-service.git

# Option 3: If you already cloned without submodules
$ git submodule update --init --recursive

The --recursive flag is important if your submodules themselves have submodules (nested submodules). It happens more often than you would think, especially with C/C++ projects.

Updating a Submodule

There are two different operations here, and confusing them is a common source of bugs.

Pulling the latest from the submodule's remote:

$ cd libs/shared-utils
$ git fetch origin
$ git checkout main
$ git pull origin main
$ cd ../..
$ git add libs/shared-utils
$ git commit -m "update shared-utils to latest main"

Or use the shorthand:

$ git submodule update --remote libs/shared-utils
$ git add libs/shared-utils
$ git commit -m "update shared-utils to latest main"

Pinning to a specific commit:

$ cd libs/shared-utils
$ git checkout a1b2c3d
$ cd ../..
$ git add libs/shared-utils
$ git commit -m "pin shared-utils to a1b2c3d (before breaking change)"

This pinning behavior is the submodule's biggest strength. You explicitly control which version of the shared code your project uses. No surprises.

The Submodule Gotchas

I have hit every single one of these in production. Here they are, in order of how much time they waste.

1. Detached HEAD state. When you run git submodule update, Git checks out the pinned commit -- not a branch. The submodule is in detached HEAD state. If you start making changes in the submodule directory without checking out a branch first, those changes are dangling commits that can be garbage collected.

$ cd libs/shared-utils
$ git status
HEAD detached at a1b2c3d
nothing to commit, working tree clean

# WRONG: making changes in detached HEAD
$ echo "fix" >> utils.js
$ git commit -am "quick fix"
# This commit is not on any branch!

# RIGHT: check out a branch first
$ git checkout main
$ echo "fix" >> utils.js
$ git commit -am "quick fix"
$ git push origin main

2. Forgetting to init. New team members clone the repo, run npm install, start the server, and get cryptic errors because the submodule directory is empty. Every onboarding doc needs to say "clone with --recurse-submodules". Every CI pipeline needs the submodule checkout step. You will forget at least once.

3. Stale references. Developer A updates the submodule pointer to commit xyz. Developer B pulls the main repo but forgets to run git submodule update. Now Developer B has the old submodule code but the new parent code that expects the updated submodule. The errors are confusing because the submodule directory exists and has files in it -- they are just the wrong files.

# After pulling, always update submodules
$ git pull origin main
$ git submodule update --init --recursive

4. Removing a submodule is ugly. There is no git submodule remove command in older Git versions. You have to do it manually:

# Remove the submodule entry from .gitmodules
$ git config -f .gitmodules --remove-section submodule.libs/shared-utils

# Remove the submodule entry from .git/config
$ git config -f .git/config --remove-section submodule.libs/shared-utils

# Remove the submodule directory from the index and working tree
$ git rm --cached libs/shared-utils
$ rm -rf libs/shared-utils
$ rm -rf .git/modules/libs/shared-utils

$ git commit -m "remove shared-utils submodule"

In Git 2.35.0+, you can use git rm libs/shared-utils and it handles most of this. But if you are on an older version, it is a multi-step process that is easy to get wrong.

Git Subtrees Explained

A subtree takes the opposite approach from submodules. Instead of storing a pointer, it copies the entire content of the external repository into a subdirectory of your project and merges it into your history. There is no .gitmodules file, no special initialization step, no detached HEAD. The files are just there.

The key difference: cloners do not need to know or care that a subtree exists. They clone your repo and get all the files. The subtree is invisible to anyone who does not need to update it.

Under the hood, git subtree uses a merge strategy. When you add a subtree, Git takes the external repository's history and merges it into your project's history at the specified path. This means your repository's commit log includes the subtree's commits (squashed or unsquashed, your choice).

Adding a Subtree

# First, add the remote (optional but makes commands shorter)
$ git remote add shared-utils https://github.com/your-org/shared-utils.git

# Add the subtree with squashed history
$ git subtree add --prefix=libs/shared-utils shared-utils main --squash
git fetch shared-utils main
remote: Enumerating objects: 47, done.
remote: Counting objects: 100% (47/47), done.
remote: Compressing objects: 100% (31/31), done.
remote: Total 47 (delta 12), reused 42 (delta 9), pack-reused 0
Receiving objects: 100% (47/47), 8.12 KiB | 2.71 MiB/s, done.
Resolving deltas: 100% (12/12), done.
From https://github.com/your-org/shared-utils
 * branch            main       -> FETCH_HEAD
 * [new branch]      main       -> shared-utils/main
Added dir 'libs/shared-utils'

The --squash flag is important. Without it, every commit from the subtree's entire history is merged into your project's log. With --squash, you get a single merge commit. I always use --squash.

After adding the subtree, your project looks like this:

my-service/
  libs/
    shared-utils/
      package.json
      index.js
      utils/
        validate.js
        format.js
  src/
  package.json

The files are real files in your repository. No pointers, no special modes.

Updating a Subtree

When the upstream shared-utils repo has new changes, you pull them in:

$ git subtree pull --prefix=libs/shared-utils shared-utils main --squash
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 3 (delta 1), reused 2 (delta 0), pack-reused 0
Receiving objects: 100% (3/3), 312 bytes | 312.00 KiB/s, done.
Resolving deltas: 100% (1/1), completed with 1 local object.
From https://github.com/your-org/shared-utils
 * branch            main       -> FETCH_HEAD
Merge made by the 'recursive' strategy.
 libs/shared-utils/utils/validate.js | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

This creates a merge commit in your project. The subtree files are updated in place.

Pushing Changes Back Upstream (Subtree Split)

One of the subtree's underappreciated features is pushing changes back to the upstream repository. Suppose you fix a bug in libs/shared-utils/utils/validate.js while working in your main project. You can push that fix back to the shared-utils repo:

$ git subtree push --prefix=libs/shared-utils shared-utils main
git push using:  shared-utils main
Enumerating objects: 7, done.
Counting objects: 100% (7/7), done.
Delta compression using up to 8 threads
Compressing objects: 100% (4/4), done.
Writing objects: 100% (4/4), 389 bytes | 389.00 KiB/s, done.
Total 4 (delta 2), reused 0 (delta 0), pack-reused 0
To https://github.com/your-org/shared-utils.git
   a1b2c3d..e4f5g6h  e4f5g6h -> main

Behind the scenes, git subtree push runs a git subtree split to extract only the commits that touch the subtree prefix, rewrites them to remove the prefix from file paths, and pushes them to the remote.

You can also split explicitly if you want to create a new repository from a subdirectory:

# Extract libs/shared-utils into a new branch with rewritten history
$ git subtree split --prefix=libs/shared-utils -b shared-utils-split
Created branch 'shared-utils-split' (a1b2c3d4e5f)

# Push that branch to a new remote
$ git remote add shared-utils-new https://github.com/your-org/shared-utils-extracted.git
$ git push shared-utils-new shared-utils-split:main

This is extremely useful when you realize mid-project that a piece of your codebase should be its own repository.

Head-to-Head Comparison

Here is the comparison table I keep in my notes. These are the factors that actually matter when choosing.

Factor Submodules Subtrees npm Packages Monorepo (Workspaces)
Setup complexity Medium Low Medium High initial, low ongoing
Clone behavior Extra init step required Just works npm install Just works
Version pinning Exact commit SHA By pull/merge semver in package.json Workspace links
Contributor friction High -- must understand submodules Low -- files are just there Low -- standard workflow Low once set up
Repo size impact Minimal (pointer only) Full copy in history None (node_modules) Everything in one repo
Bidirectional changes Natural (each repo is independent) Possible via subtree push Publish new version Edit directly
CI/CD complexity Must configure submodule checkout No extra steps No extra steps Needs workspace-aware tooling
Offline development Need submodule populated Works (files are local) Need node_modules populated Works
History isolation Complete (separate repos) Merged (can squash) Complete Shared
Works without Node.js Yes Yes No Yes (with other tools)

When to Use Submodules

Submodules are the right choice in these specific scenarios:

Strict version pinning across multiple consumers. If you have a shared library used by ten services, and each service needs to independently decide when to upgrade, submodules give you that control. Service A can pin to commit abc123 while Service B pins to def456. Each team upgrades on their own schedule.

Large external repositories you do not want to embed. If the shared code is a large repository -- say, a vendor SDK, a dataset, or a binary tools collection -- embedding it as a subtree bloats your repo permanently. A submodule keeps your repo lean because it only stores the pointer.

Vendor or third-party code you never modify. If you are pulling in an open source library that you will never contribute back to, submodules work well. You pin to a tag, update periodically, and never touch the code.

# Pin a vendor library to a specific release tag
$ git submodule add https://github.com/vendor/their-sdk.git vendor/sdk
$ cd vendor/sdk
$ git checkout v2.4.1
$ cd ../..
$ git add vendor/sdk
$ git commit -m "add vendor SDK pinned to v2.4.1"

Cross-language projects. In a Node.js service that also includes a Go microservice and a Python ML model, npm packages only cover the Node.js part. Submodules work across any language.

When to Use Subtrees

Subtrees are the right choice in these scenarios:

Simpler workflow for a small team. If your team is small (1-5 people) and you do not want to train everyone on submodule commands, subtrees reduce friction. Contributors clone, pull, push. The subtree is invisible to them unless they need to update it.

Stable shared code that rarely changes. If the shared library is mature and changes infrequently, a subtree is ideal. You pull updates every few weeks or months. The rest of the time, the code sits there and works.

You want to make local modifications. With subtrees, you can edit the shared code directly in your project. Changes are regular commits. You can push them back upstream when you are ready, or keep them as project-specific patches.

Archiving a dependency. If you are taking a dependency on a library that might disappear (abandoned open source, a contractor's repo, etc.), a subtree preserves a full copy. Even if the upstream goes away, you have the code.

When to Use Neither

Here is the advice most Git articles will not give you: for Node.js projects, npm packages are usually the better answer.

If the shared code is JavaScript, if you have a private npm registry (or can use GitHub Packages, Artifactory, or even a Git URL in package.json), and if you do not need to develop the shared code in lockstep with the consumer -- just publish a package.

{
  "name": "my-service",
  "dependencies": {
    "shared-utils": "git+https://github.com/your-org/shared-utils.git#v2.4.1",
    "@your-org/shared-utils": "^2.4.1"
  }
}

The Git URL syntax (git+https://...#tag) even gives you version pinning without a registry. It is not ideal for production, but it works for internal tools.

Monorepos with npm workspaces are another strong option. If you are already using a monorepo, workspaces give you symlinked local packages with zero overhead:

{
  "name": "my-monorepo",
  "workspaces": [
    "packages/*",
    "services/*"
  ]
}
$ npm install
# packages/shared-utils is now linked into every service's node_modules

I use this pattern for most of my active projects. It eliminates the versioning problem entirely because every service always uses the latest code. The tradeoff is that all your code lives in one repository.

CI/CD Considerations

This is where submodules create the most pain. Your CI pipeline must be explicitly configured to check out submodules, and the default behavior in most CI systems is to not do this.

GitHub Actions with Submodules

name: Build and Test
on: [push, pull_request]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout with submodules
        uses: actions/checkout@v4
        with:
          submodules: recursive
          # If submodules are in private repos, you need a token
          token: ${{ secrets.GH_PAT }}

      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'

      - name: Install and test
        run: |
          npm install
          npm test

The critical line is submodules: recursive. Without it, the submodule directory is empty and your build fails with missing module errors.

If your submodules are in private repositories, you also need a Personal Access Token (PAT) with repo access. The default GITHUB_TOKEN only has access to the current repository.

Azure Pipelines with Submodules

trigger:
  - main

pool:
  vmImage: 'ubuntu-latest'

steps:
  - checkout: self
    submodules: recursive
    persistCredentials: true

  - task: NodeTool@0
    inputs:
      versionSpec: '20.x'

  - script: |
      npm install
      npm test
    displayName: 'Install and test'

Azure Pipelines uses submodules: recursive on the checkout step. The persistCredentials: true is needed if you want subsequent Git operations to work with authentication.

For Azure DevOps repos, if the submodule is in the same organization, the build service account usually has access. If it is in a different organization or on GitHub, you need to configure a service connection.

Subtrees in CI/CD

Subtrees need nothing extra in CI/CD. The files are part of your repository. Your pipeline clones the repo and gets everything. This alone is a compelling reason to prefer subtrees in teams that struggle with CI configuration.

# No special configuration needed for subtrees
name: Build and Test
on: [push, pull_request]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
      - run: npm install && npm test

Common Workflows

Submodule Workflow: Daily Development

# Morning: start work, make sure submodules are current
$ git pull origin main
$ git submodule update --init --recursive

# During the day: if you need to update a submodule
$ git submodule update --remote libs/shared-utils
$ git diff libs/shared-utils  # see what changed
$ git add libs/shared-utils
$ git commit -m "update shared-utils to latest"

# If you need to work on the submodule itself
$ cd libs/shared-utils
$ git checkout main
$ git pull origin main
# make changes, commit, push
$ git push origin main
$ cd ../..
$ git add libs/shared-utils
$ git commit -m "update shared-utils pointer after fix"

Subtree Workflow: Daily Development

# Morning: start work (no extra steps, just pull)
$ git pull origin main

# If you need to update the subtree from upstream
$ git subtree pull --prefix=libs/shared-utils shared-utils main --squash

# If you fix a bug in the subtree and want to push upstream
$ git subtree push --prefix=libs/shared-utils shared-utils main

# If you just want to edit subtree files locally, just edit them
# They are regular files. No special commands needed.

Complete Working Example

Let us set up both approaches from scratch so you can see the full lifecycle.

Setting Up the Shared Library

First, create the shared library repository:

$ mkdir shared-utils && cd shared-utils
$ git init
Initialized empty Git repository in /home/dev/shared-utils/.git/

$ cat > package.json << 'EOF'
{
  "name": "@myorg/shared-utils",
  "version": "1.0.0",
  "main": "index.js"
}
EOF

$ cat > index.js << 'JSEOF'
var validator = require("./lib/validate");
var formatter = require("./lib/format");

module.exports = {
  validate: validator,
  format: formatter
};
JSEOF

$ mkdir lib

$ cat > lib/validate.js << 'JSEOF'
function isEmail(value) {
  var pattern = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
  return pattern.test(value);
}

function isNonEmpty(value) {
  return typeof value === "string" && value.trim().length > 0;
}

function isPositiveInt(value) {
  var num = parseInt(value, 10);
  return !isNaN(num) && num > 0 && String(num) === String(value);
}

module.exports = {
  isEmail: isEmail,
  isNonEmpty: isNonEmpty,
  isPositiveInt: isPositiveInt
};
JSEOF

$ cat > lib/format.js << 'JSEOF'
function slugify(text) {
  return text
    .toString()
    .toLowerCase()
    .trim()
    .replace(/\s+/g, "-")
    .replace(/[^\w\-]+/g, "")
    .replace(/\-\-+/g, "-");
}

function truncate(text, maxLength) {
  if (text.length <= maxLength) {
    return text;
  }
  return text.substring(0, maxLength - 3) + "...";
}

module.exports = {
  slugify: slugify,
  truncate: truncate
};
JSEOF

$ git add -A
$ git commit -m "initial shared-utils library"
$ git remote add origin https://github.com/your-org/shared-utils.git
$ git push -u origin main

Example A: Submodule-Based Project

# Create the consuming project
$ mkdir service-alpha && cd service-alpha
$ git init
$ npm init -y

# Add the shared library as a submodule
$ git submodule add https://github.com/your-org/shared-utils.git libs/shared-utils
Cloning into 'libs/shared-utils'...
done.

# Use it in your code
$ cat > app.js << 'JSEOF'
var express = require("express");
var utils = require("./libs/shared-utils");

var app = express();
app.use(express.json());

app.post("/users", function(req, res) {
  var email = req.body.email;
  var name = req.body.name;

  if (!utils.validate.isEmail(email)) {
    return res.status(400).json({ error: "Invalid email" });
  }

  if (!utils.validate.isNonEmpty(name)) {
    return res.status(400).json({ error: "Name is required" });
  }

  var slug = utils.format.slugify(name);

  res.json({
    message: "User created",
    slug: slug,
    email: email
  });
});

var port = process.env.PORT || 3000;
app.listen(port, function() {
  console.log("service-alpha listening on port " + port);
});
JSEOF

# Commit everything
$ git add -A
$ git commit -m "initial service with shared-utils submodule"

Here is the GitHub Actions CI configuration for this project:

# .github/workflows/ci.yml
name: CI

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          submodules: recursive

      - uses: actions/setup-node@v4
        with:
          node-version: '20'

      - run: npm install
      - run: npm test

And the Azure Pipelines equivalent:

# azure-pipelines.yml
trigger:
  branches:
    include:
      - main

pool:
  vmImage: 'ubuntu-latest'

steps:
  - checkout: self
    submodules: recursive
    persistCredentials: true

  - task: NodeTool@0
    inputs:
      versionSpec: '20.x'

  - script: npm install
    displayName: 'Install dependencies'

  - script: npm test
    displayName: 'Run tests'

Example B: Subtree-Based Project

# Create the consuming project
$ mkdir service-beta && cd service-beta
$ git init
$ npm init -y

# Add the remote for convenience
$ git remote add shared-utils https://github.com/your-org/shared-utils.git

# Add as a subtree
$ git subtree add --prefix=libs/shared-utils shared-utils main --squash
Added dir 'libs/shared-utils'

# Use it in your code (identical to the submodule version)
$ cat > app.js << 'JSEOF'
var express = require("express");
var utils = require("./libs/shared-utils");

var app = express();
app.use(express.json());

app.post("/users", function(req, res) {
  var email = req.body.email;
  var name = req.body.name;

  if (!utils.validate.isEmail(email)) {
    return res.status(400).json({ error: "Invalid email" });
  }

  if (!utils.validate.isNonEmpty(name)) {
    return res.status(400).json({ error: "Name is required" });
  }

  var slug = utils.format.slugify(name);

  res.json({
    message: "User created",
    slug: slug,
    email: email
  });
});

var port = process.env.PORT || 3000;
app.listen(port, function() {
  console.log("service-beta listening on port " + port);
});
JSEOF

# Commit
$ git add -A
$ git commit -m "initial service with shared-utils subtree"

The CI configuration for the subtree project is standard -- no special configuration needed:

# .github/workflows/ci.yml
name: CI

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
      - run: npm install
      - run: npm test

Updating the Shared Library in Each Approach

Suppose a bug is found in validate.js and the fix is pushed to shared-utils upstream.

Updating the submodule-based project:

$ cd service-alpha
$ git submodule update --remote libs/shared-utils
Submodule path 'libs/shared-utils': checked out 'f7g8h9i0j1k2l3m4n5o6p7q8r9s0t1u2v3w4x5y6'
$ git add libs/shared-utils
$ git commit -m "update shared-utils with validation fix"
$ git push origin main

Updating the subtree-based project:

$ cd service-beta
$ git subtree pull --prefix=libs/shared-utils shared-utils main --squash
Merge made by the 'recursive' strategy.
 libs/shared-utils/lib/validate.js | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
$ git push origin main

Common Issues and Troubleshooting

Issue 1: "fatal: no submodule mapping found in .gitmodules for path"

fatal: no submodule mapping found in .gitmodules for path 'libs/shared-utils'

This happens when the Git index has a submodule entry but .gitmodules does not, or vice versa. It is usually caused by a bad merge or manually deleting the submodule directory without cleaning up properly.

Fix:

# Check what Git thinks the submodule state is
$ git ls-files --stage | grep 160000
160000 a1b2c3d4... 0	libs/shared-utils

# If .gitmodules is missing the entry, re-add it
$ git submodule add https://github.com/your-org/shared-utils.git libs/shared-utils

# If the entry exists in .gitmodules but the index is wrong
$ git rm --cached libs/shared-utils
$ git submodule add https://github.com/your-org/shared-utils.git libs/shared-utils

Issue 2: "fatal: refusing to merge unrelated histories" (Subtree Pull)

fatal: refusing to merge unrelated histories

This occurs on git subtree pull when Git cannot find a common ancestor between your project and the subtree. It is common when you used --squash on the initial add but something went wrong with the merge base.

Fix:

# Re-fetch and try with explicit merge strategy
$ git fetch shared-utils main
$ git subtree pull --prefix=libs/shared-utils shared-utils main --squash

# If that still fails, remove and re-add the subtree
$ git rm -r libs/shared-utils
$ git commit -m "remove subtree for re-add"
$ git subtree add --prefix=libs/shared-utils shared-utils main --squash

Issue 3: "Server does not allow request for unadvertised object" in CI

fatal: Server does not allow request for unadvertised object a1b2c3d4e5f6...
Fetched in submodule path 'libs/shared-utils', but it did not contain a1b2c3d4e5f6.
Direct fetching of that commit failed.

This happens in CI when the submodule is pinned to a commit that does not exist on any branch in the submodule's remote. Usually because someone force-pushed or rebased the submodule repo and the pinned commit was garbage collected.

Fix:

# On your local machine, update the submodule to a commit that exists
$ cd libs/shared-utils
$ git fetch origin
$ git log --oneline origin/main -5  # find a valid commit
$ git checkout <valid-commit>
$ cd ../..
$ git add libs/shared-utils
$ git commit -m "fix submodule pointer to valid commit"
$ git push

Prevent this by never force-pushing the submodule's main branch. Use --force-with-lease if you must force push, and never rewrite published history.

Issue 4: Subtree Merge Conflicts on Pull

Auto-merging libs/shared-utils/lib/validate.js
CONFLICT (content): Merge conflict in libs/shared-utils/lib/validate.js
Automatic merge failed; fix conflicts and then commit the result.

This happens when you have made local changes to the subtree files and the upstream also changed the same lines. It is a normal merge conflict, but it can be confusing because you might not realize the subtree files are expected to stay in sync with an upstream.

Fix:

# Resolve the conflict like any merge conflict
$ code libs/shared-utils/lib/validate.js  # fix conflicts manually
$ git add libs/shared-utils/lib/validate.js
$ git commit -m "resolve subtree merge conflict in validate.js"

To avoid this, establish a convention: either always edit shared code upstream and pull into consumers, or always edit locally and push upstream. Bidirectional changes to the same files will eventually cause conflicts.

Issue 5: Submodule Shows Modified When It Should Not

$ git status
On branch main
Changes not staged for commit:
  modified:   libs/shared-utils (new commits)

This appears when someone ran commands inside the submodule directory that changed the checked-out commit, or when git submodule update --remote was run without committing the result.

Fix:

# Reset the submodule to the committed pointer
$ git submodule update --init libs/shared-utils

# Or, if you intentionally updated it, commit the change
$ git add libs/shared-utils
$ git commit -m "update shared-utils submodule"

Best Practices

  • Document your choice. Add a section to your README explaining which approach you use and why. Include the exact commands for cloning, updating, and (for subtrees) pushing changes upstream. Future you and your teammates will thank you.

  • Use --squash with subtrees. Unless you have a specific reason to merge the full upstream history, always use --squash. It keeps your commit log clean and avoids confusing interleaved history from the subtree.

  • Pin submodules to tags, not branches. When possible, pin to a tagged release (v2.4.1) rather than a floating branch tip. Tags are immutable. Branch tips move. Pinning to a tag gives you reproducible builds.

  • Add a Git alias for submodule updates. Save your team from remembering the full command:

    $ git config --global alias.pullall '!git pull && git submodule update --init --recursive'
    
    # Now just run:
    $ git pullall
    
  • Test submodule CI configuration before merging. Open a draft PR that adds the submodule and verify the CI pipeline checks out the submodule correctly. Do not discover the problem on a Friday afternoon production deploy.

  • Keep subtree remotes in a script. Since subtree remotes are not stored in .gitmodules, new developers will not know the upstream URL. Add a scripts/update-subtrees.sh that documents and automates the process:

    #!/bin/bash
    # update-subtrees.sh -- Pull latest changes from upstream subtree repos
    
    git remote add shared-utils https://github.com/your-org/shared-utils.git 2>/dev/null || true
    git subtree pull --prefix=libs/shared-utils shared-utils main --squash
    
  • Never nest submodules more than one level deep. Nested submodules (submodules within submodules) work technically, but the update and init commands become fragile. If you find yourself nesting, flatten the structure or switch to a monorepo.

  • Evaluate regularly. If your submodule is causing more CI failures than it prevents dependency drift, switch to a subtree or a package. If your subtree is growing so large that it bloats your repository, switch to a submodule. The right tool changes as your project evolves.

  • Use .gitmodules shallow clone for large submodules. If the submodule repository is large and you only need the pinned commit (not the full history), configure shallow cloning:

    [submodule "vendor/large-sdk"]
        path = vendor/large-sdk
        url = https://github.com/vendor/large-sdk.git
        shallow = true
    

    This dramatically reduces clone time in CI pipelines.

References

Powered by Contentful