阿里云主机折上折
  • 微信号
Current Site:Index > Filter branch (git filter-branch)

Filter branch (git filter-branch)

Author:Chuan Chen 阅读数:2396人阅读 分类: 开发工具

git filter-branch is a powerful history rewriting tool in Git that allows deep modifications to a repository's commit history. It can batch-modify file contents, author information, commit messages, and more in commits, but it must be used with caution as it changes commit hashes, which may impact collaboration.

Basic Concepts and Uses

The core functionality of git filter-branch is to traverse all commits and rewrite history according to specified rules. Common scenarios include:

  • Permanently deleting sensitive files (e.g., passwords, keys) from history
  • Batch-modifying author/committer information
  • Extracting a subdirectory as the root of a new repository
  • Resolving historical conflicts when merging multiple repositories
# Basic syntax structure
git filter-branch [options] <subcommand> [arguments]

Typical Use Cases

Deleting Files from History

To delete a specific file (e.g., credentials.txt) from all commits:

git filter-branch --force --index-filter \
  'git rm --cached --ignore-unmatch credentials.txt' \
  --prune-empty --tag-name-filter cat -- --all
  • --index-filter is faster than --tree-filter (does not check out files)
  • --prune-empty automatically removes resulting empty commits
  • --tag-name-filter cat preserves tag names

Modifying Commit Information

Batch-modifying author email addresses:

git filter-branch --env-filter '
  if [ "$GIT_AUTHOR_EMAIL" = "old@example.com" ]; then
    GIT_AUTHOR_EMAIL="new@example.com";
  fi
' --tag-name-filter cat -- --all

Extracting a Subdirectory

Promoting subdir/ to the repository root:

git filter-branch --subdirectory-filter subdir -- --all

Advanced Usage Examples

Conditionally Modifying File Content

Using --tree-filter to modify text in all .js files:

git filter-branch --tree-filter "
  find . -name '*.js' -exec sed -i 's/var/const/g' {} +
" HEAD~10..HEAD

Combining Multiple Conditions

Modifying author information and deleting files simultaneously:

git filter-branch --env-filter '
  # Modify author
  export GIT_AUTHOR_NAME="New Name"
  export GIT_AUTHOR_EMAIL="new@email.com"
' --index-filter '
  # Delete file
  git rm --cached --ignore-unmatch secret.txt
' --prune-empty -- --all

Performance Optimization Tips

  1. Limit scope: Add a commit range (e.g., HEAD~20..HEAD)
  2. Use index filtering: --index-filter is 10-100x faster than --tree-filter
  3. Disable garbage collection: Temporarily disable GC
    git -c gc.auto=0 filter-branch ...
    

Risks and Considerations

  1. Hash changes: All rewritten commits generate new hashes, breaking collaboration
  2. Backup required: Always create a repository backup before proceeding
    git clone --mirror original.git backup.git
    
  3. Clean up cache: Clear reference cache after operation
    git for-each-ref --format="delete %(refname)" refs/original | git update-ref --stdin
    git reflog expire --expire=now --all
    git gc --prune=now
    

Alternative Tool Comparison

For large repositories, consider more efficient tools:

  • git-filter-repo (Python implementation, faster)
  • BFG Repo-Cleaner (Java implementation, optimized for deleting large files)
# Example of using git-filter-repo to delete files
git filter-repo --path credentials.txt --invert-paths

Practical Example: Project Migration

Assume you need to extract the project-v1/ directory as a new repository and modify all committer information:

# Step 1: Clone the original repository
git clone https://example.com/original.git
cd original

# Step 2: Extract the subdirectory
git filter-branch --subdirectory-filter project-v1 -- --all

# Step 3: Modify commit information
git filter-branch --env-filter '
  export GIT_COMMITTER_NAME="Team"
  export GIT_COMMITTER_EMAIL="team@company.com"
' -- --all

# Step 4: Push to the new repository
git remote set-url origin https://example.com/new-repo.git
git push --force --all

本站部分内容来自互联网,一切版权均归源网站或源作者所有。

如果侵犯了你的权益请来信告知我们删除。邮箱:cc@cccx.cn

Front End Chuan

Front End Chuan, Chen Chuan's Code Teahouse 🍵, specializing in exorcising all kinds of stubborn bugs 💻. Daily serving baldness-warning-level development insights 🛠️, with a bonus of one-liners that'll make you laugh for ten years 🐟. Occasionally drops pixel-perfect romance brewed in a coffee cup ☕.