Git Large File Storage


In the post, I will introduce a common usage of large file storage in Git together with a tool bfg.


Say one day you found an engineer in your team accidentally commit a binary file (for example a shared library) to the remote repository. He argued the file must be accessible in the repository. You are trying to find a better way to do it.


Since it is bad to keep large files with codes, the first thing we want to do is reverting all related commits. After it, we need to use another method to store large files.

The conclusion is that we could use bfg and lfs to do the job. bfg is a faster version of git branch cleaner and lfs is an extension of Git to do large file storage.

1. Install bfg

Visit the official website of bfg. Download the jar file. Then you are all set with bfg.

2. Clean the Remote Branch

We will use the --delete-files option.

## delete corresponding files
cd /path/to/your/repo
rm lib/
git add .
git commit -m "rm shared lib"
## use bfg
java -jar bfg.jar --delete-files /path/to/your/repo
## since old hash values might be changed, you should force remote to update
git push origin your-branch -f

3. Install lfs

Visit here for detailed instructions.

curl -s | sudo bash
sudo apt-get install git-lfs
git lfs install

4. Track Large Files

cd /path/to/your/repo
git lfs install
cd lib
git lfs track '*.so'
# you will see a .gitattributes file in the current directory
git add .gitattributes 
git add lib/
git commit -m "add libs using lfs"
# after this, the `.so` files in the lib will use lfs to do tracking rather than the origin git

Make sure your co-workers also installed lfs correctly. Then just do push or pull as you normally would. After these steps, instead of storing large binary files, lfs stores file pointers in the repo and store files in separate servers.

Leave a Reply

Your email address will not be published. Required fields are marked *