In the post, I will introduce a common usage of large file storage in Git together with a tool bfg.
Say one day you found an engineer in your team accidentally commit a binary file (for example a shared library) to the remote repository. He argued the file must be accessible in the repository. You are trying to find a better way to do it.
Since it is bad to keep large files with codes, the first thing we want to do is reverting all related commits. After it, we need to use another method to store large files.
The conclusion is that we could use
lfs to do the job.
bfg is a faster version of
git branch cleaner and
lfs is an extension of Git to do large file storage.
1. Install bfg
Visit the official website of bfg. Download the jar file. Then you are all set with
2. Clean the Remote Branch
We will use the
## delete corresponding files cd /path/to/your/repo rm lib/libxxx.so git add . git commit -m "rm shared lib" ## use bfg java -jar bfg.jar --delete-files libxxx.so /path/to/your/repo ## since old hash values might be changed, you should force remote to update git push origin your-branch -f
3. Install lfs
Visit here for detailed instructions.
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash sudo apt-get install git-lfs git lfs install
4. Track Large Files
cd /path/to/your/repo git lfs install cd lib git lfs track '*.so' # you will see a .gitattributes file in the current directory git add .gitattributes git add lib/libxxx.so git commit -m "add libs using lfs" # after this, the `.so` files in the lib will use lfs to do tracking rather than the origin git
Make sure your co-workers also installed
lfs correctly. Then just do push or pull as you normally would. After these steps, instead of storing large binary files,
lfs stores file pointers in the repo and store files in separate servers.