167

I have a commit with the ID 56f06019, for example. In that commit I have accidentally committed large file (50 MB). In another commit I added the same file but in the right size (small). Now my repo is too heavy when I clone. How do I remove that large file from the repo history to reduce the size of my repo?

3

5 Answers 5

220

Chapter 9 of the Pro Git book has a section on Removing Objects.

Let me outline the steps briefly here:

git filter-branch --index-filter \
    'git rm --cached --ignore-unmatch path/to/mylarge_50mb_file' \
    --tag-name-filter cat -- --all

Like the rebasing option described before, filter-branch is rewriting operation. If you have published history, you'll have to --force push the new refs.

The filter-branch approach is considerably more powerful than the rebase approach, since it

  • allows you to work on all branches/refs at once,
  • renames any tags on the fly
  • operates cleanly even if there have been several merge commits since the addition of the file
  • operates cleanly even if the file was (re)added/removed several times in the history of (a) branch(es)
  • doesn't create new, unrelated commits, but rather copies them while modifying the trees associated with them. This means that stuff like signed commits, commit notes etc. are preserved

filter-branch keeps backups too, so the size of the repo won't decrease immediately unless you expire the reflogs and garbage collect:

rm -Rf .git/refs/original       # careful
git gc --aggressive --prune=now # danger
15
  • 1
    It's worth noting that this doesn't seem to work under windows cmd.exe. Seems to work under cygwin fine, though.
    – Fake Name
    Commented Nov 16, 2013 at 5:29
  • 2
    I got the above git filter-branch to work by using double-quotes instead of single-quotes (on Windows Server 2012 cmd.exe)
    – JCii
    Commented Dec 19, 2013 at 7:09
  • 2
    What worked for me was this filter-branch command line. git filter-branch --force --index-filter 'git rm --ignore-unmatch --cached PathTo/MyFile/ToRemove.dll' -- fbf28b005^.. Then rm --recursive --force .git/refs/original and rm --recursive --force .git/logs Then I used the git prune --expire now and git gc --aggressive This worked better for me than your exact steps listed above. Thank you for including the link to the Git Pro book as it was invaluable.
    – dacke.geo
    Commented Nov 16, 2015 at 16:53
  • After the filter-branch command, the only way I could get the size of the .git folder down was to follow the command found here: stackoverflow.com/questions/1904860/… git -c gc.reflogExpire=0 -c gc.reflogExpireUnreachable=0 -c gc.rerereresolved=0 \ -c gc.rerereunresolved=0 -c gc.pruneExpire=now gc "$@" Commented Mar 14, 2016 at 17:39
  • 1
    @AlexanderMyasnikov because usually files are being removed for important reasons (like, they're big or contain sensitive information). Unless you process all branches, the file will still be in the repository. Also, good thing that you still have the backup after filter-branch.
    – sehe
    Commented Jun 3, 2023 at 1:30
24

You can use git-extras tool. The obliterate command completely remove a file from the repository, including past commits and tags.

https://github.com/tj/git-extras/blob/master/Commands.md

4
  • 2
    Awesome! This did the job. So beautifully simple. Commented Jul 1, 2021 at 2:27
  • Thanks a lot for such a wonderful and simplest solution. Commented Dec 28, 2021 at 12:08
  • This had to rewrite the entire history which was about 30000 commits, even though the files I want to remove were just 5 commits old.
    – thanos.a
    Commented Sep 20, 2022 at 12:33
  • git obliterate does basically the same as the accepted answer.
    – Y. E.
    Commented May 7, 2023 at 12:42
11

I tried using the following answer on windows https://stackoverflow.com/a/8741530/8461756

Single quote does not work on windows; you need double-quotes.

Following worked for me.

git filter-branch --force --index-filter "git rm --cached --ignore-unmatch PathRelativeRepositoryRoot/bigfile.csv" -- --all

After removing the big file, I was able to push my changes to GitHub master.

3
  • somehow .\relative\path\to\file* doesn't work for me. I need to use *file* instead
    – Ooker
    Commented Dec 16, 2022 at 16:04
  • @Ooker the path you used, was it relative to repo root? Commented Mar 31 at 9:43
  • 2 years have passed and I have no idea what was then. But I suppose yes
    – Ooker
    Commented Mar 31 at 10:21
1

You will need to git rebase in the interactive mode see an example here: How can I remove a commit on GitHub? and how to remove old commits.

If your commit is at HEAD minus 10 commits:

$ git rebase -i HEAD~10

After the edition of your history, you need to push the "new" history, you need to add the + to force (see the refspec in the push options):

$ git push origin +master

If other people have already cloned your repository, you will to inform them, because you just changed the history.

6
  • 4
    That does not remove the large file from history. Also, the canonical way to force push is git push --force or git push -f (which doesn't require people to know the branch push target)
    – sehe
    Commented Jan 5, 2012 at 11:13
  • Based on the question, the new file is exactly the same as the old file, that is, the same path. This is why you cannot directly use git rm on the path. Commented Jan 5, 2012 at 12:22
  • 2
    @sehe, if you do a rebase eliminating the commit with the huge file, it is gone for good.
    – vonbrand
    Commented Feb 7, 2013 at 1:33
  • @vonbrand only from that branch that you rebased. I'm not assuming the 'from' branch gets deleted. But yeah, if you delete a revision tree branch, that will help :_
    – sehe
    Commented Feb 7, 2013 at 12:38
  • @sehe, sure, you have to chase down all branches containing the offending commit. If it is before some bushiness in the repo, you'll have a lot of reorganizing to do. But rebase is the tool for this.
    – vonbrand
    Commented Feb 7, 2013 at 13:02
-1

You can use a simple Command to deleted

 git rm -r -f app/unused.txt 
 git rm -r -f yourfilepath
1
  • This will leave the file in the history. The question is to remove the file from the history as well. So it was like the file was never added in the first place. Commented Jan 20, 2023 at 14:15

Not the answer you're looking for? Browse other questions tagged or ask your own question.