-
Notifications
You must be signed in to change notification settings - Fork 839
Description
Hi there, when using git-filter-repo
recently (git version 2.49.0
, git-filter-repo version a40bce548d2c
), I learned "the hard way" that the glob syntax does not match what you'd expect from a glob pattern containing a *
. Specifically, the path segment boundary is not respected.
For example, given the following directory structure:
a/
b/
c/
d.sql
e.sql
not_deleted.txt
y.sql
The following glob patterns in a paths file (paths.txt):
glob:a/b/*.sql
And the following command:
git filter-repo --sensitive-data-removal --invert-paths --paths-from-file paths.txt
With typical glob syntax, one would expect that y.sql
would be deleted and nothing else (because *
does not match on path separators). However, the entire a/b/c
directory is also deleted.
On further RTFM-ing of the man page, I discovered a note in the examples describing this as an expected behavior.
It appears that git-filter-repo
may actually be using Unix fnmatch
syntax rather than glob syntax -- where *
in fnmatch
will match any character (including path separators).
Short of changing the API to git-filter-repo
by renaming path-glob
to path-fnmatch
, I would suggest making this fact far more prominent in the docs (for example, by mentioning it in the options section of the man page, and in any corresponding --help
output) to avoid confusion by people who assume that they are dealing with a standard glob matcher.
Finally, thanks so much for creating this tool -- it's the best at what it does!