Commit Graph

464 Commits

Author SHA1 Message Date
Elijah Newren
47c5a29fd4 Merge branch 'sb/callback-from-file'
Signed-off-by: Elijah Newren <newren@gmail.com>
2021-06-07 08:36:10 -07:00
Shezan Baig
5256c99e49 Allow callback body to be loaded from a file
For anything more complicated than a few lines, it's easier to write the
callback body in a file and let filter-repo load the file as a string.

Signed-off-by: Shezan Baig <sbaig1@bloomberg.net>
[en: added a testcase for code coverage]
Signed-off-by: Elijah Newren <newren@gmail.com>
2021-06-07 08:35:23 -07:00
Elijah Newren
a10fa46010 Merge branch 'sr/reusable-test-runner-script'
Signed-off-by: Elijah Newren <newren@gmail.com>
2021-06-07 08:21:46 -07:00
Stefano Rivera
24f09bd016 Share implementation with github workflow
Signed-off-by: Stefano Rivera <stefano@rivera.za.net>
2021-06-06 14:45:34 -04:00
Stefano Rivera
26e3f8c52e Exit non-zero if the tests fail
Signed-off-by: Stefano Rivera <stefano@rivera.za.net>
2021-06-06 14:45:34 -04:00
Stefano Rivera
34b26f4026 Break the actual test runner into its own script
So that we don't have to run with coverage if we don't want to.

Additionally, don't require being in the t directory to run tests

Signed-off-by: Stefano Rivera <stefano@rivera.za.net>
2021-06-06 14:45:09 -04:00
Elijah Newren
e5d8938d48 lint-history: explain how TMPDIR can be used
Some users may want to take advantage of setting TMPDIR to another
location that might be faster for the linting process.

Reported-by: @ruv on GitHub
Signed-off-by: Elijah Newren <newren@gmail.com>
2021-06-05 11:57:44 -07:00
Elijah Newren
ccc37d3423 lint-history: explain filename paths
It was not clear for some users that the filenames would be relative
paths from the toplevel of the repository.  Add some text to explain
this.

Reported-by: @ruv on GitHub
Signed-off-by: Elijah Newren <newren@gmail.com>
2021-06-05 11:57:44 -07:00
Elijah Newren
dc012d277b bfg-ish: add some sanity checks on the specified repo
Signed-off-by: Elijah Newren <newren@gmail.com>
2021-06-05 11:39:03 -07:00
Elijah Newren
06fa059744 Merge branch 'bl/bfg-ish-relative-paths'
Signed-off-by: Elijah Newren <newren@gmail.com>
2021-06-05 11:33:47 -07:00
林博仁(Buo-ren Lin)
e732141363 Fix relative path compatibility for --replace-text and bfg_args.repo
Users could specify relative paths on the command line, and then also
provide a directory other than '.' for the repo.  Since we did an
unconditional os.chdir() to move into the repo, that would invalidate
the original relative paths.  Fix that by changing the relative paths
into absolute paths.

Signed-off-by: 林博仁(Buo-ren Lin) <Buo.Ren.Lin@gmail.com>
[en: tweaked commit message to explain the problem]
Signed-off-by: Elijah Newren <newren@gmail.com>
2021-06-05 10:44:34 -07:00
Elijah Newren
75e67bcd44 git-filter-repo.txt: link to GitHub docs on purging old history
Signed-off-by: Elijah Newren <newren@gmail.com>
2021-04-07 11:15:16 -07:00
Elijah Newren
12743def48 git-filter-repo.txt: add some clarifications around replace refs
Signed-off-by: Elijah Newren <newren@gmail.com>
2021-04-07 11:13:55 -07:00
Elijah Newren
8683d6fe48 Merge branch 'js/windows-fixes'
Signed-off-by: Elijah Newren <newren@gmail.com>
2021-04-01 12:28:37 -07:00
Johannes Schindelin
fbaab1704c lint-history: do decode bytes
This fixes the "TypeError: a bytes-like object is required, not 'str'"
problem on Windows, letting t9391 pass.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-04-01 12:28:17 -07:00
Johannes Schindelin
e0a3df8c62 Fix the Python path on Windows
On Windows, we want to run with a native Python, i.e. the separator is a
semicolon, and the paths should be Windows paths (although they're
allowed to have forward slashes instead of backslashes).

Since we're most likely running this in an MSYS2 Bash, allow for
`$TEST_DIRECTORY` to pretend to be a Unix path, and translate it via
`cygpath` into a Windows path.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2021-04-01 12:28:17 -07:00
Elijah Newren
3f181531df README.md: link to external formatting of user manual
Some people don't like htmlpreview.github.io.  I once or twice saw a
case where it appeared to be affected by load limits.  Since external
sites are making the manual available, and it's unlikely there are too
many changes between the last release and the current manual, just link
to it as an alternative for folks.

Signed-off-by: Elijah Newren <newren@gmail.com>
2021-04-01 12:15:11 -07:00
Elijah Newren
d2fdc89ff3 filter-repo: avoid depending on wc binary being present
rev-list already has --count option anyway, so piping output to wc -l to
count the number of lines was a total waste of time.  Plus, it might
cause failures for the testsuite on some Windows boxes.

Signed-off-by: Elijah Newren <newren@gmail.com>
2021-04-01 12:15:11 -07:00
Elijah Newren
cf67ccd978 filter-repo: improve invalid repository error message
Even though the repository is encoded as a bytestring, we want error
messages to be UTF-8.

Signed-off-by: Elijah Newren <newren@gmail.com>
2021-04-01 12:14:17 -07:00
Elijah Newren
7500fb7c5a t9390: add a testcase for --path-rename with no colon
Commit 28b479b7 (Fix bug in --path-rename argument without colon,
2021-03-12) added a new conditional error message, with no corresponding
testcase to ensure the line was covered.  I forgot to check the coverage
before merging the change.  Add a relevant test now.

Signed-off-by: Elijah Newren <newren@gmail.com>
2021-03-30 01:01:23 -07:00
Elijah Newren
97a1613f81 lint-history: fix binary blob detection
We had a lingering issue in the conversion from python2 to python3; as
reported by @thebrandre on GitHub:

    any(x==b'1' for x in b"123")
    # returns True in Python2 and False in Python3 because different
    # types are returned on iteration:
    [type(x) for x in b"123"]
    # Python2: [<type 'str'>, <type 'str'>, <type 'str'>]
    # Python3: [<class 'int'>, <class 'int'>, <class 'int'>]

Replace the
    any(x==b"0" for x in blob.data[0:8192])
construct with
    b"\0" in blob.data[0:8192]
to fix this.

Suggested-by: @thebrandre on GitHub
Signed-off-by: Elijah Newren <newren@gmail.com>
2021-03-29 23:23:39 -07:00
Elijah Newren
cf84943982 Merge branch 'lk/path-rename-colon-count'
Signed-off-by: Elijah Newren <newren@gmail.com>
2021-03-12 07:39:23 -08:00
Lassi Kortela
28b479b79d Fix bug in --path-rename argument without colon
The --path-rename flag expected an argument with a colon
character (':') in it, which it assumed without checking. If the user
gave an argument with no colon in it, this backtrace would be shown:

  File "/usr/local/bin/git-filter-repo", line 1626, in __call__
    if values[0] and values[1] and not (
IndexError: list index out of range

Add a real error message in place of the backtrace.

Also check that there's exactly one colon; show an error message if
there's more than one, as that syntax has no interpretation that is
obviously the right one.

Signed-off-by: Lassi Kortela <lassi@lassi.io>
2021-03-12 15:30:18 +02:00
Elijah Newren
4987e0f6e3 filter-repo: fix --use-mailmap
--use-mailmap was defined as `--mailmap .mailmap` except that it would
set args.mailmap to ".mailmap" rather than b".mailmap" (in other words,
it accidentally set it to a string rather than a bytestring).  Since
the --mailmap parameter is always passed as a bytestring, we ran into
errors with calling unknown functions due to the type mismatch.

Signed-off-by: Elijah Newren <newren@gmail.com>
2021-03-12 01:07:35 -08:00
Elijah Newren
407d15dd29 Merge pull request #167 from dscho/meaow
Add a GitHub workflow for continuous testing

Signed-off-by: Elijah Newren <newren@gmail.com>
2020-10-21 16:10:30 -07:00
Johannes Schindelin
d28b2a7346 Add a GitHub workflow to test this thing
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2020-10-22 00:29:26 +02:00
Johannes Schindelin
d0dcece202 t9391: guard dos2unix use behind a prereq
Not all setups have `dos2unix`. Most notably, the Ubuntu and macOS
agents of GitHub Actions don't.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2020-10-22 00:29:26 +02:00
Johannes Schindelin
85afdf9da9 t9391: don't rely on the system gitconfig defining core.autocrlf=false
The test case t9391.12 specifically wants to test LF vs CR/LF line
ending issues, expecting `core.autoCRLF` to default to `false`. This is
true on Linux and macOS and pretty much everywhere else, except on
Windows.

Let's make sure that the test operates with the `core.autoCRLF` value it
assumes to operate under.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2020-10-22 00:29:26 +02:00
Johannes Schindelin
fe79ec9912 t9390: work around yet another Unix<->Win32 path issue
On Windows, there is no absolute path `/fake/path`, but MSYS2 (which Git
for Windows uses e.g. for running Bash scripts) pretends that it exists.
This only works within MSYS2 applications, of course, so... when MSYS2
sees that we hand a parameter to a non-MSYS2 application in a shell
script, it helpfully converts it to the full path (prepending MSYS2's
pseudo root directory).

Let's work around that by using a Win32-compatible path to begin with:
`$(pwd)` produces that on Windows. On other platforms, it still works.

As a bonus, this safe-guards our test against a setup where `/fake/path`
_actually exists_. Stranger things have been seen in the wild, after
all.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2020-10-22 00:29:26 +02:00
Johannes Schindelin
848cd652f0 t9390: work around clash with MSYS2's Unix<->Win32 path conversion
MSYS2 tries to be very helpful, and in most cases it even works, by
converting parameters passed from inside an MSYS2 Bash to a non-MSYS2
application (such as `git.exe`) if they look like Unix-style paths or
path lists.

Sometimes, however, this automatic path conversion is unhelpful, e.g.
when passing the parameter `foo:.` to Git, which MSYS2 will readily
convert to a Windows-style path list: `foo;bar` (i.e. using a semicolon
instead of a colon).

Happily, there is a way to avoid that: the `MSYS_NO_PATHCONV` variable.
Let's use it.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2020-10-22 00:29:25 +02:00
Johannes Schindelin
6967fad156 t9390: avoid using colrm
While it is true that `colrm` is available on macOS by default, and even
in Ubuntu (thanks to the `bsdmainutils` package), it is not available on
Windows.

Let's use `cut` instead.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2020-10-22 00:29:25 +02:00
Johannes Schindelin
e6ffeded2e t9390: avoid using Bash-ism <(...)
The problem with this is that on Windows, we use the MSYS2 Bash which
uses the POSIX emulation layer called "MSYS2 runtime" that pretends that
there _is_ something like the `/dev/fd/` namespace, and tells `git.exe`
about it, but `git.exe` does not use the POSIX emulation layer, and
hence has no idea what Bash is talking about.

Besides, we should avoid pipes, just as we do in the Git project.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2020-10-22 00:29:25 +02:00
Johannes Schindelin
8bc195673c t9390: close link of broken &&-chain
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2020-10-22 00:29:25 +02:00
Johannes Schindelin
f1ee28d78f t9390: expect the correct line count in --strip-blobs-with-ids
In that test case, we expect the line count to be 5, but it is actually
6 lines that we should expect:

	numbers/medium.num
	numbers/small.num
	sequence/know
	whatever
	words/know

Note the empty line at the top: this list is generated via `git log
--format=%n`, and that `%n` stands for "newline", meaning that we _must_
expect an empty line.

This expectation seems to have been broken already in the commit that
added the test case: b6a35f8 (filter-repo: implement
--strip-blobs-with-ids, 2019-05-30). It was hidden for such a long time
by a broken &&-chain, which we will fix next.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2020-10-22 00:29:25 +02:00
Johannes Schindelin
6c475a7e09 t9390: use the correct prereq when using "funny" file names
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2020-10-22 00:29:25 +02:00
Johannes Schindelin
580e0f0395 Test data and scripts must have Unix line endings
The tests will otherwise fail.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2020-10-22 00:29:14 +02:00
Johannes Schindelin
453128fff7 Ignore the generated Python cache
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2020-10-21 22:33:38 +02:00
Elijah Newren
74ea810872 INSTALL.md: add notes about common installation issues
Signed-off-by: Elijah Newren <newren@gmail.com>
2020-10-19 08:20:53 -07:00
Elijah Newren
9282a33a02 git-filter-repo.txt: regexes & globs apply to entire file, not to lines
Signed-off-by: Elijah Newren <newren@gmail.com>
2020-10-19 08:10:08 -07:00
Elijah Newren
93ee4ae907 Merge branch 'mw/empty-author-name' into main
Signed-off-by: Elijah Newren <newren@gmail.com>
2020-10-17 17:07:27 -07:00
Martin Wilck
282f8ddb9b filter-repo: only set author from committer if author email not set
Some commits may have a valid author email, but no valid author name.
Old versions of git didn't enforce a non-empty name.
Setting the author data from the committer is wrong in this case.

Also add a test case for this to t9390.

Example: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c6295cdf656de63d6d1123def71daba6cd91939c

(en: replaced with a dedicated test instead of tweaking existing ones)

Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
2020-10-17 17:06:53 -07:00
Elijah Newren
7eaaf191de filter-repo: correctly prune nested tags not matching filtering criteria
When the user specifies some kind of criteria to filter commits by (e.g.
--subdirectory-filter mysubdir), we rewrite parents commits that are
entirely filtered out to the most recent ancestor that still exists, or
just prune the parent if there isn't one.  That works great when the
parent is a commit, but nested tags have parents that are tags.  If we
only prune the first tag (i.e. the tag of a commit), then letting any
tags through that had that tag as a parent will result in a fast-import
crash with a message of the form

   fatal: mark :35390 not declared

Ensure that when a tag gets pruned, the pruning is recorded as such...so
that any children tags will get pruned as well.

Signed-off-by: Elijah Newren <newren@gmail.com>
2020-10-17 12:14:18 -07:00
Elijah Newren
b1606ba8ac Merge branch 'mr/fix-filter-lamely-name-error' into main
Signed-off-by: Elijah Newren <newren@gmail.com>
2020-10-07 10:52:54 -07:00
Elijah Newren
f9a54f36d9 Merge branch 'tm/fix-typo' into main
Signed-off-by: Elijah Newren <newren@gmail.com>
2020-10-07 10:51:58 -07:00
Marius Renner
70f83c2526 filter-lamely: fix NameError because of forgotten fr module prefix
In repositories with annotated tags filter-lamely crashes with the
message: "NameError: name 'Reset' is not defined".

This is because of a missing "fr" module prefix in the code, which this
commit adds.

Signed-off-by: Marius Renner <marius@mariusrenner.de>
2020-10-06 16:27:39 +02:00
Tom Matthews
96959d1174
converting-from-bfg-repo-cleaner.md: fix typo
Signed-off-by: Tom Matthews <trcm@pm.me>
2020-10-06 11:04:47 +01:00
Elijah Newren
7b3e714b94 filter-repo (README): remove outdated 2.28.0-not-yet-released comment
Signed-off-by: Elijah Newren <newren@gmail.com>
2020-07-27 11:34:33 -07:00
Elijah Newren
d79ea709b7 filter-repo: fix crash from assuming parent is an int
When filtering with --refs, parents can be a hash rather than an
integer.  There was a code path in RepoFilter._prunable() that was
written assuming the first parent would always be an integer; fix it to
handle a hash as well.

Reported-by: Niklas Hambüchen <mail@nh2.me>
Signed-off-by: Elijah Newren <newren@gmail.com>
2020-07-27 10:52:59 -07:00
Elijah Newren
4b452da4ef Merge branch 'jb/ignore-generated-docs' into main
Signed-off-by: Elijah Newren <newren@gmail.com>
2020-07-27 10:03:01 -07:00
Elijah Newren
e4960a53f8 Fix undefined variable names
Reported-by: Christian Clauss <cclauss@me.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
2020-07-27 09:49:43 -07:00