Commit Graph

56 Commits

Author SHA1 Message Date
Elijah Newren
cbacb6cd82 filter-repo: simplify import in lib-usage examples
Python wants filenames with underscores instead of hyphens and with a
.py extension.  We really want the main file named git-filter-repo, but
we can add a git_filter_repo.py symlink.  Doing so dramatically
simplifies the steps needed to import it as a library in external python
scripts.

Signed-off-by: Elijah Newren <newren@gmail.com>
2019-04-26 07:56:03 -07:00
Elijah Newren
6dba1f200c filter-repo: avoid string->datetime->string round trips
Most filtering operations are not interested in the time that commits
were authored or committed, or when tags were tagged.  As such,
translating the string representation of the date into a datetime object
is wasted effort, and causes us to waste more time later as we have to
translate it back into a string.

Instead, provide string_to_date() and date_to_string() functions so that
callers can perform the translation if wanted, and let the normal case
be fast.

Provides a small but noticable speedup when just filtering based on
paths; about a 3.5% improvement in execution time for writing the new
history.

Signed-off-by: Elijah Newren <newren@gmail.com>
2019-04-26 07:56:03 -07:00
Elijah Newren
73e91edecc filter-repo: add text removal (or replacement) via file of expressions
Make it easy for users to search and replace text throughout the
repository history.  Instead of inventing some new syntax, reuse the
same syntax used by BFG repo filter's --replace-text option, namely,
a file with one expression per line of the form

    [regex:|glob:|literal:]$MATCH_EXPR[==>$REPLACEMENT_EXPR]

Where "$MATCH_EXPR" is by default considered to be literal text, but
could be a regex or a glob if the appropriate prefix is used.  Also,
$REPLACEMENT_EXPR defaults to '***REMOVED***' if not specified.  If
you want a literal '==>' to be part of your $MATCH_EXPR, then you
must also manually specify a replacement expression instead of taking
the default.  Some examples:

    sup3rs3kr3t
    (replaces 'sup3rs3kr3t' with '***REMOVED***')

    HeWhoShallNotBeNamed==>Voldemort
    (replaces 'HeWhoShallNotBeNamed' with 'Voldemort')

    very==>
    (replaces 'very' with the empty string)

    regex:(\d{2})/(\d{2})/(\d{4})==>\2/\1/\3
    (replaces '05/17/2012' with '17/05/2012', and vice-versa)

    The format for regex is as from
    re.sub(<pattern>, <repl>, <string>) from
    https://docs.python.org/2/library/re.html
    The <string> comes from file contents of the repo, and you specify
    the <pattern> and <repl>.

    glob:Copy*t==>Cartel
    (replaces 'Copyright' or 'Copyleft' or 'Copy my st' with 'Cartel')

Signed-off-by: Elijah Newren <newren@gmail.com>
2019-04-26 07:56:03 -07:00
Elijah Newren
dd438dc455 filter-repo: add mailmap handling
Signed-off-by: Elijah Newren <newren@gmail.com>
2019-04-26 07:56:03 -07:00
Elijah Newren
a5d4d70876 filter-repo: add some testcases making use of filter-repo as a library
Signed-off-by: Elijah Newren <newren@gmail.com>
2019-04-26 07:56:03 -07:00
Elijah Newren
17a2f7102d filter-repo: add some basic tests, with git-style test-lib.sh
Signed-off-by: Elijah Newren <newren@gmail.com>
2019-03-12 14:19:38 -07:00