Commit Graph

523 Commits

Author SHA1 Message Date
Elijah Newren
2c769de150 filter-repo: work around git-fast-export bug
Explicitly specify --topo-order; git-fast-export fails on some topologies
unless it traverses in topological order.

Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:42:23 -08:00
Elijah Newren
bf5e92d02a filter-repo: portability fixes
Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:42:23 -08:00
Elijah Newren
471e9d8684 filter-repo: rewrite to not use pyparsing in order to avoid memory madness
pyparsing sucks a whole file into memory at a time and then parses, which
is really bad in this case since the output from git-fast-export is huge.
I entered disk swapping madness pretty easily.  So, now I just do my own
manual parsing.

Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:42:21 -08:00
Elijah Newren
ae486e85b8 filter-repo: small restructurings for the big sierra import
* Allow hooking up (and filtering) multiple git fast-export's to one import
* Allow user callbacks to force dumping of object in order to reference it
  with subsequent inserted objects
* Put the separate callbacks and global vars in the calling program into a
  combined class

Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:41:13 -08:00
Elijah Newren
69497ac6e6 filter-repo: add get_total_commits() function, finish transition to a module
Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:40:37 -08:00
Elijah Newren
28cc91054e filter-repo: fix handling of ids of blobs and commits
My prior handlings of marks would only work if there were not additions
or removals from the fast-export stream.  Further, I referred to these as
marks even though I really only accept idnum values, not sha1s or anything
else.  So, now I refer to these as ids everywhere, and I am much more
careful in my handling of ids.

Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:39:57 -08:00
Elijah Newren
94f0ccfd80 filter-repo: call everything_callback as necessary, fix commit_callback
The commit_callback call was trying to pass a Reset object, which was
not defined.  Copy-n-paste-n-forget-to-replace isn't good.  Now it passes
a Commit object.

Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:39:57 -08:00
Elijah Newren
9cd296655a filter-repo: rename functions a bit, make filter object creation explicit
Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:39:55 -08:00
Elijah Newren
207c6d0c16 filter-repo: pipe output to git-fast-import now to create a new repository
Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:39:13 -08:00
Elijah Newren
0d9568684c filter-repo: match git-fast-export spacing after reset commands
Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:38:51 -08:00
Elijah Newren
003dd21714 filter-repo: add ability to handle deleted files in commits
Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:38:51 -08:00
Elijah Newren
c92a4e471e filter-repo: fix parsing bug in Reset object creation
Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:38:51 -08:00
Elijah Newren
b029443a6f filter-repo: fix indexing bug in Commit object creation
Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:38:51 -08:00
Elijah Newren
392d09d084 filter-repo: don't hardcode sys.stdout, I'll eventually want to pipe elsewhere
Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:38:51 -08:00
Elijah Newren
11057e874e filter-repo: add a FileChanges object, for changes that are part of a commit
Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:38:51 -08:00
Elijah Newren
586d65270b filter-repo: add parsing of commits
Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:38:51 -08:00
Elijah Newren
f6f4e5fbbf filter-repo: match fast-import grammar slightly better
Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:38:51 -08:00
Elijah Newren
ff95c771d8 filter-repo: prevent pyparsing from expanding tabs to spaces
We are not parsing simple text; we're parsing data and need to be able to
print that data unmunged.

Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:38:51 -08:00
Elijah Newren
de7aeb64bc filter-repo: add parsing of branch resets
Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:38:51 -08:00
Elijah Newren
f990dda9ad filter-repo: allow random blob insertion and creation without specifying marks
Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:38:51 -08:00
Elijah Newren
163e299ed7 filter-repo: handle multiple blobs, require all input to be parsed, nice errors
Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:38:51 -08:00
Elijah Newren
eb4afc4e78 filter-repo: add GitElement and Blob classes, and a FastExport Parser class
We still only parse a single blob, but this should put the infrastructure
in place for parsing more output from git-fast-export.

Signed-off-by: Elijah Newren <newren@gmail.com>
2019-02-09 14:38:47 -08:00
Elijah Newren
2b34e5c25d filter-repo: initial import
This initial version can parse git-fast-export blobs in exact-data format,
but not much else yet.

Signed-off-by: Elijah Newren <newren@gmail.com>
2019-01-29 15:17:24 -08:00