filter-repo: modify parse_optional_parent_ref to return original parent too

commits may not have any parents at all.  As such,
parse_optional_parent_ref() is used expecting that it will sometimes
return None.

Now, when commits are skipped, we have a scheme to translate anyone that
depends on such commits to instead depend on the nearest ancestor of
such commits.  If the entire ancestry of a commit was skipped along with
a comit, then that commit will be translated to None, which is
indistinguishable from there having been no parent to begin with.
Sometimes our scheme needs to distinguish between a commit that started
with no parents and one which ended up with no parents, so we need a way
to tell these apart.

Also, not knowing the original parent makes it hard for us to
determine if the original had the same weird topology that the current
commit does.  For example, it is possible for a merge commit to have
one parent be the ancestor of another (particularly when --no-ff is
passed to git merge), or even for a merge commit to have the same
commit used as both parents (if you use low-level commands to create
a crazy commit).  There are cases where the pruning of some commits
could cause either of these situations to arise, and it's useful to be
able to distinguish between intentionally "weird" history and history
that has been made weird due to other pruning, because the latter we
may have reason to do additional pruning on.

Signed-off-by: Elijah Newren <newren@gmail.com>
This commit is contained in:
Elijah Newren 2018-12-15 09:20:40 -08:00
parent ab1b43f480
commit 70e6f848ed

View File

@ -824,19 +824,21 @@ class FastExportFilter(object):
that the name of the reference ('from', 'merge') must match the
refname arg.
"""
baseref = None
orig_baseref, baseref = None, None
matches = re.match('%s :(\d+)\n' % refname, self._currentline)
if matches:
orig_baseref = int(matches.group(1)) + self._id_offset
# We translate the parent commit mark to what it needs to be in
# our mark namespace
baseref = _IDS.translate( int(matches.group(1))+self._id_offset )
baseref = _IDS.translate(orig_baseref)
self._advance_currentline()
else:
matches = re.match('%s ([0-9a-f]{40})\n' % refname, self._currentline)
if matches:
baseref = matches.group(1)
orig_baseref = matches.group(1)
baseref = orig_baseref
self._advance_currentline()
return baseref
return orig_baseref, baseref
def _parse_optional_filechange(self):
"""
@ -980,7 +982,7 @@ class FastExportFilter(object):
"""
# Parse the Reset
ref = self._parse_ref_line('reset')
from_ref = self._parse_optional_parent_ref('from')
ignoreme, from_ref = self._parse_optional_parent_ref('from')
if self._currentline == '\n':
self._advance_currentline()
@ -1054,12 +1056,13 @@ class FastExportFilter(object):
self._translate_commit_hash,
commit_msg)
parents = [self._parse_optional_parent_ref('from')]
pinfo = [self._parse_optional_parent_ref('from')]
# Due to empty pruning, we can have real 'from' and 'merge' lines that
# due to commit rewriting map to a parent of None. We need to record
# 'from' if its non-None, and we need to parse all 'merge' lines.
while self._currentline.startswith('merge '):
parents.append(self._parse_optional_parent_ref('merge'))
pinfo.append(self._parse_optional_parent_ref('merge'))
orig_parents, parents = zip(*pinfo)
# Since we may have added several 'None' parents due to empty pruning,
# get rid of all the non-existent parents
parents = [x for x in parents if x is not None]
@ -1068,7 +1071,7 @@ class FastExportFilter(object):
if not parents:
parents.append(None)
was_merge = len(parents) > 1
was_merge = len(orig_parents) > 1
# Remove redundant parents (if both sides of history are empty commits,
# the most recent ancestor on both sides may be the same commit).
parents = collections.OrderedDict.fromkeys(parents).keys()
@ -1218,7 +1221,7 @@ class FastExportFilter(object):
"""
# Parse the Tag
tag = self._parse_ref_line('tag')
from_ref = self._parse_optional_parent_ref('from')
ignoreme, from_ref = self._parse_optional_parent_ref('from')
original_id = None
if self._currentline.startswith('original-oid'):