filter-repo: modify parse_optional_parent_ref to return original parent too

commits may not have any parents at all. As such, parse_optional_parent_ref() is used expecting that it will sometimes return None. Now, when commits are skipped, we have a scheme to translate anyone that depends on such commits to instead depend on the nearest ancestor of such commits. If the entire ancestry of a commit was skipped along with a comit, then that commit will be translated to None, which is indistinguishable from there having been no parent to begin with. Sometimes our scheme needs to distinguish between a commit that started with no parents and one which ended up with no parents, so we need a way to tell these apart. Also, not knowing the original parent makes it hard for us to determine if the original had the same weird topology that the current commit does. For example, it is possible for a merge commit to have one parent be the ancestor of another (particularly when --no-ff is passed to git merge), or even for a merge commit to have the same commit used as both parents (if you use low-level commands to create a crazy commit). There are cases where the pruning of some commits could cause either of these situations to arise, and it's useful to be able to distinguish between intentionally "weird" history and history that has been made weird due to other pruning, because the latter we may have reason to do additional pruning on. Signed-off-by: Elijah Newren <newren@gmail.com>
2024-07-04 01:15:41 +02:00 · 2018-12-15 09:20:40 -08:00 · 2018-12-15 09:20:40 -08:00 · 70e6f848ed
commit 70e6f848ed
parent ab1b43f480
1 changed files with 12 additions and 9 deletions
--- a/21
+++ b/21
@ -824,19 +824,21 @@ class FastExportFilter(object):
    that the name of the reference ('from', 'merge') must match the
    refname arg.
    """
-    baseref = None
+    orig_baseref, baseref = None, None
    matches = re.match('%s :(\d+)\n' % refname, self._currentline)
    if matches:
+      orig_baseref = int(matches.group(1)) + self._id_offset
      # We translate the parent commit mark to what it needs to be in
      # our mark namespace
-      baseref = _IDS.translate( int(matches.group(1))+self._id_offset )
+      baseref = _IDS.translate(orig_baseref)
      self._advance_currentline()
    else:
      matches = re.match('%s ([0-9a-f]{40})\n' % refname, self._currentline)
      if matches:
-        baseref = matches.group(1)
+        orig_baseref = matches.group(1)
+        baseref = orig_baseref
        self._advance_currentline()
-    return baseref
+    return orig_baseref, baseref

  def _parse_optional_filechange(self):
    """
@ -980,7 +982,7 @@ class FastExportFilter(object):
    """
    # Parse the Reset
    ref = self._parse_ref_line('reset')
-    from_ref = self._parse_optional_parent_ref('from')
+    ignoreme, from_ref = self._parse_optional_parent_ref('from')
    if self._currentline == '\n':
      self._advance_currentline()

@ -1054,12 +1056,13 @@ class FastExportFilter(object):
                        self._translate_commit_hash,
                        commit_msg)

-    parents = [self._parse_optional_parent_ref('from')]
+    pinfo = [self._parse_optional_parent_ref('from')]
    # Due to empty pruning, we can have real 'from' and 'merge' lines that
    # due to commit rewriting map to a parent of None.  We need to record
    # 'from' if its non-None, and we need to parse all 'merge' lines.
    while self._currentline.startswith('merge '):
-      parents.append(self._parse_optional_parent_ref('merge'))
+      pinfo.append(self._parse_optional_parent_ref('merge'))
+    orig_parents, parents = zip(*pinfo)
    # Since we may have added several 'None' parents due to empty pruning,
    # get rid of all the non-existent parents
    parents = [x for x in parents if x is not None]
@ -1068,7 +1071,7 @@ class FastExportFilter(object):
    if not parents:
      parents.append(None)

-    was_merge = len(parents) > 1
+    was_merge = len(orig_parents) > 1
    # Remove redundant parents (if both sides of history are empty commits,
    # the most recent ancestor on both sides may be the same commit).
    parents = collections.OrderedDict.fromkeys(parents).keys()
@ -1218,7 +1221,7 @@ class FastExportFilter(object):
    """
    # Parse the Tag
    tag = self._parse_ref_line('tag')
-    from_ref = self._parse_optional_parent_ref('from')
+    ignoreme, from_ref = self._parse_optional_parent_ref('from')

    original_id = None
    if self._currentline.startswith('original-oid'):