cachepc-linux

Fork of AMDESE/linux with modifications for CachePC side-channel attack
git clone https://git.sinitax.com/sinitax/cachepc-linux
Log | Files | Refs | README | LICENSE | sfeed.txt

messy-diffstat.rst (4816B)


      1.. SPDX-License-Identifier: GPL-2.0
      2
      3=====================================
      4Handling messy pull-request diffstats
      5=====================================
      6
      7Subsystem maintainers routinely use ``git request-pull`` as part of the
      8process of sending work upstream.  Normally, the result includes a nice
      9diffstat that shows which files will be touched and how much of each will
     10be changed.  Occasionally, though, a repository with a relatively
     11complicated development history will yield a massive diffstat containing a
     12great deal of unrelated work.  The result looks ugly and obscures what the
     13pull request is actually doing.  This document describes what is happening
     14and how to fix things up; it is derived from The Wisdom of Linus Torvalds,
     15found in Linus1_ and Linus2_.
     16
     17.. _Linus1: https://lore.kernel.org/lkml/CAHk-=wg3wXH2JNxkQi+eLZkpuxqV+wPiHhw_Jf7ViH33Sw7PHA@mail.gmail.com/
     18.. _Linus2: https://lore.kernel.org/lkml/CAHk-=wgXbSa8yq8Dht8at+gxb_idnJ7X5qWZQWRBN4_CUPr=eQ@mail.gmail.com/
     19
     20A Git development history proceeds as a series of commits.  In a simplified
     21manner, mainline kernel development looks like this::
     22
     23  ... vM --- vN-rc1 --- vN-rc2 --- vN-rc3 --- ... --- vN-rc7 --- vN
     24
     25If one wants to see what has changed between two points, a command like
     26this will do the job::
     27
     28  $ git diff --stat --summary vN-rc2..vN-rc3
     29
     30Here, there are two clear points in the history; Git will essentially
     31"subtract" the beginning point from the end point and display the resulting
     32differences.  The requested operation is unambiguous and easy enough to
     33understand.
     34
     35When a subsystem maintainer creates a branch and commits changes to it, the
     36result in the simplest case is a history that looks like::
     37
     38  ... vM --- vN-rc1 --- vN-rc2 --- vN-rc3 --- ... --- vN-rc7 --- vN
     39                          |
     40                          +-- c1 --- c2 --- ... --- cN
     41
     42If that maintainer now uses ``git diff`` to see what has changed between
     43the mainline branch (let's call it "linus") and cN, there are still two
     44clear endpoints, and the result is as expected.  So a pull request
     45generated with ``git request-pull`` will also be as expected.  But now
     46consider a slightly more complex development history::
     47
     48  ... vM --- vN-rc1 --- vN-rc2 --- vN-rc3 --- ... --- vN-rc7 --- vN
     49                |         |
     50                |         +-- c1 --- c2 --- ... --- cN
     51                |                   /
     52                +-- x1 --- x2 --- x3
     53
     54Our maintainer has created one branch at vN-rc1 and another at vN-rc2; the
     55two were then subsequently merged into c2.  Now a pull request generated
     56for cN may end up being messy indeed, and developers often end up wondering
     57why.
     58
     59What is happening here is that there are no longer two clear end points for
     60the ``git diff`` operation to use.  The development culminating in cN
     61started in two different places; to generate the diffstat, ``git diff``
     62ends up having pick one of them and hoping for the best.  If the diffstat
     63starts at vN-rc1, it may end up including all of the changes between there
     64and the second origin end point (vN-rc2), which is certainly not what our
     65maintainer had in mind.  With all of that extra junk in the diffstat, it
     66may be impossible to tell what actually happened in the changes leading up
     67to cN.
     68
     69Maintainers often try to resolve this problem by, for example, rebasing the
     70branch or performing another merge with the linus branch, then recreating
     71the pull request.  This approach tends not to lead to joy at the receiving
     72end of that pull request; rebasing and/or merging just before pushing
     73upstream is a well-known way to get a grumpy response.
     74
     75So what is to be done?  The best response when confronted with this
     76situation is to indeed to do a merge with the branch you intend your work
     77to be pulled into, but to do it privately, as if it were the source of
     78shame.  Create a new, throwaway branch and do the merge there::
     79
     80  ... vM --- vN-rc1 --- vN-rc2 --- vN-rc3 --- ... --- vN-rc7 --- vN
     81                |         |                                      |
     82                |         +-- c1 --- c2 --- ... --- cN           |
     83                |                   /               |            |
     84                +-- x1 --- x2 --- x3                +------------+-- TEMP
     85
     86The merge operation resolves all of the complications resulting from the
     87multiple beginning points, yielding a coherent result that contains only
     88the differences from the mainline branch.  Now it will be possible to
     89generate a diffstat with the desired information::
     90
     91  $ git diff -C --stat --summary linus..TEMP
     92
     93Save the output from this command, then simply delete the TEMP branch;
     94definitely do not expose it to the outside world.  Take the saved diffstat
     95output and edit it into the messy pull request, yielding a result that
     96shows what is really going on.  That request can then be sent upstream.