The open source Git project just released Git 2.19, with features and bug-fixes from over 60 contributors. Here’s a look at some of the most interesting features introduced in the latest versions of Git.
git range-diff
You might have used git rebase
, which is a powerful tool for rewriting history
by altering commits, commit order, or branch bases to name a few. Many people
do this to “polish” a series of commits before proposing to merge them into a
project. But how can we visualize the differences between two sets of commits,
before and after a rebase?
We can use git diff
to show the difference between the two end states, but
that doesn’t provide information about the individual commits. And if the base
on which the commits were built has changed, the resulting state might be
quite different, even if the changes in the commits are largely the same.
Git 2.19 introduces git range-diff
, a tool for comparing two sequences of
commits, including changes to their order, commit messages, and the actual
content changes they introduce.
In this example, we rewrote a series of three commits, and compared the tips of
each version using git range-diff
. git range-diff
shows that we moved the
commit introducing README.md
to be first instead of second, amended both the
commit message and body of the typo fix, and introduced a new commit to add a
missing newline.
[source]
git grep
’s new tricksWhen you search for a phrase using git grep
, it’s often helpful to have
additional information pertaining to each match, such as its line number and
function context.
In Git 2.19 you can now locate the first matching column of your query with
git grep --column
.
If you’re using Vim, you can also try out git-jump
, a Git add-on that
converts useful locations in your code to jump locations in your text editor.
git-jump
can take you to merge conflicts, diff hunks, and now, exact grep
locations with git grep --column
.
git grep
also learned the new -o
option (meaning --only-matching
). This is
useful if you have a non-trivial regular expression and want to gather only the
matching parts of your search.
For example, if you want to count all of the various ways that the Git source code spells “SHA-1” (e.g., “sha1”, “SHA1”, and so on):
(The other options -hiI
are to omit the filename, search case-insensitively,
and ignore matches in binary files, respectively.)
The git branch
command, like git tag
(and their scriptable counterpart, git
for-each-ref
), takes a --sort
option to let you order the results by a number
of properties. For example, to show branches in the order of most recent update,
you could use git branch --sort=-authordate
. But if you always prefer that
order, typing that sort option can get tiresome.
Now, you can use the branch.sort
config to set the default ordering of git
branch
:
Note that by default, git branch
sorts by refname, hence master
is first and
newest
is last. In the above example, we tell Git that we would instead prefer
the most recently updated branch first, and the rest in descending order. Hence,
newest
is first and master
is last.
You might also want to try these other sorting options:
--sort=numparent
shows merges by how awesome they are--sort=refname
sorts branches alphabetically by their name (this is the
default, but may be useful to override in your configuration)--sort=upstream
sorts branches by the remote from which they originate[source]
Git has always detected renamed files as part of merges. For example, if one
branch moves a file from A
to B
and another modifies content in A
, then
the resulting merge will apply that modification to the content’s new location
in B
.
The same thing can happen with files in a directory. If one branch moves a
directory from A
to B
but another adds a new file A/file
, we can infer
that the file should become B/file
when the two are merged. In Git 2.18, git
merge
does this whenever rename detection is enabled (which is by default).
[source]
In Git v2.18, a remote code execution vulnerability in .gitmodules
was
fixed, where an attacker could execute scripts when the victim cloned with
--recurse-submodules
. If you haven’t upgraded, please do! The fix was also
backported to v2.17.1, v2.16.4, v2.15.2, v2.14.4, and v2.13.7, so you’re safe
if you’re running one of those.
[source]
Have you ever run into a Git command line option that should have tab-completed but didn’t? Keeping these up to date has long been an annoying source of manual work for the project, but now the completion of options for most commands is generated automatically (along with the list of commands itself, the names of config options, and more). [source, source, source, source]
gpg
signing and verification of commits and tags has been extended to work
with gpgsm
, which uses X.509 certificates instead of OpenPGP keys. These
certificates may be easier to manage for centralized groups (e.g., developers
working for a large enterprise).
[source]
To fetch a configuration variable with a “fallback” value, it’s common for
scripts to say git config core.myFoo || echo <default>
. But that doesn’t
give Git the opportunity to interpret <default>
for you. When it comes to
colors, this is especially important for instances where you ultimately need
the ANSI color code, for say, “bold red”, but don’t want to type \033[1;31m
.
git config
has long supported this with a special --get-color
option, but
now there are options that can be applied uniformly to all types of config.
For instance, git config --type=int --default=2M core.myInt
will expand the
default to 2097152, and git config --type=expiry --default=2.weeks.ago
gc.pruneExpire
consistently returns a number of seconds.
[source,
source]
Quick quiz: if git tag -l
is shorthand for git tag --list
, then what does
git branch -l
do? If you thought, “surely it doesn’t list all branches”,
then congratulations: you’re a veteran Git user!
In fact, git branch -l
has been used since 2006 to establish a reflog for a
newly created branch, something that you probably didn’t care about since it
became the default shortly after being introduced.
That usage has been deprecated (you will receive a warning if you use git
branch -l
), thus clearing the way for git branch -l
to mean git branch
--list
.
[source]
In our last post, we discussed the new --color-moved
option, which
(unsurprisingly) colors lines moved in a diff. The lines that were moved must
be identical, meaning that the feature would miss re-indented code unless you
specified a diff option such as --ignore-space-change
. Keep in mind that
this option would affect the whole diff, potentially missing space changes
that you do care about. In Git 2.19, the whitespace for move detection can
be configured independently with the new --color-moved-ws
option.
[source]
Many of Git’s commands are colorized, like git diff
, git status
, and so
on. Since 2.17, a few more commands improved their support for colorization,
too. git blame
learned to colorize lines based on
age
or by
group.
Messages sent from a remote server are now colorized based on their keyword
(e.g., “error”, “warning”, etc.). Finally, push errors are now painted red for
increased visibility.
[source,
source,
source]
If you’ve ever run git checkout
with the name of a remote branch, you might
know that Git will automatically create a local branch that tracks the
remote one. However, if that branch name is found in more than one remote, Git
does not know which to use, and simply gives up.
In 2.19, Git learned the checkout.defaultRemote
configuration, which
specifies a remote to default to when resolving such an ambiguity.
[source]
Git interprets certain text encodings (e.g. UTF-16
) as binary, meaning that
tools like git diff
will not show a textual diff. Normally it’s recommended
to store your text files as UTF-8
, but this isn’t always possible if other
tools generate or expect another encoding.
You can now tell Git which encoding you prefer in your working tree on a
per-file basis by setting the working-tree-encoding
attribute. This will
cause Git to store the files as UTF-8
internally, and convert them back to
your preferred encoding on checkout. The result looks good in git diff
, as
well as on hosting sites.
[source]
Some features are so big that they’re developed over the course of several releases. We have historically avoided reporting on works in progress in these posts, since the features are often still experimental, or there’s nothing you can directly start using.
That said, some of the topics upstream around this release are too exciting to ignore! So, here’s an incomplete summary of what’s happening upstream:
An important part of Git’s decentralized design is that all clones receive the full history of the project, making all clones true peers of one another. When there aren’t a large number of objects in your repository, things go quickly, but at a certain size clones can become frustratingly slow.
There’s ongoing work to allow “partial” clones which omit some blob and tree
objects, in favor of requesting objects from the server as-needed. You can see a
design overview of the feature, or even start experimenting yourself. Note
that most public servers do not yet support the feature, but you can play with
git clone --filter=blob:none
against your local Git 2.19 install.
[source, source, source, source, source, source]
Git has a very simple data model: everything is an object named after the hash
of its contents, and objects point to each other by those names. Many operations
walk the graph formed by those pointers. For example, asking “which releases
contain this bug-fix” is really “which tag objects have a path to walk back to
commit X
” (where X
is the commit fixing the aforementioned bug).
Those walks have traditionally required loading each object from disk to find its pointers. But now Git can compute and store properties of each commit in a more efficient format, leading to significantly faster traversals. You can read more about it in a series of blog posts from the feature’s author.
Git still uses roughly the same protocol for fetching that was developed in 2005: after a client connects, the server dumps the current state of all branches and tags (called the “ref advertisement”), and then the client asks for the parts it needs to update. As repositories have grown, the cost of this advertisement has become a source of inefficiency.
The protocol has added new features over the years in a backwards-compatible way by negotiating capabilities between the server and client. But one thing that couldn’t be changed is the ref advertisement itself, because it happens before there’s a chance to negotiate.
Now there’s a new protocol which addresses this (and more), providing a way to transfer the advertisement more efficiently. Only a few servers support the new protocol so far, but you can read more about it in this blog post from its designer.
[source, source, source, source]
We mentioned earlier that all Git objects are named according to a hash of their contents. You might know that the algorithm that determines the value of that hash is SHA-1, which has not been considered safe for some time. In fact, a collision attack was discovered and published last year, which we wrote about in our post on its remediation.
Though SHA-1 collisions in Git are unlikely in practice, the Git project has decided to pick a new hashing algorithm and has made significant progress towards implementing it. Git has chosen SHA-256 as the successor to SHA-1, and is working through the transition plan to convert to it.
[source]
That’s just a sampling of changes from the last few versions. Read the full release notes for 2.19, or find the release notes for previous versions in the Git repository.