Reading this article, I realize that I'm old now. I still remember wrestling with cvs, svn. Merge, branch were slow and even more challenging. It was much easier to mess up and so difficult to rewind.
When I first learned git, I thought it's pretty neat. It solves merge, branch, rewind problems. Git is one of the things in life that doesn't work like the way we think. But it turns out to be a better way.
I’m torn. Practically Mercurial feels like it should be the winner. The commands are more uniform and predictable.
That’s not all it has going for it either. Mercurial has a concept of commit stages to make history rewriting safer. It has a commit model that enables you to work on and manipulate branches of commits seamlessly, without needing named branches. It has not just a tree of commits but also each commit tracks its own history through rewrites. You get cool commands like absorb and evolve. It’s easier to extend than Git.
The only downside to modern Mercurial I can think of is it’s still slower than Git, by at least a bit. But it can scale incredibly far with its extensibility. For example, what Facebook did:
So why does it never seem to get consideration? I guess it’s because of the insane proliferation of Github and Linux, which is definitely more a blessing than a curse. But it’s weird. Back in the CVS and SVN days, it didn’t seem like there was ever going to be a ‘network effect’ for source control like there is today.
I kind of wish Gitlab would implement Mercurial support. I bet it would help Mercurial gain more adoption within teams working on closed source projects. I know Bitbucket does, but to be honest that doesn’t really appeal to me much.
Hg and git have near feature parity, so I don't really lament Hg's loss so much. Sure Hg's CLI is a little bit better, but beyond that it never really offered any really compelling features over git. Mercurial and git is like Honda and Toyota; maybe one is a bit nicer than the other, but they're both offering you more or less the same thing.
Fossil is another matter. Fossil defies pithy car analogies. Integrating the bug tracker into the version control alone is a game changer, and that's not even all that Fossil does.
Hg actually does go beyond Git in many areas; I did outline a few but my favorites are the improved commit model (stages, natural branching, rewrite tracking,) the commands Git doesn’t have (absorb, evolve,) and the extensibility (see Facebook extensions.)
Sure, the model is similar and Fossil is different. But that is kind of an important note. If Fossil can’t be compared on the same level, maybe that’s a sign it solves fundamentally different problems.
In most setups, the bug tracker and source control are separate, but that doesn’t mean you can’t get bugtracking alongside code either, GitLab provides everything from bugtracking to CI to deploying stuff to Kubernetes.
Not to say Fossil isn’t cool or doesn’t have its place, but if I disagree with the philosophy (and I do, fundamentally,) then I don’t feel like I lose much using alternative software suites.
> If Fossil can’t be compared on the same level, maybe that’s a sign it solves fundamentally different problems.
Or Fossil provides a superset of the others. Like comparing a corkscrew, which only opens bottles of wine, to a swiss army knife that has a corkscrew. They both solve the same problem, but one of them also solves other problems and is generally a more useful tool to keep around in your pocket.
The world didn't lose much with Hg losing to git. But with Fossil losing, we lost a great deal. As a consequence we have a world where people feel locked into the proprietary bug trackers their git host provides. Had Hg prevailed, that situation would be no different. The only way the world would be different if Hg had prevailed would be fewer posts on HN whining about git's interface being obtuse. Not really a substantially different reality, is it?
Git is ugly and daunting when you start with it, but as you get to know it you start to appreciate it's beauty and elegance and it will work for you. Actually I find this with many programming concepts.
> I committed and immediately realized I need to make one small change!
I think it might be nice to add a disclaimer saying that this is not advisable if you've already pushed the code. Suck it up and make a new commit–don't rewrite public Git history.
I think changing history is fine on feature branches that only you are working on, though. IMO the benefits of keeping commit history clean outweigh the cost of having to push with "--force".
This then discourages ad hoc teamwork because you can't touch feature branches other than the ones you own. And because the results of getting it wrong are hairy, people will tend to stay away just in case. It's a chilling effect.
Or, someone might have created another branch off your feature branch, because they depend on your work. Now you've creaed a time bomb for them when they try to merge their work after you've merged your alternative-history version of it. (and the failure mode is just weird, it takes experience to identify that all those seemingly nonsensical merge conflicts are result of this situation)
Etc. It just breaks a lot of things. The Git model and bad UI are already taxing enough to work with in your head, concurrently with your actual programming and domain cognitive load, that adding the uncertainty and multiplied complexity from having history rewritten around you is just a bad tradeoff.
(This may be different if the scenario is not a team, of course...)
> This then discourages ad hoc teamwork because you can't touch feature branches other than the ones you own. And because the results of getting it wrong are hairy, people will tend to stay away just in case. It's a chilling effect.
Do you frequently go messing with a branch assigned to single developers on your team without any sort of prior heads-up? And then surprise them with new commits next time they push/pull? I guess if you do that then none of you can ever force-push, and it's great if that works for your team, but I feel like generally people try not to interject and instead give a heads-up before messing with others' branches, after which the original dev knows others are involved and can then avoid force-pushing (or sync up when needed).
Not as often as we'd like to[1], but it happens regularly in reviews, or when casually working together on the same thing in, or when basing branches on other people's unfinalized work, etc. If a team member is used to a force push / rebase type of workflow, it invariably means that stuff will blow up becuause they forgot they had rebased, or they do a force push out of habit, etc. If you include rebase / force push in your daily workflow, you will invariably foul up with it and forget to make an exception for the teamwork.
Your proposed "heads up" way can work in theory, but it introduces too much friction and risk of error, and still leaves the situation in shambles when they have a rebased branch on their laptop that they haven't pushed yet. So you need to have a multi phase protocol that requires many interactions and even then the other guy may well forget about it and do a rebase & force push out of habit in the end.
[1] Well, "branch assigned to a single developer" isn't a thing, but most feature branches are finished by 1-2 people.
But then you already know someone is reviewing that branch! So both of you avoid force-pushing.
> or when casually working together on the same thing in
Again, you both know multiple people are involved here... "casually working together" is the heads-up! You don't need another one.
> or when basing branches on other people's unfinalized work
Without any sort of hint to the guy working on that branch? How do you know it's even in a stable state to build on if you have no communication?
> or they do a force push out of habit
Yes it breaks if you screw up, but I mean then you just deal with it right? It's a dumb mistake just like any other silly mistake that can happen during committing, it's infrequent, it's on an ephemeral branch, and it's completely reversible. I don't see why someone occasionally mistakenly pushing the wrong thing (forced or otherwise) on a branch that isn't even going to exist for much longer is such a catastrophic event that you have to formulate your whole team's entire development process around avoiding that event 100% at all costs?
> Your proposed "heads up" way can work in theory, but it introduces too much friction and risk of error, and still leaves the situation in shambles when they have a rebased branch on their laptop that they haven't pushed yet. So you need to have a multi phase protocol that requires many interactions and even then the other guy may well forget about it and do a rebase & force push out of habit in the end.
Again, you don't need an explicit heads-up when you already know multiple people involved, and you can just deal with the occasional errors, as I mentioned. See above.
But you can't safely rebase that branch! Having rebased it, you would have now broken it for others working on it. So this is a great example how the damage spreads and taints other branches around the history rewrite.
(And even arriving at the "ok i could fix this with rebase" diagnosis will have been painful and frustrating and eaten time & energy, and you can't be sure you got away with it before actually doing it and waiting if your teammates will come kick you in the nuts. or worse, silently spend a day untangling their work.).
Huh? You don't rebase their branch. You rebase your own changes on top of their branch, which they happened to recently rewrite. Just like you might rebase your changes on top of master after master has undergone changes. I think the parent's point was that it doesn't matter if the branch was rewritten or just extended; either way you rebase the same commits on it the same way. (If you're one of those people who's against the notion of rebasing entirely then that's a separate debate we can have another time, but you need to separate that from the force-push issue.)
So then anyone building on it would rebase the same way you just did? Which was the same way they would have done so if you had just pushed a new commit?
Like a few other replies, I force push on my branches all the time.
I push my code up to my origin as soon as I can and as I go I’ll fix up my commits and force push.
There are some advantages for me anyway. Pushing to origin kicks off some smoke tests and end to end tests that are fairy slow and cumbersome to run on my dev machine. That helps me catch bugs earlier, especially since I’m working on a Microservices architecture. Also it acts as a backup for if my dev machine dies on me. I prefer to fix up and force push to create a clean logical story from my code rather than leave in spurious commits which exist only to fix linting for example.
That's not public git history though, those are just patches on mailing lists. Even before it goes into maintainer's lists, in my experience it's extremely rare that a maintainer will force push to their own public -next branch.
I see about 600 pull requests on github[1]. My understanding was that Linus moved away from mailing lists some time ago. But then I'm not involved in kernel hacking.
The Linux kernel has a bot that reminds people that kernel development happens on the mailing list: https://github.com/KernelPRBot. Since GitHub does not allow for disabling the pull requests tab, the pull requests invariably get closed.
I force-push all the time in my feature branches. It's expected. If I'm using `git commit --fixup` and `git rebase -i origin/master --autosquash` that feature branch is going to come out clean and nice in the merge commit if I force push to my feature branch.
Git is pretty nice, but I'm sure there is something much better waiting to he invented. The CLI in particular could use a ton of improvements.
And I feel it in my bones that there is a revolutionary GUI waiting to be invented. Why can't I drag a commit or set of commits from one branch to another? With safe, easy undo (reflog doesn't count) and super smooth conflict resolution? Etc etc.
And of course there is the interesting rabbit hole of semanitc / language aware diff. Line diffs suck in many ways.
It's one of the hundreds of of problems that I'd love to work on one day, but probably won't get a chance to. Sigh... :)
swarmdb has CRDT-based merge and causal branching which is "super smooth". Lets you recombine changesets rather freely, or at least supposed to. https://github.com/gritzko/ron-cxx/tree/master/db (the project is in its early stages)
Regarding language-aware AST-based diffs, I know of one serious full-time effort. Even for Java, it turned way too complex, so guys gave up.
I don't get this "afraid of losing something" mindset at all. In fifteen years, I've "lost" some minor changes maybe 3 or 4 times, and this was mostly with SVN, which does not have the safeguards that Git has. The only thing that I am moderately afraid of is pushing to the wrong remote branch.
> I don't get this "afraid of losing something" mindset at all. In fifteen years, I've "lost" some minor changes maybe 3 or 4 times, and this was mostly with SVN, which does not have the safeguards that Git has. The only thing that I am moderately afraid of is pushing to the wrong remote branch.
I can lose something for you in 2 seconds in git. Have fun e.g. recovering from this:
Honest to god, I don't know how people who find `git` hard to use manage to write code. Everyone on the Internet acts like the concepts are impossible to grasp and it's like really easy to grok.
Honestly, it faded into the background of code from the beginning. I mean, I know "Forward-port local commits to the updated upstream head" means nothing to anyone not already familiar with `git rebase` but a practical mastery of the tool is very easy to achieve.
I honestly think this is a pedagogical lack. We tell everyone it's this complex thing and that they should be scared of rebase and the reflog and they believe it. Maybe if we didn't, it'd be easier.
I was thinking if I would post what you did it would get downvoted but yeah, if you find git hard how or why are you writing code? That is surely a lot harder. Not sure why it is downvoted as sure it might not be a popular opinion but it seems to hold...
> if you find git hard how or why are you writing code? That is surely a lot harder.
So many things wrong with your premises...
How do you know every git user is writing code? Is git only for code now?
And in the code case... were you born knowing how to use git? Or were you forced to learn it before you wrote a single line of code? Or were you forced not to learn it until you were a pro coder? Is it difficult to fathom something landing between these extremes?
It's common for people teaching others how to use git to start off the lesson with some sort of disclaimer about git being really hard. I think this is a huge mistake. I suppose it's meant to prevent the student from feeling discouraged if they happen to struggle, but the student struggling with git is not a foregone conclusion. Such statements can demoralize the student before they even dive into it though. When you tell your student that git is hard, you are doing them a disservice.
OK hot shot. Tell everyone here how you've never fucked up commands and had to blow out a repo and start over and how were all idiots for having done that.
It's a tool. Any tool can be confusing if the person isn't taught how to use it. Git requires teaching so there's a lot of room for misunderstanding.
I'm not the hot shot. The hot shots are all the kids from Berkeley and Stanford who figure this shit out as interns while supposedly fully-trained engineers with all the knowledge of data structures that should come with that think this shit is too hard.
I think I'd be completely unsurprised to see an intern successfully use `rerere` on a longer project of theirs.
Git is not hard. It's very simple. But people learn it the wrong way. You have to learn it from the DAG up. If you cannot grasp how the DAG works you'll forever be reading and writing articles like this one which do not help you to learn.
This is a horrible article. You should not bookmark it or use it. If you're not a programmer, you shouldn't use git. If you are a programmer, do yourself a favour and spend a day going through something like this: https://wyag.thb.lt/
It will make you better at git and better at programming. Git is a powerful tool and you need to learn how to use it. Imagine if people read articles like this one instead of learning how to drive.
Don't gatekeep. Git isn't just for programmers.. it's for people that are learning.. people using it in non programming capacities and tons more. Telling people to "git gud" is not helpful. Not everyone knows, or indeed cares to know what a Directed Acyclic Graph is, and sites like this help people's anxiety who are just learning, or who have already screwed up and just want a solution.
Git is just for programmers. It was made by and for kernel developers, no less. There are better tools for other people.
If you don't care what a DAG is, you will never understand what git is or what it's for. No arguments. Git is a tool for building a DAG. If you don't need a DAG you don't need git.
You can use Git for versioning all kinds of assets...3D models, fonts, textures, music. I know someone who stored his book on Github and took pull requests from editors.
Yes but those are hacks, no one should immediately reach for git as a tool to version EVERYTHING just because you can. If you treat git as a convenient hammer for your screw, don't be surprised when the screw breaks at an inopportune time
Are you imagining every use of git is either (1) a Computer Scientist writing Code, or (2) a hack? You can't imagine anything in the spectrum in between? LaTeX papers by academics in various fields, scientific coders (MATLAB etc.), people writing stuff in Markdown, students who are still learning even CS, etc. are all doing the wrong thing by using git?
so the people who maintain our site content using markdown and hugo, commit/push git, and trigger a CI build and deploy automatically MUST be developers?
answer: they are not, but they can handle basic git just fine. we aren’t some special class of super human: git is a tool, and you absolutely don’t need to know what a DAG is to use it
I agree that the DAG and Git storage model are at least not particularly complicated. The problem is that the Git user interface (the CLI, plus various concepts that are used in other interfaces as well, like refspecs, that are not fundamental to Git) is not very simple, and the correspondence between the DAG and mutations you may wish to perform, and Git commands, is often fairly obtuse and opaque.
I think you are right about the DAG. Once I understood what the high-level data structure of git is, many branch related commands immediately made sense and it was suddenly very easy to use. Many of my colleagues haven't taken the time to learn that and continuously struggle with basic commands.
I care about the internals of git about as much as I care about the internals of my filesystem.
It's probably helpful to know some basics, but do I need to know intimate details of my filesystem to use cp, mv, shell redirection? No. For most basic actions it Just Works™.
The problems in git are purely user-interface based. Other distributed systems have proven you can make a dcvs with a reasonably friendly UI.
That's because a file system is something you already understand, even if you've never actually used an old fashioned paper filing system. The software is providing you with something you understand.
Git is not providing you with something you understand. It is providing you with a DAG and you neither understand what that is or why you need it. The DAG is not the "internals of git". This is the big mistake. It is git. Everything about git is about building that DAG.
Graph = graph, a structure composed of a set of objects (nodes or vertices) with links between them (edges).
Directed = the edges have an orientation / a direction.
Acyclic = there's no cycle, you can't come back to a node (in a directed graph you have to follow edge direction).
In Git, the commit objects are nodes, the "parent" link(s) is a directed edge, and because commit objects are immutable you can't create a commit which refers to one of its "descendants" and thus the graph is acyclic.
Unfortunately this article, like almost all others, is still wrong because it looks like commits get mutated when you rebase and the old commits disappear.
It is very important to understand that commits (in fact, all blobs) are immutable in git. You can only make new things. You can't modify old things. Git doesn't delete anything for a while either.
Directed acyclic graph. Basically each git commit points to zero or more parent commits (usually one, zero for root commits, more than one for merge commits) and that forms a DAG.
Git isn't hard but it's usually taught like absolute shit. From the get go you're told to use 4 commands, 2 of which are usually not explained and when they are it's often hand wavey. From there on its usually people pontificating about high level philosophy while failing to give concrete working examples. At least that was my experience.
I'm decent at got now for the work I do so I'm cool with it. It really is an awesome tool. But for some reason its just collectively taught like shit.
When I first learned git, I thought it's pretty neat. It solves merge, branch, rewind problems. Git is one of the things in life that doesn't work like the way we think. But it turns out to be a better way.
reply