Re: Git hosting

From: Olav Vitters <olav bkor dhs org>
To: Elijah Newren <newren gmail com>
Cc: gnome-sysadmin <gnome-sysadmin gnome org>, GNOME Infrastructure <gnome-infrastructure gnome org>
Subject: Re: Git hosting
Date: Thu, 6 Mar 2008 10:13:20 +0100
On Wed, Mar 05, 2008 at 09:05:01PM -0700, Elijah Newren wrote:
> On Wed, Mar 5, 2008 at 7:37 AM, Olav Vitters <olav bkor dhs org> wrote:
> >  Yeah, I read that, but it wasn't clear *at all*. I now understand you
> >  want multiple branches within one directory.. or something. However, I
> >  understood it as using sha1 everywhere.. which e.g. bzr has now too (so
> >  I considered it as the same). Same for hg.
> 
> I don't see how cryptographic checksums and branches within a directory relate.

I thought he was talking about the was the files had been stored into
the repository and how that guarantees security. Bzr has 'sha1', so I
thought they covered the objection.
I completely missed the branches within directories stuff.

> >  > I crave the wonderful usability of bzr.  In my mind, it's a notch
> >  > above the rest.  svn and hg aren't too far off, but bzr has some extra
> >  > polish.  However, it has parallels to svn here in that the format is
> >  > going to limit its utility to many people.  I was talking to Ian
> >  > Clatworthy (bzr dev) about all kinds of bzr and DVCS issues; cool
> >  > things in bzr, stuff to put in his writeups, etc.  He did his best to
> >  > answer questions and sell bzr, but when I pointed out the most obvious
> >  > and painful missing feature in bzr that I'd miss the most from using
> >  > git -- being a branch container -- there was simply no response.  The
> >
> >  You mean multiple branches with one directory right? Why is that so
> >  useful?
> 
> For something like release scripts or bugzilla, I don't think it would
> be.  But metacity, for example, is a completely different beast.  I've
> often felt forced to keep many different checkouts gnome-2-20,
> gnome-2-18, etc, plus a few other feature checkouts.  I also often
> tend to have dozens of patches sitting around in each that I have to
> constantly apply and unapply.  It's a pain.  This is something that
> having easy branching and merging would help out a lot with, but in
> particular that many branches within a directory help with.

Does it really matter if that would be done as either magic 'git' or a
few local directories with e.g. hardlinks? I'd rather have hardlinks.
Maybe crazy, but when using CVS I was never sure about branches. You
checkout a branch, but would 'cvs add' add to the branch or to HEAD? I
had no idea. Same to understand if you're currently working on a branch
or not.
I understand having the ability to combine this all is wanted. Just not
needed for me. Perhaps indeed various people will use branches more. For
the modules I commit in the whole branch thing is moot; think of e.g.
the gnomeweb scripts. These didn't support branching (for years).

I've checked out the Git Tutorial. IMO it still needs a rewrite. I'd
start of small and focus on practical things. E.g. no /home/foo/stuff. I
have this git repository at $SOMEWHERE and I want to
1. Apply some changes to the code
- involves somehow getting that code from $SOMEWHERE (git://, no
  /home/foo stuff)
- making a patch
- committing it
--> I don't need no branches for this if I'm just a casual user. People
sometimes just diff from the source code of a tarball. Just keep it
simple.

Then later on the different concepts can be introduced one at a time.
Each one self contained where all the first interactions are explained
(doesn't have to go into detail). Not saying the tutorial is bad, just
that some concepts are skipped or need to be explained differently (with
command I can copy paste. E.g. I'd add a  'wget $PATCHURL && patch <
$FILE'.
Not saying I am good at explaining things, but I do give various
trainings. For what I've seen, only few people try out things.
Generally, best to first show an entire example where the commands are
spelled out and a basic explanation is done. Then later on you explain
the various concepts that were quickly introduced at the first stage.
Still ensuring to explain everything again as people often miss out on
stuff (basic trainer stuff -- explaining things 3 times.. will be more
difficult as a online tutorial though).

> Also, from another point of view...bzr, hg, and svn make branching
> more common by virtue of making it easier to handle them, in
> particular having good merge functionality.  svn inhibits it greatly,
> but we still see a fair number of branches.  That number is going to
> get a _lot_ bigger.  Just take a look at how often DVCS fans talk
> about branching.  That means to get all the metacity data I need, I'd
> have to do half a dozen separate clones.  Ouch.  It means that to get
> people's work, if we use bzr, I'd have to potentially do dozens of
> clones.  Painful.  To manage my own work, I'd have dozens of
> directories.  That's a no go to me.  (It was one of the reasons I
> harped on this issue at
> http://blogs.gnome.org/newren/2007/11/24/local-caching-a-major-distinguishing-difference-between-vcses/;
> I was hoping other VCSes would see the light and try to gain this
> feature)

I didn't get that. (ATM it is a 'crazy weirdo feature' ;)

> >  > this missing functionality.  Also, if you read through the bzr layout
> >  > model and read over their documentation, you find that their
> >  > assumption of branch==directory goes all the way to their very core,
> >
> >  IMO that is what makes bzr so good in usability. Easy to understand what
> >  it does. I sort of get Git a bit more now, and I understand you can link
> >  one repos/checkout/whatever to multiple upstream thingies.
> 
> I disagree.  It's *one* thing that I agree lends to bzr's usability,
> yes, but it isn't the reason for bzr's superb usability.

Yeah, that is more the whole design method. Everything is discussed with
usability in mind (I've been reading the mailing list archives).

> But, it's also fundamentally limiting.  Severely so, for many of my
> use cases.  One thing bzr could do would be to split off documentation
> about multiple branches per directory in a more advanced section, or
> even potentailly have plugins to provide the additional capabilities
> if their plugin system was good enough.  That is, they could do it if
> their structure supported it.

Agreed. This is IMO what Git should focus upon. Yes, swiss army knife,
etc. I don't care. Give me the basics as that is all (or more than) I
need ATM. With that I don't mean only explaining a few commands. I mean
more 'layering'. Make sure I only have to understand basic concepts.

> >  > I think you'd even find people starting to want it with bzr or hg,
> >  > really.  Sure, not everyone would want it, but the modules with many
> >  > developers will at some point realize these capabilities and start
> >  > clamoring for this kind of support...or else roll their own solution.
> >  > Even among modules with only a few maintainers you'll see some
> >  > adoptions of this functionality.  I've already seen these kinds of
> >  > things spring up all over the place, and not just for git.
> >
> >  I don't think such a thing is usable. What is the benefit if you have to
> >  be an insider to know about such things? E.g. if someone
> >  branches/clones/whatever from gtk+, then I want a page about Gtk+ to
> >  list the branches/clones.
> >  Yes, you could have that in some personal space.. that is just an
> >  implementation detail. If I'm some interested person in Gtk+, I wouldn't
> >  want to find out 6months later  that e.g. behdad did a lot of nice work
> >  in his personal space located *somewhere*.
> 
> The point, at least at first, is really more about enabling better
> workflow than pushing patches to bugzilla or over email, i.e. it's a
> review mechanism.  Thus, discovery of such repos is only really needed
> by code reviewers and it quite naturally happens.

By code reviewers yes. But it should be easily discoverable by outsiders
as well. Perhaps that is just 'push upstream', while DVCS just makes the
local stuff visible on a server.
If people have a lot of time to keep track of a project it wouldn't
matter. However, it should be tracked together.

> Those who prefer centralized development workflows and putting patches
> in email or bugzilla are perfectly welcome to do so, but making it
> easy for developers who know about newer workflows to directly pull
> changes from each other makes code review and even development much
> nicer in certain cases.

I mean a 'Google' (not really true). Sort of various stages:
1. On your own pc: Nobody can access even if they know you're working on
stuff.
2. Stored on some server: You tell people within the project. Sort of
like a site. Telling other people is nice, but ehr.. not effective as
can be.
3. Cataloged: Truly allows distributed development, while allowing
distributed development. I do think gitweb is a good start. Finally an
overview with descriptions. Just think it should be enhanced. AFAIK git
has enough info on what things are tracked.
Assume e.g. that user files don't store the real upstream stuff. Then
gitweb could show the 'clones' (?) done on gnome. Note that I understand
that with distributed you'll never have all the clones nicely on a
git.gnome.org. I'm just talking about what GNOME provides as information
to outsiders.
E.g. comparing to google: Having a site indexed by google is actually a
step between #2 and #3. There is some 'directory' stuff that is more
related to #3.

> Notice that some of the same arguments you make could be made about
> open source development in general with public repositories...any one
> can get their own little svn (or whatever) checkout and do their own
> little development.  Why are there thousands of people checking out
> the code with their own little working copies?  How does anyone know
> that the work is being done on their personal computer located
> *somewhere*?

I understand that. It doesn't matter. However, infrastructure provided
by GNOME should be clear and speed up development. Meaning: if stuff is
stored on GNOME servers we should make it as clear/understandable as
possible.

> Yes, there are communication issues if you want your code used.  In
> both your scenario and the modified more general open source one.  But
> it happens naturally...if you want your code changes used, you have to
> communicate.

Same as above. Talking about GNOME servers.

> >  > git-submodule?  It's a joke IMO.  Nearly no useful functionality yet
> >  > and already it has some nasty UI issues and a couple gotchas to boot.
> >  >
> >  > (Great idea which would be superior to svn:externals, IMO, but the
> >  > implementation is extremely lacking at best.)
> >
> >  Can you confirm the rest doesn't have this (Hg/Bzr)?
> 
> No, I can't.  All I can say is that I've seen very little on this
> issue, and what I have seen made it sound like people were considering
> developing such capabilities.  But I wasn't that interested in this

So someone need to find a solution... hrm.

> issue so I never went looking for it.  I suspect no one has such
> abilities, though the KDE->git conversions being talked about heavily
> on the kde-scm-interest mailing list (which is interestingly heavily
> biased towards git, despite calls to try to get proponents of other
> systems involved) seems to suggest they're going to make git-submodule
> work one way or another.

I followed that list a bit. But ehr, glossed over stuff as I hoped the
svn:externals could be avoided somehow. Within GNOME we already have
repositories for everything.

> *shrug*  Sorry, that's all I know.

But I like asking on gnome-infra and getting clear answers!!

[..]
> >  > Luckily, bzr, hg, and git all have tools for svn migrations.
> >
> >  You know the quality?
> 
> Haven't played with them for several months, and hgsvn didn't exist at
> the time.  Been meaning to get a more updated look at them.
> 
> bzr-svn was slick, but had a list of published drawbacks whose lenght
> surprised me a bit; I haven't seen any noise on those being fixed.
> However, if you didn't mind manually recompiling subversion to use
> some patch out there (will be in some future subversion release
> itself) and manually killing bzr-svn and restarting it when it makes
> your machine swap like mad (my machine only has 2GB ram; not nearly
> enough for bzr-svn except with small modules), the imports it made
> seemed to work fine.  There are apparently people out there with
> scripts to monitor bzr-svn for when it gets too large, which will kill
> it and restart it where it left off.  Sounds scary, I know, but it
> actually does work well.

Ah, yes. I know about that bug in SVN/bindings. Causes the memory not to
be released. IIRC there has been a workaround at this point. I don't
consider this a deal breaker though. Pretty minor compared to the things
Ross Golder had to do for CVS->SVN.

> git-svn was clunky and difficult to figure out (surprised?) but worked
> like a charm and imported great.  Additionally, Chris Lee and Thiago
> Maciera (sp?), IIRC, have built some kind of separate svn fastimport
> script and been able to do many conversions of the kde repository with
> all kinds of detailed mucking and rearranging with some nice clean
> driving input files.  Looked really slick, but I didn't look much
> further since I was more interested in incremental svn importing
> tools.
> 
> >  There has to be a *known working proven* solution for translators or
> >  we'll have a big problem. Transiflex is one idea. However, it has to be
> >  accepted and *working*.
> 
> Noted.

Nice, where do you keep the action points? On Live.gnome.org? ;-)
pretty please?

> >  I'll try that tutorial again. However, Git is severely lacking into
> >  explaining the multiple branches thing, the 'index' and linking other
> >  repositories. Further, there is a lot of usage of things you might
> >  understand if you're used to Git, but I don't understand at all.
> >  'master' ?!? Lots of those btw. Never sure if something is local or not.
> >  How it all interacts. Plus you can't ignore it.
> 
> Yeah, I feel your pain, and I'm not going to discount it.  It's still
> very much an issue.

I'm hoping that by writing down the frustrations in enough detail that
when I finally understand it, I perhaps can more clearly read back and
figure out why Git sucks and hopefully what need to change.

[..]
> >  > >  > - All DVCS can do something like svn export
> >  > >  >   $vcs://$vcs.gnome.org/repos/path/$FILE right?
> >  >
> >  > Yes, they all have it, though they don't idiotically require a URL
> >  > like svn (another example of stupid UI that needs a script): svn
> >  > export, bzr export, hg archive, git archive.
> >
> >  So you can only export a file right? No need for the whole history? Plus
> >  it is *fast* even when done externally?
> 
> Oops, I missed the $FILE bit.  No idea whether they have it or not.

Although I don't work on that anymore, it is important for e.g. fetching
'/trunk/MAINTAINERS'.

Getitng the feeling how big of a project this is... urghhh!

-- 
Regards,
Olav
References:
- Re: Git hosting
  - From: Behdad Esfahbod
- Re: Git hosting
  - From: Elijah Newren
- Re: Git hosting
  - From: Olav Vitters
- Re: Git hosting
  - From: Elijah Newren
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]