Subversion migration recap (cut-off Friday July 14th)



Hi guys,

I hope you have all recovered from GUADEC. As you have all probably long
forgotten by now, we are scheduled to migrate the GNOME CVS repository
to Subversion next weekend. Friday night at 23:59UTC to be precise. That
means that the migration will probably be under way this time next week.

Here's an update on the status of the main issues.


Archive history issues
----------------------

Unfortunately, during it's lifetime the GNOME CVS server has suffered
several accidental clock resets (to a point in 1997). The timestamps
listed in some of the CVS ',v' files are not always correct and in
chronological order. This led to some problems high-lighted by James
Henstridge:

http://mail.gnome.org/archives/gnome-infrastructure/2006-February/msg00059.html

Despite lengthy discussions with Michael Haggerty, the current
maintainer of cvs2svn, and several patches attempting to work around the
problem, it seems that it is unlikely that we'll be able to eliminate
the problem entirely - only use the patches we've developed to alleviate
the problem in the many cases. The most recent few mail exchanges on the
subject are archived here:

http://cvs2svn.tigris.org/servlets/BrowseList?listName=dev&by=date&from=2006-07-01&to=2006-07-31&first=1&count=4

At any rate, we will not be able to rely on the subversion archive to
recreate historical versions. To a certain extent, we can't really rely
on the CVS archive either, as various 'CVS surgery' operations will have
taken place over the years, in addition to the clockskew problems.
However, I will make sure the CVS archives stay on-line in
read-only/anonymous mode indefinitely in order to keep that history
publicly available as it was at the time of the cut-off.

The thing to remember is that this issue will only affect a small
minority of the GNOME modules - the majority, esp newer modules, won't
suffer this problem and you should still be able to retrieve reasonably
faithful historical checkouts from either system in most cases.


Migration order
---------------

One downside will be that to migrate the entire CVS repository is that
we have a *lot* of history. Current indications are that - worse case
scenario - it could take the best part of a month to complete.
Obviously, lots of hackers aren't going to want to wait this long before
they can continue making commits to their module. So this is how I'm
thinking of determining the migration order.

We have three 'priority' lists, and then the rest. The first priority
list will contain a list of any modules developers specifically request
are treated as priority, as they expect to be working with them within
hours/days of the migration cut-off (send your requests to me off the
list now). The second priority list will contain a list of other modules
we consider relatively urgent, such as the 'www.gnome.org' website in
case we decide to change anything about of homepage. The third priority
list is compiled from a list of the most active modules from the last
year in activity order (compiled from a query on 'bonsai-svn' from a
fairly recent test migration). Then, the whole list of modules is
processed for any not handled above. This can be seen in the migration
script 'create-svnrepos.sh', which is linked to and explained here:

http://live.gnome.org/Subversion

Obviously, at 23:59UTC the whole CVS archive will become read-only. As
each CVS module is being migrated to a Subversion repository, it will be
available read-only (for checkout and in viewcvs). When each module's
subversion repository has been successfully migrated, the repository
perms will reset so GNOME developers can proceed to make commits.

As each module gets migrated, an '.out' file will be generated
containing the output of the cvs2svn run (for the curious). These can be
quite big sometimes, so the last few lines (containing the migration
statistics) will be directed into a '.tail' file and the whole output
file is bzip2'ed down. You should be able to see this happening here for
the running test migration here:

http://svn.gnome.org/migration/

As you can see from the date-stamps, the recent test migration is
currently only migrating 25-30 reasonable sized modules a day. We
currently have over 860 modules, so it could be about a month before all
the modules have been migrated and are ready. I think I can get this
time down with a few changes to the script to make better use of SMP
etc, but the bottleneck will probably become I/O. Good thing we weren't
thinking about creating a monolithic repository, eh?! :)

Now would be an ideal opportunity to let us know about any old CVS
modules you know about that are redundant, so we can put them out of the
way in the Attic before the migration.


Shouldn't we be cancelling or postponing the migration again?
-------------------------------------------------------------

I don't believe the history problems warrant us postponing or cancelling
the migration (again). If we do, it is unlikely that these problems will
ever be solved and we will probably be stuck with CVS until the cows
come home. We must move forward. However, I expect there will be some
who feel may feel a bit edgy about the migration.

So, unless I hear otherwise, either from the board or if a significant
number of people raise objections in the next week (or both), the
migration will begin as scheduled next weekend, using our current
best-effort clock-skew alleviation patches.

I will obviously continue to do everything I can to ensure that all the
angles are covered and that the actual migration goes as smoothly and
efficiently as possible.


More information
----------------

Anything else I have thought of and/or covered should be written up
here:

http://live.gnome.org/Subversion

Please send questions or comments to 'gnome-infrastructure gnome org',
or drop by on irc.gnome.org/#sysadmin. Thanks.

--
Ross





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]