Re: gvfs archive backend



> ... (Feel free to hire me to work on gvfs or libarchive.)

;-)  If I ever find a way to get paid for this, I'll keep
you in mind.  This is all completely volunteer for me, too.
(My day job is completely unrelated to my volunteer work
on FreeBSD and the libarchive suite.)

libarchive is starting to see contributions from people other
than me; I've incorporated a lot of patches from various people
already and look forward to more.  Certainly there's a lot
of small improvements that could be made pretty easily by
someone who was interested:
  * ISO support could use more complete Rockridge parsing
    and support for Joliet extensions
  * I've looked at the zisofs compression extensions; those
    would also be easy to implement in libarchive
  * I have patches for bzip2 compression support for Zip archives
    that I hope to integrate soon.
  * I've had a lot of requests for lzma support.  It would
    be quite simple for someone to implement.  If there were
    BSD-licensed lzma libraries, I'd do it myself.  If not,
    I'm still happy to incorporate patches from someone else.
    (It should be less than a day's work for someone to copy
    and modify the existing gzip support plus another day to
    build some test cases.)

Supporting true random/direct access would be a somewhat
larger project.  A few people have expressed interest in
working on this and I think it's feasible, though I have
pretty strong feeelings about what the result should look
like and would like to be closely involved in designing
such support.

Read/write support might be possible, but it is definitely
a larger project than I have time for.

Thanks a lot for your interest and feedback.  I'm always
interested in hearing requests and try to respond to bug
reports and small feature requests quickly.

Cheers,

Tim Kientzle


Benjamin Otte wrote:
Hey,

Sorry that it took a while to answer, but I wanted to get a definite
answer on what will happen with the archive stuff.
So here's the whole history of what happened:
- I managed to finish a proof-of-concept of an archive backend in a
day or two. Libarchive has a great API after.all :) The result can be
seen at http://people.freedesktop.org/~company/stuff/nautilus-archive.png
- Alex was that impressed by it, that he just merged it into the
stable 2.22.0 branch and decided to ship it in Fedora 9.
- A long discussion ensued about adding this feature in a stable release cycle.
- Lots of people (mostly distro QA) started testing this backend.
- Finally, it was decided to not inroduce this feature in the stable release.
- Distros had mixed reactions about it. Some will ship the new archive
support (like Fedora 9) and some won't (like Ubuntu Hardy). The main
criticism was incomplete integration into the Nautilus file manager
and buggy support for users' common archive formats ISO and ZIP.

And this is where we stand today.

The next Gnome release coming in September will definitely see the
archive support becoming mainstream. And with that, the rest of the
distributions will include libarchive in their main trees. And with
that, in the shorter term we (and in turn you) will see lots of
requests for supporting weird features and formats (mostly ZIP and
ISO), hopefully including patches from us. :)

Mid-term I'd like to improve the performance when extracting
information from archives to make accessing remote archives more
performant. This includes the need for random access to archives and a
better way to access only selected metadata of files - an example
would be doing an initial parse of the archive, where we are only
interested i the root directory and its contents. ISO for example
would nicely support that. Another thing is of course extracting a
file, which currently requires opening the archive, skipping entries
until we arrive at the desired one and then doing the extraction with
a way to directly jump to the file if possible.
And yes, I'm aware that some formats (like gzip) don't allow random
access, but for the formats that do (which would be ZIP and ISO again)
it should be possible to implement it.

In the long term there is the question of adding support for modifying
archives. But that is so far away, that I'd rather not think about it
yet. For the forseeable future, I see the archive backend as
read-only, unless someone suddenly implements an archive_open_edit()
call. :)

I'm also not sure how much time I am going to spend on hacking this
stuff, as I usually prefer to spend my unpaid time improving Swfdec
instead of hacking on GVfs. (Feel free to hire me to work on gvfs or
libarchive.)

Cheers,
Benjamin


On Sat, Mar 8, 2008 at 11:35 AM, Tim Kientzle <kientzle freebsd org> wrote:

Benjamin Otte wrote:


I've recently been investigating how hard it would be to provide a way
for Gnome's new gvfs virtual filesystem to "mount" archives [1].


Yours would not be the first such project I've
heard of.

For read-only access, I think that libarchive isn't a bad
way to go, although it's focus on streaming may make it
a bit less efficient for the application you have in
mind.  As you observed, update is a much trickier proposition.

To answer a couple of questions you asked specifically about
libarchive:
 * It already reads both Zip and ISO9660, though it
  does use a streaming strategy that does not give it
  access to all features of such files.
 * Libarchive does use forward seeks to optimize reading
  listings from large uncompressed archives and does
  advertise the current file offset within the archive.
  Using these two features, you should able to rapidly
  skim any archive to generate a complete directory listing
  with offsets and then seek to read any single entry.
  (This won't work for ISO9660, but should work fine
  for tar, cpio, zip.)

I think the key question you have to figure out is whether
you want support for "generic archives" (including tar, cpio,
etc) or whether you want to support one or two very specific
formats.

The advantage of implementing specific formats is that you
can take advantage of those formats in deep ways.  The Zip
file format in particular was designed with incremental
update in mind and should work well for this type of
application.

Supporting archive mounting in a very generic fashion will
probably require you to make some sacrifices:  Either
you'll have a pretty complex infrastructure or have to
sacrifice some functionality.  Libarchive chose to sacrifice
some functionality largely in order to limit complexity.

As for adding random-access support to libarchive, I can only
say that I'm ambivalent.  It could be a nice addition, but it's
not a priority for me.  I'll certainly consider any code that
someone might send my way that implements such features, but
whether I include it into my sources will depend on how well
it integrates with the other things I'm trying to do.  (In
particular, I won't sacrifice streaming to support random
access.)  Anyone wishing to implement such an ambitious new
feature should definitely familiarize themselves with the
libarchive_test suite first.  ;-)

Of course, the code is openly licensed and you're welcome to
copy any parts of it that you find useful or even fork the
whole project if you like.  (I do, of course, ask that you
honor and preserve the existing source licenses.)

Let me know what you decide to do, even if you decide
not to use libarchive.




One last question: Are you on IRC?


No, but I generally respond to email pretty quickly.

Cheers,

Tim Kientzle








[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]