On Wed, 2009-10-14 at 15:54 +0100, Alan Cox wrote: Sorry if I not made it clear - I'm against putting everything in binary which does not mean that binary format is ultimately evil. Probably XML is not the easiest format to parse. I am still a bit 'scared' by idea of binary format unless it is needed (I'm having no problem with binary cache - although I understand why it has no place here). > > - FS are usually implemented very carefully. They tend to be part of > > kernel. On the other hand desktop applications are designed much more > > 'speedy'. Sometimes application hangs (much more frequent then kernel > > locks IMHO), sometimes it crashes. > > Desktop application software mostly sucks. I wouldn't argue with that, > but the libraries used at the low level are mostly good clean code. > I agree that many libraries are well-tested but I rather concentrated on 'why FS are binaries while the desktop-formats are text-based'. > > - FS have much better support of tools which recover the data. Well - > > you cannot edit by XML editor but both FAT and EXT2/3/4 have numerous > > tools that recovers data - even less popular systems like reiserfs have > > them. I haven't seen them for many application binary format. > > So write one. The format is visible, the code is open source. You should > just need xml2vomit and vomit2xml or whatever format data you use. > There is no problem in writing such tools if format is kept intact (no corruption). Writing recovery tools on the other hand requires more 'manhoures'. > > - The more common code the more profitable optimalization is. If the > > format is read once at startup it makes much more sense to have it more > > readable then fast. On the other hand if it is used constantly... > > That argument is garbage. It's one reason why Gnome takes forever to start > up. As a normal use you regularly start applications and desktops, you > almost never go and emacs the XML file. > > Take 20,000 distro Gnome users, what percentage of them do you think have > ever hand edited their configuration, what percentage do you think have > ever used things like gconftool. For that matter what percentage of > normal users do you think even understand the question "Have you ever > hand edited your gconf database" > > They all start the desktop up, they all bitch about it taking forever. > 1. Most of the (GNU/)Linux start up system and configuration is text. Why there is no performance problems there? The only binary file in /etc I heard about are terminfo files. 2. I have rather old computer. Gnome starts in less then minute (although I have not measured it). 3. I'm not speaking about users - I'm rather speaking about the programmers. It is much easier to debug text formats then binary formats. As far as power users are concerned - they will easier recover data after some time from text-based format then look for documentation of old format. In current gconf if they don't know where the value is they may be not emacs the file (as it won't be noticed by gconf - at least in past) but they may find it using standard tools like find/grep etc. > > looking through it is cheap as it is already in random access memory. > > Since even cache operates on virtual memory as long as block is > > continuous it makes practically no difference in speed. > > Reality check. On a modern hard disk a good rule of thumb is that reading > 512Kbytes of data costs as much as a single sector read. So you want all > your metadata in a single linear file loaded once, in order. Now whether > that is something looking like rot13 encoded vomit, beautifully spaced > and formatted XML or a database format is less important because the > rotational latency and seek latency of the disk dominate any processing > time unless your data format is extremely bogus. > Point taken. > > - FS are rarely compressed. Text-based formats are much compressable and > > backup of them would take much less space. > > The same *information* should compress to the same size irrespective of > the input. Thats a mathematical theoretical case but reality isn't far off > providing any padding is consistent. Hmm. AFAIK in some cases compressed XML files were better then designed binary formats in terms of disk-space efficiency. > You don't want to compress critical > backup data anyway because it means a bit error on the backup media costs > you vastly more data. > Well. Settings are usually important but not critical. Home backup usually rely on compressed data. > The only sense by which text compresses is better is the "because it was > larger and more wasteful to begin with" sense. > From mathematical standpoint you are right. From practical - well at least sometimes compressed text is smaller then the binary. > Shrug.. pluggable back ends would be nice anyway. I'd rather have my base > preferences in hesiod BTW - what's hesoid? Regards
Attachment:
signature.asc
Description: This is a digitally signed message part