Re: Why is mc since 4.6 shiped with a striped doen html.syntax?



Hi!

Let me introduce my humble opinion =)
I use mc for several years; and during last several months I got deep into the 
jungle of modern HTML. What you told about HTML seems wrong to me, so I felt 
an urge to join your discussion (please see below)

On Wed, 2007-05-16 at 16:44 +0200, Michelle Konzack wrote:
while I am using mc since 8 years and it was working fine for editing
HTML files, I like to kown, WHY mc since 4.6 is shiped with a striped
down html-syntax.

The reason for the removal of the uppercase tags was because the new
XHTML standard required lowercase tags.  

Well, XHTML is a thing of order, yet plain old HTML is still generally used.

Since mc still had no support 
for case-insensitive match, I decided that mc should do the highlighting
based on the syntax alone without checking the spelling of the tags.

Yes, simplicity is great! Less function is much better than incorrect 
behaviour. But why not use XML hiliter then? It renders nice and it has been 
supporting single-quoted attr values for a long time.

Restoring uppercase HTML tags today would be ridiculous.  

yes

Maybe using 
lowercase tags would the better, 

no

but I would prefer a fix that would 
allow case-insensitive match in the syntax rules.

Yes, I believe this is the only way to get back to names; both tag and attr 
names should be insensitive.
Well, lets postpone tag/attr names for future mc - the simple way, you 
mentioned earlier is
based on the syntax alone without checking the spelling of the tags.

I prefer to call it 'lexical layer' (leaving 'syntax' to basic 
interpretation). 

So, the real-life HTML is in fact something very different from what is said 
in W3C specs. Actually it is influenced by major browser vendors; all of them 
now process broken HTML constructs in quite the same way. It will be 
consistent for us to support the same model in mc.
Well, I have never seen a 100% correct html hiliter; that's impossible withing 
mc's hiliting framework. But we can cover a number of frequent cases with 
little effort.

Let me provide quite raw html syntax file. It is based on yours, but supports 
a number of cases:

- tag is started with the sequence '<[[:alpha:]]' , not '<'
- strings can be quoted with backstrict also
- quoted string (inside tag) can only happen after space or '=', otherwise 
it's just a meaningless quote, not a string

This hiliter is quite raw, I actually failed to correctly hilite '>' and '=' 
in some cases. There are HTML issues not addressed at all.
I really don't understand this hiliting technology very well, I just like mc 
and know a bit of HTML.. I wish you will consider this aproach useful, and 
hack this file inspired by my ideas =)

--
Peter A. Kerzum

Attachment: html.syntax
Description: Text document



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]