Syntax highlighting README

On Wed, May 14, 2003 at 06:36:16PM -0400, Pavel Roskin wrote:
> On Wed, 14 May 2003, Marco Ciampa wrote:
> > file ..\*\\.(rb|RB)$ Ruby\sProgram ^#!\s\*/.\*/ruby
> > include ruby.syntax
> Thanks, applied!
> > Please note that this is a preliminary version and any comment and
> > suggestions are welcome!
> Some other rules highlight numbers, but I don't really care.  I leave it
> to those who actually program in Ruby.
> You may want to consider brighter highlighting for function names after
> "def" because there are too few other syntax elements around to make
> function definitions easy to distinguish.
I'll investigate on how to do it.

I though that would be useful this 'cut & paste' from the cooledit
manual for syntax highlighting hacking to be put on the /syntax dir as a
README file.


Marco Ciampa

Syntax highlighting README (derived from cooledit manual)

Syntax highlighting means that keywords and contexts(like 
C comments, string  constants,  etc)  are  highlighted  in  
different colours.  The following section explains the 
format of the file ~/.cedit/syntax.

The file ~/.mc/cedit/Syntax is rescanned on opening of any
new  editor file. It contains a list of file types and how
to identify what rule set the text you are editing belongs
to.  The file token dictates how to match up your text. On
the same line as  a  file  token  must  appear  a  regular
expression to match the filename, a string to be displayed
on the left of the editor window for description purposes,
and  a  regular  expression to match the first line of the
file. If either of the regular expressions match, the file
is deemed to have the particular type. For example

file ..\*\\.(py|PY])$ Python\sProgram ^#!\s\*/.\*/python

Will  cause  a file to be labelled as Python Program if it
contains say, #!/usr/bin/python, on the first line  OR  of
it ends in say

Note  that  *,  +  and  \ have to be escaped with a \, and
space must be presented with a \s.

After the file keyword may come the include  keyword.  The
include  keyword  says  to load a rule set from a separate
file, and is the preferred way of adding  new  rule  sets.
The  path from where it loads defaults to cooledit/syntax/
under the lib/ directory where you installed Cooledit. See
the  examples  in  your own Syntax file and in this direc-

Each rule set is divided into contexts, and  each  context
contains  keyword definitions. A context is a scope within
the text that a particular set of keywords applies to. For
instance,  the region within a C style quote (i.e. between
" quotations) has its own separate colour  and  hence  its
own separate context. Within it, the normal C tokens, like
if and while, will not apply, but %d should be highlighted
in  a  different colour. Contexts are usually for when you
have something  that  must  be  coloured  across  multiple
lines.  The  default context contains the list of keywords
to fall back on should there be no other  applicable  con-
text. This is usually normal programming code.

A trivial C programming rule set might look like this:

file .\*\\.c C\sProgram\sFile (#include|/\\\*)

wholechars abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_

# default colors
context default
  keyword  whole  if       yellow/24
  keyword  whole  else     yellow/24
  keyword  whole  for      yellow/24
  keyword  whole  while    yellow/24
  keyword  whole  do       yellow/24
  keyword  whole  switch   yellow/24
  keyword  whole  case     yellow/24
  keyword  whole  static   yellow/24
  keyword  whole  extern   yellow/24
  keyword         {        brightcyan/14
  keyword         }        brightcyan/14
  keyword         '*'      green/6

# C comments
context /\* \*/ brown/22

# C preprocessor directives
context linestart # \n brightred/18
  keyword  \\\n  yellow/24

# C string constants
context " " green/6
  keyword  %d    yellow/24
  keyword  %s    yellow/24
  keyword  %c    yellow/24
  keyword  \\"   yellow/24

Each context starts with a line of the form:
context  [exclusive]  [whole|wholeright|wholeleft] [lines-
tart] delim [linestart] delim [foreground] [background]

One exception is the first context. It must start with the
context default [foreground] [background]
or else cooledit will return an error.

The linestart option dictates that delim must start at the
beginning of a line.

The whole option tells that delim must be  a  whole  word.
What constitutes a whole word are a set of characters that
can  be  changed  at  any  point  in  the  file  with  the
wholechars command. The wholechars command at the top just
sets the set exactly to its default  and  could  therefore
have been omitted. To specify that a word must be whole on
the left only, you  can  use  the  wholeleft  option,  and
similarly  on the right. The left and right set of charac-
ters can be set separately with,
wholechars [left|right] characters

The exclusive option causes the text  between  the  delim-
iters to be colourised, but not the delimiters themselves.

Each rule is a line of the form:
keyword  [whole|wholeright|wholeleft]  [linestart]  string
foreground [background]

Important to note is the line
  keyword  \\\n  yellow/24
This  line  defines a keyword containing the \ and newline
characters.  Because keywords  have  a  higher  precedence
than context delimiters, this keyword prevents the context
from ending at the end of a line if the line ends in  a  \
thus  allowing C preprocessor directive to continue across
multiple lines.

The colours themselves need to apply to the Midnight  Com-
mander  internal  editor as well as to Cooledit. Therefore
the form
is used. See some of the many rule sets given,  for  exam-
ples on using this. Usually the background colour is omit-
ted, thus defaulting to the usual background colour.

Context or keyword strings are interpreted so that you can
include tabs and spaces with the sequences \t and \s. New-
lines and the \ are specified with \n and \\ respectively.
Since  whitespace  is  used  as a seperator, it may not be
used explicitedly. Also, \* must be used to specify  a  *,
and  a  \+ to specify a +. The * itself is a wildcard that
matches any length of characters. The + is like the *  but
matches  a  length  of non-whitespace characters only. For
  keyword         '+'      green/6
  keyword         '\s'      green/6
colours all C single character constants green. You  could
also have used
  keyword         "*"      green/6
to colour string constants, except that the matched string
may not cross newlines.

The \{  wild  card  matches  any  characters  that  exists
between it and its matching \}. For example, the following
matches C style octals:
  keyword '\\\{0123\}\{01234567\}\{01234567\}' brightgreen/16

The \[ \] wild card is similar and matches any  number  of

All  wild  cards  may be used within context delimiters as
well, but you cannot have a wildcard as the first  charac-
ter  of a context delimiter. In addition, using a wildcard
as the first character of a  keyword,  impacts  hugely  on

Comments may be included on a line of there own and  begin
with a #.

Because of the simplicity of the implementation, there are
a few intricacies that will not be  coped  with  correctly
but  these  are  a minor irritation. On the whole, a broad
spectrum of quite complicated situations are handled  with
these  simple  rules.  It is a good idea to take a look at
the syntax file to see some of the nifty tricks you can do
with  a  little  imagination. If you can't get by with the
rules I have coded, and you think you  have  a  rule  that
would  be  useful, please email me with your request. How-
ever, do not ask for regular expression  support,  because
this is flatly impossible.

A  useful  hint  is  to  work as much as possible with the
things you can do rather than try to do things  that  this
implementation can't cope with. Also remember that the aim
of syntax highlighting is to make programming  less  prone
to error, not to make code look pretty.

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]