Re: Concerning the hyphen confusion in BiDi (was: Re: Bidirectional Bugs in Hebrew)



> On 9 Nov 2000, Owen Taylor wrote:
> > Omer Zak <omerz actcom co il> writes:
> > >
> > > I believe that it is OK to store such a sequence as letter-number-hyphen
> > > in a Logical Hebrew ordered text file.  Such a sequence would be displayed
> > > (by Pango and other applications using the BiDi algorithms) as
> > > letter-hyphen-number (from right to left).
> >
> > But this is almost certainly the _wrong_ sequence of characters to
> > store in the file. Most likely the right sequence is:
> >
> >  HEBREW LETTER + HYPHEN + RIGHT TO LEFT MARK + NUMBER
> >
> > (Other possiblities are present as well)

I agree.
> 
> I consider the problem to be one of man-machine interface.  I personally
> feel annoyed when I have to type the unnatural-feeling sequence of HEBREW
> LETTER + DIGITS + MINUS to get the effect that I want (HEBREW LETTER +
> HYPHEN + DIGITS), whereas if I typed LTR text, I'd get directly the
> sequence which I want.
> 
The editor should be smart enough to understand that it's a hyphen and not a
minus. I am not too familiar with the Unicode and Bidi standards, but I expect
the hebrew character codepages, as opposed to keyboards, to have a separate code
for hebrew hyphen (As someone on this list mentioned before, a hebrew hyphen
should be rendered at the top of the line, unlike a minus sign or an english
hyphen). And then, I expect the Bidi algorithm to render hebrew hyphen as a RTL
character.

There is a very simple way, I think, to tell whether a typed sign is a
minus/dash or a hyphen: If the preceding character is a letter, then it's a
hyphen. Otherwise, it's a minus/dash (there may be issues about telling these
two apart too, but I can't think of any).

Now, doing it this way will also have problems: For example, if I type
(I hope this transliteration is clear enough, this is supposed to be like
'logical hebrew'):

kmo-4
[i.e. kmo<hyphen>4]

imtending

kmo -4

Then stepping back and inserting the space will not fix it, generating instead

kmo <hyphen>4

But I think this behavior will be much less surprising and unintuitive than the
behavior we see today, which Omer rightly complains about.

Just my 2 pence,
	Shai.





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]