Re: [orca-list] Copying Math Expressions from Web pages



Hello,
heh, it seems we are in somewhat similar position, although coming from
different backgrounds. I'm using Linux as my main operating system for
about 3 months by now, and I have problems with reading math too.

I may be wrong, I hope someone fixes me in that case, but I don't see a
way to do this without a kind of speech logger either.
But if you want to really do it this way, there is an option. I have
made a project called Chinfusor some time ago:
https://rastisoftslabs.com/2020/07/22/chinfusor-a-universal-solution-for-reading-texts-in-foreign-alphabets-on-linux/

It's a speech module for speech dispatcher, which acts as a middle
layer between clients and speech modules in order to route parts of
text written in various alphabets to specified engines with specified
configuration.
But never mind that now, the important thing is, that it's a speech
module, receiving all spoken text from Orca. If you want, I can make a
special build for you, which will allow recording incoming speech
requests with a keyboard shortcut, and copying the recorded text to
clipboard.
Thus, you would be able to copy an equation like:
Press shortcut,
make Orca speak the equation you want to get,
press shortcut again, the collected text will appear in your clipboard.

Although I would personally most likely go with some kind of translator
to LaTeX or similar format, like:
https://sourceforge.net/projects/xsltml/

Can you programm in JavaScript? If yes, plugins like TamperMonkey can
make copying MathML notation much more flexible, technically if you
connect it with sheets above, you should be able to get directly a
LaTeX form. I can't confirm quality of the sheets, as I didn't use them
myself, I just saw some people using them, so they'd be my first try if
i wanted a functionality like this.

I personally find the current Orca MathML interpretation quite weak. It
not just doesn't contain a structural mathematical navigation, but even
the basic reading makes quite strange things, such as on this page:
https://openstax.org/books/calculus-volume-2/pages/3-6-numerical-integration

Combined with long pauses between individual equation elements, reading
more complex math formulas can be quite hard.
Thus I'm working on my own MathML parser to do this. I originally
thought, that i will create presentation markup parser along with
content markup parser, but considering the fact that even Firefox
doesn't support content markup yet, I'm currently thinking about
another plan, splitting the project to two parts, what would allow me
to release the thing much sooner than originally expected.

It's not ready yet, but if I finish the tags implementation, I will
need betatesters to check it with various formulas. I want a rock-solid 
parser, I had quite bad experiences with Jaws on Windows, where many
formulas simply got the interpreter stuck, draining cpu to 80 or 90%. I
don't want anything similar on Linux, but that means the program must
be checked in first place. Would you be interested in this?
Also, if yes, I would also like to ask, what area of mathematics is
your most used? For example, i study artificial intelligence and deep
learning, which is almost completely built on multidimensional
calculus, which could be in theory coverable quite easily, as it
contains rather equations than mathematical statements.
However, abstract algebra for example is a different area, which deals
with mathematical proofing, making a rich use of mathematical language,
what means a lot of unicode symbols unknown to speech synthesizers. I
didn't check yet, how many of them are exactly in unicode table, but I
can imagine there are lot of them. Mapping them all will be surely a
drag, which will take somewhat more time than the basic presentation
markup implementation, so if you work in this area, it may require
somewhat more intensive preparations.

Best regards

Rastislav
V Sobota,  8. august 2020 o 13:03 +0200, Ishe Chinyoka napísal(a):

Hi Rastislav,

Thanks for your helpful suggestions regarding the various
technologies
used for mathematical presentation. Actually, I was trying to copy
what
Orca was saying. As an example, the formula for the pooled estimate
is
presented as:
(p1  + p2
)/(n1 + n2)
When I move my cursor to where the expression is, I do not have the
option to copy the formula. I tried pressing the context menu so as
to
bring any other options for that formula, but failed. So what I ended
up
doing was to literally type into my text editor what Orca was saying.
But I feel that this is an inefficient way to deal with lots of
formulas.

However, I think from what you are saying, I am getting a better idea
of
what I have to do: dealing with the HTML source for the page. The
only
setback I see in that is when I am participating in some MOOC courses
like at Coursera, where I have to answer a question after making some
calculations. In that situation, I may have no other way to review
the
formula in time.

I will have to explore those other options like MathML and LaTeX. The
latter is what I often use for my daily work in preparing some
material.
I find that LaTeX is the most accessible format out there when it
comes
to maths. My issue right now is reading maths on Linux, which is now
my
operating system of choice for the past two months.

Thanks once again.

Ishe




On Saturday,  8 August 2020, at 04:05, Rastislav Kiss via orca-list <
orca-list gnome org> wrote:
Hello,
it highly depends on concrete formula you're viewing.

The common html format for math expressions is called MathML. It
allows
webpages to contain math formulas in a xml-like form, which is
supported by all major browsers and can be quite easily rendered.
There are two branches of MathML:
* Presentation markup
* Content markup

The firstone describes math expressions in a rather visual form.
For
example, v^2 is represented as v with superscript 2 and its not
specified, whether the 2 is an index of a vector or a power.
The secondone focuses rather on meaning of expressions and their
individual parts, the previous example would be in this case either
specified explicitly as a power or index.

In practice, both branches are mixed together, so an author can
make
use of both visual and semantic expressivity of MathML. The exact
way
how this is done is... kind of more complicated, you can read more
about it if you want here:
https://www.w3.org/TR/MathML2/chapter5.html

Spoiler: after the first few sections, it gets superboring. :)

While presentation markup can be translated to correct math
notation in
very simple equations, it gets significantly harder with more
complexones, that's the reason why there are not many MathML to
LaTeX
convertors around. Such task needs heuristic, which may but also
may
not work, depending on quality of algorithm and processed formula.
Content markup is ideal for back conversion, as it contains all
necessary informations without distracting elements.
Which markup is used in concrete formula highly depends on used
software, and the fact that they can be used both doesn't help it
much.

Thus for copying these equations, I would most likely use a
mathematical software, which can deal with both markups to give you
the
best possible results. I can't recommend you any specific as I
didn't
need this myself, but there should be few of them around on the
net.

And... if you're lucky, there is a third option. MathML contains a
semantic tag, which can be used to describe various part of a math
expression. For example, a presentation markup with content markup.
But
it can also hold non-MathML content, like LaTeX form of the viewed
equation.
If this is your case, you have won. One program which commonly does
this is Pandoc, it annotates all equations with their LaTeX form,
so
they can be copied very easily.
Selection of the equation should do the job, if not, saving the
page as
txt could, or in worst case, examining the html code, search for
annotation tag.
Sadly I didn't see a common place, where equations would have this
kind
of attachment, but I wasn't really looking for it, so you may find
few,
where it will be available.

Best regards

Rastislav
V Štvrtok,  6. august 2020 o 15:19 +0200, Ishe Chinyoka via orca-
list
napísal(a):
Hi,


I am finding Orca's handling of maths to my liking as it is able
to
say
out all the maths expressions I come across. But I am failing to
copy
those formulas into some application, from Firefox. How do I
accomplish
such a thing as copying a maths expression?

Currently, copying any expression yields the following string at
the
place where the expression should be: "[Math Processing Error]".

TIA,



_______________________________________________
orca-list mailing list
orca-list gnome org
https://mail.gnome.org/mailman/listinfo/orca-list
Orca wiki: https://wiki.gnome.org/Projects/Orca
Orca documentation: https://help.gnome.org/users/orca/stable/
GNOME Universal Access guide: 
https://help.gnome.org/users/gnome-help/stable/a11y.html





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]