[evince] PDF annotation support (tentative design)



Hi *.

tl;dr: I want to work on proper annotations in evince and do this "ask
first" thingy.


Recently I had to read a PDF document that contained annotations other
than text annotations (i.e. highlight, strikethrough and underline) with
some markup attached to them. While poppler renders these annotations,
there was no way to access the associated markup text and I had to
resort to okular in order to read them.

To make evince the best document viewing application, I had some look at
the evince and poppler sourcecode over the weekend and what needs to be
done in order to support additional annotation types.

Currently, evince annotations only allow one (as in "exactly one")
rectangular area to be interactive. The problem with things like
highlight annotations is that if they span multiple lines, their area is
not rectangular anymore. If it were, highlighting the last word in a
line and the first word of the next line would result it the two
complete lines to be clickable, not just those to words. For hyperlink
annotations that span multiple lines, poppler creates a separate link
area for each piece of the link (I don't know whether this stems from
the PDF standard itself or can be considered a quirk in poppler), so
this is not a problem here.

The current code structure uses EvMapping to map from interactive area
to dynamic elements (like annotations) and this EvMapping is created
from PopplerAnnotMapping which defines a bounding rectangle that
completely contains the annotation.

poppler recently added some missing pieces for properly supporting this
kind of annotation that use a list of quadrilaterals to define their
interactive area (<https://bugs.freedesktop.org/show_bug.cgi?id=51487>).

Long story short, I want to work on annotation support in evince and see
two possible solutions:

Instead of using PopplerAnnotMapping to define the annotation's area I
create one EvMapping for each quadrilateral I find for the annotation
and reference the exact same EvAnnotation object in each of these
annotations.
This leaves all the libview parts untouched and still working (I hope)
but may have undesirable side-effects: AFAIUI the bounding box is
intended merely for focussing and displaying purposes and is the area
that should be displayed if a certain annotation is selectedin the UI.
If the annotation now has several such areas, focussing annotations may
not behave as expected or we have to build a union area first.
I would probably implement this by not returning an EvAnnotation from
ev_annot_from_poppler_annot() but an EvMappingList and then appending
that list to the return value of pdf_document_annotations_get_annotations().


A more elaborate but not so quick fix would require a change in how
annotations are handled in evince in general. To accomodate the
requirements for interaction as well as displaying, an EvAnnotation must
have both quadrilaterals (which may also be slanted, ftm) and the bound
box rectangle (which is always aligned to page borders) defined by
EvMapping. libview would then use the quadrilaterals of an annotation to
handle interactions and e.g. the mouse pointer shape and
resort to the bounding box for focussing, refreshing the cairo context
or similar tasks.
Depending on whether links are really supposed to be split into many
pieces, this could also be used to unify the code for annotations and
links, maybe providing a list of all links in the sidebar (which is
impossible now).
Also, I'm not sure whether EvMappingList is the right structure for this
sort of thing. Maybe, EvMappings should not have a gpointer member, but
some sort of EvInteractiveArea interface and EvAnnotation, EvImage and
other users of EvMapping should implement this interface. This interface
would then provide some function that uses the implementation's
definition of what is considered interactive via some
ev_interactive_area_is_interactive_at_location() function. This would
make it easier to use the same codepaths for all interactive "things" in
a document (including forms and such).


From a design point of view, I think the second option is a much cleaner
approach, even though it still requires some polishing and I'm far too
unfamiliar with the code to think of all the corner-cases. Any help in
this regard is very much appreciated.

Any comments on this?


Regards,
  Thomas

P.S.: Sorry if evince-list is not for discussing -devel things, from the
archive I'm not quite sure whether evince-list is only a user/support
list or also for development discussions. Feel free to move this
discussion to bgo if that's more appropriate.

Attachment: signature.asc
Description: OpenPGP digital signature



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]