Re: beagle problem with indexing pdf files

From: "D Bera" <dbera web gmail com>
To: "Jan Falkenhagen" <spam to f lkenhagen de>
Cc: dashboard-hackers gnome org
Subject: Re: beagle problem with indexing pdf files
Date: Tue, 25 Apr 2006 14:12:29 -0400

> On Mon, 2006-04-24 at 21:11 +0200, Jan Falkenhagen wrote:
> > i've got a problem with beagle indexing pdf documents. Sometimes pdf
> > indexing causes very high cpu and memory consumption of the pdftotext
> > process (>1.5 GB of RAM). however i guess that this occurs only with
> > some broken pdf files. is there any simple way to track down this
> > behaviour to the files that are causing this problem?
>
> This is a bug in the xpdf software (from which pdftotext comes), and I'd
> suggest you report a bug to their developers.
>
> If you'd like, you can attach a broken PDF to a beagle bug and we can
> see what we can do to work around it.

Or if someone wants to write a managed PDF parser (in C#), that would
be cool too :). There are some C# pdf libraries out there that can be
used e.g. itextsharp (its under LGPL, is that compatible with beagle
licensing ?)
http://itextsharp.sourceforge.net/

--
-----------------------------------------------------
Debajyoti Bera @ http://dbera.blogspot.com
beagle / KDE fan
Mandriva / Inspiron-1100 user

Follow-Ups:
- Re: beagle problem with indexing pdf files
  - From: Kevin Kubasik
- Re: beagle problem with indexing pdf files
  - From: Tomasz Torcz

References:
- beagle problem with indexing pdf files
  - From: Jan Falkenhagen
- Re: beagle problem with indexing pdf files
  - From: Joe Shaw

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]