Re: [Tracker] tracker-extract crashes in GLib: gmem.c:168: failed to allocate 2330172 bytes



Hi Martyn

Am 19.05.2014 um 14:32 schrieb Martyn Russell <martyn lanedo com>:
Something is not right there, what function is trying to allocate that memory and for what variable?

I don't have a debug build so I only have a SBT posted initially:

fe3cfed2 g_logv   (fe4669d9, 4, fe46d604, fdb3ec3c, 238e3c) + 1d2
fe3d00c2 g_log    (fe4669d9, 4, fe46d604, fe46830f, 238e3c, 1494f725) + 32
fe3ce730 g_realloc (14833008, 238e3c, fdb3ec90, 14213008, fdb3ecec, 14213008) + 80
fe4191ba _g_gnulib_vasnprintf (0, fdb3ed4c, fd0bc35f, fdb3edf8, 14833008, 4) + 66a
fe41a333 _g_gnulib_vasprintf (fdb3edac, fd0bc35f, fdb3edf8, fe481a3c, fe481a3c, 8b15e68) + 33
fe413bd4 g_vasprintf (fdb3edac, fd0bc35f, fdb3edf8, 14213008, 8b24120, ffffffff) + 34
fe3ee0d6 g_string_append_vprintf (8168260, fd0bc35f, fdb3edf8, fd0cc588, 8b15e68) + 46
fe3ee2fb g_string_append_printf (8168260, fd0bc35f, 14213008, 1d, 8b15e68, 0) + 2b
fd0ac385 tracker_sparql_builder_object_string (8b15e68, 14632008, fdb3ee5c, fd0abcee, 8168260, fcad11bd) + 
b5
fd0ac55c tracker_sparql_builder_object_unvalidated (8b15e68, 14632008, fcad11f0, 86c26c8, 100000, 50) + 10c

Yea you sent this before, I meant more which extractor but I think I have some idea here because of the 
text extraction in your previous emails :)

:) Did't I mention that before? It's libextract-text.so of course.

This is the full log from another tracker-control -r, forcing reindexing:

<http://pastebin.com/3mD9gEz8>


The log you originally sent:

 ...

 12 May 2014, 14:46:19: Tracker:   Read 65535 bytes from file, 16 bytes remaining until configured 
threshold is reached
 12 May 2014, 14:46:19: Tracker:   Read 16 bytes from file, 0 bytes remaining until configured threshold is 
reached
 12 May 2014, 14:46:19: GLib: gmem.c:168: failed to allocate 2330172 bytes

I wonder if the text extractor is broken in some way and/or the GIO code is too on Solaris?

I was wondering that too and therefor decided to defer digging into this issue until I could check whether I 
could reproduce it on Linux too. Unfortunately I don't have the time at the moment to set up an environment 
on Linux where I can test with a Tracker version compiled from source.

Tracker is linked with glib2 from OpenCSW which is at the newest version 2.40.0:
<http://www.opencsw.org/packages/CSWglib2/>

That's the newest version. I'm currently responsible for the glib2 package in OpenCSW and packaged 2.40.0 
just a week ago.

Can you put the file somewhere to test so I can try on my local box here?

Attached.

Attachment: file1.txt.zip
Description: Zip archive



I'm using a directory with ~485 copies of this file:
$ ls -1 /tank/dbd/dir/ | wc -l
     487
$

Reading that many bytes from a 1.2Mb file isn't right either.

I wonder if you have some upper limit on memory allocations on the system?

Afair tracker-extract itself sets a memory limit at startup, ~256 MB afair:

root beast:~# /opt/csw/libexec/tracker-extract -f /tank/dbd/
Locale 'TRACKER_LOCALE_LANGUAGE' was set to 'de_DE.UTF-8'
Locale 'TRACKER_LOCALE_TIME' was set to 'de_DE.UTF-8'
Locale 'TRACKER_LOCALE_COLLATE' was set to 'de_DE.UTF-8'
Locale 'TRACKER_LOCALE_NUMERIC' was set to 'de_DE.UTF-8'
Locale 'TRACKER_LOCALE_MONETARY' was set to 'de_DE.UTF-8'
Setting priority nice level to 19
Loading extractor rules... (/opt/csw/share/tracker/extract-rules)
  Loaded rule '10-abw.rule'
  Loaded rule '10-dvi.rule'
  Loaded rule '10-epub.rule'
  Loaded rule '10-gif.rule'
  Loaded rule '10-html.rule'
  Loaded rule '10-ico.rule'
  Loaded rule '10-jpeg.rule'
  Loaded rule '10-mp3.rule'
  Loaded rule '10-msoffice.rule'
  Loaded rule '10-oasis.rule'
  Loaded rule '10-pdf.rule'
  Loaded rule '10-png.rule'
  Loaded rule '10-ps.rule'
  Loaded rule '10-tiff.rule'
  Loaded rule '11-msoffice-xml.rule'
  Loaded rule '90-text-generic.rule'
  Loaded rule '93-mplayer-generic.rule'
Extractor rules loaded
Setting memory limitations: total is 18,4 EB, minimum is 256 MB, recommended is ~1 GB
  Virtual/Heap set to 268,4 MB (50% of total or MAXLONG)
MIME type guessed as 'inode/directory' (from GIO)
tracker_mimetype_info_get_module: assertion 'info != NULL' failed
Es wurden keine Metadaten gefunden oder keine Entdecker, die mit dieser Datei umgehen können

Total memory is a little less then 18,4 EB of course, 64 GB afair. :) 

The other thing is, the default maximum bytes to read from a text file is 1048576. So we shouldn't even be 
reaching 2330172 bytes.

See below, it's as if its reading up to the configured maximum of 1 MB from each file into an allocated 
buffer and fails to free the buffer. I check that code path in the module and it uses g_content_from_file() 
(or whatever it was named) and seems to properly free it afterwards.

12 Mai 2014, 14:33:31: Tracker: Extracting metadata for 'file:///tank/dbd/dir11/file185.txt'
12 Mai 2014, 14:33:31: Tracker: MIME type passed to us as 'text/plain'
12 Mai 2014, 14:33:31: Tracker: Using /opt/csw/lib/tracker-1.0/extract-modules/libextract-text.so...
12 Mai 2014, 14:33:31: Tracker:   Starting to read 'file:///tank/dbd/dir11/file185.txt' up to 1048576 bytes...
12 Mai 2014, 14:33:31: Tracker:   Read 65535 bytes from file, 983041 bytes remaining until configured 
threshold is reached
12 Mai 2014, 14:33:31: Tracker:   Read 65535 bytes from file, 917506 bytes remaining until configured 
threshold is reached
12 Mai 2014, 14:33:31: Tracker:   Read 65535 bytes from file, 851971 bytes remaining until configured 
threshold is reached
12 Mai 2014, 14:33:31: Tracker:   Read 65535 bytes from file, 786436 bytes remaining until configured 
threshold is reached
12 Mai 2014, 14:33:31: Tracker:   Read 65535 bytes from file, 720901 bytes remaining until configured 
threshold is reached
12 Mai 2014, 14:33:31: Tracker:   Read 65535 bytes from file, 655366 bytes remaining until configured 
threshold is reached
12 Mai 2014, 14:33:31: Tracker:   Read 65535 bytes from file, 589831 bytes remaining until configured 
threshold is reached
12 Mai 2014, 14:33:31: Tracker:   Read 65535 bytes from file, 524296 bytes remaining until configured 
threshold is reached
12 Mai 2014, 14:33:31: Tracker:   Read 65535 bytes from file, 458761 bytes remaining until configured 
threshold is reached
12 Mai 2014, 14:33:31: Tracker:   Read 65535 bytes from file, 393226 bytes remaining until configured 
threshold is reached
12 Mai 2014, 14:33:31: Tracker:   Read 65535 bytes from file, 327691 bytes remaining until configured 
threshold is reached
12 Mai 2014, 14:33:31: Tracker:   Read 65535 bytes from file, 262156 bytes remaining until configured 
threshold is reached
12 Mai 2014, 14:33:31: Tracker:   Read 65535 bytes from file, 196621 bytes remaining until configured 
threshold is reached
12 Mai 2014, 14:33:31: Tracker:   Read 65535 bytes from file, 131086 bytes remaining until configured 
threshold is reached
12 Mai 2014, 14:33:31: Tracker:   Read 65535 bytes from file, 65551 bytes remaining until configured 
threshold is reached
12 Mai 2014, 14:33:31: Tracker:   Read 65535 bytes from file, 16 bytes remaining until configured threshold 
is reached
12 Mai 2014, 14:33:31: Tracker:   Read 16 bytes from file, 0 bytes remaining until configured threshold is 
reached
12 Mai 2014, 14:33:31: Tracker: Done (2 objects added)

Cheerio!
-Ralph



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]