[xml] [IGNORE!] C++ SAX interface, high memory usage.



Hi again. Please ignore my last message, copied below. The memory leak was in the (...) part.

I'm sorry for taking your time.

Best regards,

Eduardo.

--------------------- ORIGINAL MESSAGE -------------

Hi.

I'm using libxml2 to read a big (250MB) XML file using the SAX approach. The code works, but the memory usage is increasing at a rate of about 100 bytes per read record. The file has around 485000 <item xx="yy" zz="pp" ..> tags, so for every 10000 tags read, the application increases memory usage by 1GB. It reaches a limit where the OS kills the application without reaching the end of the XML file.

The most relevant part of the problem seems to be here:

 void StartElementCallback(void * pData,
                          const xmlChar * name,
                          const xmlChar ** attrs) {

  * ((Data *) pData) = Data(); // Data is a struc with many members and a Data() contructor to zero all of them.

  while (NULL != attrs && NULL != attrs[0]) {
    printf("attribute: %s=%s\n",attrs[0],attrs[1]);

    std::ostringstream strStream;

    strStream.str("");
    strStream << attrs[0];
    std::string strAttribute = strStream.str();

    strStream.str("");
    strStream << attrs[1];
    std::string strValue = strStream.str();
   
    ...

    attrs = &attrs[2];
  }
}

  My first thought was that strStream was holding a reference to attrs[n] so I tried copying attrs[n] like this:

 void StartElementCallback(void * pData,
                          const xmlChar * name,
                          const xmlChar ** attrs) {

  * ((Data *) pData) = Data(); // Data is a struc with many members and a Data() contructor to zero all of them.

  while (NULL != attrs && NULL != attrs[0]) {
    printf("attribute: %s=%s\n",attrs[0],attrs[1]);

    char * pKey = strdup(reinterpret_cast<const char*>(attrs[0]));

    std::string strAttribute = std::string(pKey);
    free(pKey);

    char * pValue = strdup(reinterpret_cast<const char*>(attrs[1]));
    std::string strValue = std::string(pValue);
    free(pValue);

    ...

    attrs = &attrs[2];
  }
}

But the problem persisted. So I tried this:

 void StartElementCallback(void * pData,
                          const xmlChar * name,
                          const xmlChar ** attrs) {

  * ((Data *) pData) = Data(); // Data is a struc with many members and a Data() contructor to zero all of them.

  while (NULL != attrs && NULL != attrs[0]) {
    printf("attribute: %s=%s\n",attrs[0],attrs[1]);

    std::string strAttribute = "1";
    std::string strValue = "2";

    ...

    attrs = &attrs[2];
  }
}

And the memory footprint of the program was reduced to 2.2MB, constant value.

Am I doing something wrong? Can you please help me find what is the problem?

Thanks a lot for your time.

Best regards,

Eduardo.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]