• bleistift2@sopuli.xyzOP
    link
    fedilink
    English
    arrow-up
    4
    ·
    2 days ago

    I have been able to outsource low level parsing to third party libraries

    Hahaha!!!

    Today I watched a Java server crash because a library decided it needed more than 3GB of heap space to read a 10MB file. That was after manually removed background colors from around 100,000 cells, which apparently caused the parser to create even more objects in its internal representation of the sheet.

    • folekaule@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      2 days ago

      Yeah, I get it. I’ve had many libraries fail me in as many ways, which is why I consider it lucky to not have to implement my own. I work in .net these days, but there have been times where I had to just dig into the xml inside the xlsx and use xml tools. Those were mostly one-offs, thankfully.

      Back when I did Java I had a frustrating experience with IBM’s libxml causing our app to crash after several days due to a memory leak. I didn’t have access to the production environment so it took me probably 3 weeks to find the cause and only after digging through a crash dump provided by the sysadmin. Not related, but you triggered my traumatic memory :)