Ask YC: Favorite email parsing library?

Any recommendations on a good library for parsing emails in mbox format? I'm pretty agnostic as to language (I don't know Python, but if the library is that good, I will learn it). I mostly care that it takes care of all the language encoding business for me. I've been using the Perl Mail::Box suite, and language encodings with that is just a mess.

Thanks everyone!

  • leave the parsing to someone else and use plan9's upas mail file system

    http://doc.cat-v.org/bell_labs/upas_mail_system/

    http://www.quanstro.net/plan9/nupas.pdf

    now available for plan9port on Unix

    we don't need no stinking APIs and Libs, give us grep awk sed and cat

  • Best parser I've found for PHP is the Pear mimeDecode library. Takes a bit of time to figure out the header parsing but it's pretty decent at handling the UW torture test.

  • rmail for ruby totally rocks for parsing emails. I haven't used it for the mbox format but I think it can handle it.

  • hmm, i've parsed email before. is there something specific to email encoding than general files? such as parsing the "encoding" header thingy? anything else?

  • EPS (Email Parsing System) is intended to give people the ability to write their own email processing tools. Whether you want to process incoming and outgoing emails, or just analyze a message, this package is intended to aid in that endeavor.

    http://www.inter7.com/index.php?page=eps

  • upasfs(4) http://man.cat-v.org/plan_9/4/upasfs

    Edit: Damn, skwiddor beat me to it :)