Ask HN: How would you classify an email signature using NLP?
This is a challenge I have been working on for a few days now.
I have 150gb of mail that I will try to work with from an NLP perspective.
I wish to do some different classification with the emails.
But the first fun thing I ran into was how I need to classify the signature in an email.
I ended up manually picking around 500 email signatures and trained a model for recognition. My model performs horrible.
How would you do that ?
Use an LLM to classify a large representative sample, then train a model on that.