Discussion:
[Tutor] How to parse a mailing list thread?
chandan kumar
2015-09-19 16:16:12 UTC
Permalink
Hello,

I am looking for a python module which i can use to parse mailing thread
and extract some information from it.

Any pointer regarding that would be helpful.

Thanks,

Chandan Kumar
_______________________________________________
Tutor maillist - ***@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor
Cameron Simpson
2015-09-20 05:41:31 UTC
Permalink
Post by chandan kumar
I am looking for a python module which i can use to parse mailing thread
and extract some information from it.
Any pointer regarding that would be helpful.
You should describe where the email messages are stored. I'll presume you have
obtained the messages.

Construct a Message object from each message text. See the email.message
module:

https://docs.python.org/3/library/email.message.html#module-email.message

Every message has a Message-ID: header which uniquely identifies it. Replies to
that message have that id in the In_Reply-To: header. (If you're parsing usenet
newsgroup messages, you want the References: header - personally I consult
both.)

The complete specification of an email message is here:

http://tools.ietf.org/html/rfc2822

and the email.message module (and the other email.* modules) makes most of it
easily available. If you need to parse email addresses import the
"getaddresses" function from the "email.utils" module.

Constuct a graph connecting messages with the replies. You're done!

Cheers,
Cameron Simpson <***@zip.com.au>
_______________________________________________
Tutor maillist - ***@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Loading...