I received several machine-generate e-mails which are all mostly the same: a notification. They are HTML emails with no plaintext MIME part. Yikes! And to complicate matters further, the messages traversed my anonaddy forwarding account which PGP encrypts every message to me before forwarding it to my normal email account.
The gov wants me to give them an “unaltered copy” of these e-mails. This gov office actually blocks my mail server so I am generally unwilling to send them email. This means I will be giving them the emails on paper hardcopy.
So wtf, this is tricky. They want an “unaltered copy”. If I were to print the MBOX files, it would be useless to them because it’s a base64 blob that only I can decrypt. My mail client is mutt so the HTML is detected and piped through w3m to give me a text version that is readable enough.
But in general, how do you give unaltered copies of an HTML email on paper form? This is not necessarily for a court but it could go down that path. Would a court want to see raw HTML tags? Or do courts prefer the HTML to be rendered for readability?
Normally I copy the w3m-rendered text of email into LaTeX and typeset it to look pretty and copy-paste the useful headers into a well-styled header in a monospaced font. And I omit the useless headers. But I get the impression my way of working would not pass for “unaltered”.
I could perhaps try to feed the HTML into wkhtmltopdf
. In the end, HTML rendering always varies depending on the rendering tool. Normies use MS Outlook, and I have to figure that the gov is normally dealing with normies. So maybe I should install Evolution or Thunderbird. Any suggestions for a tool that is particularly good at making HTML email presentable on paper without looking too custom?
#askFedi
I'm on my phone right now. When I get home I'll dig them up.
I might be able to get by without the script. I just found that I can render the body in Firefox well enough (that often fails but it works with the particular emails I’m dealing with), fiddle with the paper format and scale to exactly fit a page, and then import it into LaTeX, rescale, and attach a header. If you’ve already got the script ready then I would be happy to take it anyway and compare the script output to what I’m manually rigging up. But if you’ve not started then no worries. Thanks!
(edit)
fwiw to anyone with the same need, I found this project: https://github.com/nickrussler/email-to-pdf-converter It looks a bit messy to install on my distro and I’m not sure of EML / Mbox differences, so I’m not planning to use it myself.
I'd kind of forgotten how I'd done it.
This script searches an MBOX file for emails from or too lawyer1 and lawyer2, that contain the names or email addresses in target_names. It exports each email to a txt file, and saves any attachments in their original format.
https://pastebin.com/i0xq4fP9
Then I used this bash script to export the txt files to PDF, using pandoc (https://pandoc.org/)
https://pastebin.com/17FPXPr5
In my case, we needed to export like 7,000 emails from over a 6 year period from like a 45 GB GMail MBOX export. The lawyers seemed happy with the result, but it was a lot of data.
Thanks! I grabbed it in case it comes in handy. I wonder if the first script which searches for messages might have been simplified by using grepmail. Grepmail is slow but powerful.
This is slow too.