Opened 16 months ago

Last modified 16 months ago

#18205 new bug

Emails get identifed with MIME text/plain

Reported by: humdinger Owned by: nobody
Priority: normal Milestone: Unscheduled
Component: Kits/Storage Kit Version: R1/beta4
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

There's no sniffer rule for text/x-email. Newly downloaded emails are assigned the correct MIME type, but older emails (that for some reason lost their file type...) are identified as text/plain.

If you "Force identify" (hold SHIFT when opening the FIle menu) a formerly correctly typed Email gets assigned "text/plain".

I think we need a sniffer rule. Wonder why it worked in the past (AFAIK, it may never have worked and nobody noticed...)

Change History (7)

comment:1 by pulkomandy, 16 months ago

There can't be a sniffer rule for this since the email file contents is just text and there are no headers to detect it. The magic is all in the file attributes, which a sniffing rule doesn't have access to.

How did the mimetype end up being lost? This shouldn't happen. In this case it's up to the mailing applications to set and preserve the filetype I'd say

comment:2 by humdinger, 16 months ago

How the MIME type got lost, I don't know. As I wrote, you can make an email lose it's MIME type by "Force identify". Then it's a text/plain.

Emails do have headers... Just open one in a text editor, or use Mail's "View | Show header".

comment:3 by nephele, 16 months ago

pulkomandy ment a header in the sense of a line preceeding the plaintext content, not the mail headers as such. If you open the email in Koder do you see the headers before the plaintext content?

The problem here teally is that our sniffer is incapabel, in general, to detect email. The structure of a multi-mime email is detectable, but not with the text matching we have. At some point we may want to think about adding a way that more complex sniffer add-ons run which give the final verdict, kind of like a tie breaker (for example to detect source code vs plain text)

In this case it sounds like force identify is available in the gui but does not do what users expect, it will throw away any a priori knowledge we have of the mime type like from the mail app, or a webserver and instead guess what it is, based on the file stream, disregarding any extended attributes and any file name hints (lile the extension)

comment:4 by humdinger, 16 months ago

If you open the email in Koder do you see the headers before the plaintext content?

Yes. First line is always: Delivered-To: humdingerb@gmail.com
Second line starts with: Received: by
Then come some lines that may differ by mail server or sender's application (X-Google-Smtp-Source:, ARC-Seal:, ARC-Message-Signature:, Return-Path:). All emails should contain: Received: from , Content-Type: , MIME-Version: , From: , To: , Date: , Message-ID: , Subject: etc.

Maybe in sum, those make for an OK sniffer rule?
All these headers may have been inserted by the mail_daemon when it originally fetched the mail, I don't know, but that's OK IMO. Where else can you get single email files from?

comment:5 by nephele, 16 months ago

It depends how you store them, unix for example used to store emails in an mbox format that was all emails in one file.

Anyhow, yes, if the email headers are stored in the file itself and not the extended attributes we can sniff for them fine. But we can't easily destinquish multipart from single part emails still :) (that is emails that include images, or a second html version etc) but we probably don't need that yet either.

comment:6 by humdinger, 16 months ago

I don't think I have any multipart emails, or I don't know how to distinguish them. I do have some partially downloaded mails (text/x-partial-email) - for whatever reason, I certainly haven't limited the download size in the mail_daemon settings.
When opening such a partial mail with Mail, it gets downloaded. If I first change its MIME to text/x-email, Mail shows an empty body. Maybe Mail can learn to change the MIME of empty-body mails to text/x-partial-email and pass it to mail_daemon as usual to download the rest.
Then the sniffer-rule wouldn't have to distinguish partially downloaded mails, too.

comment:7 by nephele, 16 months ago

Multipart emails are very common nowadays

I speculate mail_daemon might create empty mails when viewing an imap directory and download them later? I don't know though. I don't think Mail can distinquish empty emails from half-downloaded ones though. Probably better to bot change the minetype if mail_daemon has set it

Note: See TracTickets for help on using tickets.