Opened 4 years ago

Closed 4 years ago

#12163 closed enhancement (duplicate)

MIME sniffing can't reliably detect html vs xhtml

Reported by: haiqu Owned by: pulkomandy
Priority: normal Milestone: Unscheduled
Component: Applications/WebPositive Version: R1/Development
Keywords: Cc:
Blocked By: #11800 Blocking:
Has a Patch: no Platform: All


The following snippet of code shows an error in WebPositive:

<div id="project-header">
  <a href="/"><img src="/home/furius-logo-w.png" id="logo"></a>
  <div id="project-home"><a href="..">Project Home</a></div>

It will fail to display and throw an error at the second line. Replacing it with the following code works:

<div id="project-header">
  <a href="/"><img src="/home/furius-logo-w.png" id="logo"></img></a>
  <div id="project-home"><a href="..">Project Home</a></div>

but since the img tag is assymetrical the code itself is wrong.

Change History (6)

comment:1 Changed 4 years ago by pulkomandy

"an error" isn't very descriptive of what you get. Let me grab my crystal ball...

I think you get a parse error because Web+ detected your file or page as xhtml and you are trying to write html. For remote files Web+ will use the mime type given in the HTTP headers. For local files it will use the mimetype attribute of the file. That attribute is set by the mime sniffing rules (you can see them in preferences/filetypes), but the detection is not perfect and will sometimes mistake html for xhtml. You can manually adjust the attribute (use listattr, catattr, and setattr in terminal to experiment with attributes) and set the correct mime type.

Correct code in xhtml could be (notice the / at the end to close the img):

<img src="/home/furius-logo-w.png" id="logo"/>

comment:2 Changed 4 years ago by haiqu

It wasn't a very descriptive error message.

error on line 13 at column 64: Opening and ending tag mismatch: img line 0 and a

Here's the head code. The file was machine generated:

<?xml version="1.0" encoding="iso-8859-1" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "">
<html xmlns="" xml:lang="en" lang="en">
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<meta name="generator" content="Docutils 0.8.1:" />
<title>Version Control Tools and Other Utilities</title>
<link rel="stylesheet" href="../style.css" type="text/css" />

It was opened locally. Hope all that helps to understand if it's a bug.

comment:3 Changed 4 years ago by waddlesplash

Resolution: invalid
Status: newclosed

PulkoMandy is correct, that's invalid XHTML. You have to close the <img> tag for it to be valid XHTML.

comment:4 Changed 4 years ago by haiqu

Since it's html that's irrelevant.

Note: Content="text/html"


comment:5 Changed 4 years ago by pulkomandy

Resolution: invalid
Status: closedreopened
Summary: WebPositive error on img tag between <a></a> tagsMIME sniffing can't reliably detect html vs xhtml
Type: bugenhancement

but there is an XML header and a doctype saying xhtml transitional. So, this pretends to be an XML file. Only when you start to parse it and encounter the meta tag, you should switch to html (and probably start parsing from the start again?).

So, you can force the MIME type of the file to "text/html" using the addattr command.

I'm switching the ticket to enhancement and changing the description. I'm not sure what we can do to improve our sniffing system, it's based on simple rules to be reasonably fast, and XHTML and HTML may simply be too similar to hope reasonably telling them apart, especially in cases like your example.

comment:6 Changed 4 years ago by pulkomandy

Blocked By: 11800 added
Resolution: duplicate
Status: reopenedclosed
Note: See TracTickets for help on using tickets.