Opened 13 years ago
Closed 11 years ago
#7670 closed bug (fixed)
Sniffing rule for text/html is wrong
Reported by: | pulkomandy | Owned by: | axeld |
---|---|---|---|
Priority: | normal | Milestone: | R1 |
Component: | Preferences/FileTypes | Version: | R1/Development |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | All |
Description
The current sniffing rule for html is :
0.40 [0:64]( -i "<HTML" | "<HEAD" | "<TITLE" | "<BODY" | "<TABLE" | "<!--" | "<META" | "<CENTER")
This looks for bits of html in the 64 first bytes of the file.
However, valid html start with a doctype, which takes more than 64 bytes. So the detection will fail on most html files. Checking for title, body, table, meta and center seems barely useful. Checking for the doctype must be done carefully to not mistakenly accept other xml files. Looking for <!DOCTYPE HTML
may work.
Note:
See TracTickets
for help on using tickets.
The rule now looks in the first 512 bytes, with better results.