Opened 10 years ago
Last modified 10 years ago
#11591 new bug
C++ source file is identified as HTML
Reported by: | waddlesplash | Owned by: | bonefish |
---|---|---|---|
Priority: | normal | Milestone: | R1 |
Component: | Servers/registrar | Version: | R1/Development |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | All |
Description
As in title. The source file that has this problem is attached.
This probably happens because it has some HTML tag names in it...
Attachments (1)
Change History (5)
by , 10 years ago
Attachment: | HTSearchParser.cpp added |
---|
comment:1 by , 10 years ago
Yes, any file with "<title" in the first 512 chars is HTML with a priority of 0.4. The source code rule identifies files starting with "" or "/*", or having a #include or #ifdef in the first 32 chars, but with a priority of only 0.20.
MIME sniffing can't always make a perfect guess, and these two rules are probably the fuzzier ones. You can try to move the "<title" tag further down in your file so it isn't in the first 512 chars anymore.
I'm not sure if fixing the sniffing rules is possible, do you have an idea what could be done?
comment:3 by , 10 years ago
Can we give more weight to anything that has /*
and #ifdef
in the first 64 chars? Because the file obviously has far more C/C++ keywords than HTML ones.
comment:4 by , 10 years ago
Is there no weight at all given to the file extension? To me that feels like a much more likely indicator of the type than having "<title" in the first 512 chars.
File that causes the issue.