Opened 14 years ago
Closed 12 years ago
#6227 closed bug (fixed)
Terminal TermParse::EscParse() 2 issues.
Reported by: | h.z | Owned by: | siarzhuk |
---|---|---|---|
Priority: | normal | Milestone: | R1 |
Component: | Applications/Terminal | Version: | R1/alpha2 |
Keywords: | Cc: | siarzhuk | |
Blocked By: | Blocking: | ||
Platform: | All |
Description
1.Insert UTF-8 3 bytes char with width = 1 cause wrong buffer attributes cause BView::DrawString abnormally.
2.The convert_to_utf8() always return with zero dstLen, then the following line of InsertChar() with do nothing.
The changes on TermParse::EscParse() need more testing.
Attachments (1)
Change History (14)
comment:1 by , 14 years ago
comment:2 by , 14 years ago
I notice new class methods are used now, so I think need some time to do some debugging to see what happens after these changes.
One quick fix for issue 1 just like this, working good at UTF-8 conversion but its not a good way, I think. The problems should come from TermParse::EscParse().
---BasicTerminalBuffer.cpp--- void BasicTerminalBuffer::InsertChar(UTF8Char c, uint32 width, uint32 attributes) { debug_printf("BasicTerminalBuffer::InsertChar('%.*s' (%d), %#lx)\n", (int)c.ByteCount(), c.bytes, c.bytes[0], attributes);
if ((int32)width == FULL_WIDTH)
attributes |= A_WIDTH;
+ if (c.IsFullWidth()) +{ + attributes |= A_WIDTH; + width *= 2; +}
by , 14 years ago
Attachment: | terminal-6227.patch added |
---|
fix for these 2 issues, tested under UTF-8/GBK conversions
comment:3 by , 14 years ago
patch: | 0 → 1 |
---|
comment:5 by , 14 years ago
issue 1: utf-8 fullwidth with width=1 fix: add if IsFullWidth then width=2.
issue 2: convert_to_utf8 always with dstLen=0 fix: add dstLen=4 and.
follow-up: 7 comment:6 by , 14 years ago
First of all: sorry to reply this late, and thanks for working on this! I tried to apply the patch but doesn't applies cleanly. I'm applying it manually. Only, I can't understand some of the comparison in TermParse.cpp:
if ((c > 0xA1 && c < 0xA9)
(c>0xB0 && c< 0xF7) (c>0x81 && c< 0xA0) (c>0xAA && c< 0xFE) (c>0xA8 && c< 0xA9))
Besides that the comparisons would be more readable if they were done in order (first 0x81, then 0xA1, etc), the last case (c>0xA8 && c< 0xA9) seems to be covered already by the first check (c > 0xA1 && c < 0xA9), and there are other checks which overlap (c>0xB0 && c< 0xF7) vs (c>0xAA && c< 0xFE) so I'm not sure if I understand this. Then there is some coding style violations (parenthesis go to the end of the line, missing spaces between "c" and ">", etc.)
comment:7 by , 14 years ago
Replying to jackburton:
the last case (c>0xA8 && c< 0xA9) seems to be covered already by the first check (c > 0xA1 && c < 0xA9)
In fact, since c
is an integer, the last check is always false.
comment:8 by , 13 years ago
patch: | 1 → 0 |
---|
comment:9 by , 13 years ago
The patch doesn't apply cleanly and more feedback is needed. Can you please update the patch to a current revision and fix the mentioned coding style problems.
comment:10 by , 13 years ago
Cc: | added |
---|
comment:11 by , 13 years ago
First issue (invalid destLen was sent to convert_to_utf8) fixed in hrev43252.
Some comments about the second one:
1) UTF8.h says that:
41 B_GBK_CONVERSION, // Chinese GB18030
But GB 18030 is an extension of GBK. They are different encodings! Are there anybody flyent in iconv's architecture to check which encoding is used behind the scenes of our convert_xx_utf8?
2) Those ranges in the patch looks like come from the following info (root/src/libs/iconv/gb18030.h):
* GB18030, as specified in the GB18030 standard, is an extension of GBK. * In what follows, page numbers refer to the GB18030 standard (second * printing). * It consists of the following parts: * * One-byte range: * ASCII p. 2 0x{00..7F} * * Two-byte range: * GBK part 1 p. 10..12 0x{A1..A9}{A1..FE} * GBK part 2 p. 13..36 0x{B0..F7}{A1..FE} * GBK part 3 p. 37..52 0x{81..A0}{40..7E,80..FE} * GBK part 4 p. 53..81 0x{AA..FE}{40..7E,80..A0} * GBK part 5 p. 82 0x{A8..A9}{40..7E,80..A0} * UDA part 1 p. 83..84 0x{AA..AF}{A1..FE} U+E000..U+E233 * UDA part 2 p. 85..87 0x{F8..FE}{A1..FE} U+E234..U+E4C5 * UDA part 3 p. 88..90 0x{A1..A7}{40..7E,80..A0} U+E4C6..U+E765
So things are looking more complicate that a simple fallback to CASE_PRINT_GR anyway. ;-)
comment:12 by , 12 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:13 by , 12 years ago
JFYI: East Asian Full-Width characters support fixed in hrev45462. Issue with GBK encoding should be investigated additionally.
comment:14 by , 12 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
Should be finally resolved in hrev45533.
Have you been meaning to attach a patch? The wording of your last sentence suggests so, but I am not sure. :-)