Opened 14 years ago
Closed 13 years ago
#6276 closed bug (fixed)
Console backspace doesn't properly handle non-ascii unicode
Reported by: | Adek336 | Owned by: | zooey |
---|---|---|---|
Priority: | normal | Milestone: | R1 |
Component: | System/POSIX | Version: | R1/Development |
Keywords: | terminal bash readline | Cc: | siarzhuk, henrimoi@… |
Blocked By: | Blocking: | #5775, #6094, #6836, #7244 | |
Platform: | All |
Description
A testcase
mkdir xyz cd xyz # fine cd .. cd Ź<backspace>xyz # sh: cd: yz: No such file or directory
The examplary 'Ź' character does not have to be typed-in from keyboard, it may very well be copied-pasted.
hrev37286, gcc4+2 hybrid, VBox 3.2
Change History (17)
comment:1 by , 14 years ago
comment:2 by , 14 years ago
This is a duplicate of #5775, but it has a better summary and description. So closing that one.
comment:3 by , 14 years ago
Blocking: | 5775 added |
---|
comment:4 by , 14 years ago
Cc: | added |
---|
follow-up: 6 comment:5 by , 14 years ago
Cc: | added |
---|
If you try to set LC_CTYPE or any of the other locale variables, bash will warn that setlocale fails. This is probably because Haiku doesn’t ship with any glibc locale files, so applications won’t even know what character encoding is associated with a locale. On Linux (Debian squeeze), these files seem to be installed under /usr/share/i18n.
This also means that any application that relies on C/POSIX locale support for finding out the character encoding will fail, including applications such as svn. And because the locale files are not installed, setting LC_CTYPE will not mean anything.
comment:6 by , 14 years ago
Replying to heto:
If you try to set LC_CTYPE or any of the other locale variables, bash will warn that setlocale fails.
Actually, it's because Haiku doesn't currently implement any of the POSIX locale stuff properly. There is a work-in-progress branch to rectify this, but until it's complete, none of the LC_* stuff will work correctly, regardless of the presence of those files.
comment:7 by , 14 years ago
Blocking: | 6094 added |
---|
comment:8 by , 14 years ago
Blocking: | 6836 added |
---|
comment:9 by , 14 years ago
Component: | - General → Applications/Command Line Tools |
---|---|
Version: | R1/alpha2 → R1/Development |
comment:10 by , 14 years ago
Owner: | changed from | to
---|---|
Status: | new → in-progress |
After a first glance it looks very much like a bash/readline problem. Looking a bit closer...
comment:11 by , 14 years ago
Component: | Applications/Command Line Tools → System/POSIX |
---|---|
Owner: | changed from | to
Status: | in-progress → assigned |
Passing on to Oliver. This is a POSIX locale related issue. Haiku's <stdlib.h> defines MB_CUR_MAX
to 1. It is also noteworthy that <limits.h> doesn't define MB_LEN_MAX
, so in gcc's fixed header it is defined to 1 as well.
comment:12 by , 14 years ago
Keywords: | terminal bash readline added |
---|
follow-up: 15 comment:14 by , 13 years ago
Is setting both MB_LEN_MAX and MB_CUR_MAX to 6 a correct fix (seems to be the value needed for utf-8) ? Or is there something more involved ?
MB_LEN_MAX must be a constant so we should use 6 for it, MB_CUR_MAX may be variable depending on the current locale but I'm not sure that's of any use.
comment:15 by , 13 years ago
Replying to pulkomandy:
Is setting both MB_LEN_MAX and MB_CUR_MAX to 6 a correct fix (seems to be the value needed for utf-8) ? Or is there something more involved ?
It is probably more involved. Possibly also including changes to the compiler.
To represent any Unicode code point 4 byte UTF-8 suffices, BTW. glibc seems to define MB_LEN_MAX
to 16 (sixteen).
comment:16 by , 13 years ago
With hrev43310, the behaviour has improved with respect to the handling of backspace, but entering the multibyte characters manually still causes problems - the character shows only after pressing space for instance). Pasting multibyte characters and deleting them via backspace works ok, though.
I'll investigate why editing doesn't work ..
Yes. I have same when forgot switch keymap from russian and type something in terminal and then try use backspace key and get something like Adek336 have.