Opened 18 years ago
Last modified 10 years ago
#976 assigned bug
convert_from_utf8() broken — at Version 11
Reported by: | axeld | Owned by: | pulkomandy |
---|---|---|---|
Priority: | normal | Milestone: | Unscheduled |
Component: | Kits/libtextencoding.so | Version: | R1/Development |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | All |
Description (last modified by )
convert_from_utf8() uses iconv() in the background. However, it opens/closes the iconv context with every invocation, which is just wrong.
It becomes obvious with the UTF-16 encoding, where a marker is added at the beginning of the file, or, in our case, for every call to convert_from_utf8().
It is not possible to open/close the context properly without changing the API. Since this conflicts with the goal of Haiku R1, this bug is delayed to post-R1 schedule. Likely, a new API will need to be designed as part of the locale kit. It may be done before R2, and coexist with the legacy one.
Change History (11)
comment:1 by , 18 years ago
Component: | - General → Kits/libtextencoding.so |
---|
comment:2 by , 18 years ago
Well, guess what, it didn't exist when I wrote that bug report :-) And I needed to search it quite a while until I really knew that, I must have forgotten updating the bug afterwards ;-)
follow-up: 4 comment:3 by , 18 years ago
Would it work if we used a static variable (TLS ?) to keep the initialized state of iconv ? Thus we'd just call iconv_open() the first time convert_encoding() is called...
comment:4 by , 18 years ago
Replying to jackburton:
Would it work if we used a static variable (TLS ?) to keep the initialized state of iconv ? Thus we'd just call iconv_open() the first time convert_encoding() is called...
Hmmm problem would be, thought, that we wouldn't know when to call iconv_close(). Ok, let go what I just said.
comment:5 by , 18 years ago
An API change seems to be unavoidable. convert_encoding_open() convert_encoding_close() More or less the way iconv work :/
comment:6 by , 18 years ago
We can't change the API in such a way if we want to stay binary compatible, though.
I think we could work out something with TLS and the "state" parameter, even though it wouldn't be that clean: we could only lazily close the iconv context in case a) a new translation is started (with "state" = 0), or if b) the thread exits.
Creating a new iconv context for every call to convert_{from|to}_utf8() is also very slow.
comment:7 by , 18 years ago
Sounds good. But IMO if it is too hacky for us, we could (should?) add some functions or a class which clearly open and close the conversion stream.
comment:8 by , 18 years ago
Hmm and what about apps that do more than 1 convertion at the same time in the same thread, with different state vars ? like, it's converting to UTF-8, and from it to something else... interleaving calls. I know it's lame but possible. What about a (thread-safe = locked) list (hash table ?) of state pointers ? Each time a new pointer is seen we allocate an iconv context. We wouldn't really know when to close the context though, but it would be more correct.
comment:9 by , 15 years ago
Owner: | changed from | to
---|---|
Version: | R1/pre-alpha1 → R1/Development |
comment:10 by , 15 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:11 by , 14 years ago
Description: | modified (diff) |
---|---|
Milestone: | R1 → Unscheduled |
So, you complain about components, but then don't even use them? :)