Opened 17 years ago

Last modified 9 years ago

#976 assigned bug

convert_from_utf8() broken — at Version 11

Reported by: axeld Owned by: pulkomandy
Priority: normal Milestone: Unscheduled
Component: Kits/libtextencoding.so Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description (last modified by pulkomandy)

convert_from_utf8() uses iconv() in the background. However, it opens/closes the iconv context with every invocation, which is just wrong.

It becomes obvious with the UTF-16 encoding, where a marker is added at the beginning of the file, or, in our case, for every call to convert_from_utf8().

It is not possible to open/close the context properly without changing the API. Since this conflicts with the goal of Haiku R1, this bug is delayed to post-R1 schedule. Likely, a new API will need to be designed as part of the locale kit. It may be done before R2, and coexist with the legacy one.

Change History (11)

comment:1 by wkornewald, 17 years ago

Component: - GeneralKits/libtextencoding.so

So, you complain about components, but then don't even use them? :)

comment:2 by axeld, 17 years ago

Well, guess what, it didn't exist when I wrote that bug report :-) And I needed to search it quite a while until I really knew that, I must have forgotten updating the bug afterwards ;-)

comment:3 by jackburton, 17 years ago

Would it work if we used a static variable (TLS ?) to keep the initialized state of iconv ? Thus we'd just call iconv_open() the first time convert_encoding() is called...

in reply to:  3 comment:4 by jackburton, 17 years ago

Replying to jackburton:

Would it work if we used a static variable (TLS ?) to keep the initialized state of iconv ? Thus we'd just call iconv_open() the first time convert_encoding() is called...

Hmmm problem would be, thought, that we wouldn't know when to call iconv_close(). Ok, let go what I just said.

comment:5 by korli, 17 years ago

An API change seems to be unavoidable. convert_encoding_open() convert_encoding_close() More or less the way iconv work :/

comment:6 by axeld, 17 years ago

We can't change the API in such a way if we want to stay binary compatible, though.

I think we could work out something with TLS and the "state" parameter, even though it wouldn't be that clean: we could only lazily close the iconv context in case a) a new translation is started (with "state" = 0), or if b) the thread exits.

Creating a new iconv context for every call to convert_{from|to}_utf8() is also very slow.

comment:7 by korli, 17 years ago

Sounds good. But IMO if it is too hacky for us, we could (should?) add some functions or a class which clearly open and close the conversion stream.

comment:8 by mmu_man, 17 years ago

Hmm and what about apps that do more than 1 convertion at the same time in the same thread, with different state vars ? like, it's converting to UTF-8, and from it to something else... interleaving calls. I know it's lame but possible. What about a (thread-safe = locked) list (hash table ?) of state pointers ? Each time a new pointer is seen we allocate an iconv context. We wouldn't really know when to close the context though, but it would be more correct.

comment:9 by axeld, 14 years ago

Owner: changed from axeld to nobody
Version: R1/pre-alpha1R1/Development

comment:10 by pulkomandy, 14 years ago

Owner: changed from nobody to pulkomandy
Status: newassigned

comment:11 by pulkomandy, 13 years ago

Description: modified (diff)
Milestone: R1Unscheduled
Note: See TracTickets for help on using tickets.