#11606 closed bug (invalid)
An issue after new scheduler merging
Reported by: | Giova84 | Owned by: | nobody |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | System/Kernel | Version: | R1/Development |
Keywords: | Cc: | ||
Blocked By: | Blocking: | ||
Platform: | All |
Description
First of all: I started to use TOR (the onion router) on Haiku about one two years ago, continuosly, without issue.
Well, since the changes in the Haiku's scheduler i noticed an issue: randomly, tor, starts to peg one cpu core (in ProcessController I can see tor that occupy 100% one core of my cpu). This issue occurs very randomly (sometimes occurs continuosly and sometimes occurs also after some days or more) and in a very unpredictable way. I also tried to recompile a freshly version of tor, but this issue still occurs. In anyway seems that I've found an workaround to avoid this issue: I start tor through a shell script invoking "renice -b 1" and "ulimit", to attempt to reduce the cpu usage of the process. For now, using this script, are ten days that I no longer see this issue. I also reported this issue on the tor bug tracker: https://trac.torproject.org/projects/tor/ticket/13513 . As suggested by tor's team I open this ticket on Haiku's bugtracker. In anyway I currently running also a machine with Haiku Alpha 4, and this trouble never appeared.
Currently I don't have a syslog with this issue: i will provide a syslog whenever and if I see this issue again.
Attachments (2)
Change History (15)
comment:1 by , 10 years ago
comment:2 by , 10 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
follow-up: 5 comment:3 by , 10 years ago
Owner: | changed from | to
---|
I fail to see how this is related to the scheduler. If the thread requests as much CPU time as possible and there are no competing threads scheduler allows that thread to use 100% of CPU time.
The real issue might actually be present also in Alpha 4, but since the old scheduler tended to distribute load evenly across CPUs even if there were only one active thread it might be harder to notice on CPU usage graphs.
This still looks like a Haiku bug, though. As it was said in the linked Tor bug report it is probably best to start investigating this issue by taking look at the libevent, which Tor apparently uses.
comment:4 by , 10 years ago
Hi pdziepak,
I can assure you that on Alpha 4 (and on other older nightlies) i never seen this behaviour. In anyway I'm not saying that the scheduler is the culprit; i was just referring to the fact that this behaviour, at least for me, is appeared on newer nightlies (in past I didn't updated Haiku for a long time, due harware issues). I mentioning the scheduler due the cpu consumption. However on the Tor bugtracker https://trac.torproject.org/projects/tor/ticket/13513, an user said that " #2963 is probably related " So, if #2963 is the culprit, the issue is network related, and maybe the cause is in the different network card which I currently use: an rtl8111x family card (on Alpha 4 i have an ipro 1000 family card).
follow-up: 6 comment:5 by , 10 years ago
Replying to pdziepak:
As it was said in the linked Tor bug report it is probably best to start investigating this issue by taking look at the libevent, which Tor apparently uses.
This sounds a lot like the compatibility issue that Transmission exposed with libevent. It depends on positive POSIX error codes and therefore needs to be built with the posix_error_mapper. Otherwise failed connections will result in misinterpreting error codes as FD numbers and cause libevent to end up in an endless loop when attempting to resize its FD array.
Debugging the thread and getting a stack trace would be the obvious way to confirm/deny.
How did you build the tor binary? And did you build your own libevent? For reference on using the posix_error_mapper see the recipes for libevent and transmission in HaikuPorts.
follow-up: 7 comment:6 by , 10 years ago
Replying to mmlr:
How did you build the tor binary? And did you build your own libevent? For reference on using the posix_error_mapper see the recipes for libevent and transmission in HaikuPorts.
I've built the Tor binary by myself as usual, using the gcc4 compiler, without any special flag during compiling. I've tried with libevent alpha built by myself and also using the official libevent from Haikuporter. The libevent (hpkg) from Haikuporter, as i can see, is shipped using posix_error_mapper: https://bitbucket.org/haikuports/haikuports/commits/53f9c5255cb65bdc182cdc6340479239d3e5cb52
Should I also use these flags when I compile Tor?
comment:7 by , 10 years ago
Replying to Giova84:
Should I also use these flags when I compile Tor?
Yes. Libevent definitely requires it to work correctly in error cases. You could use the pre-built package to get a libevent built with the correct defines. But the final binary that is built needs to be linked to libposix_error_mapper.a as well (tor in that case). Using the same define and library as in the HaikuPorts commit should do the trick.
comment:8 by , 10 years ago
Ok, I've built Tor as you suggested.
But now I am totally unable to use Tor. I try to explain the new behaviour:
I launch this new compiled Tor (with libposix_error_mapper flags) and when I launch Tor i got:
Dec 13 21:16:51.502 [warn] Directory /boot/home/.tor cannot be read: No such file or directory Dec 13 21:16:51.502 [warn] Failed to parse/validate config: Couldn't access/create private data directory "/boot/home/.tor"
But in the torrc config file the datadir is set as "DataDirectory /boot/home/.tor". Well, so I make the .tor directory and now:
Dec 13 21:17:15.419 [warn] Fixing permissions on directory /boot/home/.tor Dec 13 21:17:15.000 [warn] State file "/boot/home/.tor/state" is not a file? Failing. Dec 13 21:17:15.000 [err] set_options(): Bug: Acting on config options left us in a broken state. Dying.
But there is no "state" file inside that directory. So I've tried to put, inside the .tor directory, the "state" file from my backup, and now tor correctly starts, but it stucks at
Bootstrapped 5%: Connecting to directory server
Note: I've also attempt to build Tor using the --with-tor-user=username --with-tor-group=groupname but these issues are always here. I've also tried to launch tor using the "--user" flags, but this doesn't help.
All works fine (as before) if I use the old Tor binary (without the libposix_error_mapper flags)
In anyway I've saved a debug report, which I attach to this ticket: hope this help you to figure out the issue.
by , 10 years ago
Attachment: | tor-13692-debug-13-12-2014-20-36-44.report added |
---|
tor (built using libposix_error_mapper flags) debug report
follow-up: 10 comment:9 by , 10 years ago
I've run again the old tor binary (without libposix_error flags) and "fortunately" it hanged again as described in the ticket: I saved another debug error when tor started to peg the cpu.
by , 10 years ago
Attachment: | tor-19764-debug-13-12-2014-20-46-08.report added |
---|
tor (built without libposix_error_mapper flags) debug report (when starts to peg the cpu)
comment:10 by , 10 years ago
Replying to Giova84:
I've run again the old tor binary (without libposix_error flags) and "fortunately" it hanged again as described in the ticket: I saved another debug error when tor started to peg the cpu.
Note: here i started Tor without the script which calls ulimit and renice, but I can't assure if this make some difference: as I've said this issue occurs very randomly.
comment:11 by , 10 years ago
Sorry for the noise. I just want to say that I've found a similar issue using rdesktop https://bitbucket.org/haikuports/haikuports/src/b446faf1f49d578283cfa022f367c36e1557f987/net-misc/rdesktop/rdesktop-1.8.0.recipe: after a while (but this issue with rdesktop occurs more often) it start to peg the cpu as happen with Tor. Could be unrelated; in the case I will open a dedicated ticket on Haikuports.
comment:12 by , 6 years ago
Resolution: | → invalid |
---|---|
Status: | assigned → closed |
Seems pretty definitive this is not a scheduler bug, so, closing.
comment:13 by , 5 years ago
Milestone: | R1 |
---|
Remove milestone for tickets with status = closed and resolution != fixed
Replying to Giova84:
Sorry, I meant to say two years ago.