Opened 5 years ago

Closed 8 months ago

#11606 closed bug (invalid)

An issue after new scheduler merging

Reported by: Giova84 Owned by: nobody
Priority: normal Milestone: R1
Component: System/Kernel Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: no Platform: All

Description

First of all: I started to use TOR (the onion router) on Haiku about one two years ago, continuosly, without issue.

Well, since the changes in the Haiku's scheduler i noticed an issue: randomly, tor, starts to peg one cpu core (in ProcessController I can see tor that occupy 100% one core of my cpu). This issue occurs very randomly (sometimes occurs continuosly and sometimes occurs also after some days or more) and in a very unpredictable way. I also tried to recompile a freshly version of tor, but this issue still occurs. In anyway seems that I've found an workaround to avoid this issue: I start tor through a shell script invoking "renice -b 1" and "ulimit", to attempt to reduce the cpu usage of the process. For now, using this script, are ten days that I no longer see this issue. I also reported this issue on the tor bug tracker: https://trac.torproject.org/projects/tor/ticket/13513 . As suggested by tor's team I open this ticket on Haiku's bugtracker. In anyway I currently running also a machine with Haiku Alpha 4, and this trouble never appeared.

Currently I don't have a syslog with this issue: i will provide a syslog whenever and if I see this issue again.

Attachments (2)

tor-13692-debug-13-12-2014-20-36-44.report (8.3 KB ) - added by Giova84 5 years ago.
tor (built using libposix_error_mapper flags) debug report
tor-19764-debug-13-12-2014-20-46-08.report (11.7 KB ) - added by Giova84 5 years ago.
tor (built without libposix_error_mapper flags) debug report (when starts to peg the cpu)

Download all attachments as: .zip

Change History (14)

in reply to:  description comment:1 by Giova84, 5 years ago

Replying to Giova84:

First of all: I started to use TOR (the onion router) on Haiku about one two years ago

Sorry, I meant to say two years ago.

comment:2 by anevilyak, 5 years ago

Owner: changed from axeld to pdziepak
Status: newassigned

comment:3 by pdziepak, 5 years ago

Owner: changed from pdziepak to nobody

I fail to see how this is related to the scheduler. If the thread requests as much CPU time as possible and there are no competing threads scheduler allows that thread to use 100% of CPU time.

The real issue might actually be present also in Alpha 4, but since the old scheduler tended to distribute load evenly across CPUs even if there were only one active thread it might be harder to notice on CPU usage graphs.

This still looks like a Haiku bug, though. As it was said in the linked Tor bug report it is probably best to start investigating this issue by taking look at the libevent, which Tor apparently uses.

comment:4 by Giova84, 5 years ago

Hi pdziepak,

I can assure you that on Alpha 4 (and on other older nightlies) i never seen this behaviour. In anyway I'm not saying that the scheduler is the culprit; i was just referring to the fact that this behaviour, at least for me, is appeared on newer nightlies (in past I didn't updated Haiku for a long time, due harware issues). I mentioning the scheduler due the cpu consumption. However on the Tor bugtracker https://trac.torproject.org/projects/tor/ticket/13513, an user said that " #2963 is probably related " So, if #2963 is the culprit, the issue is network related, and maybe the cause is in the different network card which I currently use: an rtl8111x family card (on Alpha 4 i have an ipro 1000 family card).

in reply to:  3 ; comment:5 by mmlr, 5 years ago

Replying to pdziepak:

As it was said in the linked Tor bug report it is probably best to start investigating this issue by taking look at the libevent, which Tor apparently uses.

This sounds a lot like the compatibility issue that Transmission exposed with libevent. It depends on positive POSIX error codes and therefore needs to be built with the posix_error_mapper. Otherwise failed connections will result in misinterpreting error codes as FD numbers and cause libevent to end up in an endless loop when attempting to resize its FD array.

Debugging the thread and getting a stack trace would be the obvious way to confirm/deny.

How did you build the tor binary? And did you build your own libevent? For reference on using the posix_error_mapper see the recipes for libevent and transmission in HaikuPorts.

in reply to:  5 ; comment:6 by Giova84, 5 years ago

Replying to mmlr:

How did you build the tor binary? And did you build your own libevent? For reference on using the posix_error_mapper see the recipes for libevent and transmission in HaikuPorts.

I've built the Tor binary by myself as usual, using the gcc4 compiler, without any special flag during compiling. I've tried with libevent alpha built by myself and also using the official libevent from Haikuporter. The libevent (hpkg) from Haikuporter, as i can see, is shipped using posix_error_mapper: https://bitbucket.org/haikuports/haikuports/commits/53f9c5255cb65bdc182cdc6340479239d3e5cb52

Should I also use these flags when I compile Tor?

in reply to:  6 comment:7 by mmlr, 5 years ago

Replying to Giova84:

Should I also use these flags when I compile Tor?

Yes. Libevent definitely requires it to work correctly in error cases. You could use the pre-built package to get a libevent built with the correct defines. But the final binary that is built needs to be linked to libposix_error_mapper.a as well (tor in that case). Using the same define and library as in the HaikuPorts commit should do the trick.

comment:8 by Giova84, 5 years ago

Ok, I've built Tor as you suggested.

But now I am totally unable to use Tor. I try to explain the new behaviour:

I launch this new compiled Tor (with libposix_error_mapper flags) and when I launch Tor i got:

Dec 13 21:16:51.502 [warn] Directory /boot/home/.tor cannot be read: No such file or directory
Dec 13 21:16:51.502 [warn] Failed to parse/validate config: Couldn't access/create private data directory "/boot/home/.tor"

But in the torrc config file the datadir is set as "DataDirectory /boot/home/.tor". Well, so I make the .tor directory and now:

Dec 13 21:17:15.419 [warn] Fixing permissions on directory /boot/home/.tor
Dec 13 21:17:15.000 [warn] State file "/boot/home/.tor/state" is not a file? Failing.
Dec 13 21:17:15.000 [err] set_options(): Bug: Acting on config options left us in a broken state. Dying.

But there is no "state" file inside that directory. So I've tried to put, inside the .tor directory, the "state" file from my backup, and now tor correctly starts, but it stucks at

Bootstrapped 5%: Connecting to directory server

Note: I've also attempt to build Tor using the --with-tor-user=username --with-tor-group=groupname but these issues are always here. I've also tried to launch tor using the "--user" flags, but this doesn't help.

All works fine (as before) if I use the old Tor binary (without the libposix_error_mapper flags)

In anyway I've saved a debug report, which I attach to this ticket: hope this help you to figure out the issue.

by Giova84, 5 years ago

tor (built using libposix_error_mapper flags) debug report

comment:9 by Giova84, 5 years ago

I've run again the old tor binary (without libposix_error flags) and "fortunately" it hanged again as described in the ticket: I saved another debug error when tor started to peg the cpu.

by Giova84, 5 years ago

tor (built without libposix_error_mapper flags) debug report (when starts to peg the cpu)

in reply to:  9 comment:10 by Giova84, 5 years ago

Replying to Giova84:

I've run again the old tor binary (without libposix_error flags) and "fortunately" it hanged again as described in the ticket: I saved another debug error when tor started to peg the cpu.

Note: here i started Tor without the script which calls ulimit and renice, but I can't assure if this make some difference: as I've said this issue occurs very randomly.

comment:11 by Giova84, 5 years ago

Sorry for the noise. I just want to say that I've found a similar issue using rdesktop https://bitbucket.org/haikuports/haikuports/src/b446faf1f49d578283cfa022f367c36e1557f987/net-misc/rdesktop/rdesktop-1.8.0.recipe: after a while (but this issue with rdesktop occurs more often) it start to peg the cpu as happen with Tor. Could be unrelated; in the case I will open a dedicated ticket on Haikuports.

comment:12 by waddlesplash, 8 months ago

Resolution: invalid
Status: assignedclosed

Seems pretty definitive this is not a scheduler bug, so, closing.

Note: See TracTickets for help on using tickets.