Opened 7 days ago

Last modified 7 days ago

#14738 assigned bug

[pc_serial] Sometimes gets stuck in Write() (even in async mode)

Reported by: ttcoder Owned by: mmu_man
Priority: normal Milestone: Unscheduled
Component: Drivers/TTY Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Has a Patch: no Platform: All

Description

I have a report from KEZM (they have an RS 232 switcher piloted by CC on Haiku) : CC tends to get stuck on the serial write a few times per week -- the app freezes.

My code configures the BSerialPort like thus

serialInput.SetBlocking( false );

Yet it seems to get in a 'blocked' state ?

Change History (4)

comment:1 Changed 7 days ago by ttcoder

Running ps, he gets this:

/boot/home/config/apps/TuneTrackerSystems/CommandCenter   589        7    0    0 
  Thread                                   Id    State Prio    UTime    KTime
CommandCenter                           589     wait   10  1294940   340328 
event_server                            607     wait   10    85438    30826 SwitcherHandler_looper(3464)
logging_server                          608     wait   10    17937     8280 
render                                  610      zzz   15  1514434  1373240 
w>Command Center                        618     wait   15   953315   243675 
SwitcherHandler_looper                  624     wait   10   222743   178693 pc_serial:done_write(3480)

Looking for "pc_serial:done_write" yields this: http://xref.plausible.coop/source/xref/haiku/src/add-ons/kernel/drivers/ports/pc_serial/SerialDevice.cpp

From my limited understanding, it seems the semaphore is acquired in Write() and released in WriteCallbackFunction() ? So the fact my thread remains stuck means the callback was not called, does that sound correct? How do I deal with that in my app, can I "unblock" my thread from another thread ?

EDIT: oddly, looking for other references to the callback yields nothing: http://xref.plausible.coop/source/search?q=WriteCallbackFunction&defs=&refs=&path=&hist=&type=&project=haiku Is it not called by anyone?

Last edited 7 days ago by ttcoder (previous) (diff)

comment:2 Changed 7 days ago by pulkomandy

Component: Kits/Device KitDrivers/TTY
Owner: changed from pulkomandy to mmu_man

It seems the pc_serial driver does not handle non-blocking write nor timeouts. So I don't see a way to fix this from the application side.

I don't see where that write callback is called, however I see the semaphore is also released in the interrupt handler.

I suspect a race condition, where we manage to call write twice, quickly enough, and the interrupt triggers only once, resulting in the second write call blocking for no reason.

Reassigning to mmu_man, as he wrote this driver.

comment:3 in reply to:  2 Changed 7 days ago by ttcoder

Replying to pulkomandy:

I suspect a race condition, where we manage to call write twice, quickly enough, and the interrupt triggers only once, resulting in the second write call blocking for no reason.

Very interesting.. If that is indeed the cause, I guess it would be mitigated by doing a snooze(9000) (or whatever the delay is between interrupts) after each write? Remember, a perfect fix later does not have to exclude an immediate work-around -- the station would be happy if I could provide a timely improvement :-)

comment:4 Changed 7 days ago by mmu_man

WriteCallbackFunction seems to be unused, it's a leftover from the usb_serial driver code which was used as model.

Looking back at the code it's amazing it actually works :)

Note: See TracTickets for help on using tickets.