Opened 10 years ago

Closed 6 years ago

#10415 closed bug (fixed)

System freeze when using showimage on NTFS drive with lots of jpeg files

Reported by: bbjimmy Owned by: nobody
Priority: high Milestone: R1/beta1
Component: System/Kernel Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

This is 100% repeatable.

To reproduce:

Place a large number of large .jpg image files on an ntfs volume.

Boot hrev46670 and mount the ntfs volume

Double-click a .jpg image to launch Showimage.

Start a slideshow.

Wait 5 to 15 minutes. The slideshow stops, as if the last image in the directory was reached. the mouse can still move, but showimage is locked up. next the mouse stops, and there is no way to regain control of the system. The system must be powered off to escape the lock-up.

My guess is that it is Showimage as this is the only time I see the lock-up, but there seems to be nothing added to the system log and no debug information can be obtained.

Attachments (4)

Bitmap Clip (1.4 MB ) - added by bbjimmy 10 years ago.
backtrace of lock-up
previous_syslog.txt (248.8 KB ) - added by bbjimmy 10 years ago.
system log
syslogold.txt (512.0 KB ) - added by bbjimmy 10 years ago.
syslog after aspace 1
previous_syslog.2.txt (246.7 KB ) - added by bbjimmy 10 years ago.
Doesn't look like the kernel memory information was added to the file.

Change History (32)

comment:1 by ttcoder, 10 years ago

Does memory usage in ShowImage ..etc keep rising until the freeze (as examined through ProcessController) ? Are you able to drop to KDL (alt-prtscr-D) after it freezes ?

by bbjimmy, 10 years ago

Attachment: Bitmap Clip added

backtrace of lock-up

comment:2 by bbjimmy, 10 years ago

Yes, I could drop to KDL, see the attachment.

comment:3 by ttcoder, 10 years ago

The attachement (cannot be read online as it lacks an e.g. .jpg extension BTW) mentions low_resources_monitor() so that heavily hints at memory exhaustion and a memory leak indeed...

comment:4 by bbjimmy, 10 years ago

It appears that the Schedualer improvements have solved the issue. Or at maybe the issue has been masked? I have not been able to reproduce the error running hrev 46720.

comment:5 by bbjimmy, 10 years ago

My mistake, I connected to my wifi router, and the issue poped up again. The backtrace only has information relating to envoking the debugger, nothing that gives any clue to the error. I have the image, but no time to upload it. I will if anybody feels it may help.

by bbjimmy, 10 years ago

Attachment: previous_syslog.txt added

system log

comment:6 by bbjimmy, 10 years ago

Atheros wifi new media issue? it only seems to show while running a slideshow with Showimage and being connected to my wireless router.

comment:7 by bonefish, 10 years ago

Component: Applications/ShowImageSystem/Kernel
Owner: changed from leavengood to axeld

The following things are interesting in the syslog:

low resource address space: warning -> critical
...
vnode 7:4631286608493535112 is not becoming unbusy!
vnode 7:4631286608493535603 is not becoming unbusy!

If you aren't doing anything special, running out of kernel address space is certainly something that isn't supposed to happen. The busy vnode issues might be a separate issue or a side effect. So the first important thing is to determine what happens with the kernel address space. If you can enter KDL, please enter aspace 1. It will print several pages of area listings. You can skip the output (with 's'); it should still end up in the "previous_syslog".

comment:8 by bbjimmy, 10 years ago

I think I followed your directions corectly.

by bbjimmy, 10 years ago

Attachment: syslogold.txt added

syslog after aspace 1

comment:9 by bonefish, 10 years ago

The syslog of interest would have been /var/log/previous_syslog which is saved after reboot.

by bbjimmy, 10 years ago

Attachment: previous_syslog.2.txt added

Doesn't look like the kernel memory information was added to the file.

comment:10 by bonefish, 10 years ago

Just to make sure: you rebooted immediately into Haiku afterward? Because that's when the file is written.

comment:11 by bbjimmy, 10 years ago

That is exactly what I did. I had to type reboot from kdl to reboot haiku.

Last edited 10 years ago by bbjimmy (previous) (diff)

comment:12 by bbjimmy, 10 years ago

I repeated the steps, and still didn't get any information from aspace 1. looks like I will need to photograph the information as it doesn't end up in the preveous_syslog file.

comment:13 by bbjimmy, 10 years ago

I uploaded photos of the output of aspace 1 to http://fatelk.com/haiku/Archive.zip Caution it is a large file, 88.9 MiB

comment:14 by bbjimmy, 10 years ago

Apparently the inform,ation from the "aspace 1" kdl command was not interesting gnough for me to waste my time supplying it. There has not even been one download of the file that contains the output.

comment:15 by bonefish, 10 years ago

Yeah, we should totally fire the horde of full-time Haiku kernel developers who have been slacking off over the past four days.

BTW, the download is so slow for me (19 KB/s), that I can't wait for it ATM.

comment:16 by bonefish, 10 years ago

The address space is filled with "physical page pool" and "physical page pool space" areas, so this points toward someone leaking physical page mapping slots. Is the partition in question on a USB disk?

comment:17 by bbjimmy, 10 years ago

Haiku is running on a usb stick, but the images I am viewing are on the hard disk. I always run nightlies on a 2 GiB usb stick to test. I install the nightly on a 2Gib image file in QEMU then dd it to the usb stick. Then I boot from the USB stick and test the OS. This way I have enough room to install stuff to test.

comment:18 by ttcoder, 10 years ago

I'm kinda curious... I guess this probably happens even faster if not using the the slide-show but instead right-arrow'ing quickly through the list.. But does the freeze also occur if ShowImage is not involved at all but rather you open a Terminal, cd to the NTFS folder with the images, and run e.g. cksum * (or even hd *) in it ?

comment:19 by bbjimmy, 10 years ago

Neither of those tests cause the failure.

comment:20 by bonefish, 10 years ago

I saw similar symptoms at the last BeGeistert (cf. ticket:5777#comment:10). I suspect that the issue is actually USB related. That just checksumming/reading the files doesn't have the same effect would support that theory. Assuming the machine doesn't have enough memory to keep all the files in the cache, I suspect that little used files/code from the USB partition get evicted over time and then reloaded (possibly multiple times), which exercises the leak repeatedly.

A test without USB involvement could verify the theory.

comment:21 by scottmc, 9 years ago

Is this still 100% repeatable? Even with latest builds?

comment:22 by kallisti5, 9 years ago

bbjimmy, is this still an issue with the latest nightlies? If we don't hear back on this one it'll be bumped to unscheduled.

comment:23 by bbjimmy, 9 years ago

The issue is still present in hrev48595. The mouse seems to work a little longer than before, but the kernel bt still referrs to low_resources_monitor().

Last edited 9 years ago by bbjimmy (previous) (diff)

comment:24 by pulkomandy, 9 years ago

Summary: hrev46670 locks upSystem freeze when using showimage on NTFS drive with lots of jpeg files

comment:25 by bbjimmy, 8 years ago

I just checked in hrev 50387 and the issue is still there.

comment:26 by axeld, 7 years ago

Owner: changed from axeld to nobody
Status: newassigned

comment:27 by bbjimmy, 6 years ago

As of hrev51792 I can no longer reproduce this issue.

comment:28 by pulkomandy, 6 years ago

Resolution: fixed
Status: assignedclosed

Thanks for the update!

Note: See TracTickets for help on using tickets.