Opened 14 years ago

Closed 6 years ago

Last modified 4 weeks ago

#7740 closed bug (fixed)

High resolution JPEG images crash ShowImage due to failure to clone area from app_server

Reported by: leavengood Owned by: axeld
Priority: normal Milestone: R1/beta2
Component: Servers/app_server Version: R1/alpha3
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description (last modified by leavengood)

I have spent hours trying to debug this but I think it is outside my skillset at this point.

Opening the attached photo (a picture of my yard if you are wondering) which is more than 8000 pixels wide will always crash ShowImage from hrev42239 (and before, I've seen this for a while on big images.)

Edit: WARNING: this image also causes very bad behavior in WebPositive, maybe due to the same bug!!! I would try downloading it with wget!

After much printf debugging I narrowed it down to BBitmap::_InitObject failing to clone an area from app_server. BBitmapStream::WriteAt just doesn't check the bitmap InitCheck like it should (that is another bug which I can fix) and then tries to access the bitmap Bits() which are null, causing a segfault. My backtrace is also attached, but the line numbers are probably wrong due to my added printfs. But this should be reproducable with the attached image.

I used the area command in KDL to look at the area_id being returned from the app_server, and it seems valid and is owned by the app_server. The result from clone_area is B_BAD_VALUE, so I assume it is failing at the lookup_area call on line 1961 of vm.cpp. Maybe for some reason the source address space returned from the MultiAddressSpaceLocker is wrong?

If I fix BBitmapStream::WriteAt to actually check the InitCheck of the bitmap it creates, the segfault is fixed, but the image still won't load of course. It seems to be a deeper problem in either the app_server or kernel.

I've also included my debug output, and the result of the KDL command area for the area from the debug output (as well as the teams output to see it is indeed owned by app_server, team_id 70 = 0x46.)

Attachments (8)

big_yard_image.jpg (4.6 MB ) - added by leavengood 14 years ago.
A high resolution image guaranteed to crash ShowImage
showimage_crash_high_res_jpg.txt (7.4 KB ) - added by leavengood 14 years ago.
Backtrace
printf_debug_output_showimage_crash.txt (1.6 KB ) - added by leavengood 14 years ago.
printf debug output
kdl_areas_teams_showimage_crash.txt (1.9 KB ) - added by leavengood 14 years ago.
KDL output
screenshot1.jpeg (499.2 KB ) - added by cocobean 7 years ago.
Test JPEG oddly not filling ShowImage display window
ImageOnAMac.png (703.2 KB ) - added by Janus 7 years ago.
Gallery-38724-debug-22-12-2024-19-53-41.report (67.7 KB ) - added by smallstepforman 4 weeks ago.
Crash report
Gallery-43299-debug-22-12-2024-19-59-26.report (38.1 KB ) - added by smallstepforman 4 weeks ago.
A version with a single thread for thumbnails

Change History (31)

by leavengood, 14 years ago

Attachment: big_yard_image.jpg added

A high resolution image guaranteed to crash ShowImage

by leavengood, 14 years ago

Backtrace

by leavengood, 14 years ago

printf debug output

by leavengood, 14 years ago

KDL output

comment:1 by leavengood, 14 years ago

Description: modified (diff)

comment:2 by mmlr, 14 years ago

I've started investigating this but ran out of time. What I could gather was that the B_BAD_VALUE actually comes from browser:haiku/trunk/src/system/kernel/vm/VMUserAddressSpace.cpp#L675 which is rather curious, as it indicates the addressSpec to be B_EXACT_ADDRESS which isn't what the BBitmap function supplies. As mentioned I ran out of time then. I'll continue investigating tonight/tomorrow. I wanted to share the finding in case someone else wants to take a look sooner than that.

comment:3 by axeld, 14 years ago

Fixed the BitmapStream part in hrev42297. I'll look into the other problem as well, but I don't have much time, so feel free to continue to investigate it yourself.

comment:4 by axeld, 14 years ago

Looking at http://dev.haiku-os.org/browser/haiku/trunk/src/kits/app/ServerMemoryAllocator.cpp#L79 already reveals the problem: a 128 MB area is successfully reserved to contain the area. However, the area is larger than this, so the cloning (at the exact reserved address) is likely to fail.

comment:5 by axeld, 14 years ago

Status: newin-progress

comment:6 by leavengood, 14 years ago

Thanks Axel.

That indeed sounds like the problem. Wow, an image this big requires more than 128 MB! I wouldn't have thought which is why I glanced over that code.

comment:7 by leavengood, 14 years ago

Actually doing the math the image should only take around 46 MB but as you said I guess it is the area that is bigger than 128 MB.

comment:8 by axeld, 14 years ago

Resolution: fixed
Status: in-progressclosed

Fixed in hrev42298 (just came back and noticed that Trac refused to add the comment, as Ryan got in between -- really annoying feature).

Not sure what you have calculated there, but 8000*5000*4 (4 byte per pixel) is more than 128 MB :-)

comment:9 by leavengood, 14 years ago

Yeah that Trac "feature" bit me a few times on this bug too.

Oops, my silly math was for 1 byte per pixel :-D

Thanks for fixing this!

comment:10 by cocobean, 7 years ago

Tested on hrev51875 x86_64. This bug is still valid. Loading same picture in ShowImage gives "Can't load image. Either file or an image translator does not exist". Webpositive shows the image initially while loading, then shows a grey image of it after completion. Tested a few other smaller high res JPEGs which worked. Retesting the test picture, ShowImage will eventually display the test picture but as a small picture not filling the display window (see screenshot). I had to zoom in to enlarge it and displaying the picture work properly afterwards. I can also display it in WebPositive. NOTE: Getting it to work with the test images is not consistent. If you close the image in Webpositive/ShowImage, ShowImage will start displaying the same "Can't load Image..." error eventually.

NOTE: Observing resource/memory usage spikes. Possible that intensive resource apps like WebPositive or another system resource was starving other system resources (i.e. app_server, system memory, etc) in background causing this intermittent problem.

You can close this ticket at your discretion. If a user keeps other apps/system resources usage low, there are no major issues. Using high res JPEGs is not the issue.

Last edited 7 years ago by cocobean (previous) (diff)

comment:11 by cocobean, 7 years ago

Resolution: fixed
Status: closedreopened

by cocobean, 7 years ago

Attachment: screenshot1.jpeg added

Test JPEG oddly not filling ShowImage display window

in reply to:  11 comment:12 by Janus, 7 years ago

Replying to cocobean:

Test JPEG oddly not filling ShowImage display window

I think the jpeg is corrupted, I have the same problem on macOs

Last edited 7 years ago by Janus (previous) (diff)

by Janus, 7 years ago

Attachment: ImageOnAMac.png added

comment:13 by cocobean, 7 years ago

We can do View->Full screen in ShowImage. Also, I retested other JPEG images up to 50.3 Megapixels with ShowImage - no major issues on hrev51877 x86_64.

Although, what is still happening is that if we view 'medium->large' files extensively using USB devices - we can/will hit a kernel paging issue (known issue). So, don't think it is specific to ShowImage itself.

Last edited 7 years ago by cocobean (previous) (diff)

comment:14 by waddlesplash, 6 years ago

Resolution: fixed
Status: reopenedclosed

That is not an app server crash, and if it is actually an issue, deserves a separate ticket.

comment:15 by nielx, 5 years ago

Milestone: R1R1/beta2

Assign tickets with status=closed and resolution=fixed within the R1/beta2 development window to the R1/beta2 Milestone

comment:16 by smallstepforman, 4 weeks ago

I am working on a heic thumbnail viewer for myself, and obviously mixing with old fashioned jpeg/png images, and I regularly hit this issue when generating thumbnails from folders with > 100 images. It always crashes with the following stack trace, at random intermittent intervals:

Frame IP Function Name

----------------------------------------------- 0x7ff7a4410dd0 0x1c0679e1ab6 memcpy + 0x26

Disassembly:

memcpy: 0x000001c0679e1a90: 55 push %rbp 0x000001c0679e1a91: 4889d1 mov %rdx, %rcx 0x000001c0679e1a94: 4889e5 mov %rsp, %rbp 0x000001c0679e1a97: 4156 push %hrev14 0x000001c0679e1a99: 4155 push %hrev13 0x000001c0679e1a9b: 4989fd mov %rdi, %hrev13 0x000001c0679e1a9e: 4154 push %hrev12 0x000001c0679e1aa0: 4989f4 mov %rsi, %hrev12 0x000001c0679e1aa3: 4883ec18 sub $0x18, %rsp 0x000001c0679e1aa7: 4883fa10 cmp $0x10, %rdx 0x000001c0679e1aab: 7623 jbe 0x1c0679e1ad0 0x000001c0679e1aad: 4881faff070000 cmp $0x7ff, %rdx 0x000001c0679e1ab4: 765a jbe 0x1c0679e1b10 0x000001c0679e1ab6: f3a4 rep movsb <--

Frame memory:

[0x7ff7a4410d90] .MV.S....?V.S... f0 4d 56 b6 53 10 00 00 c0 3f 56 b6 53 10 00 00 [0x7ff7a4410da0] .:.#&.....A..... e0 3a 83 23 26 01 00 00 b0 16 41 a4 f7 7f 00 00 [0x7ff7a4410db0] ........ .B.S... 00 00 00 00 00 00 00 00 20 f0 42 b5 53 10 00 00 [0x7ff7a4410dc0] @.A......8.P.... 40 0e 41 a4 f7 7f 00 00 a3 38 07 50 08 01 00 00

0x7ff7a4410e50 0x1085007389e BBitmapStream::WriteAt(long, void const*, unsigned long) + 0x7e

0x7ff7a4410e80 0x10ac09cf34e BPositionIO::Write(void const*, unsigned long) + 0x2e

0x7ff7a4411310 0x1d7523c1b20 JPEGTranslator::Decompress(BPositionIO*, BPositionIO*, BMessage*, jmp_buf_tag const[1]*) + 0x4d0

0x7ff7a44113b0 0x1d7523c1db3 JPEGTranslator::DerivedTranslate(BPositionIO*, translator_info const*, BMessage*, unsigned int, BPositionIO*, int) + 0x83

0x7ff7a4411410 0x1d7523c31fc BaseTranslator::BitsTranslate(BPositionIO*, translator_info const*, BMessage*, unsigned int, BPositionIO*) + 0x9c

0x7ff7a4411690 0x10850079b87 BTranslatorRoster::Translate(BPositionIO*, translator_info const*, BMessage*, BPositionIO*, unsigned int, unsigned int, char const*) + 0xc7

0x7ff7a4411770 0x1085007469f BTranslationUtils::GetBitmap(BPositionIO*, BTranslatorRoster*) + 0x4f

0x7ff7a4411940 0x10850074977 BTranslationUtils::GetBitmapFile(char const*, BTranslatorRoster*) + 0xd7

0x7ff7a4411960 0x10850074a5d BTranslationUtils::GetBitmap(char const*, BTranslatorRoster*) + 0xd

It's always on memcpy in BBitmapStream::WriteAt, always on different images so no repeateble way to reproduce.

comment:17 by smallstepforman, 4 weeks ago

I forgot to mention that this is hrev58436, from 19 Dec 2024, so relatively recent (and any rev in the last couple of years)

comment:18 by waddlesplash, 4 weeks ago

Please attach the full debug report.

Does running with the guarded heap make any difference?

comment:19 by smallstepforman, 4 weeks ago

I've tried debugging with LD_PRELOAD=libroot_debug.so, however that never lets me exceed 2.4GiB of RAM (I have 32Gb here, running x86_64 version of Haiku). The app always freezes with LD_PRELOAD=libroot_debug.so once I hit 2.4, and if I dont have that environmental variable than I can grow much higher.

During my tests this evening, I never triggered the BBitmapStream issue. If libroot_debug would allow allocating more than 2.4GiBi of RAM, then maybe I would have had more luck. I will keep trying tonight, and report if I find something.

by smallstepforman, 4 weeks ago

Crash report

by smallstepforman, 4 weeks ago

A version with a single thread for thumbnails

comment:20 by smallstepforman, 4 weeks ago

I finally got it to crash with guarded heap. I dont know how much it can help. I do have 2Gb core file, that may be too much to share here ...

comment:21 by smallstepforman, 4 weeks ago

KERN: debug_server: Thread 43300 entered the debugger: Segment violation KERN: stack trace, current PC 0x1b77e915816 </boot/system/lib/libroot_debug.so> memcpy + 0x26: KERN: (0x7f82692826c0) 0x1ac863338a3 </boot/system/lib/libtranslation.so> _ZN13BBitmapStream7WriteAtElPKvm + 0x83 KERN: (0x7f8269282740) 0xb21d582354 </boot/system/lib/libbe.so> _ZN11BPositionIO5WriteEPKvm + 0x34 KERN: (0x7f8269282770) 0x1d9da2d2b23 </boot/system/add-ons/Translators/JPEGTranslator> _ZN14JPEGTranslator10DecompressEP11BPositionIOS1_P8BMessagePA1_K13jmp_buf_tag + 0x4d3 KERN: (0x7f8269282c00) 0x1d9da2d2db8 </boot/system/add-ons/Translators/JPEGTranslator> _ZN14JPEGTranslator16DerivedTranslateEP11BPositionIOPK15translator_infoP8BMessagejS1_i + 0x88 KERN: (0x7f8269282ca0) 0x1d9da2d41ff </boot/system/add-ons/Translators/JPEGTranslator> _ZN14BaseTranslator13BitsTranslateEP11BPositionIOPK15translator_infoP8BMessagejS1_ + 0x9f KERN: (0x7f8269282d00) 0x1ac86339b8a </boot/system/lib/libtranslation.so> _ZN17BTranslatorRoster9TranslateEP11BPositionIOPK15translator_infoP8BMessageS1_jjPKc + 0xca KERN: (0x7f8269282f80) 0x1ac863346a2 </boot/system/lib/libtranslation.so> _ZN17BTranslationUtils9GetBitmapEP11BPositionIOP17BTranslatorRoster + 0x52 KERN: (0x7f8269283060) 0x1ac8633497c </boot/system/lib/libtranslation.so> _ZN17BTranslationUtils13GetBitmapFileEPKcP17BTranslatorRoster + 0xdc KERN: (0x7f8269283230) 0x1ac86334a62 </boot/system/lib/libtranslation.so> _ZN17BTranslationUtils9GetBitmapEPKcP17BTranslatorRoster + 0x12 KERN: (0x7f8269283250) 0x1818de55c40 </boot/home/Development/Gallery/Gallery> _ZN10ImageCache10BitmapItemC2ERK7BStringP5Image + 0x78 KERN: (0x7f8269283290) 0x1818de55fdc </boot/home/Development/Gallery/Gallery> _ZN10ImageCache9GetBitmapERK7BStringP5Image + 0x174 KERN: (0x7f8269283350) 0x1818de55709 </boot/home/Development/Gallery/Gallery> _ZN5Image14AsyncLoadImageEffP7BWindow + 0xcb KERN: (0x7f82692833b0) 0x1818de66f87 </boot/home/Development/Gallery/Gallery> _ZSt13invoke_implIvRM5ImageFvffP7BWindowERPS0_JRfS8_RP13GalleryWindowEET_St21invoke_memfun_derefOT0_OT1_DpOT2_ + 0xb6 KERN: (0x7f8269283410) 0x1818de66e4d </boot/home/Development/Gallery/Gallery> _ZSt8invokeIRM5ImageFvffP7BWindowEJRPS0_RfS8_RP13GalleryWindowEENSt15invoke_resultIT_JDpT0_EE4typeEOSD_DpOSE_ + 0x7f KERN: (0x7f8269283470) 0x1818de66cf4 </boot/home/Development/Gallery/Gallery> _ZNSt5_BindIFM5ImageFvffP7BWindowEPS0_ffP13GalleryWindowEE6callIvJEJLm0ELm1ELm2ELm3EEEET_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE + 0xce KERN: (0x7f82692834c0) 0x1818de66b60 </boot/home/Development/Gallery/Gallery> _ZNSt5_BindIFM5ImageFvffP7BWindowEPS0_ffP13GalleryWindowEEclIJEvEET0_DpOT_ + 0x24 KERN: (0x7f82692834f0) 0x1818de66a2e </boot/home/Development/Gallery/Gallery> _ZSt13invoke_implIvRSt5_BindIFM5ImageFvffP7BWindowEPS1_ffP13GalleryWindowEEJEET_St14invoke_otherOT0_DpOT1_ + 0x20 KERN: (0x7f8269283510) 0x1818de667a8 </boot/home/Development/Gallery/Gallery> _ZSt10invoke_rIvRSt5_BindIFM5ImageFvffP7BWindowEPS1_ffP13GalleryWindowEEJEENSt9enable_ifIX16is_invocable_r_vIT_T0_DpT1_EESD_E4typeEOSE_DpOSF_ + 0x20 KERN: (0x7f8269283530) 0x1818de66223 </boot/home/Development/Gallery/Gallery> _ZNSt17_Function_handlerIFvvESt5_BindIFM5ImageFvffP7BWindowEPS2_ffP13GalleryWindowEEE9_M_invokeERKSt9_Any_data + 0x20 KERN: (0x7f8269283550) 0x1818de52190 </boot/home/Development/Gallery/Gallery> _ZNKSt8functionIFvvEEclEv + 0x32 KERN: (0x7f8269283570) 0x1818de52ba4 </boot/home/Development/Gallery/Gallery> _ZN9yplatform10WorkThread11work_threadEPv + 0x34c KERN: (0x7f8269283660) 0x1b77e895ee9 </boot/system/lib/libroot_debug.so> thread_entry + 0x19 KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x7f80d3c6e000, ip 0xffffffff80179436, write 0, kernel, exec 0, thread 0xa9ce KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x7f80d3c6f000, ip 0xffffffff80179436, write 0, kernel, exec 0, thread 0xa9ce KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x7f80d3c70000, ip 0xffffffff80179436, write 0, kernel, exec 0, thread 0xa9ce KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x7f80d3c71000, ip 0xffffffff80179436, write 0, kernel, exec 0, thread 0xa9ce KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x7f8269240000, ip 0xffffffff80179436, write 0, kernel, exec 0, thread 0xa9ce KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x7f8269241000, ip 0xffffffff80179436, write 0, kernel, exec 0, thread 0xa9ce KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x7f8269242000, ip 0xffffffff80179436, write 0, kernel, exec 0, thread 0xa9ce KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x7f8269243000, ip 0xffffffff80179436, write 0, kernel, exec 0, thread 0xa9ce KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x7fa9887d2000, ip 0xffffffff80179436, write 0, kernel, exec 0, thread 0xa9ce KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x7fa9887d3000, ip 0xffffffff80179436, write 0, kernel, exec 0, thread 0xa9ce KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x7fa9887d4000, ip 0xffffffff80179436, write 0, kernel, exec 0, thread 0xa9ce KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x7fa9887d5000, ip 0xffffffff80179436, write 0, kernel, exec 0, thread 0xa9ce KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x7ffc12029000, ip 0xffffffff80179436, write 0, kernel, exec 0, thread 0xa9ce KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x7ffc1202a000, ip 0xffffffff80179436, write 0, kernel, exec 0, thread 0xa9ce KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x7ffc1202b000, ip 0xffffffff80179436, write 0, kernel, exec 0, thread 0xa9ce KERN: vm_page_fault: vm_soft_fault returned error 'Bad address' on fault at 0x7ffc1202c000, ip 0xffffffff80179436, write 0, kernel, exec 0, thread 0xa9ce

comment:22 by smallstepforman, 4 weeks ago

The above is the output in syslog

comment:23 by smallstepforman, 4 weeks ago

Possibly unrelated, but when I run the app through gdb, it never crashes. Without gdb, I have found a directory with images which crashes every 2nd time. With gdb, never. Obviously the memory mapping and timing is different ...

Note: See TracTickets for help on using tickets.