Opened 11 years ago

Closed 11 years ago

Last modified 11 years ago

#9140 closed bug (fixed)

KDL when configure bison-2.6.4

Reported by: diger Owned by: axeld
Priority: critical Milestone: R1
Component: System/Kernel Version: R1/alpha4
Keywords: Cc:
Blocked By: Blocking: #10111
Platform: All

Description

hrev44732 gcc4, gcc4.6.3

When trying to configure the bison I get KDL

Attachments (4)

diger.png (97.1 KB ) - added by diger 11 years ago.
bison-kdl.txt (26.7 KB ) - added by siarzhuk 11 years ago.
cutout of crash log
kdl-groff.png (660.6 KB ) - added by tidux 11 years ago.
bison-kdl-on-nightly-hrev46038.png (62.4 KB ) - added by siarzhuk 11 years ago.
The KDL reproduced on newest available hrev46038 nightly.

Download all attachments as: .zip

Change History (28)

by diger, 11 years ago

Attachment: diger.png added

comment:1 by diger, 11 years ago

KDL in vfs code. See screenshot for more details

comment:2 by diver, 11 years ago

Component: - GeneralSystem/Kernel
Owner: changed from nobody to axeld

comment:3 by diver, 11 years ago

Might be a dupe of #1988.

comment:4 by siarzhuk, 11 years ago

This test reproduces the behaviour:

#include <fcntl.h>

int
main ()
{
	int result = 0;
	static char const sym[] = "conftest.sym";
	if (symlink ("/dev/null", sym) != 0)
		result |= 2;
	else
	{
		int fd = 0;
		fd = open (sym, O_WRONLY | O_NOFOLLOW | O_CREAT, 0);
		if (fd >= 0)
		{
			close (fd);
			result |= 4;
		}
	}
	return result;
}

comment:5 by diger, 11 years ago

This bug is reproduced when configuring gettext-runtime 0.18.2 & gettext-tools 0.18.2

in reply to:  4 ; comment:6 by anevilyak, 11 years ago

Replying to siarzhuk:

This test reproduces the behaviour:

Hi Siarzhuk,

I don't suppose there's anything special about your and/or diger's system configuration? Thus far neither the above test nor any of the configure scripts mentioned are reproducing the panic over here. As a first hunch I tried switching to a cyrillic locale but that made no difference. hrev45350, gcc4, 8GB of RAM and 8 CPU cores over here for reference.

in reply to:  6 comment:7 by siarzhuk, 11 years ago

Replying to anevilyak:

I don't suppose there's anything special about your and/or diger's system configuration? Thus far neither the above test nor any of the configure scripts mentioned are reproducing the panic over here. As a first hunch I tried switching to a cyrillic locale but that made no difference. hrev45350, gcc4, 8GB of RAM and 8 CPU cores over here for reference.

Strange, it is reproducible both in Virtual Box and with real HW on my home system. May be our /Sources partitions that were created years ago affect on this. I'll check more widely than.

comment:8 by siarzhuk, 11 years ago

I have checked bison.c test on following systems:

hrev42604 GCC4/Hybrid (in VirtualBox) hrevalpha4-44594 hrev45141 x86_64 hrev43037 GCC4/Hybrid hrev45223 GCC4/Hybrid hrev44869 GCC4/Hybrid

test on all systems fails with the same error. :-|

comment:9 by anevilyak, 11 years ago

Always with the same set of partitions? Or does e.g. a completely clean virtualbox image with no other partitions mounted exhibit the same issue?

in reply to:  9 ; comment:10 by siarzhuk, 11 years ago

Replying to anevilyak:

Always with the same set of partitions? Or does e.g. a completely clean virtualbox image with no other partitions mounted exhibit the same issue?

By the way the VirtualBox case above (#1) is such "completely clean". 2,3,4 - different partitions of the same PC. 5,6 - different partitions of the other PC. BTW diger.png acquired on Virtual Box at my PC on the work. I have tested all cases by copying bison.c to home directory issuing "gcc bison.c" and running resulting a.out file.

in reply to:  10 ; comment:11 by anevilyak, 11 years ago

Replying to siarzhuk:

By the way the VirtualBox case above (#1) is such "completely clean". 2,3,4 - different partitions of the same PC. 5,6 - different partitions of the other PC. BTW diger.png acquired on Virtual Box at my PC on the work. I have tested all cases by copying bison.c to home directory issuing "gcc bison.c" and running resulting a.out file.

Tried exactly those steps, still no luck. Could you by any chance try enabling VFS tracing (http://cgit.haiku-os.org/haiku/tree/src/system/kernel/fs/vfs.cpp#n66 ), and then paste the resulting serial output from vbox here?

by siarzhuk, 11 years ago

Attachment: bison-kdl.txt added

cutout of crash log

in reply to:  11 comment:12 by siarzhuk, 11 years ago

Replying to anevilyak:

Could you by any chance try enabling VFS tracing (http://cgit.haiku-os.org/haiku/tree/src/system/kernel/fs/vfs.cpp#n66 ), and then paste the resulting serial output from vbox here?

It was a bit tricky: First I have to disable syslog because it never ends tracing into system log about it's writing into system log, I suspect. Than I have to unsuccessfully wait about 3 hours until it finish loading app_server and other whistles. Than I just hardcoded "launch /bin/consoled" into boot script and get the possibility to run a.out and get KDL. :-) I hope it helps.

comment:13 by tidux, 11 years ago

I was able to reproduce this on both hrev45480 and hrev45526 when attempting to configure groff 1.22.2, on a physical machine and a virtual machine. Here's a screenshot of the VM crashing.

by tidux, 11 years ago

Attachment: kdl-groff.png added

comment:14 by diger, 11 years ago

hrev45703 gcc4.7.3

reproduced when configuring gettext-runtime 0.18.2 & gettext-tools 0.18.2 & bison & groff

in reply to:  11 ; comment:15 by siarzhuk, 11 years ago

Replying to anevilyak:

Tried exactly those steps, still no luck. Could you by any chance try enabling VFS tracing (http://cgit.haiku-os.org/haiku/tree/src/system/kernel/fs/vfs.cpp#n66 ), and then paste the resulting serial output from vbox here?

Any news here? Diger reports me that more and more software packages are affected by this problem. He is maintainer of the Haiku port of PKGSRC system and can observe the growing of this problem in the real-time. ;-)

That looks like newest (>=2.69) autoconf versions issue and may become serious problem as soon as we try to recompile optional packages preparing to the next Haiku release, IMO.

in reply to:  15 comment:16 by anevilyak, 11 years ago

Replying to siarzhuk:

Any news here? Diger reports me that more and more software packages are affected by this problem. He is maintainer of the Haiku port of PKGSRC system and can observe the growing of this problem in the real-time. ;-)

Speaking for myself only, it's still completely impossible to reproduce on my own hardware, and my knowledge of the VFS is otherwise too limited to go off the log output alone. I'd hoped one of the other kernel developers who had more exposure/experience with that code would comment. There's a possibility it could in some way be related to some of the other races involving get_vnode() i.e. #5262 or #9839 though.

comment:17 by diger, 11 years ago

Priority: normalcritical

comment:18 by diger, 11 years ago

hrev46032 gcc4.7.3

reproduced when configuring gtexinfo & libidn

BTW, from my 10 months' experience - about 2-3 such KDLs is enough to damage the FS unrecoverably.

by siarzhuk, 11 years ago

The KDL reproduced on newest available hrev46038 nightly.

comment:19 by siarzhuk, 11 years ago

The KDL reproduced on newest available hrev46038 nightly.

Reproduced in VirtualBox 4.2.18.hrev88780 using bison.c on the latest available GCC2 nightly hrev46038

comment:20 by siarzhuk, 11 years ago

Hm... Just quick look: The create_vnode's parameter openMode is 524801 that correspond to O_CREAT | O_WRONLY | O_NOFOLLOW (0x80201). So the traverse variable in the code below should be set to false.

The only call of VNodePutter::Put is "protected" by if (... && traverse) so it should not be issued in case traverse is false. But it did.

Was the O_NOFOLLOW defined into the value different that 0x00080000 during compiling vfs.cpp? Or I have missed something? ;-)

static int
create_vnode(struct vnode* directory, const char* name, int openMode,
	int perms, bool kernel)
{
	bool traverse = ((openMode & (O_NOTRAVERSE | O_NOFOLLOW)) == 0);

[...]
			// If the node is a symlink, we have to follow it, unless
			// O_NOTRAVERSE is set.
			if (S_ISLNK(vnode->Type()) && traverse) {
				putter.Put();

in reply to:  20 ; comment:21 by bonefish, 11 years ago

Replying to siarzhuk:

The only call of VNodePutter::Put is "protected" by if (... && traverse) so it should not be issued in case traverse is false. But it did.

VNodePutter is a RAII style class. Put() is also called in the destructor.

There's an erroneous put_vnode() (probably overlooked when changing the code to use VNodePutter) in an error case. So I suppose there already exists a symlink where the file shall be created.

in reply to:  21 comment:22 by siarzhuk, 11 years ago

Replying to bonefish:

VNodePutter is a RAII style class. Put() is also called in the destructor.

Ah... That is I have missed. :)

There's an erroneous put_vnode() (probably overlooked when changing the code to use VNodePutter) in an error case. So I suppose there already exists a symlink where the file shall be created.

Yes, this is the case of this configure test: an attempt to create the file inplace of existing symlink. Thank you for the pointing out!

comment:23 by siarzhuk, 11 years ago

Resolution: fixed
Status: newclosed

Fixed in hrev46039. Thanks again!

comment:24 by anevilyak, 11 years ago

Blocking: 10111 added

(In #10111) Duplicate indeed.

Note: See TracTickets for help on using tickets.