Opened 10 years ago

Closed 10 years ago

#11104 closed enhancement (fixed)

[PATCH] JSON parser

Reported by: waddlesplash Owned by: stippi
Priority: normal Milestone: R1
Component: Kits Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

Will be used by HaikuDepot, but it's going in the shared kit so that other apps can use it in the future.

To parse, just call Json::Parse with a BMessage and a BString or char* that has JSON data in it. It should be self-explanatory after that :)

Attached is the patch and a small program that I used to test it (it loads a test.json file from the working directory and then calls ->PrintToStream on the BMessage).

Tested against various cases found on http://json.org including a few that used \uXXXX constructs.

Attachments (2)

JsonTest.cpp (435 bytes ) - added by waddlesplash 10 years ago.
JSON tester, compile with g++ -g JsonTest.cpp haiku/src/kits/shared/Json.cpp haiku/src/kits/shared/MessageBuilder.cpp -Ihaiku/headers/private/shared/ -o jsontest -lbe
0001-libshared-Introduce-JSON-parser-and-MessageBuilder.patch (15.1 KB ) - added by waddlesplash 10 years ago.
The parser.

Download all attachments as: .zip

Change History (15)

comment:1 by waddlesplash, 10 years ago

patch: 01

comment:2 by pulkomandy, 10 years ago

Owner: changed from nobody to stippi
Status: newassigned

On the patch:

  • Class names should start with B like everything in the BeAPI to avoid conflicts with user application classes. Or, the BPrivate namespace could be used.
  • On a first quick read of the code, I think the logic to handle array vs object could be moved to the BMessageBuilder. So we could call AddString with the data and let the message builder put it at the right place. I could be wrong, I didn't analyze the code too deeply.

On the tests:

  • A good test has an easy way to decide on pass/fail. Yours doesn't as it requires someone to manually analyze the results. This makes the test much less useful, because only you can know what the results are supposed to be. Moreover, if we ask you to look at bugs again in a few months/years, maybe you will have forgotten what the results were supposed to be
  • Having the tests integrated in src/tests, and ideally using the UnitTester framework, would be appreciated. This is documented in "Unit Testing, part deux" and "Unit Testing for fun and profit" on the website.

comment:3 by waddlesplash, 10 years ago

OK, I'll fix the class naming, but I'd rather do everything at once so I'll wait for now.

Array/object can't be handled inside MessageBuilder because the MessageBuilder is generic and knows nothing about what types things are. It blindly adds what it's told. This way, we can exchange MessageBuilder for some other data format in the future that does not support all the things BMessage does.

As for tests: I wasn't sure where to put them or what the best way to do it was. Placing them in src/tests/shared sounds OK, I'll see if I can use the UnitTester framework. Is there some way to get an equivalent of PrintToStream() into a BString?

comment:4 by waddlesplash, 10 years ago

mmu_man just pointed out to me on IRC that the JsonValueType needs commenting and proper 4-character-codes.

comment:5 by pulkomandy, 10 years ago

You can use BMessage::HasSameData to compare BMessages. So you can compare the parser output with an archived (or program-generated) message. It even supports recursive comparison of nested messages.

comment:6 by axeld, 10 years ago

A good test would be a unit test, anyway.

About the parser: I think the BMessage output should be implemented via a callback mechanism, as you can have very large JSON data in real life. For that same reason, it should internally work on a BDataIO, not just a BString/char*.

comment:7 by waddlesplash, 10 years ago

But if you have something so large it wouldn't fit in a BString, the BMessage tree would exceed your memory, no?

Also, what do you mean about a callback? The BMessage is passed in for a reason -- whatever calls Json::Parse should own it.

in reply to:  7 comment:8 by axeld, 10 years ago

Replying to waddlesplash:

But if you have something so large it wouldn't fit in a BString, the BMessage tree would exceed your memory, no?

Exactly. This is why both should be optional. You may want to pass a BDataIO to the parser instead, and use a DOM parser style handler to handle the contents. The BMessage handler would just be an optional handler with a convenience method just like already exists.

Also, what do you mean about a callback? The BMessage is passed in for a reason -- whatever calls Json::Parse should own it.

That would not change, at all. I'm not sure you understood what I'm talking about, though :-)

Version 0, edited 10 years ago by axeld (next)

comment:9 by waddlesplash, 10 years ago

Now I understand. You want a SAX-style parser for JSON. We *could* do that, and the API is set up so we can remove MessageBuilder and replace it with GenericBuilder (of which one implementation would be MessageBuilder, and a callback-style interface could be another).

However, I see no reason to do that *right now* as Stephan does not need it for his work. We can discuss doing that later if/when this API goes public.

comment:10 by axeld, 10 years ago

Sorry, I mixed up DOM and SAX style. The BDataIO input change would already make sense, I suppose, but other than that, yes, that could be done later. In any case, it should be done before this API gets public or sees widespread use.

by waddlesplash, 10 years ago

Attachment: JsonTest.cpp added

JSON tester, compile with g++ -g JsonTest.cpp haiku/src/kits/shared/Json.cpp haiku/src/kits/shared/MessageBuilder.cpp -Ihaiku/headers/private/shared/ -o jsontest -lbe

comment:11 by stippi, 10 years ago

Thanks for the code! It looks like it gets the job done. There is some code with regards to handling null, true, false which looks like it could somehow be refactored to eliminate the repeating code. The code like this:

case 't':
{
	if (builder.What() != JSON_TYPE_ARRAY && key.Length() == 0)
		return B_BAD_DATA;
		// 'true' cannot be a key, it can only be a value
	
	BString value;
	JSON.CopyInto(value, (int32)pos, 4);
	if (value == "true") {
		if (builder.What() == JSON_TYPE_ARRAY)
			key.SetToFormat("%zu", builder.CountNames());
		builder.AddBool(key.String(), true);
		key = "";
				pos += 3;
	} else
		return B_BAD_DATA;
		// "t" out in the middle of nowhere!?

	break;
}

should end up looking like this:

case 't':
	if (_ParseConstant("true", JSON, &pos, builder, key)) {
		builder.AddBool(key.String(), true);
		key = "";
	} else {
		return B_BAD_DATA;
	}
	break;

That you constantly have to reset "key" is also not so nice, but I have no better idea right now except to move the key adding into a templated method and change the MessageBuilder methods to all have the same name:

template<typename DataType>
void
BJson::_AddData(MessageBuilder& builder, BString& key, const DataType& data) {
    builder.AddData(key.String(), data);
    key = "";
}

And I am a bit sad you did not use my suggestion of the generic and abstract ObjectBuilder with MessageBuilder being the only concrete implementation of it for now.

Last edited 10 years ago by stippi (previous) (diff)

comment:12 by waddlesplash, 10 years ago

I wasn't sure what exactly we'd want to do with it as I don't know what can be split from ObjectBuilder from MessageBuilder without knowing what usecases we'd actually have (I had to modify MessageBuilder a few times while writing the JSON parser). If anyone comes along with an actual use case for that I'll do the work necessary to split and implement.

As for your _ParseConstant stuff, great suggestion. I'll get around to it tomorrow or the day after; things are a bit hazy with repairing my main workstation ATM (and not having a way to build Haiku because of that).

comment:13 by stippi, 10 years ago

Resolution: fixed
Status: assignedclosed

Patch applied, and subsequently some fixes. Thanks a lot, it's already working.

Note: See TracTickets for help on using tickets.