My JavaScript book is out! Don't miss the opportunity to upgrade your beginner or average dev skills.

Tuesday, December 06, 2011

On JSON Comments

Another active exchange with @getify about JSON comments and here my take because tweets are cool but sometimes is hard to tell everything you think in 140 bytes ...

Kyle Facts

JSON is used on daily basis for billion of things and configuration files are one, surely common, way to use JSON ( just think about npm packages for node.js ).
His frustration about the fact JSON does not allow comments is comprehensible, and he even created an online petition about allowing comments in JSON specs ... but are comments really what we need?

Just A Side Effect

JSON is extremely attractive as standard, first of all because it's available and widely adopted by basically any programming language in this world, even those that never had to deal with a single JavaScript interlocutor, secondly because it's both simple to parse, and easy to read for humans.
After all, what can be so wrong about comments inside such common serialization standard?
Aren't comments just as easy to parse as white spaces?
The problem is, in my opinion, that we are mixing up an easy process to serialize data, much easier compared to what PHP serialize and unserialize functions do, with the possibility to describe it.
I have always seen JSON as a protocol, rather than a YAML substitute, and as a protocol I expect to be as compact as possible and as cross platform as possible.
Specially about the latter point:
that's annoying to port to every single language
Precisely, what we should understand is that if JSON became so popular without needing comments, maybe the fact today we would like to use it as a "descriptive markup" does not reflect anymore the success, the adoption, and possibilities this standard brought to all these languages?

Improve The Standard

Thanks gosh software is not always stuck behind immutable standards or patents ... and neither is JSON.
If the need for comments is such big topic, define another standard able to combine the good old one with a new one.
If this new standard is truly what developers need, every JSON implementor will spend few hours to test and optimize the well defined standard in order to accept comments, isn't it?
I mean ... RFC 4627 was not meant to be the final solution, that was a Crockford proposal universally adopted.
To create a new standard able to extend RFC 4627 should not be a big problem ... or maybe ...

JSON Is Not A JavaScript Thing

The JSON serialization looks just like JavaScript ... but not JavaScript only.
Other programming languages use curly brackets and squared brackets to define lists and objects ( Python and others ) ... the fact JSON has been accepted so well is probably because the design of the format was indeed already widely adopted, it was not a JS thing, and never should be.
What's the deal here, is define a standard for JSON comments.
Let me better explain this point ...

I am a JS developer and I edit my JSON files via my JS editor ... fair enough ... I want to communicate my data to a server side service, let's say Python.
Python would like to be able to parse my data and produce a file compatible with ... Python, of course.
Does it mean that Python at that point should keep comments in a JavaScript standard? And why that, since the format used to exchange data was already somehow evaluable via Python and right now not anymore due some double slash in the file?

# this is PYTHOOOOOOOOOON
o = {"test": "data"}
o.get("test") # data

Will the renewed JSON work as well?


# still python
o = {"test": "data"} // and some WTF
o.get("test")

>> SyntaxError: invalid syntax

Well done ... making JSON with comments a JS thing python, as other languages, need extra effort to parse data.
What will happen once Python uses the json library?

import json

jsonh.loads(theReceivedData);

Should it print data with python compatible comments or with JS compatible comments so that data is not directly usable anymore for any python application, storing it as example in a file.py or a database ?
And once the renewed JSON has been transformed into Python valid syntax, wouldn't this file not be usable anymore from all others programming languages due possible syntax errors ?

Not That Easy

I like JSON as it is, except few broken implementations quite common even on browsers, because it's about creating a bloody piece of text many other languages can understand basically on the fly.
No need to remember which comment style has been saved with that version of JSON, no need to parse back and finally, no need to ask every single programming language that is using JSON as protocol to update their legacy ... that just worked, and it will aways work for what it was meant: transfer data, not transfer JS like code without functions and/or functions calls only ...
What we are doing, we as JavaScript developer, is to abuse JSON as if it's a piece of our JS code, polluting today with comments, and who knows what else tomorrow.
The right way to go, still in my opinion, would be, once again, to enrich, propose, create, a new standard that allows comments and why not, other features.
As example, what I always found annoying is that in PHP we can unserialize preserving the class, so that we can serialize objects states ... where is this in JSON?
Nowhere, indeed I have spent in my past few hours trying to enrich this protocol ... did it work? Was it interesting ? Probably not, except for my last attempt that is 100% based on current JSON and it's about optimizing bandwidth and performances ... that worked better, accordingly with JSONH github status, still I was not expecting every other that never had this problem to adopt that approach ... you know what I mean?

Still Valid JSON

If it's about writing comments, next snippet is perfectly valid JSON string. All we need to do is to use the replacer in a proper way:

{
"@description": "something meaningful",
"property": "value",

"@description": "something else",
"other": "other value"
}

// parse above text via JSON
console.log(JSON.parse(aboveText, function (key, value){
if (key.charAt(0) != "@")
return value;
}));

Here we are with our object, comments manually written on the original JSON, and every language able to parse them

5 comments:

Toby Osbourn said...

I really like the idea of doing something like @comment:'This is my comment'.

My only concern would be that if we started doing this and it only half took off, people would have to make the mental switch from JSON they have to clean to JSON they don't.

Andrea Giammarchi said...

the clean is not mandatory, is just a nice have.
The reason anyone would like to put comments are configuration files ... those we write manually, those always parsed with specific properties check.

If the software that relies in a JSON config file does a for loop, is not a big deal to ignore keys that start with a common convention, e.g. the @

For all other cases, like data transfer, the reason we all use JSON, this comments thing is a non problem, imho

Bil Simser said...

I too like the @comment metatag. If parsers would just ignore those as part of data and you could check for them and pluck them out that would be fine. I really don't see the need for comments in JSON. If someone needs to describe it with verbage the file isn't the place for that. Providers and consumers should know what the nodes are for and if they don't, sticking comments in isn't the way to solve that problem. It's a process issue, not a technology one.

Unknown said...

If you post some multibyte characters then you exceed 140 bytes. ;)

Unknown said...

The purists are going to hate commenting in this way because it means a larger memory footprint when the corresponding structure is built.

For those purists, there's nothing stopping an enterprising developer from using a SAX style parser to build the structure and ignore the comments while doing so.

All that to say, I like this approach.