Show HN: Homogenius – Packer/unpacker to reduce the size of JSON

  • Out of interest, do you have any size comparisons compared to gzipping the raw homogenous JSON text?

    Edit: I did a quick test by myself, using the first example but repeated for 10,000 objects:

      Raw:                550111
      Homogenius:         150183
      
      Raw gzipped:        1688
      Homogenius gzipped: 419
    
    Interesting to see a ~4x between raw JSON and Homogenius JSON both when compressed and not compressed.

  • I think [Transit][1] by Cognitect addresses this issue as well.

    [1]: https://github.com/cognitect/transit-format#caching

  • I think this needs some sort of marker for the compressed form to make it clear it's not to be digested as is. Maybe `$HMGNS$` or something as the first key of the array?

  • Why not just use MessagePack (aka MsgPack)

    http://msgpack.org/

  • I see two great things about this tool, and not exactly for what it was designed:

    1) Fix bandwidth problems caused by API designer laziness.

    2) Detect API designer laziness and programmer laziness.

    The API designer laziness merits description. Essentially, if you as an API host server are sending back tons of redundant data, you are doing a disservice to your user by not passing references to this data in the first place. The results in your API should either include an entities section or links to retrieve more data. (Note the scale of repeated data I am talking about here are subtrees, not just key/value)

    The programmer laziness is when an he just sends back JSON.stringify(someVeryLargeObject), instead of making sure only a minimal object is sent back.

    So we can detect this laziness by running this tool and looking for 4x compression gains!

  • If size is a concern, wouldn't a binary message format make more sense?

  • You loose in human readability but you win in size. An interesting project and i think about a lot of uses for it (save bandwith, avoid timeout etc ...)

    thks to make it and i will use it (in go i think).

  • So it's a schema plus interned values? Do you think it might be better on real-world data to use just one mapping of value id to value instead of one per column/key?

  • I really don't see too many use cases here. Both parties have to buy into it, then at that point you might as well use a binary protocol.

  • Doing this with JSON is a neat idea. I recently learned that Excel workbooks do this using a sharedStrings.xml file in the .xlsx package.

  • No benchmarking or comparisons with gzip.

  • Any plans to add this to npm?

  • like Alan Turing lolol