Skip to main content

MessagePack Is A More Efficient JSON

It is an age-old problem, that of having some data you want to store somewhere, and later bring it back. How do you format the data? Custom file formats are not that hard, but if you use an existing format you can probably steal code from a library to help you. Common choices include XML or the simpler JSON. However, neither of these are very concise. That’s where MessagePack comes in.

For example, consider this simple JSON stanza:

{"compact":true, "schema":0}

This is easy to understand and weighs in at 27 bytes. Using MessagePack, you’d signal some special binary fields by using bytes >80 hex. Here’s the same thing using the MessagePack format:

 
0x82 0xA7 c o m p a c t 0xC3 0xA6 s c h e m a 0x00

Of course, the spaces are there for readability; they would not be in the actual data stream which is now 18 bytes. The 0x82 indicates a two-byte map. The 0xA7 introduces a 7-byte string. The “true” part of the map is the 0xC3. Then there’s a six-byte string (0xA6). Finally, there’s a zero byte indicating a zero.

You can probably puzzle it out for the most part. Any byte that starts with a zero is a fixed integer. Numbers that start at 0x80 encode a map, so 0x84 is a four-element map. For arrays, the prefix is 9 instead of 8 and strings start with either 0xA0 or 0xB0, so you can have up to 32 characters easily encoded.

Of course, you might need an integer bigger than 0x7F, right? So there are other integer formats such as 0xCC for 8-bit unsigned or 0xD3 which is a 64-bit signed big-endian number. Prefixes of 0xCA and 0xCB store 32- or 64-bit IEEE 754 floating point numbers.

For larger strings there is str8 (0xd9), str16 (0xda), and str32 (0xdb). In each case, the number is the count of bits in the string length. So 0xd9 gets a single byte count and 0xdb gets four-bytes for the count. There are other formats, of course, and you can see them in the spec.

The real trick, of course, is the availability of library code. The project claims over 50 languages on their web page. So if you are writing in C, C++, Haskell, Dart, Kotlin, or Matlab, you can find code to help you.

We’ve seen a lot of JSON out there, and it will probably remain since most applications don’t care about the efficiency of representing data. While XML has fallen out of favor because of its complexity, there are still places you run into it.



from Hackaday https://ift.tt/2U1Gl4P

Comments

Popular posts from this blog

Bill Gates steps down from Microsoft’s board to focus on philanthropy

In an announcement on Friday, Microsoft revealed that company co-founder Bill Gates has decided to step down from his role on its Board of Directors in order to focus on his philanthropic efforts at the Bill & Melinda Gates Foundation. This is Gate’s biggest change to his role at Microsoft since stepping down as company chairman in February 2014. According … Continue reading from SlashGear https://ift.tt/2We90Gu

World Economic Forum launches Global AI Council to address governance gaps

The World Economic Forum is creating a series of councils that create policy recommendations for use of things like AI, blockchain, and precision medicine. Read More from VentureBeat http://bit.ly/2EKBjD4

A Mini USB Keyboard That Isn’t A Keyboard

A useful add-on for any computer is a plug-in macro keyboard, a little peripheral that adds those extra useful buttons to automate tasks. [ Sayantan Pal] has made one, a handy board with nine programmable keys and a USB connector, but the surprise is that at its heart lies only the ubiquitous ATmega328 that you might find in an Arduino Uno. This isn’t a USB HID keyboard, instead it uses a USB-to-serial chip and appears to the host computer as a serial device. The keys themselves are simple momentary action switches, perhaps a deluxe version could use key switches from the likes of Cherry or similar. The clever part of this build comes on the host computer, which runs some Python code using the PyAutoGui library. This allows control of the keyboard and mouse, and provides an “in” for the script to link serial and input devices. Full configurability is assured through the Python code, and while that might preclude a non-technical user from gaining its full benefit it’s fair to say that ...