Do you need to serialize/deserlize huge amount of data in very quick and simple way? Is XML, JSON or other mechanism for serialization too slow or too heavy for you? There is a nice solution for these problems. It is called Google Protocol Buffers. If you don’t know what is it and how it works, read my post below.
Protocol Buffers in a nutshell
Official Protocol Buffers page says: “Protocol buffers are Google’s language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages – Java, C++, or Python.”
Interesting are also reasons for releasing this API. Google announces the following reasons:
Protocol buffers are used by practically everyone inside Google. They have many other projects they would like to release as open source that use protocol buffers, so to do this, Google needed to release protocol buffers first. In fact, bits of the technology have already found their way into the open – if you dig into the code for Google AppEngine, you might find some of it.
Google like to provide public APIs that accept protocol buffers as well as XML, both because it is more efficient and because they’re going to convert that XML to protocol buffers at the end anyway.
People outside Google might find protocol buffers useful.
Getting protocol buffers into a form Google were happy to release was a fun 20% project.
Why I have chosen Protocol Buffers is simple – I need something like “XML, but smaller, faster, and simpler” :).
Why “think XML, but smaller, faster, and simpler”?
I wanted to know, how Protocol Buffers differ from XML, before I start using it. I found description of main differences in this overview. Protocol Buffers:
are simpler and generate data access classes that are easier to use programmatically
Manipulating Protocol Buffers is much easier than XML. You must only create proto file like XML Schema and generate class file with Protocol Buffers compiler. From now you can use Protocol Buffers in very simple way by creating or reading appropriate objects. See example below.
Proto file that define structure of data:
You can see that Protocol Buffers is simpler and more intuitive in use than XML.
are 3 to 10 times smaller and 20 to 100 times faster
As far as I am concerned performance is the most important advantage of Protocol Buffers. Beyond Google overview I found a very interesting site with performance tests. The conclusion is rather simple: Protocol Buffers serialization is smaller and faster than XML, JSON, Binary and others seriailzation mechanisms.
Something for .NET fans
Currently I work in C# and I needed .NET version of Protocol Buffers. I was worried when I red that there are only Java, C++ and Python APIs. But after few minutes I found site Third-Party Add-ons for Protocol Buffers whit links for three C# implementations:
I didn’t have time to test them all, but after reading this blog post I have chosen dotnet-protobufs by Jon Skeet, because it is “close to the original Google PB spirit, in the sense that you have to explicitly call methods to serialize/deserialize”. This feature is very important for me because I can learn Protocols Buffers from an official tutorial – in my case Java tutorial.
Summary: If you need simple and fast data serialization mechnism you should use Google Protocol Buffers successfully. Please note that serialization results are less readable then XML/JSON formats.
Share your opinion and experience with us below or meet us on Twitter: @GOYELLO.
Thanks to Karol Świder for helping me write this blog post.