A month ago I had an incident in production that was caused, as I found out later, by poor performance of used JSON parser library. I’ve optimalized the code and managed to solve it but decided to look for another library with better performance characteristics. I searched for some existing benchmarks and found two of them – one is for JSON manipulation on Android and the second one is thorough serialization test focused on different use-cases than I had. So I decided to write my own microbenchmark copying the use-case I had in the production.
There are many differencies among JSON libraries regarding their features and resulting performance. So if you want to know my findings continue reading …
Tested use case is a serialization of a rich POJO with collection of inner POJOs referencing another POJO (see PhotoAlbum model object). Tested JSON parser must have direct POJO to JSON serialization and deserialization support in order to get into the suite. I’ve used all libraries by same naive approach – grab it and use it the simpliest way possible. No tweaking, no optimalizing whatsoever (that’s the way they are usually used). Average size of generated JSON data is 160KB, so it is rather big. The problem in production was caused by data of 33KB size in my case.
Disclaimer microbenchmarking results may be misleading – if you want to be sure write your own tests, with your custom data on your hardware to be really sure.
I’ve added new tests for Staxon – with three different serialization factories – internal, GSon and Jackson. New section added. I’ve re-run all the tests and updated the numbers.
Hardware: Intel® Core™2 Duo CPU T7500 @ 2.20GHz × 2, SSD disk (but File IO time is not counted)
Software: Java(TM) SE Runtime Environment (build 1.6.0_23-b05), Java HotSpot(TM) Server VM on Ubuntu Linux 11.10 32-bit
JVM options: -Xmx32m -server
Data value being tested: PhotoAlbum.
Before taking measurements I warmed things up by running the task 250 times, next 3000 loops are counted. No other operations except the serialization or deserialization is measured. Finally average execution time is taken for each operation and library. I ran tests as a full suite and separately by each library – but the results didn’t differ much.
Tested libraries and their versions:
- FlexJson (2.1)
- GSon (2.1)
- Jackson (2.0.4)
- JsonLib (2.4)
- JsonMarshaller (0.21)
- JsonSmart (2.0-beta2)
- Protostuff JSON (1.0.4)
- XStream (1.4.2)
- Staxon (1.0)
Testing code is placed on GitHub: https://github.com/novoj/JavaJsonPerformanceTest
Results and comments
There are a few catches when using this library. In order to serialize POJO right you have to exclude serialization of a class property deep-wise and include serialization of collections as they are not serialized by default. Library has neither problems with Date handling nor with using generated Groovy classes. I haven’t noticed any support for resolution of circular references. Library is easily reachable in maven repos. Main advantage of this library according authors is ability to pick and choose specific properties and structures that should be converted to/from JSON.
Serialization: 5.978ms / 1 PhotoAlbum, JSON file size 165KB
Deserialization: 14.314ms / 1 PhotoAlbum
It’s very sophisticated and configurable library – it supports versioning, custom handlers and instantiation factories, exclusion of certain properties, custom naming. Library is super easy to use with good documentation. I haven’t noticed any support for resolution of circular references, handling Groovy classes or Date objects. Library is easily reachable in maven repos.
Serialization: 5.11ms / 1 PhotoAlbum, JSON file size 169KB
Deserialization: 9.396667ms / 1 PhotoAlbum
It claims to be a fastest Java JSON library among all – and according to my statistics – they are quite right. Even then it has very rich feature set that exceeds what is provided by GSon (but no versioning support out of the box). It is super easy to use, it has native support for Dates (and JodaTime too!).
Serialization: 1.291ms / 1 PhotoAlbum, JSON file size 165KB
Deserialization: 1.588ms / 1 PhotoAlbum
This library is very configurable but has some glitches – for example deserializing of Date objects doesn’t work out of the box and you have to provide type handlers. It has several strategies how to cope with circular referencies. Very good documentation is provided and it has integration with Groovy. It is easily reachable from maven repos (but beware you have to provide classifier=jdk15). This library burned me at the production – as you can see it has really bad performance stats.
Serialization: 17.592ms / 1 PhotoAlbum, JSON file size 168KB
Deserialization: 92.248ms / 1 PhotoAlbum
It has problems with serialization / deserialization of Groovy objects (throws exception regarding ASM). It has no Date support built-in. It requires you to place annotations into your model (or DTO) classes that might be rather uncomfortable (and maybe unacceptable) in some cases. Documentation is quite poor. It is not placed in Maven repos.
Serialization: 3.8146667ms / 1 PhotoAlbum, JSON file size 165KB
Deserialization: 6.125ms / 1 PhotoAlbum
This library is very simplistic and small – POJO deserialization comes first in version 2, which is currently in beta. Almost nothing is configurable, documentation is poor. Library is not reachable in Maven repos. It’s not currently possible to deserialize Date objects, more than that there is no configurable option to add custom type handler, so that deserialization of object containing date is not possible at all.
Serialization: 4.026ms / 1 PhotoAlbum, JSON file size 172KB
Powerful library requiring rather complicated setup when not using RuntimeSchema generator. In standard setup I believe library is used to do much more stuff than I’ve used it for. JSON transformations are just piece of work it can do (it can convert to YAML, XML and more). It had no problems with Date objects and Groovy classes. Library is accessible in Maven repos.
Serialization: 1.9116666ms / 1 PhotoAlbum, JSON file size 165KB
Deserialization: 1.2213334ms / 1 PhotoAlbum
Library formerly used to serialize and deserialize to / from XML internally using Jettison to transfrom data to / from JSON. It is easy to use, highly customizable and supports resolution of circular references. Library handles Date objects out of the box, it has no problem with Groovy classes and is placed in Maven repos.
Serialization: 76.84967ms / 1 PhotoAlbum, JSON file size 171KB
Deserialization: 26.361ms / 1 PhotoAlbum
Staxon aims primarily on the streaming API and presented use-case is not its primary kind of targeted usage. Nevertheless I was asked in the commentaries section to add some tests for this library so I did so. Staxon seems very easy to use, very well documented. It handles Date objects out of the box and is placed in Maven repos. It’s performance is not one of the best – seems rather average for the use-case tested, but remember – when using streaming style or reading / writing results might be different.
Staxon over GSon
Serialization: 7.510667ms / 1 PhotoAlbum, JSON file size 176KB
Deserialization: 12.171ms / 1 PhotoAlbum
Staxon over Jackson
Serialization: 4.8446665ms / 1 PhotoAlbum, JSON file size 176KB
Deserialization: 11.099ms / 1 PhotoAlbum
Staxon default implementation
Serialization: 6.3436666ms / 1 PhotoAlbum, JSON file size 176KB
Deserialization: 15.087ms / 1 PhotoAlbum
Young library (since mid 2011) that performs really fast. It dynamically creates specific serializer / deserializer class using ASM for each custom POJO type. It has considerable amount of processing features though not very well documented. Documentation as a whole is very poor but as long as it works you are safe to go. Library handles Date objects out of the box, it has no problem with Groovy classes and is placed in Maven repos.
Serialization: 1.88ms / 1 PhotoAlbum, JSON file size 165KB
Deserialization: 1.02ms / 1 PhotoAlbum
All stats are clearly comparable on the following graph:
My conclusion is that when you need easily serialize / deserialize Java POJOs without sacrificing performance and have some backup in terms of extensibility and configurability you should choose Jackson, Fastjson or GSon library. Jackson is my winner and I am going to migrate all of my code to this one. Any comments and thoughts are appreciated (and remember this is only microbenchmark so make tests for your own use-cases if you want to be 100% sure)!
Update 8/2012 Fastjson tests was added – it placed itself among the best performing libraries. Serialization is slightly worse than Jackson but in deserialization it is the clear winner.