Serialization by various means

kaffiene · September 5, 2012, 8:31am

oooooohhhhh…

Fight! Fight!

princec · September 5, 2012, 9:03am

Thread derailment detected!

I use Serializable the way I do because then I know exactly what’s being serialized and what’s not. I don’t depend on any external libraries for it to work, there are no known bugs, and the documentation is thorough. It’s just tried and tested and works - very very well. Why would I switch to anything else?

Also the binary format is stable and well-specified and has been for years. Nobody’s going to change it unexpectedly and even if they do, it has trivially available reference implementations in the form of older JDKs. My data is safe. And so on.

Cas

Cero · September 5, 2012, 11:26am

if this is really all you need to serialize:

Kryo kryo = new Kryo();
kryo.register(SomeClass.class);
// ...
SomeClass someObject = new SomeClass(...);
kryo.writeObject(buffer, someObject);
// ...
SomeClass someObject = kryo.readObject(buffer, SomeClass.class);

then its really easy.
I would like to try what happens if the classes have changes -> incompatible save game problem

princec · September 5, 2012, 11:44am

Persistent versioning problems won’t be solved by any layer automatically, so I wouldn’t worry about that in as far as the differences go. With Java serialization, provided you’ve used the same serialVersionUID, you can read different versioned incoming data and it will both ignore data that is present but no longer in the class and ignore missing data; it will however moan about incompatible data. And even then whether that is semantically correct for your class is highly dependent on what the class does.

By way of the traditional method of persistence / transmission in Java as a counterexample:


class SomeClass implements Serializable { ... }
SomeClass someObject  = new SomeClass();
ObjectOutputStream oos = new ObjectOutputStream(somewhere);
oos.writeObject(someObject);
oos.close();
//...
ObjectInputStream ois = new ObjectInputStream(somewhere);
SomeClass someObject = (SomeClass) ois.readObject();
ois.close()

Barely any different, setup wise. Key style difference -we don’t explicitly register data to serialize with some subsystem, we explicitly mark the classes and control serialization with various in-class means; this pollutes a class with its method of serialization but this does mean that wherever this class is used in a serialized form it is guaranteed to have the same binary format. So there are merits and demerits either way on that front.

Cas

jezek2 · September 5, 2012, 11:47am

I’ve used Java serialization for few years as my map file format. Contrary to some, it can be used for long-term serialization with upgrading, though needs some extra code, including subclassing ObjectInputStream (could eg. change class/package names that way).

However 2 years ago I decided to ditch it because all the serialization/deserialization code (you need some extra code sometimes, either for more optimized storage or for properly initializing some fields in readResolve as serialization bypasses constructors) was too much scattered around the classes, esp. when it dealed with multiple versions of the file format.

So I redid it by having serializer and deserializer classes instead, where I have all the code on one place and for me it makes much more sense that way.

ReBirth · September 5, 2012, 12:29pm

I agree with princec. My home-made notepad alt uses Serialization, works like charm. The main reason why I use it because it doesn’t need additional library.

matheus23 · September 5, 2012, 12:32pm

Oh man… more of this guys being afraid or too lazy to use librarys… Such kind of programmers will become extinct…

delt0r · September 5, 2012, 12:34pm

Such kind of programers avoid Dependency hell. Using a library to solve something that is already solved is quite frankly, silly.

ReBirth · September 5, 2012, 12:57pm

Really? ask Nate and CodeBunny how annoying I am.

This “my kind” programmers only use lib because of something that JDK can’t provide, such as performance, shorter code, or utilities.

princec · September 5, 2012, 12:59pm

“Horses for courses” as we say in England. You must evaluate what you need from serialization to make an informed choice about what to use.

Factors one might consider:

Dependency on potentially fragile external library, where fragility is a product of various factors such as project dormancy, support, ownership, hosting, code complexity
Performance characteristics; Java’s in build serialization is not exactly lightning fast or memory efficient so it is generally unsuitable for realtime game-speed interactions
Simplicity and ease of use
Code pollution versus “transportability” (being mingled with the classes to be serialized, Java Serialization makes your serializable objects highly transportable to other projects; and as a counterexample Kryonet requires you to set all serialization up for objects before you can use them)
Comprehensiveness; the inbuild Java serialization is fully complete and manages every edge-case I know of including cyclic references, forward-and-backward versions, replacement, customisation, and even high-performance cases using Externalizable, blah.
and so on.

Choose the best for your needs. Caveat: no serialization solution can solve the versioning problem for you.

Cas

ra4king · September 5, 2012, 3:26pm

Nate is now most likely laying on his bed, crying his eyes out.

Nate · September 5, 2012, 9:31pm

I missed the start of the thread but the important thing is that people are getting stuff done, whatever tool they choose to use. However, there is a difference between a tool that can possibly get a job done and a good tool for the same job. That is what I see we are discussing here. We could all go write everything in COBOL, but we don’t.

Avoiding a dependency is fine if the built-in alternative is sufficient. The built-in Java serialization is certainly not the end all solution. Not even close. It has many problems, more than I could list here. Josh Bloch’s Effective Java goes into detail. If you understand all the issues and it still meets your needs, by all means, use it if you want. I’ll mention a couple issues.

Java serialization uses an extralinguistic mechanism to create objects without invoking constructors. Objects not designed to be constructed this way can easily have inconsistent state. This is a nasty gotcha and (among other reasons) causes serialization code to permeate classes.

Java serialization is slow and the output is large. Externalizable is the usual rebuttal, but then you have to hand write serialization (time consuming, error prone, difficult to support forward/backward compatibility), and that code has to go inside each of your classes. At that point, all Java serialization is really doing is giving you a stream to write to.

Kryo’s class registration is an optional optimization to greatly reduce output size. It is not required. Note that Kryo is the serialization library and KryoNet is Kryo + networking. Class registration is required for KryoNet, but not for Kryo. Other setup, such as customizing serialization for a particular class, can be done inside or outside the class when using Kryo.

It is that easy, though the example you gave is old, the docs have an update to date example. Forward and backward compatibility aren’t always important, but when they are, you must think about it before choosing how to do your serialization. Kryo can do it multiple ways. You can have none, which is most efficient. You can use TaggedFieldSerializer (where you can add fields but can’t remove them, only deprecate them), which has very minimal overhead. You can use CompatibleFieldSerializer (where you can add and remove fields), which has a bit of overhead. I use TaggedFieldSerializer for my save games and other binary files that may be loaded across versions of my app.

Agreed. There are many factors to consider, even more than you listed. I suggest reviewing the jvm-serializers project. Even then, your own real data should be used to have meaningful comparisons. It is still very difficult to compare individual features of the different libraries.

For my own usage, there are two libraries I use for serialization, one for binary and one for human readable. I’m the author of both, so I may be biased. I’ll go ahead and ramble on about them.

Many years ago I wanted to make a quick multiplayer networked Rampart clone. I found the JGN (Java Game Networking) lib by darkfrog and used that. I contributed code to do POJO serialization using reflection. This code is now used by jMonkeyEngine. Eventually I decided to write my own networking lib for my personal use. Later I rewrote the serialization a third time and separated it from the networking, and this is how Kryo and KryoNet was born. About 4 months ago I rewrote Kryo again and I’m quite happy with the current incarnation. The code is very nice, the API easy to use, and it compares well against the alternatives. It has a nice little community (143 members on the mailing list) with active members that provide quality contributions (this is unheard of! ;)) and is used by a number of projects.

In a nutshell, Kryo does what you would do if you were hand writing serialization code, only it does it automatically. POJOs need no interface or annotation and can be serialized automatically. When a class does need serialization code, it can go in the class or can be kept separate. Kryo output is very small. Kryo is very fast. Various features, such as object references, can turned on and off to custom serialization for your needs to increase performance and reduce output size. Bytecode generation is used for speed when possible. Soon Kryo will be able to take advantage of Unsafe when possible, which can be up to 20 times faster at the cost of increased output size (details). This will be pretty sweet, considering Kryo is already one of the fastest libs.

The other serialization I’d like to mention is related to JSON, but first some background. I had previously toyed with making a YAML serialization library, called YamlBeans. I wanted human friendly serialization, and I wanted it to be as easy to use as Kryo. It turns out YAML is totally fucked to parse. YamlBeans is fine for POJOs, but does not have the flexibility for more complex usage. Doing that would have required exposing the YAML parsing/emitting, which is just too complicated. I can only recommend YamlBeans for the most simple object graphs, or where you just have to parse YAML into maps/lists/strings.

About a year ago, I wanted to use JSON as my human readable serialization format. Jackson seems to be the leading library in this area, but it didn’t have the simple API and the output flexibility that I wanted. I had been playing with Ragel, a sweet FSM compiler, so I felt like taking a crack at parsing JSON myself. Here is the result. I also created a class to emit JSON using a super simple, stack based API (here). With those, I wrote a class similar to YamlBeans and Kryo that uses reflection to serialize and deserialize. Unlike YamlBeans, it exposes the JSON pasring/emitting so the output is very customizable.

The result is the Json class in libgdx. I also ripped it out of libgdx for use in other apps, calling it JsonBeans. Usage, as shown on that page, is very simple. It has three output formats: JSON (quotes around everything), JavaScript (like JSON but names are only quoted if needed), and a “minimal” format I invented (names and values are only quoted if needed). It supports compatibility by optionally ignoring unknown fields in the JSON, which makes it useful for save games, preferences, etc.

While easy to use for basic operations, the Json class is also very flexible which allows for some neat tricks. JSON parsing can be hooked to do fancy stuff. Here I intercept reading an object and if a string was found in the JSON where an object was expected, I look up the object by name. This is used for scene2d skin file parsing, so objects previously defined in the JSON can be referenced later by name. Another example, it can deserialize into an existing object. For the libgdx texture packer, here I have a JSON file in each directory, and a child directory inherits the settings from the parent directory. Values set in the child directory JSON override those in the parent directory JSON.

In closing :p, Java serialization can be fine. When it is not fine, it is horrible. There are other worthwhile alternatives, probably even some not written by yours truly.

Cero · September 6, 2012, 2:02am

Seeing that KryoNet is the best game networking java library, it may be a very good idea to check out, use and familiarize myself with Kryo - because KryoNet should be really easy to use then.

Advo · September 6, 2012, 5:54am

best != easiest.

Also, if it boiled down to using a relatively high level networking API like KryoNet I would personally use Netty instead.

Nate · September 6, 2012, 7:28am

I wouldn’t say it is the best, but I do think it is easy. What did you not find easy about KryoNet? Also, have you ever tried to use Netty? It is significantly more complicated to get something done (eg, what thread runs what code). I have a Netty implementation of KryoNet and will likely update KryoNet with it some day when I find time. It works but is missing some features KryoNet has now. Users hate when you take away features they use. The KryoNet API stays mostly the same. The real value is in the serialization, that the networking is handled for you, and a nice API.

kaffiene · September 6, 2012, 7:54am

I really like JSON as a serialisation option - I’ll check out that JSONBeans class.

Nate · September 6, 2012, 10:14am

Cool! I just updated JsonBeans with the latest from libgdx, so it is now v0.3. If you grabbed 0.2, you’ll want to grab 0.3 (very similiar, just some bug fixes).

Advo · September 7, 2012, 5:18pm

To answer your questions, I have used Netty in the past for personal projects. It was a toss up for me between Netty and KryoNet at the time.

Netty has exceptional resources and documentation, which is something KryoNet lacks.

Additionally, back then Netty was underneath Jboss so I could rely on active development with plenty of bug fixes / or support.

Netty was more actively developed.

Netty had a usage that was very logical.

I chose to use Netty at the time, it worked out great. Besides the time I was torn between the two I have never looked at KryoNet so my reply was purely my personal opinion.

I didn’t consider Mina at the time because of I had heard that the API was going through serious changes from Mina 2.x to Mina 3.

Demonpants · September 7, 2012, 9:50pm

I was going to ask if Kryo calls constructors as expected, and looks like Nate posted above that it does. That’s my main issue with Serializable - the objects are not created through “normal” paths so anything that has been expected to happen in the constructor won’t. If you design around this you’re fine, but it’s annoying. I’ve used Externalizable (a pain in the ass, as Nate said) to solve this, and also I have serialized a “saved” version of each object that is then used to construct the real object, but that is also a pain in the ass.

Next time I need I might give Kryo a look specifically to avoid this weakness of Serializable.

matheus23 · September 8, 2012, 9:42am

Remember to add the default constructor…
I use Kryo (and KryoNet) and I really have to say: Good job Nate! Awesome work!. Kryo and KryoNet are simple and simply awesome