Fwd: Re: Towards better serialization
michal at kleczek.org
Wed Jun 12 21:31:41 UTC 2019
---------- Forwarded message ---------
From: Kłeczek, Michał <michal at kleczek.org>
Date: Wed, Jun 12, 2019 at 11:03 PM
Subject: Re: Towards better serialization
To: <amber-spec-experts at openjdk.java.net>
Seems like this is the only amber mailing list I can post to.
Sorry for double posting - got lost in all other mailing lists rejection
------ Original Message ------
From: "Remi Forax" <forax at univ-mlv.fr>
To: "Kłeczek, Michał" <michal at kleczek.org>
Cc: "amber-dev" <amber-dev at openjdk.java.net>
Sent: 12/06/2019 14:51:16
Subject: Re: Towards better serialization
>wrong list btw, amber-dev is for the problem with the implementation, not
with a spec proposal.
>----- Mail original -----
>> De: "Kłeczek, Michał" <michal at kleczek.org>
>> À: "amber-dev" <amber-dev at openjdk.java.net>
>> Envoyé: Mercredi 12 Juin 2019 14:23:30
>> Objet: Towards better serialization
>> To be honest I fail to see anything that would make it a better
>> serialization. This proposal is simply enforcement of implementing
>> writeReplace()/readResolve() pair (aka classic memento pattern) with
>> some syntactic sugar on top.
>nope, the idea is to decouple the data you need to write/read from the way
you want to serialize those data.
You cannot be more decoupled than current serialization as it is purely
And there is the whole lot of opt-in possibilities to control what
should be serialized.
What exactly is current proposal bringing to the table comparing to what
is already there?
>> And while it might make serialization more secure (which I doubt - see
>> below) it is not better for sure as it forces developers to do more
>> work. The whole premise of Java serialization was that it is supposed to
>> be transparent and cheap to implement. It is rooted in Smalltalk/Self
>> and the idea of a program image and transparent state migration. Why are
>> we giving this up? A better serialization should be simply a better
>> implementation of this idea - what we have here is retraction instead.
>It's more "secure" because the class writer is in control of the data
exported and because it relies on constructor/factory, so the same checks
are done if you construct an instance classically or by reflection.
The point is that it gives you only a false sense of security - which in
turn makes it _less_ secure (see below).
>> What's more - it does not really address security concerns! Even the
>> example of non-serializable ServerConnection being recreated based on
>> serverName upon deserialization illustrates it - serverName is not
>> sanitized/validated and as such is a security hole (as may lead to
>> information leak) - the only real defense is SecurityManager and proper
>> security policy in place.
>If you serialize something by definition you are leaking it.
I haven't been clear with my comment:
If an attacker provides a hostile serverName in the serialization stream
(ie. tries to trick the program to connect to the wrong server) - who's
responsibility it is to make sure it does not happen?
This example illustrates the fact that forcing reconstruction of the
object graph via constructor calls gives you only a false sense of
security - you *must* have some other security policy enforcement in
To make the point stronger:
Most deserialization gadgets invoke methods using reflection based on
the data from serialization stream in readObject() - which obviously is
a security hole. But there is *nothing* preventing such invocation in
the deserialization constructor.
The proposal simply does not really address security concerns - reason
being it can't as security problems cannot be solved at the level of
serialization/deserialization - this is simply the wrong place!
>That's said i agree that the example is not the best one because it
doesn't show the validation that should be done.
See above - input validation is not enough if we are talking about
>> The issue here is that we try to fix security problems in the wrong
>> place. Almost all security issues with serialization are not really
>> caused by serialization itself but by:
>> - huge classpath with all libraries accessible to each other (ie.
>> deserialization gadgets availability in classpath)
>> - running applications with no SecurityManager (starting a JVM with no
>> SecurityManager by default was the single biggest mistake Java designers
>> made in the past IMHO)
>huge classpath is a real issue, we have modules exactly for that, it's
just that given the giant size of the ecosystem, things move slowly.
>> There are issues with current serialization but IMHO they can be fixed
>> with small adjustments to the spec/API:
>> - Add an @Unshared annotation on a member field - that would signal
>> requirement for the deserialization framework to make sure the instance
>> is unshared
>> - Use ObjectInput/ObjectOutput interface everywhere instead of
>> ObjectInputStream/ObjectOutputStream classes (which would allow
>> different easier provision of different serialization formats)
>> - Provide easy way to register invariants validation (right now
>> registering ObjectInputValidation is a PITA) - it can be done by
>> introducing @InvariantCheck annotation on a method.
>> - Provide an easy way to designate a constructor or a static method as a
>> deserialization facility (either similar to the one in the proposal or
>> taking ObjectInput as an argument)
>> - Make use of ObjectInputStream.GetField/ObjectOutputStream.PutField
>> interfaces easier/more obvious
>nope, more coupling and peppering the current implementation with more
annotations are not a solution, they are the root of problems of the future
as the diverse fixes to the serialization has shown.
And yet the proposal introduces some more annotations while not
addressing any of the problems :)
>> My point here is - current serialization offers a lot and getting rid of
>> it instead of making it better is a huge step back. What's more -
>> getting rid of it is in reality only moving the problem around as the
>> need for transparent serialization is there and is witnessed by
>> existence of all Json/XML transparent serialization solutions. They will
>> not go away - quite the contrary - there will be more of them as there
>> will be no default in the standard library. And they will be worse than
>> the default one.
>serialization of records is transparent.
Why only records?
More information about the amber-dev