Bypassing Kotlin’s Null-Safety

In this short article, we will have a look at how we can bypass Kotlin’s native null-safety with sun.misc.Unsafe, and see why it can be dangerous even if we are not messing up with it directly.

Mythical sun.misc.Unsafe

The sun.misc.Unsafe class is an internal JVM tool for executing low-level operations like off-heap memory allocation, thread parking, CAS, and much more.

This class is like one of those scary computer game creatures that are there only to intimidate us, in theory, we can’t get close because they are part of the environment, but it’s often possible by exploiting glitches or holes.

If we try to access the Unsafe instance, we encounter a private constructor and a static getUnsafe() method that raises a SecurityException practically every time we call it:

public final class Unsafe {
    private static final Unsafe theUnsafe;
    // ...

    private Unsafe() {}

    @CallerSensitive
    public static Unsafe getUnsafe() {
        Class var0 = Reflection.getCallerClass();
        if (!VM.isSystemDomainLoader(var0.getClassLoader())) {
            throw new SecurityException("Unsafe");
        } else {
            return theUnsafe;
        }
    }
}

 

So, in theory, it’s guarded by a strong encapsulation, and an exception being thrown on every getUnsafe() call… but we do have the Reflection mechanism, and we can easily bypass those:

private fun getUnsafe(): Unsafe {
    return Unsafe::class.java.getDeclaredField("theUnsafe")
            .apply { isAccessible = true }
            .let { it.get(null) as Unsafe }
}

Mighty Unsafe.allocateInstance()

This method allocates an empty instance of a given class directly on the heap ignoring field initialization and constructors.

And this allows us, indeed, to effectively bypass Kotlin’s safety checks:

A cool thing to do on Friday’s evening, but what about just not using Unsafe and staying (null)safe?

Problem: Unsafe in External Libraries

The problem is that most Java libraries were written with Java in mind, where using Unsafe for certain scenarios is slightly less unsafe than it is e.g., for Kotlin.

This is especially the case with serialization/deserialization libraries – one of such is Google’s Gson which internally uses Unsafe for instantiating objects in certain situations – which is totally acceptable for Java.

If we start using it in Kotlin, we indeed might end up with an undesired behaviour observed above:

@Test
fun unsafe_2() {
    val foo = Gson().fromJson("{}", Foo::class.java)

    assertThat(foo.nonNullable).isNull()
}

In this case, we simply need to perform checks manually after instantiation, which is not super problematic – what’s problematic is the lack of consciousness that this happens, which can cost much.

Are you sure the library you are using is not doing that internally?

Code snippets can be found on GitHub.

Key Takeaways

  • Kotlin’s null-safety does not go beyond objects’ initialization phase and is bypassable
  • External libraries that use Unsafe internally can do that too – it’s important to be aware of this
You May Also Like

Recently at storm-users

I've been reading through storm-users Google Group recently. This resolution was heavily inspired by Adam Kawa's post "Football zero, Apache Pig hero". Since I've encountered a lot of insightful and very interesting information I've decided to describe some of those in this post.

  • nimbus will work in HA mode - There's a pull request open for it already... but some recent work (distributing topology files via Bittorrent) will greatly simplify the implementation. Once the Bittorrent work is done we'll look at reworking the HA pull request. (storm’s pull request)

  • pig on storm - Pig on Trident would be a cool and welcome project. Join and groupBy have very clear semantics there, as those concepts exist directly in Trident. The extensions needed to Pig are the concept of incremental, persistent state across batches (mirroring those concepts in Trident). You can read a complete proposal.

  • implementing topologies in pure python with petrel looks like this:

class Bolt(storm.BasicBolt):
    def initialize(self, conf, context):
       ''' This method executed only once '''
        storm.log('initializing bolt')

    def process(self, tup):
       ''' This method executed every time a new tuple arrived '''       
       msg = tup.values[0]
       storm.log('Got tuple %s' %msg)

if __name__ == "__main__":
    Bolt().run()
  • Fliptop is happy with storm - see their presentation here

  • topology metrics in 0.9.0: The new metrics feature allows you to collect arbitrarily custom metrics over fixed windows. Those metrics are exported to a metrics stream that you can consume by implementing IMetricsConsumer and configure with Config.java#L473. Use TopologyContext#registerMetric to register new metrics.

  • storm vs flume - some users' point of view: I use Storm and Flume and find that they are better at different things - it really depends on your use case as to which one is better suited. First and foremost, they were originally designed to do different things: Flume is a reliable service for collecting, aggregating, and moving large amounts of data from source to destination (e.g. log data from many web servers to HDFS). Storm is more for real-time computation (e.g. streaming analytics) where you analyse data in flight and don't necessarily land it anywhere. Having said that, Storm is also fault-tolerant and can write to external data stores (e.g. HBase) and you can do real-time computation in Flume (using interceptors)

That's all for this day - however, I'll keep on reading through storm-users, so watch this space for more info on storm development.

I've been reading through storm-users Google Group recently. This resolution was heavily inspired by Adam Kawa's post "Football zero, Apache Pig hero". Since I've encountered a lot of insightful and very interesting information I've decided to describe some of those in this post.

  • nimbus will work in HA mode - There's a pull request open for it already... but some recent work (distributing topology files via Bittorrent) will greatly simplify the implementation. Once the Bittorrent work is done we'll look at reworking the HA pull request. (storm’s pull request)

  • pig on storm - Pig on Trident would be a cool and welcome project. Join and groupBy have very clear semantics there, as those concepts exist directly in Trident. The extensions needed to Pig are the concept of incremental, persistent state across batches (mirroring those concepts in Trident). You can read a complete proposal.

  • implementing topologies in pure python with petrel looks like this:

class Bolt(storm.BasicBolt):
    def initialize(self, conf, context):
       ''' This method executed only once '''
        storm.log('initializing bolt')

    def process(self, tup):
       ''' This method executed every time a new tuple arrived '''       
       msg = tup.values[0]
       storm.log('Got tuple %s' %msg)

if __name__ == "__main__":
    Bolt().run()
  • Fliptop is happy with storm - see their presentation here

  • topology metrics in 0.9.0: The new metrics feature allows you to collect arbitrarily custom metrics over fixed windows. Those metrics are exported to a metrics stream that you can consume by implementing IMetricsConsumer and configure with Config.java#L473. Use TopologyContext#registerMetric to register new metrics.

  • storm vs flume - some users' point of view: I use Storm and Flume and find that they are better at different things - it really depends on your use case as to which one is better suited. First and foremost, they were originally designed to do different things: Flume is a reliable service for collecting, aggregating, and moving large amounts of data from source to destination (e.g. log data from many web servers to HDFS). Storm is more for real-time computation (e.g. streaming analytics) where you analyse data in flight and don't necessarily land it anywhere. Having said that, Storm is also fault-tolerant and can write to external data stores (e.g. HBase) and you can do real-time computation in Flume (using interceptors)

That's all for this day - however, I'll keep on reading through storm-users, so watch this space for more info on storm development.