How we use Kotlin with Exposed at TouK

Why Kotlin? At TouK, we try to early adopt technologies. We don’t have a starter project skeleton that is reused in every new project, we want to try something that fits the project needs, even if it’s not that popular yet. We tried Kotlin first it mid 2016, right after reaching 1.0.2 version

Why Kotlin?

At TouK, we try to early adopt technologies. We don’t have a starter project skeleton reused in every new project; we want to try something that fits the project’s needs, even if it’s not that popular yet. We tried Kotlin first in mid-2016, right after reaching the 1.0.2 version. It was getting really popular in Android development, but almost nobody used it on the backend, especially — with production deployment. After reading some “hello world” examples, including this great article by Sebastien Deleuze we decided to try Kotlin as main language for a new MVNO project. The project was mainly a backend for a mobile app, with some integrations with external services (chat, sms, payments) and background tasks regarding customer subscriptions. We felt that Kotlin would be something fresh and more pleasant for developers, but we also liked the “not reinventing the wheel” approach — reusing large parts of the Java/JVM ecosystem we were happy with for existing projects (Spring, Gradle, JUnit, Mockito).

Why Exposed?

We initially felt that Kotlin + JPA/Hibernate is not a perfect match. Kotlin’s functional nature with first-class immutability support was not something that could seemly integrate with full-blown ORM started in the pre-Java8 era. But Sebastien’s article led us to try Exposed — a SQL access library maintained by JetBrains. From the beginning, we really liked the main assumptions of Exposed:

  • not trying to be full ORM framework
  • two flavors — typesafe SQL DSL and DAO/ActiveRecord style
  • lightweight, no reflection
  • no code generation
  • Spring integration
  • no annotations on your domain classes (in SQL DSL flavor)
  • open for extension (e.g. PostGIS and new DB dialects)

TL;DR

If you want to see how we use Kotlin + Exposed duo in our projects, check out this Github repo. It’s a Spring Boot app exposing REST API with the implementation of Medium clone as specified in http://realworld.io (“The mother of all demo apps”).

Another nice example is this repo by Seb Schmidt.

SQL DSL

In our projects we decided to try the “typesafe SQL DSL” flavor of Exposed. In this approach you don’t have to add anything to your domain classes, just need to write a simple schema mapping using Kotlin in configuration-as-code manner:

data class User(
  val username: Username,
  val password: String,
  val email: String
)

object UserTable : Table("users") {
    val username = text("username")
    val email = text("email")
    val password = text("password")
}

And then you can write type/null-safe queries with direct mapping to your domain classes:

UserTable.select { UserTable.username eq username }?.toUser()

// or 

UserTable.select { UserTable.username like username }.map { it.toUser() }

fun ResultRow.toUser() = User(
       username = this[UserTable.username],
       email = this[UserTable.email],
       password = this[UserTable.password]
)

RefIds

We like type-safe RefIds in our domain code. This is particularly useful in DDD-ish architectures, where you can keep those RefIds in a shared domain and use them to communicate between contexts.

So we wrap plan ids (longs, strings) into simple wrapper classes (e.g. UserId, ArticleId, Username, Slug). Exposed allows to easily register your own column types or even generic WrapperColumnType implementation that you can find in our repo.

Using this technique you can rewrite this mapping to something like this:

sealed class UserId : RefId<Long>() {
  object New : UserId() {
    override val value: Long by IdNotPersistedDelegate<Long>()
  }
  data class Persisted(override val value: Long) : UserId() {
    override fun toString() = "UserId(value=$value)"
  }
}

data class User(
   val id: UserId = UserId.New,
   //...
)

fun Table.userId(name: String) = longWrapper<UserId>(name, UserId::Persisted, UserId::value)

object UserTable : Table("users") {
  val id = userId("id").primaryKey().autoIncrement()
//...
}

And now we can query by type-safe RefIds:

override fun findBy(userId: UserId) =
  UserTable.select { UserTable.id eq userId }?.toUser()

Relationship mapping

One of the most significant selling points of ORMs is how easy it is to deal with relations. You just annotate the related field/collection with OneToOne or OneToMany and then can fetch the whole graph of objects at once. In theory — quite a nice idea, but in practice, things often go wrong. I’m not going to dig into details, instead, I recommend you read e.g. these fragments of “Opinionated JPA with Querydsl” book:

In Exposed SQL DSL approach you have to do relationship mapping by yourself — if you need to. Let’s consider Article and Tag case from our project’s domain. We have a many-to-many relation here, so we need additional “article_tags” table:

object ArticleTagTable : Table("article_tags") {
   val tagId = tagId("tag_id").references(TagTable.id)
   val articleId = articleId("article_id").references(ArticleTable.id)
}

When creating an Article, we have to attach all the associated Tags by populating Article’s generated id into ArticleTabTable entries:

override fun create(article: Article): Article {
   val savedArticle = ArticleTable.insert { it.from(article) }
           .getOrThrow(ArticleTable.id)
           .let { article.copy(id = it) }
   savedArticle.tags.forEach { tag ->
       ArticleTagTable.insert {
           it[ArticleTagTable.tagId] = tag.id
           it[ArticleTagTable.articleId] = savedArticle.id
       }
   }
   return savedArticle
}

The funny part is the mapping of Article with Tags in query methods — in API specification Tags are always returned with the Article — so we need to eagerly fetch tags by using leftJoin:

val ArticleWithTags = (ArticleTable leftJoin ArticleTagTable leftJoin TagTable)

override fun findBy(articleId: ArticleId) =
  ArticleWithTags
    .select { ArticleTable.id eq articleId }
    .toArticles()
    .singleOrNull()

After joining, we have then one ResultRow per one Article-Tag pair, so we have to group them by ArticleId, and build the correct Article object by adding Tags for each matching resultRow:

fun Iterable<ResultRow>.toArticles(): List<ResultRow> {
   return fold(mutableMapOf<ArticleId, Article>()) { map, resultRow ->
       val article = resultRow.toArticle()
       val tagId = resultRow.tryGet(ArticleTagTable.tagId)
       val tag = tagId?.let { resultRow.toTag() }
       val current = map.getOrDefault(article.id, article)
       map[article.id] = current.copy(tags = current.tags + listOfNotNull(tag))
       map
   }.values.toList()
}

This implementation allows us to solve all the possible cases:

  • no articles (fold just returns empty map)
  • articles with no tags (tag is null, so listOfNotNull(tag) is empty)
  • articles with many tags (an article with a single tag is inserted into the map, then other tags are added in copy method)

However, consider when you need to fetch the dependent structure with the root object? For tags it makes sense since you always want the tags with the article, and the count of tags for any article should not be that huge. What about the comments? You definitely don’t want all the comments each time you fetch the article, instead you’ll need some kind of paging or even making a parent-child hierarchy for comments for the article. That’s why we recommend having this relationship mapped indirectly — every Comment should have ArticleId property, and the CommentRepository could have methods like:

fun findAllBy(articleId: ArticleId): List<Comment>
// or
fun findAllByPaged(articleId: ArticleId, pageRequest: PageRequest): Page<Comment>

Extendibility

Exposed is by design open for extension, making it even easier with Kotlin’s support for extension methods. You can define your own column type or expressions, e.g. for PostGIS point type support as Sebastian showed in his article. We used similar PostGIS extension in our project too. We were also able to implement a simple support for Java8 DateTime column type — for now, Exposed has Joda-time support, a generic approach for various date/time libraries is planned in the roadmap.

The bigger thing was Oracle DB dialect — we were forced to migrate to Oracle at some time in our project. We submitted a pull-request with foundations of Oracle 12 support, being tested in production for a while (then, we moved back to PostgreSQL…). The implementation was rather straightforward, with DataType- and FunctionProvider interfaces to provide and just a few tweaks in batch insert support.

Final thoughts

Our developer experience with Kotlin+Exposed duo was really pleasant. If you don’t plan to map many relations directly, just use simple data classes, connected by RefIds, it works really well. The Exposed library itself may need more exhaustive documentation and removing some annoying details (e.g. transaction management via thread-local — which is already on the roadmap), but we definitely recommend you give it a try in your project!

You May Also Like

Zookeeper + Curator = Distributed sync

An application developed for one of my recent projects at TouK involved multiple servers. There was a requirement to ensure failover for the system’s components. Since I had already a few separate components I didn’t want to add more of that, and since there already was a Zookeeper ensemble running - required by one of the services, I’ve decided to go that way with my solution.

What is Zookeeper?

Just a crude distributed synchronization framework. However, it implements Paxos-style algorithms (http://en.wikipedia.org/wiki/Paxos_(computer_science)) to ensure no split-brain scenarios would occur. This is quite an important feature, since I don’t have to care about that kind of problems while using this app. You just need to create an ensemble of a couple of its instances - to ensure high availability. It is basically a virtual filesystem, with files, directories and stuff. One could ask why another filesystem? Well this one is a rather special one, especially for distributed systems. The reason why creating all the locking algorithms on top of Zookeeper is easy is its Ephemeral Nodes - which are just files that exist as long as connection for them exists. After it disconnects - such file disappears.

With such paradigms in place it’s fairly easy to create some high level algorithms for synchronization.

Having that in place, it can safely integrate multiple services ensuring loose coupling in a distributed way.

Zookeeper from developer’s POV

With all the base services for Zookeeper started, it seems there is nothing else, than just connect to it and start implementing necessary algorithms. Unfortunately, the API is quite basic and offers files and directories abstractions with the addition of different node type (file types) - ephemeral and sequence. It is also possible to watch a node for changes.

Using bare Zookeeper is hard!

Creating connections is tedious - and there is lots of things to take care of. Handling an established connection is hard - when establishing connection to ensemble, it’s necessary to negotiate a session also. During the whole process a number of exceptions can occur - these are “recoverable” exceptions, that can be gracefully handled and not break the connection.

    class="c8"><span>So, Zookeeper API is hard.</span></p><p class="c1"><span></span></p><p class="c8"><span>Even if one is proficient with that API, then there come recipes. The reason for using Zookeeper is to be able to implement some more sophisticated algorithms on top of it. Unfortunately those aren&rsquo;t trivial and it is again quite hard to implement them without bugs.</span>

And since distributed systems are hard, why would anyone want another difficult to handle tool?

Enter Curator

<p
    class="c8"><span>Happily, guys from Netflix implemented a nice abstraction for dealing with Zookeeper internals. They called it Curator and use it extensively in the company&rsquo;s environment. Curator offers consistent API for Zookeeper&rsquo;s functionality. It even implements a couple of recipes for distributed systems.</span>

File read/write

<p
    class="c8"><span>The basic use of Zookeeper is as a distributed configuration repository. For this scenario I only need read/write capabilities, to be able to write and read files from the Zookeeper filesystem. This code snippet writes a sample json to a file on ZK filesystem.</span>

<a href="#"
                                                                                                  name="0"></a>

EnsurePath ensurePath = new EnsurePath(markerPath);
ensurePath.ensure(client.getZookeeperClient());
String json = “...”;
if (client.checkExists().forPath(statusFile(core)) != null)
     client.setData().forPath(statusFile(core), json.getBytes());
else
     client.create().forPath(statusFile(core), json.getBytes());


Distributed locking

Having multiple systems there may be a need of using an exclusive lock for some resource, or perhaps some big system requires it’s components to synchronize based on locks. This “recipe” is an ideal match for those situations.

ref="#"
                                                                                    name="b0329bbbf14b79ffaba1139881914aea887ef6a3"></a>



lock = new InterProcessSemaphoreMutex(client, lockPath);
lock.acquire(5, TimeUnit.MINUTES);
… do sth …
lock.release();


 (from https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/LockingRemotely.java)

Sevice Advertisement

<p

    class="c8"><span>This is quite an interesting use case. With many small services on different servers it is not wise to exchange ip addresses and ports between them. When some of those services may go down, while other will try to replace them - the task gets even harder. </span>

That’s why, with Zookeeper in place, it can be utilised as a registry of existing services.

If a service starts, it registers into the ServiceRegistry, offering basic information, like it’s purpose, role, address, and port.

Services that want to use a specific kind of service request an access to some instance. This way of configuring easily decouples services from their configuration.

Basically this scenario needs ? steps:

<span>1. Service starts and registers its presence (</span><span class="c5"><a class="c0"
                                                                               href="https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerAdvertiser.java#L44">https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerAdvertiser.java#L44</a></span><span>)</span><span>:</span>



ServiceDiscovery discovery = getDiscovery();
            discovery.start();
            ServiceInstance si = getInstance();
            log.info(si);
            discovery.registerService(si);



2. Another service - on another host or in another JVM on the same machine tries to discover who is implementing the service (https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerFinder.java#L50):

<a href="#"

                                                                                                  name="3"></a>

instances = discovery.queryForInstances(serviceName);

The whole concept here is ridiculously simple - the service advertising its presence just stores a file with its whereabouts. The service that is looking for service providers just look into specific directory and read stored definitions.

In my example, the structure advertised by services looks like this (+ some getters and constructor - the rest is here: https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/model/WorkerMetadata.java):



public final class WorkerMetadata {
    private final UUID workerId;
    private final String listenAddress;
    private final int listenPort;
}


Source code

<p

    class="c8"><span>The above recipes are available in Curator library (</span><span class="c5"><a class="c0"
                                                                                                    href="http://curator.incubator.apache.org/">http://curator.incubator.apache.org/</a></span><span>). Recipes&rsquo;
usage examples are in my github repo at </span><span class="c5"><a class="c0"
                                                                   href="https://github.com/zygm0nt/curator-playground">https://github.com/zygm0nt/curator-playground</a></span>

Conclusion

<p
    class="c8"><span>If you&rsquo;re in need of a reliable platform for exchanging data and managing synchronization, and you need to do it in a distributed fashion - just choose Zookeeper. Then add Curator for the ease of using it. Enjoy!</span>


  1. image comes from: http://www.flickr.com/photos/jfgallery/2993361148
  2. all source code fragments taken from this repo: https://github.com/zygm0nt/curator-playground

An application developed for one of my recent projects at TouK involved multiple servers. There was a requirement to ensure failover for the system’s components. Since I had already a few separate components I didn’t want to add more of that, and since there already was a Zookeeper ensemble running - required by one of the services, I’ve decided to go that way with my solution.

What is Zookeeper?

Just a crude distributed synchronization framework. However, it implements Paxos-style algorithms (http://en.wikipedia.org/wiki/Paxos_(computer_science)) to ensure no split-brain scenarios would occur. This is quite an important feature, since I don’t have to care about that kind of problems while using this app. You just need to create an ensemble of a couple of its instances - to ensure high availability. It is basically a virtual filesystem, with files, directories and stuff. One could ask why another filesystem? Well this one is a rather special one, especially for distributed systems. The reason why creating all the locking algorithms on top of Zookeeper is easy is its Ephemeral Nodes - which are just files that exist as long as connection for them exists. After it disconnects - such file disappears.

With such paradigms in place it’s fairly easy to create some high level algorithms for synchronization.

Having that in place, it can safely integrate multiple services ensuring loose coupling in a distributed way.

Zookeeper from developer’s POV

With all the base services for Zookeeper started, it seems there is nothing else, than just connect to it and start implementing necessary algorithms. Unfortunately, the API is quite basic and offers files and directories abstractions with the addition of different node type (file types) - ephemeral and sequence. It is also possible to watch a node for changes.

Using bare Zookeeper is hard!

Creating connections is tedious - and there is lots of things to take care of. Handling an established connection is hard - when establishing connection to ensemble, it’s necessary to negotiate a session also. During the whole process a number of exceptions can occur - these are “recoverable” exceptions, that can be gracefully handled and not break the connection.

    class="c8"><span>So, Zookeeper API is hard.</span></p><p class="c1"><span></span></p><p class="c8"><span>Even if one is proficient with that API, then there come recipes. The reason for using Zookeeper is to be able to implement some more sophisticated algorithms on top of it. Unfortunately those aren&rsquo;t trivial and it is again quite hard to implement them without bugs.</span>

And since distributed systems are hard, why would anyone want another difficult to handle tool?

Enter Curator

<p
    class="c8"><span>Happily, guys from Netflix implemented a nice abstraction for dealing with Zookeeper internals. They called it Curator and use it extensively in the company&rsquo;s environment. Curator offers consistent API for Zookeeper&rsquo;s functionality. It even implements a couple of recipes for distributed systems.</span>

File read/write

<p
    class="c8"><span>The basic use of Zookeeper is as a distributed configuration repository. For this scenario I only need read/write capabilities, to be able to write and read files from the Zookeeper filesystem. This code snippet writes a sample json to a file on ZK filesystem.</span>

<a href="#"
                                                                                                  name="0"></a>

EnsurePath ensurePath = new EnsurePath(markerPath);
ensurePath.ensure(client.getZookeeperClient());
String json = “...”;
if (client.checkExists().forPath(statusFile(core)) != null)
     client.setData().forPath(statusFile(core), json.getBytes());
else
     client.create().forPath(statusFile(core), json.getBytes());


Distributed locking

Having multiple systems there may be a need of using an exclusive lock for some resource, or perhaps some big system requires it’s components to synchronize based on locks. This “recipe” is an ideal match for those situations.

ref="#"
                                                                                    name="b0329bbbf14b79ffaba1139881914aea887ef6a3"></a>



lock = new InterProcessSemaphoreMutex(client, lockPath);
lock.acquire(5, TimeUnit.MINUTES);
… do sth …
lock.release();


 (from https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/LockingRemotely.java)

Sevice Advertisement

<p

    class="c8"><span>This is quite an interesting use case. With many small services on different servers it is not wise to exchange ip addresses and ports between them. When some of those services may go down, while other will try to replace them - the task gets even harder. </span>

That’s why, with Zookeeper in place, it can be utilised as a registry of existing services.

If a service starts, it registers into the ServiceRegistry, offering basic information, like it’s purpose, role, address, and port.

Services that want to use a specific kind of service request an access to some instance. This way of configuring easily decouples services from their configuration.

Basically this scenario needs ? steps:

<span>1. Service starts and registers its presence (</span><span class="c5"><a class="c0"
                                                                               href="https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerAdvertiser.java#L44">https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerAdvertiser.java#L44</a></span><span>)</span><span>:</span>



ServiceDiscovery discovery = getDiscovery();
            discovery.start();
            ServiceInstance si = getInstance();
            log.info(si);
            discovery.registerService(si);



2. Another service - on another host or in another JVM on the same machine tries to discover who is implementing the service (https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/curator/WorkerFinder.java#L50):

<a href="#"

                                                                                                  name="3"></a>

instances = discovery.queryForInstances(serviceName);

The whole concept here is ridiculously simple - the service advertising its presence just stores a file with its whereabouts. The service that is looking for service providers just look into specific directory and read stored definitions.

In my example, the structure advertised by services looks like this (+ some getters and constructor - the rest is here: https://github.com/zygm0nt/curator-playground/blob/master/src/main/java/pl/touk/model/WorkerMetadata.java):



public final class WorkerMetadata {
    private final UUID workerId;
    private final String listenAddress;
    private final int listenPort;
}


Source code

<p

    class="c8"><span>The above recipes are available in Curator library (</span><span class="c5"><a class="c0"
                                                                                                    href="http://curator.incubator.apache.org/">http://curator.incubator.apache.org/</a></span><span>). Recipes&rsquo;
usage examples are in my github repo at </span><span class="c5"><a class="c0"
                                                                   href="https://github.com/zygm0nt/curator-playground">https://github.com/zygm0nt/curator-playground</a></span>

Conclusion

<p
    class="c8"><span>If you&rsquo;re in need of a reliable platform for exchanging data and managing synchronization, and you need to do it in a distributed fashion - just choose Zookeeper. Then add Curator for the ease of using it. Enjoy!</span>


  1. image comes from: http://www.flickr.com/photos/jfgallery/2993361148
  2. all source code fragments taken from this repo: https://github.com/zygm0nt/curator-playground