The road to Kotlin Symbol Processing

There’s a long back story to Java annotations. Introduced in 2004 for Java 5 and supported by the javac compiler since Java 6 in 2002, they may be thought of as an industry standard approach to
metaprogramming. However, as mostly used in the Android ecosystem, it is not that popular technique in backend development. I had an
opportunity to dig into the subject while developing Krush, which is based on Kotlin annotation processing (KAPT). In this article I’ll try to show you the different approaches to annotation processing, starting from pure Java solutions, then moving to KAPT and finally to its
successor – Kotlin Symbol Processing.

Java annotation processors

Lombok *

Lombok is one of the first projects that comes into Java devs mind when thinking about annotation processing. You just add a single dependency into your pom.xml / add a plugin into Intellij Idea and some kind of magic turns your classes annotated with @Value into a functional immutable data structure. But in fact, Lombok is not a 100% pure example of an annotation processor… If you just run your debugger and step into generated toString / equals method, you’ll see no generated code:

lombok debug

Lombok starts as an usual annotation processor, but during its run it modifies a compiler abstract syntax tree to insert desired methods to your classes, which is not an intended use-case for annotation processors. In some references this technique is even called a hack:

Lombok … uses annotation processing as a bootstrapping mechanism to include itself into the compilation process and modify the AST via some internal compiler APIs. This hacky technique has nothing to do with the intended purpose of annotation processing

[source]

Apart from not being the clearest solution, using Lombok for some time broke other annotation-processing based libraries configured to run on the same source code.

AutoValue

A similar library which uses annotation processing in a clear way is AutoValue. It processes the @AutoValue and @AutoValue.Builder annotations to generate immutable classes which allow you to safely step into using a debugger. Consider following example:

@AutoValue
abstract class Book {

   static Builder builder() {
       return new AutoValue_Book.Builder();
   }

   @AutoValue.Builder
   interface Builder {
       Builder title(String title);
       Builder author(String author);
       Book build();
   }

   abstract String title();
   abstract String author();
}

Then, if you create an instance of the class using a Builder, you can see generated toString / equals / hashCode methods in the debugger:

autovalue debug

By quickly looking at the above code you can see some characteristics of using an annotation processor in your project:

  • there is a minimal framework/convention that must be built into your classes (abstract getters, static builder method)
  • you are using “third party” code (classes in this example) even by
    directly referring it in your source code or in the runtime (like in @AutoValue example)
  • there is a need to run partial compilation before being able to use
    generated code

Other examples of Java annotation processor include:

You can find other examples on this annotation processing list.

KAPT

I wasn’t aware how much annotation processing is used by libraries from the Android ecosystem. Butterknife, Room, Moshi, Hilt… these names are not quite familiar if you’re a backend developer working on the JVM. As Kotlin started gaining popularity in the Android community, it was crucial for it to support existing ecosystem libraries made for Java. That’s why KAPT was introduced – a Kotlin Annotation Processing Tool. The idea behind it was very simple:

  • make a minimal step to transform Kotlin code to Java
  • run annotation processing on Java sources, just as in any Java project

Krush

One of the examples of using KAPT is Krush – our lightweight persistence layer for Kotlin based on Exposed SQL DSL. Krush interprets standard JPA annotations on the entity classes to generate both Exposed DSL mappings and convenient methods transfer data from / to entity classes.

Consider following entity:

@Entity
data class Reservation(

   @Id
   val uid: UUID = UUID.randomUUID(),
   @Enumerated(EnumType.STRING)

   val status: Status = Status.FREE
)

By adding Krush we will have following mapping generated:

// generated
public object ReservationTable : Table("reservation") {
 public val uid: Column = uuid("uid")
 public override val primaryKey: Table.PrimaryKey = PrimaryKey(uid)
 public val status: Column = 
    enumerationByName("status", 255, pl.touk.krush.Status::class)
}

However, during Krush implementation we observed some drawbacks of using annotation processing in 100% Kotlin codebase:

  • no direct support for code generation. Java annotation API contains limited support for code generation using a Filer interface, that just lets you generate new files, with no distinction between source code, metadata, docs etc. A partial solution to this is using a third-party code generation library, like KotlinPoet, which contains a feature-rich DSL for generating Kotlin classes, properties etc.
  • missing Kotlin-specific information in the API – annotation processing API contains information about the structure of the code during compilation using javax.lang.model package. However, this is Java-specific, so you don’t have access to top-level functions, file annotations etc. However, some of additional Kotlin metadata can be retrieved by using kotlinpoet-metadata integration.
  • a need to generate Java stubs before running the annotation processing itself reduces the overall performance of project build process, some resources estimate that stub generation takes ⅓ time of the whole kotlinc run, which can be painful for large codebases.

KSP

The downsides of using KAPT in pure Kotlin projects (especially Android-based), triggered Google to develop a Kotlin-specific approach, called Kotlin Symbol Processing. It is now a preferred way to implement annotation processing in Kotlin, since its release KAPT has been put into maintenance mode. It provides a separate from javax.lang.model, Kotlin-specific API representing your source code, no need to generate any Java stubs and a better integration with code generator libraries. Let’s look at basic features of KSP by doing a hands-on example.

Implementing @Slf4j

Let’s look at how to use KSP by writing a simple processor – an equivalent of Lombok’s @Slf4j.
We should start with an implementation of our annotation, SymbolProcessor and a provider for it:

annotation class Slf4j

class Slf4jProcessor(val env: SymbolProcessorEnvironment) : SymbolProcessor {

   override fun process(resolver: Resolver): List {
       val symbols = resolver.getSymbolsWithAnnotation(Slf4j::class.java.name)
       val ret = symbols.filter { !it.validate() }.toList()
       symbols
           .filter { it is KSClassDeclaration && it.validate() }
           .forEach { it.accept(Slf4jProcessorVisitor(), Unit) }
       return ret
   }

   inner class Slf4jProcessorVisitor : KSVisitorVoid()
  
}

class Slf4jProcessorProvider : SymbolProcessorProvider {
   override fun create(environment: SymbolProcessorEnvironment) = Slf4jProcessor(environment)
}

Apart from a bunch of bootstrapping code, you may notice that the main part of the implementation would be Slf4jProcessorVisitor – which is how KSP is traversing your source code – by using the visitor pattern. So what you have to do is write appropriate visitXXX method(s) to implement your processor functionality:

override fun visitClassDeclaration(classDeclaration: KSClassDeclaration, data: Unit) {
   val packageName = classDeclaration.packageName.asString()
   val ksType = classDeclaration.asType(emptyList())

    val fileSpec = FileSpec.builder(
        packageName = packageName,
        fileName = classDeclaration.simpleName.asString() + "Ext"
    ).apply {
        val className = ksType.toClassName()
        val loggerName = "_${className.simpleName.replaceFirstChar { it.lowercase() }}Logger"
        addProperty(
            PropertySpec.builder(loggerName, Logger::class.java)
                .addModifiers(KModifier.PRIVATE)
                .initializer("%T.getLogger(%T::class.java)", LoggerFactory::class.java, className)
                .build()
        )
        addProperty(
            PropertySpec.builder("logger", Logger::class.java)
                .receiver(className)
                .getter(
                    FunSpec.getterBuilder()
                        .addStatement("return $loggerName")
                        .build()
                )
                .build()
        )
    }.build()

    fileSpec.writeTo(codeGenerator = env.codeGenerator, aggregating = false)
}

So, I’m using KotlinPoet here, which is quite nice integrated with KSP by this writeTo method – so if you want to generate a new file you just build a FileSpec and then write it to appropriate folder configured by the KSP plugin by calling writeTo method.

In short, for each class annotated with @Slf4j annotation we generate a file with a _serviceLogger property which is initialized with standard LoggerFactory.getLogger call. KotlinPoet comes with a nice templating system, which allows you to just pass class declarations from the KSP model instead of resolving them manually, with imports etc.. The second property is an extension of our annotated class, which we express by using a receiver block, we can also make a custom getter by using another KotlinPoet call.

If we did everything right this how the generated code should look for annotated class:

import org.slf4j.Logger
import org.slf4j.LoggerFactory

private val _serviceLogger: Logger = LoggerFactory.getLogger(Service::class.java)

public val Service.logger: Logger
  get() = _serviceLogger

Which should allow us to use the logger in our class:

@Slf4j
class Service {
   fun test() {
       logger.info("Hello from KSP!")
   }
}

Summary

I hope this article helped you learn the history of annotation processing and motivation behind the KSP project. In fact KAPT is now in maintenance mode, so KSP should be the default library to use in new, pure Kotlin projects. However, if you think of migrating your existing library based on KAPT, you should be aware of some complications when java.lang.model is too tightly coupled to your model. For example the Dagger project has a rough road of supporting KSP, they must first introduce some common model based on XProcessing to support both KSP and traditional annotation processing.

The code for example @Slf4j processor can be found here.

You May Also Like

Apache HISE + Apache Camel

Check out this SlideShare Presentation: Apache HISE + Apache CamelView more presentations from Rafal Rusin.Check out this SlideShare Presentation: Apache HISE + Apache CamelView more presentations from Rafal Rusin.

Multi module Gradle project with IDE support

This article is a short how-to about multi-module project setup with usage of the Gradle automation build tool.

Here's how Rich Seller, a StackOverflow user, describes Gradle:
Gradle promises to hit the sweet spot between Ant and Maven. It uses Ivy's approach for dependency resolution. It allows for convention over configuration but also includes Ant tasks as first class citizens. It also wisely allows you to use existing Maven/Ivy repositories.
So why would one use yet another JVM build tool such as Gradle? The answer is simple: to avoid frustration involved by Ant or Maven.

Short story

I was fooling around with some fresh proof of concept and needed a build tool. I'm pretty familiar with Maven so created project from an artifact, and opened the build file, pom.xml for further tuning.
I had been using Grails with its own build system (similar to Gradle, btw) already for some time up then, so after quite a time without Maven, I looked on the pom.xml and found it to be really repulsive.

Once again I felt clearly: XML is not for humans.

After quick googling I found Gradle. It was still in beta (0.8 version) back then, but it's configured with Groovy DSL and that's what a human likes :)

Where are we

In the time Ant can be met but among IT guerrillas, Maven is still on top and couple of others like for example Ivy conquer for the best position, Gradle smoothly went into its mature age. It's now available in 1.3 version, released at 20th of November 2012. I'm glad to recommend it to anyone looking for relief from XML configured tools, or for anyone just looking for simple, elastic and powerful build tool.

Lets build

I have already written about basic project structure so I skip this one, reminding only the basic project structure:
<project root>

├── build.gradle
└── src
├── main
│ ├── java
│ └── groovy

└── test
├── java
└── groovy
Have I just referred myself for the 1st time? Achievement unlocked! ;)

Gradle as most build tools is run from a command line with parameters. The main parameter for Gradle is a 'task name', for example we can run a command: gradle build.
There is no 'create project' task, so the directory structure has to be created by hand. This isn't a hassle though.
Java and groovy sub-folders aren't always mandatory. They depend on what compile plugin is used.

Parent project

Consider an example project 'the-app' of three modules, let say:
  1. database communication layer
  2. domain model and services layer
  3. web presentation layer
Our project directory tree will look like:
the-app

├── dao-layer
│ └── src

├── domain-model
│ └── src

├── web-frontend
│ └── src

├── build.gradle
└── settings.gradle
the-app itself has no src sub-folder as its purpose is only to contain sub-projects and build configuration. If needed it could've been provided with own src though.

To glue modules we need to fill settings.gradle file under the-app directory with a single line of content specifying module names:
include 'dao-layer', 'domain-model', 'web-frontend'
Now the gradle projects command can be executed to obtain such a result:
:projects

------------------------------------------------------------
Root project
------------------------------------------------------------

Root project 'the-app'
+--- Project ':dao-layer'
+--- Project ':domain-model'
\--- Project ':web-frontend'
...so we know that Gradle noticed the modules. However gradle build command won't run successful yet because build.gradle file is still empty.

Sub project

As in Maven we can create separate build config file per each module. Let say we starting from DAO layer.
Thus we create a new file the-app/dao-layer/build.gradle with a line of basic build info (notice the new build.gradle was created under sub-project directory):
apply plugin: 'java'
This single line of config for any of modules is enough to execute gradle build command under the-app directory with following result:
:dao-layer:compileJava
:dao-layer:processResources UP-TO-DATE
:dao-layer:classes
:dao-layer:jar
:dao-layer:assemble
:dao-layer:compileTestJava UP-TO-DATE
:dao-layer:processTestResources UP-TO-DATE
:dao-layer:testClasses UP-TO-DATE
:dao-layer:test
:dao-layer:check
:dao-layer:build

BUILD SUCCESSFUL

Total time: 3.256 secs
To use Groovy plugin slightly more configuration is needed:
apply plugin: 'groovy'

repositories {
mavenLocal()
mavenCentral()
}

dependencies {
groovy 'org.codehaus.groovy:groovy-all:2.0.5'
}
At lines 3 to 6 Maven repositories are set. At line 9 dependency with groovy library version is specified. Of course plugin as 'java', 'groovy' and many more can be mixed each other.

If we have settings.gradle file and a build.gradle file for each module, there is no need for parent the-app/build.gradle file at all. Sure that's true but we can go another, better way.

One file to rule them all

Instead of creating many build.gradle config files, one per each module, we can use only the parent's one and make it a bit more juicy. So let us move the the-app/dao-layer/build.gradle a level up to the-app/build-gradle and fill it with new statements to achieve full project configuration:
def langLevel = 1.7

allprojects {

apply plugin: 'idea'

group = 'com.tamashumi'
version = '0.1'
}

subprojects {

apply plugin: 'groovy'

sourceCompatibility = langLevel
targetCompatibility = langLevel

repositories {
mavenLocal()
mavenCentral()
}

dependencies {
groovy 'org.codehaus.groovy:groovy-all:2.0.5'
testCompile 'org.spockframework:spock-core:0.7-groovy-2.0'
}
}

project(':dao-layer') {

dependencies {
compile 'org.hibernate:hibernate-core:4.1.7.Final'
}
}

project(':domain-model') {

dependencies {
compile project(':dao-layer')
}
}

project(':web-frontend') {

apply plugin: 'war'

dependencies {
compile project(':domain-model')
compile 'org.springframework:spring-webmvc:3.1.2.RELEASE'
}
}

idea {
project {
jdkName = langLevel
languageLevel = langLevel
}
}
At the beginning simple variable langLevel is declared. It's worth knowing that we can use almost any Groovy code inside build.gradle file, statements like for example if conditions, for/while loops, closures, switch-case, etc... Quite an advantage over inflexible XML, isn't it?

Next the allProjects block. Any configuration placed in it will influence - what a surprise - all projects, so the parent itself and sub-projects (modules). Inside of the block we have the IDE (Intellij Idea) plugin applied which I wrote more about in previous article (look under "IDE Integration" heading). Enough to say that with this plugin applied here, command gradle idea will generate Idea's project files with modules structure and dependencies. This works really well and plugins for other IDEs are available too.
Remaining two lines at this block define group and version for the project, similar as this is done by Maven.

After that subProjects block appears. It's related to all modules but not the parent project. So here the Groovy language plugin is applied, as all modules are assumed to be written in Groovy.
Below source and target language level are set.
After that come references to standard Maven repositories.
At the end of the block dependencies to groovy version and test library - Spock framework.

Following blocks, project(':module-name'), are responsible for each module configuration. They may be omitted unless allProjects or subProjects configure what's necessary for a specific module. In the example per module configuration goes as follow:
  • Dao-layer module has dependency to an ORM library - Hibernate
  • Domain-model module relies on dao-layer as a dependency. Keyword project is used here again for a reference to other module.
  • Web-frontend applies 'war' plugin which build this module into java web archive. Besides it referes to domain-model module and also use Spring MVC framework dependency.

At the end in idea block is basic info for IDE plugin. Those are parameters corresponding to the Idea's project general settings visible on the following screen shot.


jdkName should match the IDE's SDK name otherwise it has to be set manually under IDE on each Idea's project files (re)generation with gradle idea command.

Is that it?

In the matter of simplicity - yes. That's enough to automate modular application build with custom configuration per module. Not a rocket science, huh? Think about Maven's XML. It would take more effort to setup the same and still achieve less expressible configuration quite far from user-friendly.

Check the online user guide for a lot of configuration possibilities or better download Gradle and see the sample projects.
As a tasty bait take a look for this short choice of available plugins:
  • java
  • groovy
  • scala
  • cpp
  • eclipse
  • netbeans
  • ida
  • maven
  • osgi
  • war
  • ear
  • sonar
  • project-report
  • signing
and more, 3rd party plugins...