TouK Nussknacker – using Apache Flink made easier for analysts and business

Few weeks ago we (TouK) revealed on Github our latest open source project – Nussknacker. What is it and why should you care?

Why?

First, some history: more than year ago one of our clients decided it’s high time for Real Time Marketing. They had pretty large data streams and lots of ideas for marketing campaigns, so one of the key success factors was the ability to process the data fast. We prepared a POC based on Apache Flink and it turned out that it’s really great piece of technology – fast and accurate.

There was just one problem – how to write and customize processes? Flink, as many modern stream processing engines, provides rich and friendly DSL, both in Java and Scala. But for our client that was not really enough. See, they are enterprise that do not employ developers – they have many decent, competent analysts who know the business and SQL – but not Java or (Heaven forbid…) Scala.

Nussknacker to the rescue!

So we decided to create a simple process designer for them. It looks more or less like this:

You draw a diagram, fill in the details (like filter expressions – “#input.usageData > 0” or SMS/mail content), then you press the Deploy button and voilà – your brand new process in running on Flink cluster!

Of course first somebody (that is, developer) has to prepare data sources (most probably Kafka topics), design data model (POJOs or case classes) and implement actions – like sending emails or sending some events to another Kafka topic. But once a model of data and external services are defined, an analyst can define and deploy processes all by him/herself.

More features

Sounds a bit scary to let your users run stream processes with GUI? Bad filter conditions can have serious performance implications if you deal with streams of tens of thousands of events per second… That’s why we let user test their diagrams first – each test case can be generated by sample of real data coming from e.g. Kafka and then run in Flink sandbox mini-cluster.

We have also many more features that make working with Nussknacker and Flink easier: subprocesses, versioning, generating PDF documentation, migration between environments and last but certainly not least – integration with InfluxDB/Grafana to provide detailed insight into how process is doing:

Where can I use it?

What are Nussknacker use cases? Our main deployments deal with RTM (Real Time Marketing). One of our clients started with RTM and then found out that Nussknacker is also great choice for fraud detection in real time. Industries we are working with include telcos, banks and media companies. We are also thinking about other possibilities – for example IoT.

Sounds cool?

If you are interested in easy access for semi-technical analysts to streaming data – give Nussknacker a try!

You can find the code at Github: https://github.com/touk/nussknacker, we also have a nice, Docker-based quickstart: https://touk.github.io/nussknacker/Quickstart.html.

And if you are coming to Flink Forward next week in Berlin – join me on Wednesday afternoon to hear more – <a href="https://berlin.flink-forward.org/kb_sessions/touk-nussknacker-creating-flink-jobs-with-gui/"https://berlin.flink-forward.org/kb_sessions/touk-nussknacker-creating-flink-jobs-with-gui/.

In next days/weeks we’ll post more information on TouK blog both on Nussknacker architecture and internals and on interesting use cases – stay tuned :)

You May Also Like

Private fields and methods are not private in groovy

I used to code in Java before I met groovy. Like most of you, groovy attracted me with many enhancements. This was to my surprise to discover that method visibility in groovy is handled different than Java!

Consider this example:

class Person {
private String name
public String surname

private Person() {}

private String signature() { "${name?.substring(0, 1)}. $surname" }

public String toString() { "I am $name $surname" }
}

How is this class interpreted with Java?

  1. Person has private constructor that cannot be accessed
  2. Field "name" is private and cannot be accessed
  3. Method signature() is private and cannot be accessed

Let's see how groovy interpretes Person:

public static void main(String[] args) {
def person = new Person() // constructor is private - compilation error in Java
println(person.toString())

person.@name = 'Mike' // access name field directly - compilation error in Java
println(person.toString())

person.name = 'John' // there is a setter generated by groovy
println(person.toString())

person.@surname = 'Foo' // access surname field directly
println(person.toString())

person.surname = 'Bar' // access auto-generated setter
println(person.toString())

println(person.signature()) // call private method - compilation error in Java
}

I was really astonished by its output:

I am null null
I am Mike null
I am John null
I am John Foo
I am John Bar
J. Bar

As you can see, groovy does not follow visibility directives at all! It treats them as non-existing. Code compiles and executes fine. It's contrary to Java. In Java this code has several errors, pointed out in comments.

I've searched a bit on this topic and it seems that this behaviour is known since version 1.1 and there is a bug report on that: http://jira.codehaus.org/browse/GROOVY-1875. It is not resolved even with groovy 2 release. As Tim Yates mentioned in this Stackoverflow question: "It's not clear if it is a bug or by design". Groovy treats visibility keywords as a hint for a programmer.

I need to keep that lesson in mind next time I want to make some field or method private!

Spring Security by example: OpenID (login via gmail)

This is a part of a simple Spring Security tutorial: 1. Set up and form authentication 2. User in the backend (getting logged user, authentication, testing) 3. Securing web resources 4. Securing methods 5. OpenID (login via gmail) 6. OAuth2 (login v...This is a part of a simple Spring Security tutorial: 1. Set up and form authentication 2. User in the backend (getting logged user, authentication, testing) 3. Securing web resources 4. Securing methods 5. OpenID (login via gmail) 6. OAuth2 (login v...