Typeclasses in Swift, Haskell and Scala

What is a typeclass?

Typeclass is a Haskell way of creating the type composition in the world without inheritance. It allows to define a desired behavior in a form of function signatures. The concrete implementation is provided separately for all the required types. In other words, it splits the familiar object-oriented encapsulation (data and functionality together) in two separate parts: data and functionality. At the same time typeclass defines a contract that we can build upon. It’s like defining the same function multiple times – each time the only thing that differs is the type in the signature – and making it possible to use this function in some other place without specifying which one of its variants should be used. The compiler guesses it for us.

Doesn’t it sound like Swift or Objective-C protocols? Well, it does. It’s no surprise, because they’re all fueled by same basic idea. This is the first and arguably the most important thing to know about typeclasses – although they do have a word class in their name, they are more similar to protocols. Actually, they are far enough from classes so they can independently coexist with them, orthogonal to each other.

Typeclasses, being protocol cousins, are used in similar fashion: to express a feature that spreads across multiple types. In Haskell there are typeclasses like Eq, expressing that things are equatable, or Ord, expressing that they are sortable, or Functor, expressing that they can be mapped over.

If you’ve seen WWDC 2015 session “Protocol-Oriented Programming in Swift”, you’re gonna feel at home (or run away screaming, depending on how you liked it). One thing to notice: while in a more strict functional language, namely Haskell, typeclass is a part of its type semantics, in Swift or Scala typeclass is more of a design pattern. We’re using their native type semantics to achieve similar effects.

Enough with introduction. Let’s define a typeclass so we can more easily grasp what’s going on.

Things that can be encoded in Ceasar cipher

Have you heard of Caesar cipher? It is a very basic cryptography method: we express anything as a string and than we shift each letter by a fixed number of places in the alphabet. So, for 3-letter Ceasar cipher we write D instead of A, E instead of B, F instead of C and so on.

Our typeclass is gonna describe the ability to be expressed in a Ceasar cipher form. It’s gonna be based on position of particular character in the ASCII table. For the sake of simplicity I’ll ignore the fact that in the ASCII table there are some special characters after the last letters of the alphabet. No one is actually sending messages to Roman legions anymore, so no one is gonna get surprised by some %$#.

Here is the Ceasarable typeclass defined in Haskell:

class Ceasarable c where 
    toCeasar :: c -> String

Just to mess with object-oriented minds, it uses keyword class to kick off its definition. Then it declares one function signature: toCeasar. The function takes one argument of any type and returns a string, presumably with the cipher shift applied. This is our desired behavior. It must be implemented (with the actual type instead of c) by the typeclass instances.

How does it gonna look like in Scala? We’re gonna use Scala’s type semantics. The most obvious way is to use trait:

trait Ceasarable[T] {
    def toCeasar(input: T): String
}

The translation is straightforward. Any type in Haskell becomes a generic parameter in Scala. The signature is the same.

In Swift the closest thing to traits/interfaces are protocols, so let’s search no further:

protocol Ceasarable { 
    typealias T static func toCeasar(input: T) -> String 
}

Apart from minor syntax differences, like associated type instead of generic parameter, it’s the same as in Scala. Not surprising, as those two languages share a lot of similarities (and I mean like, a lot). One thing to notice is the use of static method. Why is it static? Because we want to emulate the split between the data and behavior. If the method is not static, than it can use the instance data and there is (in our simple case) no need to pass input at all. An instance method declaration would make the typeclass a little more object-oriented, and there’s nothing wrong with that, but for now let’s stick to the original idea. We’re providing the implementation for a type, not for an instance.

When I grow up I wanna be a typeclass!

Once we’ve defined what we expect, it’d be nice to provide some actual implementations for chosen types. This way we’d be able to use the behavior. Let’s choose two simple types to work with: strings and integers. In Haskell, the implementation is provided by defining the concrete typeclass instance:

{-# LANGUAGE TypeSynonymInstances, FlexibleInstances #-}
    
instance Ceasarable String where
    toCeasar x = [toEnum $ fromEnum c + 3 | c <- x]
    
instance Ceasarable Integer where
    toCeasar x = [toEnum $ fromEnum c + 3 | c <- show x]

Keyword instance means that the implementation is coming. For String, we map over characters, get their ASCII numbers with fromEnum, add three and than encode again with toEnum. For Integer we just express it as String using show and we do exactly the same.

In Scala things get a little weird, so feel free to skip over the details. The Ceasarable behavior is enclosed in the object and marked as implicit. This way it can be implicitly passed to the place we want to use it:

object Ceasarable {
    implicit object CeasarableInt extends Ceasarable[Int] {
        override def toCeasar(input: Int): String = {
            s"$input".map(_ + 3).map(_.toChar).mkString
        }
    }
   
    implicit object CeasarableString extends Ceasarable[String] {
        override def toCeasar(input: String): String = {
            input.map(_ + 3).map(_.toChar).mkString
        }
    }
}

The object scope and implicit passing are part of Scala peculiarities, no need to dive deeper in them. If you really want to, look here. What matters is that we’ve created separated objects with Ceasarable implementations for String and Int types and we enclosed them in static-like objects (object is as close as you can get to static in Scala). Those types know nothing about their ability to be expressed in Ceasar cipher.

Let’s try the same approach in Swift:

struct CeasarableInt : Ceasarable {
    typealias T = Int
    static func toCeasar(input: Int) -> String {
        return "(input)".unicodeScalars.reduce("") { 
            (acc, char) in
            return acc + String(UnicodeScalar(char.value + 3))
        }
    }
}

struct CeasarableString : Ceasarable {
    typealias T = String
    static func toCeasar(input: String) -> String {
        return input.unicodeScalars.reduce("") { 
            (acc, char) in
            return acc + String(UnicodeScalar(char.value + 3))
        }
    }
}

Looks valid. This way we’ve defined the ability to be encoded in Ceasar cipher for strings and integers in each language we consider.

Now we can work with Ceasarable objects just as with any other group of objects sharing common characteristics, i.e. type. We can declare that we expect it as function parameter, we can return it in a function result and so on. Let’s see the example usages. In Haskell:

encodeInCeasar :: (Ceasarable c) => c -> String
encodeInCeasar = toCeasar
    
encodeInCeasar 1234 -- "4567"
    
encodeInCeasar "ABCabc" -- "DEFdef"

We are using the typeclass just like we’d use the protocol – to define a contract without explicitly defining what object are gonna conform to this contract.

How about Scala?

def encodeInCeasar[T: Ceasarable](c: T) = {
    val encoded = implicitly[Ceasarable[T]].toCeasar(c)
    println(encoded)
}

encodeInCeasar(1234) // "4567"

    encodeInCeasar("ABCabc") // "DEFdef"

The sky, once again, gets a little bit cloudy. Instead of requiring the protocol confirmation, we’re explicitly asking for the proper implementation using implicitly. Implicitly needs to have the implementations passed inside the function, and the enclosing scope is passed via a mechanism called context bound. T: Ceasarable is a syntax for context bound. It might sound confusing, but it’s fine, actually. This way we can easily see that we’re using a typeclass. In Swift, however, we encounter a problem:

func encodeInCeasar<C : Ceasarable>(c: C.T) -> String {
    return C.toCeasar(c)
}

encodeInCeasar(c: 1234) // Compiler error: Cannot invoke 'encodeInCeasar' with an argument list of type '(Int)'

Swift compiler cannot infer the generic parameter. There is a struct that does exactly what we want: conforms to Ceasarable and defines Int as its associated type. However, it cannot be found automatically. Swift doesn’t have semantics for Scala-like context bound. However, we’ve got the second best thing… Wait! It’s actually the first best thing, only Swift is 2.0. Protocol extensions.

Swift typeclasses defined with protocol extensions

In Swift we can use the extension keyword to provide implementations for already existing types. The beauty of extension lays in its two properties: universality and ability to be constraint. By universality I mean that you can extend all the Swift types: protocols, classes, structs and enums. The ability to be constraint let us express what we want to extend in a great detail – greater than allowed by protocol confirmation or class inheritance alone.

Did I mention that if you’ve watched “Protocol-Oriented Programming in Swift” you’ll feel at home? Our better implementation of typeclasses starts with a slight change to the Ceasarable definition:

protocol Ceasarable {
        static func toCeasar(input: Self) -> String
}

Instead of requiring the associated type in protocol, we can add a Self requirement. This way we’re expressing that for whatever type we’re providing the typeclass implementation, it requires the value of that type as the parameter. It stays closer to the original Haskell definition, because the typeclass doesn’t need to be generic. It is just like a template for multiple function definitions that differ only by the type in signature. Self expresses exactly that. There is also another way of expressing the same idea: see this article on how to do it using Swift 1.2 (spoiler alert: <C: Ceasarable where C.T == C>), but for Swift 2.0 the most straightforward way is with Self. The actual implementations become easier to write and more readable:

extension Int : Ceasarable {
    static func toCeasar(input: Int) -> String {
        return "(input)".unicodeScalars.reduce("") { 
            (acc, char) in
            return acc + String(UnicodeScalar(char.value + 3))
        }
    }
}

extension String : Ceasarable {
    static func toCeasar(input: String) -> String {
        return input.unicodeScalars.reduce("") { 
            (acc, char) in
            return acc + String(UnicodeScalar(char.value + 3))
        }
    }
}

It looks like a straightforward protocol confirmation and it’s just what we need. Having that, the usage get simpler as well:

func encodeInCeasar<T : Ceasarable>(c: T) -> String {
    return T.toCeasar(c)
}

encodeInCeasar(1234) // "4567"

encodeInCeasar("ABCabc") // "DEFdef"

This is what we tried to achieve. At the same time we’re providing behavior separate from data (since it’s static method of T) and expressing the common functionality (since T must be Ceasarable). By using protocol extensions, we’ve enabled the second dimension, somewhat orthogonal to inheritance, in which we can compose our functionalities.

What are Swift typeclasses, then?

A typeclass in Swift is a pattern build using the protocols and extensions. It’s simple and there’s nothing new, really, as we’ve been already using those concepts extensively. As a side note, the process of learning functional programming is very often like that: concepts we used for a long time, but differently named, generalized and ready to build upon.

Typeclasses are a way of providing a behavior for the type separately from the type and at the same time defining a contract that the type conforms to. It might be used to add functionalities and build composition without inheritance.

You May Also Like

33rd Degree day 1 review

33rd Degree is over. After the one last year, my expectations were very high, but Grzegorz Duda once again proved he's more than able to deliver. With up to five tracks (most of the time: four presentations + one workshop), and ~650 attendees,  there was a lot to see and a lot to do, thus everyone will probably have a little bit different story to tell. Here is mine.

Twitter: From Ruby on Rails to the JVM

Raffi Krikorian talking about Twitter and JVM
The conference started with  Raffi Krikorian from Twitter, talking about their use for JVM. Twitter was build with Ruby but with their performance management a lot of the backend was moved to Scala, Java and Closure. Raffi noted, that for Ruby programmers Scala was easier to grasp than Java, more natural, which is quite interesting considering how many PHP guys move to Ruby these days because of the same reasons. Perhaps the path of learning Jacek Laskowski once described (Java -> Groovy -> Scala/Closure) may be on par with PHP -> Ruby -> Scala. It definitely feels like Scala is the holy grail of languages these days.

Raffi also noted, that while JVM delivered speed and a concurrency model to Twitter stack, it wasn't enough, and they've build/customized their own Garbage Collector. My guess is that Scala/Closure could also be used because of a nice concurrency solutions (STM, immutables and so on).

Raffi pointed out, that with the scale of Twitter, you easily get 3 million hits per second, and that means you probably have 3 edge cases every second. I'd love to learn listen to lessons they've learned from this.

 

Complexity of Complexity


The second keynote of the first day, was Ken Sipe talking about complexity. He made a good point that there is a difference between complex and complicated, and that we often recognize things as complex only because we are less familiar with them. This goes more interesting the moment you realize that the shift in last 20 years of computer languages, from the "Less is more" paradigm (think Java, ASM) to "More is better" (Groovy/Scala/Closure), where you have more complex language, with more powerful and less verbose syntax, that is actually not more complicated, it just looks less familiar.

So while 10 years ago, I really liked Java as a general purpose language for it's small set of rules that could get you everywhere, it turned out that to do most of the real world stuff, a lot of code had to be written. The situation got better thanks to libraries/frameworks and so on, but it's just patching. New languages have a lot of stuff build into, which makes their set of rules and syntax much more complex, but once you get familiar, the real world usage is simple, faster, better, with less traps laying around, waiting for you to fall.

Ken also pointed out, that while Entity Service Bus looks really simple on diagrams, it's usually very difficult and complicated to use from the perspective of the programmer. And that's probably why it gets chosen so often - the guys selling/buying it, look no deeper than on the diagram.

 

Pointy haired bosses and pragmatic programmers: Facts and Fallacies of Software Development

Venkat Subramaniam with Dima
Dima got lucky. Or maybe not.

Venkat Subramaniam is the kind of a speaker that talk about very simple things in a way, which makes everyone either laugh or reflect. Yes, he is a showman, but hey, that's actually good, because even if you know the subject quite well, his talks are still very entertaining.
This talk was very generic (here's my thesis: the longer the title, the more generic the talk will be), interesting and fun, but at the end I'm unable to see anything new I'd have learned, apart from the distinction between Dynamic vs Static and Strong vs Weak typing, which I've seen the last year, but managed to forgot. This may be a very interesting argument for all those who are afraid of Groovy/Ruby, after bad experience with PHP or Perl.

Build Trust in Your Build to Deployment Flow!


Frederic Simon talked about DevOps and deployment, and that was a miss in my  schedule, because of two reasons. First, the talk was aimed at DevOps specifically, and while the subject is trendy lately, without big-scale problems, deployment is a process I usually set up and forget about. It just works, mostly because I only have to deal with one (current) project at a time. 
Not much love for Dart.
Second, while Frederic has a fabulous accent and a nice, loud voice, he tends to start each sentence loud and fade the sound at the end. This, together with mics failing him badly, made half of the presentation hard to grasp unless you were sitting in the first row.
I'm not saying the presentation was bad, far from it, it just clearly wasn't for me.
I've left a few minutes before the end, to see how many people came to Dart presentation by Mike West. I was kind of interested, since I'm following Warsaw Google Technology User Group and heard a few voices about why I should pay attentions to that new Google language. As you can see from the picture on the right, the majority tends to disagree with that opinion.

 

Non blocking, composable reactive web programming with Iteratees

Sadek Drobi's talk about Iteratees in Play 2.0 was very refreshing. Perhaps because I've never used Play before, but the presentation was flawless, with well explained problems, concepts and solutions.
Sadek started with a reflection on how much CPU we waste waiting for IO in web development, then moved to Play's Iteratees, to explain the concept and implementation, which while very different from the that overused Request/Servlet model, looked really nice and simple. I'm not sure though, how much the problem is present when you have a simple service, serving static content before your app server. Think apache (and faster) before tomcat. That won't fix the upload/download issue though, which is beautifully solved in Play 2.0

The Future of the Java Platform: Java SE 8 & Beyond


Simon Ritter is an intriguing fellow. If you take a glance at his work history (AT&T UNIX System Labs -> Novell -> Sun -> Oracle), you can easily see, he's a heavy weight player.
His presentation was rich in content, no corpo-bullshit. He started with a bit of history of JCP and how it looks like right now, then moved to the most interesting stuff, changes. Now I could give you a summary here, but there is really no point: you'd be much better taking look at the slides. There are only 48 of them, but everything is self-explanatory.
While I'm very disappointed with the speed of changes, especially when compared to the C# world, I'm glad with the direction and the fact that they finally want to BREAK the compatibility with the broken stuff (generics, etc.).  Moving to other languages I guess I won't be the one to scream "My god, finally!" somewhere in 2017, though. All the changes together look very promising, it's just that I'd like to have them like... now? Next year max, not near the heat death of the universe.

Simon also revealed one of the great mysteries of Java, to me:
The original idea behind JNI was to make it hard to write, to discourage people form using it.
On a side note, did you know Tegra3 has actually 5 cores? You use 4 of them, and then switch to the other one, when you battery gets low.

BOF: Spring and CloudFoundry


Having most of my folks moved to see "Typesafe stack 2.0" fabulously organized by Rafał Wasilewski and  Wojtek Erbetowski (with both of whom I had a pleasure to travel to the conference) and knowing it will be recorded, I've decided to see what Josh Long has to say about CloudFoundry, a subject I find very intriguing after the de facto fiasco of Google App Engine.

The audience was small but vibrant, mostly users of Amazon EC2, and while it turned out that Josh didn't have much, with pricing and details not yet public, the fact that Spring Source has already created their own competition (Could Foundry is both an Open Source app and a service), takes a lot from my anxiety.

For the review of the second day of the conference, go here.