Hamming Error Correction with Kotlin – part 2

In this article, we continue where we left off and focus solely on error detection for Hamming codes.

https://touk.pl/blog/2017/10/17/hamming-error-correction-with-kotlin-part-1/

Error Correction

Utilizing Hamming(7,4) encoding allows us to detect double-bit errors and even correct single-bit ones!

During the encoding, we only add parity bits, so the happy path decoding scenario involves stripping the message from the parity bits which reside at known indexes (1,2,4…n, 2n):

fun stripHammingMetadata(input: EncodedString): BinaryString {
    return input.value.asSequence()
      .filterIndexed { i, _ -> (i + 1).isPowerOfTwo().not() }
      .joinToString("")
      .let(::BinaryString)
}

This is rarely the case because since we made effort to calculate parity bits, we want to leverage them first.

The codeword validation is quite intuitive if you already understand the encoding process. We simply need to recalculate all parity bits and do the parity check (check if those values match what’s in the message):

private fun indexesOfInvalidParityBits(input: EncodedString): List<Int> {
    fun toValidationResult(it: Int, input: EncodedString): Pair<Int, Boolean> =
      helper.parityIndicesSequence(it - 1, input.length)
        .map { v -> input[v].toBinaryInt() }
        .fold(input[it - 1].toBinaryInt()) { a, b -> a xor b }
        .let { r -> it to (r == 0) }

    return generateSequence(1) { it * 2 }
      .takeWhile { it < input.length }
      .map { toValidationResult(it, input) }
      .filter { !it.second }
      .map { it.first }
      .toList()
}

If they all match, then the codeword does not contain any errors:

override fun isValid(codeWord: EncodedString) =
  indexesOfInvalidParityBits(input).isEmpty()

Now, when we already know if the message was transmitted incorrectly, we can request the sender to retransmit the message… or try to correct it ourselves.

Finding the distorted bit is as easy as summing the indexes of invalid parity bits – the result is the index of the faulty one. In order to correct the message, we can simply flip the bit:

override fun decode(codeWord: EncodedString): BinaryString =
  indexesOfInvalidParityBits(codeWord).let { result ->
      when (result.isEmpty()) {
          true -> codeWord
          false -> codeWord.withBitFlippedAt(result.sum() - 1)
      }.let { extractor.stripHammingMetadata(it) }
  }

We flip the bit using an extension:

private fun EncodedString.withBitFlippedAt(index: Int) = this[index].toString().toInt()
  .let { this.value.replaceRange(index, index + 1, ((it + 1) % 2).toString()) }
  .let(::EncodedString)

We can see that it works by writing a home-made property test:

@Test
fun shouldEncodeAndDecodeWithSingleBitErrors() = repeat(10000) {
    randomMessage().let {
        assertThat(it).isEqualTo(decoder.decode(encoder.encode(it)
          .withBitFlippedAt(rand.nextInt(it.length))))
    }
}

Unfortunately, the Hamming (7,4) does not distinguish between codewords containing one or two distorted bits. If you try to correct the two-bit error, the result will be incorrect.

Disappointing, right? This is what drove the decision to make use of an additional parity bit and create the Hamming (8,4).

Conclusion

We’ve seen how the error correction for Hamming codes look like and went through the extensive off-by-one-error workout.

Code snippets can be found on GitHub.

You May Also Like

Spock basics

Spock (homepage) is like its authors say 'testing and specification framework'. Spock combines very elegant and natural syntax with the powerful capabilities. And what is most important it is easy to use.

One note at the very beginning: I assume that you are already familiar with principles of Test Driven Development and you know how to use testing framework like for example JUnit.

So how can I start?


Writing spock specifications is very easy. We need basic configuration of Spock and Groovy dependencies (if you are using mavenized project with Eclipse look to my previous post: Spock, Java and Maven). Once we have everything set up and running smooth we can write our first specs (spec or specification is equivalent for test class in other frameworks like JUnit of TestNG).

What is great with Spock is fact that we can use it to test both Groovy projects and pure Java projects or even mixed projects.


Let's go!


Every spec class must inherit from spock.lang.Specification class. Only then test runner will recognize it as test class and start tests. We will write few specs for this simple class: User class and few tests not connected with this particular class.

We start with defining our class:
import spock.lang.*

class UserSpec extends Specification {

}
Now we can proceed to defining test fixtures and test methods.

All activites we want to perform before each test method, are to be put in def setup() {...} method and everything we want to be run after each test should be put in def cleanup() {...} method (they are equivalents for JUnit methods with @Before and @After annotations).

It can look like this:
class UserSpec extends Specification {
User user
Document document

def setup() {
user = new User()
document = DocumentTestFactory.createDocumentWithTitle("doc1")
}

def cleanup() {

}
}
Of course we can use field initialization for instantiating test objects:
class UserSpec extends Specification {
User user = new User()
Document document = DocumentTestFactory.createDocumentWithTitle("doc1")

def setup() {

}

def cleanup() {

}
}

What is more readable or preferred? It is just a matter of taste because according to Spock docs behaviour is the same in these two cases.

It is worth mentioning that JUnit @BeforeClass/@AfterClass are also present in Spock as def setupSpec() {...} and def cleanupSpec() {...}. They will be runned before first test and after last test method.


First tests


In Spock every method in specification class, expect setup/cleanup, is treated by runner as a test method (unless you annotate it with @Ignore).

Very interesting feature of Spock and Groovy is ability to name methods with full sentences just like regular strings:
class UserSpec extends Specification {
// ...

def "should assign coment to user"() {
// ...
}
}
With such naming convention we can write real specification and include details about specified behaviour in method name, what is very convenient when reading test reports and analyzing errors.

Test method (also called feature method) is logically divided into few blocks, each with its own purpose. Blocks are defined like labels in Java (but they are transformed with Groovy AST transform features) and some of them must be put in code in specific order.

Most basic and common schema for Spock test is:
class UserSpec extends Specification {
// ...

def "should assign coment to user"() {
given:
// do initialization of test objects
when:
// perform actions to be tested
then:
// collect and analyze results
}
}

But there are more blocks like:
  • setup
  • expect
  • where
  • cleanup
In next section I am going to describe each block shortly with little examples.

given block

This block is used to setup test objects and their state. It has to be first block in test and cannot be repeated. Below is little example how can it be used:
class UserSpec extends Specification {
// ...

def "should add project to user and mark user as project's owner"() {
given:
User user = new User()
Project project = ProjectTestFactory.createProjectWithName("simple project")
// ...
}
}

In this code given block contains initialization of test objects and nothing more. We create simple user without any specified attributes and project with given name. In case when some of these objects could be reused in more feature methods, it could be worth putting initialization in setup method.

when and then blocks

When block contains action we want to test (Spock documentation calls it 'stimulus'). This block always occurs in pair with then block, where we are verifying response for satisfying certain conditions. Assume we have this simple test case:
class UserSpec extends Specification {
// ...

def "should assign user to comment when adding comment to user"() {
given:
User user = new User()
Comment comment = new Comment()
when:
user.addComment(comment)
then:
comment.getUserWhoCreatedComment().equals(user)
}

// ...
}

In when block there is a call of tested method and nothing more. After we are sure our action was performed, we can check for desired conditions in then block.

Then block is very well structured and its every line is treated by Spock as boolean statement. That means, Spock expects that we write instructions containing comparisons and expressions returning true or false, so we can create then block with such statements:
user.getName() == "John"
user.getAge() == 40
!user.isEnabled()
Each of lines will be treated as single assertion and will be evaluated by Spock.

Sometimes we expect that our method throws an exception under given circumstances. We can write test for it with use of thrown method:
class CommentSpec extends Specification {
def "should throw exception when adding null document to comment"() {
given:
Comment comment = new Comment()
when:
comment.setCommentedDocument(null)
then:
thrown(RuntimeException)
}
}

In this test we want to make sure that passing incorrect parameters is correctly handled by tested method and that method throws an exception in response. In case you want to be certain that method does not throw particular exception, simply use notThrown method.


expect block

Expect block is primarily used when we do not want to separate when and then blocks because it is unnatural. It is especially useful for simple test (and according to TDD rules all test should be simple and short) with only one condition to check, like in this example (it is simple but should show the idea):
def "should create user with given name"() {
given:
User user = UserTestFactory.createUser("john doe")
expect:
user.getName() == "john doe"
}



More blocks!


That were very simple tests with standard Spock test layout and canonical divide into given/when/then parts. But Spock offers more possibilities in writing tests and provides more blocks.


setup/cleanup blocks

These two blocks have the very same functionality as the def setup and def cleanup methods in specification. They allow to perform some actions before test and after test. But unlike these methods (which are shared between all tests) blocks work only in methods they are defined in. 


where - easy way to create readable parameterized tests

Very often when we create unit tests there is a need to "feed" them with sample data to test various cases and border values. With Spock this task is very easy and straighforward. To provide test data to feature method, we need to use where block. Let's take a look at little the piece of code:

def "should successfully validate emails with valid syntax"() {
expect:
emailValidator.validate(email) == true
where:
email }

In this example, Spock creates variable called email which is used when calling method being tested. Internally feature method is called once, but framework iterates over given values and calls expect/when block as many times as there are values (however, if we use @Unroll annotation Spock can create separate run for each of given values, more about it in one of next examples).

Now, lets assume that we want our feature method to test both successful and failure validations. To achieve that goal we can create few 
parameterized variables for both input parameter and expected result. Here is a little example:

def "should perform validation of email addresses"() {
expect:
emailValidator.validate(email) == result
where:
email result }
Well, it looks nice, but Spock can do much better. It offers tabular format of defining parameters for test what is much more readable and natural. Lets take a look:
def "should perform validation of email addresses"() {
expect:
emailValidator.validate(email) == result
where:
email | result
"WTF" | false
"@domain" | false
"foo@bar.com" | true
"a@test" | false
}
In this code, each column of our "table" is treated as a separate variable and rows are values for subsequent test iterations.

Another useful feature of Spock during parameterizing test is its ability to "unroll" each parameterized test. Feature method from previous example could be defined as (the body stays the same, so I do not repeat it):
@Unroll("should validate email #email")
def "should perform validation of email addresses"() {
// ...
}
With that annotation, Spock generate few methods each with its own name and run them separately. We can use symbols from where blocks in @Unroll argument by preceding it with '#' sign what is a signal to Spock to use it in generated method name.


What next?


Well, that was just quick and short journey  through Spock and its capabilities. However, with that basic tutorial you are ready to write many unit tests. In one of my future posts I am going to describe more features of Spock focusing especially on its mocking abilities.

[:en] Operational problems with Zookeeper

This post is a summary of what has been presented by Kathleen Ting on StrangeLoop conference. You can watch the original here: http://www.infoq.com/presentations/Misconfiguration-ZooKeeper I've decided to put this selection here for quick reference. ...This post is a summary of what has been presented by Kathleen Ting on StrangeLoop conference. You can watch the original here: http://www.infoq.com/presentations/Misconfiguration-ZooKeeper I've decided to put this selection here for quick reference. ...

33rd Degree day 1 review

33rd Degree is over. After the one last year, my expectations were very high, but Grzegorz Duda once again proved he's more than able to deliver. With up to five tracks (most of the time: four presentations + one workshop), and ~650 attendees,  there was a lot to see and a lot to do, thus everyone will probably have a little bit different story to tell. Here is mine.

Twitter: From Ruby on Rails to the JVM

Raffi Krikorian talking about Twitter and JVM
The conference started with  Raffi Krikorian from Twitter, talking about their use for JVM. Twitter was build with Ruby but with their performance management a lot of the backend was moved to Scala, Java and Closure. Raffi noted, that for Ruby programmers Scala was easier to grasp than Java, more natural, which is quite interesting considering how many PHP guys move to Ruby these days because of the same reasons. Perhaps the path of learning Jacek Laskowski once described (Java -> Groovy -> Scala/Closure) may be on par with PHP -> Ruby -> Scala. It definitely feels like Scala is the holy grail of languages these days.

Raffi also noted, that while JVM delivered speed and a concurrency model to Twitter stack, it wasn't enough, and they've build/customized their own Garbage Collector. My guess is that Scala/Closure could also be used because of a nice concurrency solutions (STM, immutables and so on).

Raffi pointed out, that with the scale of Twitter, you easily get 3 million hits per second, and that means you probably have 3 edge cases every second. I'd love to learn listen to lessons they've learned from this.

 

Complexity of Complexity


The second keynote of the first day, was Ken Sipe talking about complexity. He made a good point that there is a difference between complex and complicated, and that we often recognize things as complex only because we are less familiar with them. This goes more interesting the moment you realize that the shift in last 20 years of computer languages, from the "Less is more" paradigm (think Java, ASM) to "More is better" (Groovy/Scala/Closure), where you have more complex language, with more powerful and less verbose syntax, that is actually not more complicated, it just looks less familiar.

So while 10 years ago, I really liked Java as a general purpose language for it's small set of rules that could get you everywhere, it turned out that to do most of the real world stuff, a lot of code had to be written. The situation got better thanks to libraries/frameworks and so on, but it's just patching. New languages have a lot of stuff build into, which makes their set of rules and syntax much more complex, but once you get familiar, the real world usage is simple, faster, better, with less traps laying around, waiting for you to fall.

Ken also pointed out, that while Entity Service Bus looks really simple on diagrams, it's usually very difficult and complicated to use from the perspective of the programmer. And that's probably why it gets chosen so often - the guys selling/buying it, look no deeper than on the diagram.

 

Pointy haired bosses and pragmatic programmers: Facts and Fallacies of Software Development

Venkat Subramaniam with Dima
Dima got lucky. Or maybe not.

Venkat Subramaniam is the kind of a speaker that talk about very simple things in a way, which makes everyone either laugh or reflect. Yes, he is a showman, but hey, that's actually good, because even if you know the subject quite well, his talks are still very entertaining.
This talk was very generic (here's my thesis: the longer the title, the more generic the talk will be), interesting and fun, but at the end I'm unable to see anything new I'd have learned, apart from the distinction between Dynamic vs Static and Strong vs Weak typing, which I've seen the last year, but managed to forgot. This may be a very interesting argument for all those who are afraid of Groovy/Ruby, after bad experience with PHP or Perl.

Build Trust in Your Build to Deployment Flow!


Frederic Simon talked about DevOps and deployment, and that was a miss in my  schedule, because of two reasons. First, the talk was aimed at DevOps specifically, and while the subject is trendy lately, without big-scale problems, deployment is a process I usually set up and forget about. It just works, mostly because I only have to deal with one (current) project at a time. 
Not much love for Dart.
Second, while Frederic has a fabulous accent and a nice, loud voice, he tends to start each sentence loud and fade the sound at the end. This, together with mics failing him badly, made half of the presentation hard to grasp unless you were sitting in the first row.
I'm not saying the presentation was bad, far from it, it just clearly wasn't for me.
I've left a few minutes before the end, to see how many people came to Dart presentation by Mike West. I was kind of interested, since I'm following Warsaw Google Technology User Group and heard a few voices about why I should pay attentions to that new Google language. As you can see from the picture on the right, the majority tends to disagree with that opinion.

 

Non blocking, composable reactive web programming with Iteratees

Sadek Drobi's talk about Iteratees in Play 2.0 was very refreshing. Perhaps because I've never used Play before, but the presentation was flawless, with well explained problems, concepts and solutions.
Sadek started with a reflection on how much CPU we waste waiting for IO in web development, then moved to Play's Iteratees, to explain the concept and implementation, which while very different from the that overused Request/Servlet model, looked really nice and simple. I'm not sure though, how much the problem is present when you have a simple service, serving static content before your app server. Think apache (and faster) before tomcat. That won't fix the upload/download issue though, which is beautifully solved in Play 2.0

The Future of the Java Platform: Java SE 8 & Beyond


Simon Ritter is an intriguing fellow. If you take a glance at his work history (AT&T UNIX System Labs -> Novell -> Sun -> Oracle), you can easily see, he's a heavy weight player.
His presentation was rich in content, no corpo-bullshit. He started with a bit of history of JCP and how it looks like right now, then moved to the most interesting stuff, changes. Now I could give you a summary here, but there is really no point: you'd be much better taking look at the slides. There are only 48 of them, but everything is self-explanatory.
While I'm very disappointed with the speed of changes, especially when compared to the C# world, I'm glad with the direction and the fact that they finally want to BREAK the compatibility with the broken stuff (generics, etc.).  Moving to other languages I guess I won't be the one to scream "My god, finally!" somewhere in 2017, though. All the changes together look very promising, it's just that I'd like to have them like... now? Next year max, not near the heat death of the universe.

Simon also revealed one of the great mysteries of Java, to me:
The original idea behind JNI was to make it hard to write, to discourage people form using it.
On a side note, did you know Tegra3 has actually 5 cores? You use 4 of them, and then switch to the other one, when you battery gets low.

BOF: Spring and CloudFoundry


Having most of my folks moved to see "Typesafe stack 2.0" fabulously organized by Rafał Wasilewski and  Wojtek Erbetowski (with both of whom I had a pleasure to travel to the conference) and knowing it will be recorded, I've decided to see what Josh Long has to say about CloudFoundry, a subject I find very intriguing after the de facto fiasco of Google App Engine.

The audience was small but vibrant, mostly users of Amazon EC2, and while it turned out that Josh didn't have much, with pricing and details not yet public, the fact that Spring Source has already created their own competition (Could Foundry is both an Open Source app and a service), takes a lot from my anxiety.

For the review of the second day of the conference, go here.