Hamming Error Correction with Kotlin – part 2

In this article, we continue where we left off and focus solely on error detection for Hamming codes.

https://touk.pl/blog/2017/10/17/hamming-error-correction-with-kotlin-part-1/

Error Correction

Utilizing Hamming(7,4) encoding allows us to detect double-bit errors and even correct single-bit ones!

During the encoding, we only add parity bits, so the happy path decoding scenario involves stripping the message from the parity bits which reside at known indexes (1,2,4…n, 2n):

fun stripHammingMetadata(input: EncodedString): BinaryString {
    return input.value.asSequence()
      .filterIndexed { i, _ -> (i + 1).isPowerOfTwo().not() }
      .joinToString("")
      .let(::BinaryString)
}

This is rarely the case because since we made effort to calculate parity bits, we want to leverage them first.

The codeword validation is quite intuitive if you already understand the encoding process. We simply need to recalculate all parity bits and do the parity check (check if those values match what’s in the message):

private fun indexesOfInvalidParityBits(input: EncodedString): List<Int> {
    fun toValidationResult(it: Int, input: EncodedString): Pair<Int, Boolean> =
      helper.parityIndicesSequence(it - 1, input.length)
        .map { v -> input[v].toBinaryInt() }
        .fold(input[it - 1].toBinaryInt()) { a, b -> a xor b }
        .let { r -> it to (r == 0) }

    return generateSequence(1) { it * 2 }
      .takeWhile { it < input.length }
      .map { toValidationResult(it, input) }
      .filter { !it.second }
      .map { it.first }
      .toList()
}

If they all match, then the codeword does not contain any errors:

override fun isValid(codeWord: EncodedString) =
  indexesOfInvalidParityBits(input).isEmpty()

Now, when we already know if the message was transmitted incorrectly, we can request the sender to retransmit the message… or try to correct it ourselves.

Finding the distorted bit is as easy as summing the indexes of invalid parity bits – the result is the index of the faulty one. In order to correct the message, we can simply flip the bit:

override fun decode(codeWord: EncodedString): BinaryString =
  indexesOfInvalidParityBits(codeWord).let { result ->
      when (result.isEmpty()) {
          true -> codeWord
          false -> codeWord.withBitFlippedAt(result.sum() - 1)
      }.let { extractor.stripHammingMetadata(it) }
  }

We flip the bit using an extension:

private fun EncodedString.withBitFlippedAt(index: Int) = this[index].toString().toInt()
  .let { this.value.replaceRange(index, index + 1, ((it + 1) % 2).toString()) }
  .let(::EncodedString)

We can see that it works by writing a home-made property test:

@Test
fun shouldEncodeAndDecodeWithSingleBitErrors() = repeat(10000) {
    randomMessage().let {
        assertThat(it).isEqualTo(decoder.decode(encoder.encode(it)
          .withBitFlippedAt(rand.nextInt(it.length))))
    }
}

Unfortunately, the Hamming (7,4) does not distinguish between codewords containing one or two distorted bits. If you try to correct the two-bit error, the result will be incorrect.

Disappointing, right? This is what drove the decision to make use of an additional parity bit and create the Hamming (8,4).

Conclusion

We’ve seen how the error correction for Hamming codes look like and went through the extensive off-by-one-error workout.

Code snippets can be found on GitHub.

You May Also Like

Micro services on the JVM part 1 – Clojure

Micro services could be a buzzword of 2014 for me. Few months ago I was curious to try Dropwizard framework as a separate backend, but didn’t get the whole idea yet. But then I watched a mind-blowing “Micro-Services Architecture” talk by Fred George. Also, the 4.0 release notes of Spring covers microservices as an important rising trend as well. After 10 years of having SOA in mind, but still developing monoliths, it’s a really tempting idea to try to decouple systems into a set of independently developed and deployed RESTful services.

Micro services could be a buzzword of 2014 for me. Few months ago I was curious to try Dropwizard framework as a separate backend, but didn’t get the whole idea yet. But then I watched a mind-blowing “Micro-Services Architecture” talk by Fred George. Also, the 4.0 release notes of Spring covers microservices as an important rising trend as well. After 10 years of having SOA in mind, but still developing monoliths, it’s a really tempting idea to try to decouple systems into a set of independently developed and deployed RESTful services.

Recently at storm-users

I've been reading through storm-users Google Group recently. This resolution was heavily inspired by Adam Kawa's post "Football zero, Apache Pig hero". Since I've encountered a lot of insightful and very interesting information I've decided to describe some of those in this post.

  • nimbus will work in HA mode - There's a pull request open for it already... but some recent work (distributing topology files via Bittorrent) will greatly simplify the implementation. Once the Bittorrent work is done we'll look at reworking the HA pull request. (storm’s pull request)

  • pig on storm - Pig on Trident would be a cool and welcome project. Join and groupBy have very clear semantics there, as those concepts exist directly in Trident. The extensions needed to Pig are the concept of incremental, persistent state across batches (mirroring those concepts in Trident). You can read a complete proposal.

  • implementing topologies in pure python with petrel looks like this:

class Bolt(storm.BasicBolt):
    def initialize(self, conf, context):
       ''' This method executed only once '''
        storm.log('initializing bolt')

    def process(self, tup):
       ''' This method executed every time a new tuple arrived '''       
       msg = tup.values[0]
       storm.log('Got tuple %s' %msg)

if __name__ == "__main__":
    Bolt().run()
  • Fliptop is happy with storm - see their presentation here

  • topology metrics in 0.9.0: The new metrics feature allows you to collect arbitrarily custom metrics over fixed windows. Those metrics are exported to a metrics stream that you can consume by implementing IMetricsConsumer and configure with Config.java#L473. Use TopologyContext#registerMetric to register new metrics.

  • storm vs flume - some users' point of view: I use Storm and Flume and find that they are better at different things - it really depends on your use case as to which one is better suited. First and foremost, they were originally designed to do different things: Flume is a reliable service for collecting, aggregating, and moving large amounts of data from source to destination (e.g. log data from many web servers to HDFS). Storm is more for real-time computation (e.g. streaming analytics) where you analyse data in flight and don't necessarily land it anywhere. Having said that, Storm is also fault-tolerant and can write to external data stores (e.g. HBase) and you can do real-time computation in Flume (using interceptors)

That's all for this day - however, I'll keep on reading through storm-users, so watch this space for more info on storm development.

I've been reading through storm-users Google Group recently. This resolution was heavily inspired by Adam Kawa's post "Football zero, Apache Pig hero". Since I've encountered a lot of insightful and very interesting information I've decided to describe some of those in this post.

  • nimbus will work in HA mode - There's a pull request open for it already... but some recent work (distributing topology files via Bittorrent) will greatly simplify the implementation. Once the Bittorrent work is done we'll look at reworking the HA pull request. (storm’s pull request)

  • pig on storm - Pig on Trident would be a cool and welcome project. Join and groupBy have very clear semantics there, as those concepts exist directly in Trident. The extensions needed to Pig are the concept of incremental, persistent state across batches (mirroring those concepts in Trident). You can read a complete proposal.

  • implementing topologies in pure python with petrel looks like this:

class Bolt(storm.BasicBolt):
    def initialize(self, conf, context):
       ''' This method executed only once '''
        storm.log('initializing bolt')

    def process(self, tup):
       ''' This method executed every time a new tuple arrived '''       
       msg = tup.values[0]
       storm.log('Got tuple %s' %msg)

if __name__ == "__main__":
    Bolt().run()
  • Fliptop is happy with storm - see their presentation here

  • topology metrics in 0.9.0: The new metrics feature allows you to collect arbitrarily custom metrics over fixed windows. Those metrics are exported to a metrics stream that you can consume by implementing IMetricsConsumer and configure with Config.java#L473. Use TopologyContext#registerMetric to register new metrics.

  • storm vs flume - some users' point of view: I use Storm and Flume and find that they are better at different things - it really depends on your use case as to which one is better suited. First and foremost, they were originally designed to do different things: Flume is a reliable service for collecting, aggregating, and moving large amounts of data from source to destination (e.g. log data from many web servers to HDFS). Storm is more for real-time computation (e.g. streaming analytics) where you analyse data in flight and don't necessarily land it anywhere. Having said that, Storm is also fault-tolerant and can write to external data stores (e.g. HBase) and you can do real-time computation in Flume (using interceptors)

That's all for this day - however, I'll keep on reading through storm-users, so watch this space for more info on storm development.