Turing completeness

sed is a very powerful tool. A simple sed statement may turn a cat into cement. Observe: echo cat | sed statement I was asked by mariom@ircnet whether is it possible to implement the Euklidean algorithm (the one that computes the greatest common divisor) in awk. awk is a Turing complete language, so the answer is yes, it is possible. There is a snippet proposed by mariom:

{
    a = $1;
    b = $2;
    while (b != 0) {
        c = a % b;
        a = b;
        b = c;
    }
    print a
}

Anyway, my response was that it is possible even in sed. Is it? Of course! Sed is a Turing complete language. Though I had no idea how to write it in sed. I use sed for simple substitutions only, but I could not admit that! I started googling for arithmetic operations in sed and I found

a dc implementation in pure sed. So the next question is how to implement the Euklidean algorithm in dc. It turned out to be quite simple, see:

[lalb%sclbsalcsblb0

This code assumes that there are two integers on the stack. So you can test it with something like that:

echo '20 25 [lalb%sclbsalcsblb0

Let’s analyze what is happening there. There is F macro that is equivalent of “while” loop in awk code. It

loads the values of a and b registers on the stack. Then replaces them with their reminder and saves it in the c registers.

lbsalcsb

statement just copies the value of

b to a and c to b. Finaly it compares the value of b with 0. If former is greater it executes F (note: recurrence. It is the only way to implement a loop in dc). Otherwise it quits.

sasblFxlap

statement is an entry point. It saves values from stack in the 

a and b registers, then executes F and prints the content of the a register. So this sed script is an equivalent of awk script that executes Euklidean algothm. Now it is enough to embed the dc script into dc.sed, et voila! Enjoy: Euclidean algorithm implemented in pure sed.

You May Also Like

Recently at storm-users

I've been reading through storm-users Google Group recently. This resolution was heavily inspired by Adam Kawa's post "Football zero, Apache Pig hero". Since I've encountered a lot of insightful and very interesting information I've decided to describe some of those in this post.

  • nimbus will work in HA mode - There's a pull request open for it already... but some recent work (distributing topology files via Bittorrent) will greatly simplify the implementation. Once the Bittorrent work is done we'll look at reworking the HA pull request. (storm’s pull request)

  • pig on storm - Pig on Trident would be a cool and welcome project. Join and groupBy have very clear semantics there, as those concepts exist directly in Trident. The extensions needed to Pig are the concept of incremental, persistent state across batches (mirroring those concepts in Trident). You can read a complete proposal.

  • implementing topologies in pure python with petrel looks like this:

class Bolt(storm.BasicBolt):
    def initialize(self, conf, context):
       ''' This method executed only once '''
        storm.log('initializing bolt')

    def process(self, tup):
       ''' This method executed every time a new tuple arrived '''       
       msg = tup.values[0]
       storm.log('Got tuple %s' %msg)

if __name__ == "__main__":
    Bolt().run()
  • Fliptop is happy with storm - see their presentation here

  • topology metrics in 0.9.0: The new metrics feature allows you to collect arbitrarily custom metrics over fixed windows. Those metrics are exported to a metrics stream that you can consume by implementing IMetricsConsumer and configure with Config.java#L473. Use TopologyContext#registerMetric to register new metrics.

  • storm vs flume - some users' point of view: I use Storm and Flume and find that they are better at different things - it really depends on your use case as to which one is better suited. First and foremost, they were originally designed to do different things: Flume is a reliable service for collecting, aggregating, and moving large amounts of data from source to destination (e.g. log data from many web servers to HDFS). Storm is more for real-time computation (e.g. streaming analytics) where you analyse data in flight and don't necessarily land it anywhere. Having said that, Storm is also fault-tolerant and can write to external data stores (e.g. HBase) and you can do real-time computation in Flume (using interceptors)

That's all for this day - however, I'll keep on reading through storm-users, so watch this space for more info on storm development.

I've been reading through storm-users Google Group recently. This resolution was heavily inspired by Adam Kawa's post "Football zero, Apache Pig hero". Since I've encountered a lot of insightful and very interesting information I've decided to describe some of those in this post.

  • nimbus will work in HA mode - There's a pull request open for it already... but some recent work (distributing topology files via Bittorrent) will greatly simplify the implementation. Once the Bittorrent work is done we'll look at reworking the HA pull request. (storm’s pull request)

  • pig on storm - Pig on Trident would be a cool and welcome project. Join and groupBy have very clear semantics there, as those concepts exist directly in Trident. The extensions needed to Pig are the concept of incremental, persistent state across batches (mirroring those concepts in Trident). You can read a complete proposal.

  • implementing topologies in pure python with petrel looks like this:

class Bolt(storm.BasicBolt):
    def initialize(self, conf, context):
       ''' This method executed only once '''
        storm.log('initializing bolt')

    def process(self, tup):
       ''' This method executed every time a new tuple arrived '''       
       msg = tup.values[0]
       storm.log('Got tuple %s' %msg)

if __name__ == "__main__":
    Bolt().run()
  • Fliptop is happy with storm - see their presentation here

  • topology metrics in 0.9.0: The new metrics feature allows you to collect arbitrarily custom metrics over fixed windows. Those metrics are exported to a metrics stream that you can consume by implementing IMetricsConsumer and configure with Config.java#L473. Use TopologyContext#registerMetric to register new metrics.

  • storm vs flume - some users' point of view: I use Storm and Flume and find that they are better at different things - it really depends on your use case as to which one is better suited. First and foremost, they were originally designed to do different things: Flume is a reliable service for collecting, aggregating, and moving large amounts of data from source to destination (e.g. log data from many web servers to HDFS). Storm is more for real-time computation (e.g. streaming analytics) where you analyse data in flight and don't necessarily land it anywhere. Having said that, Storm is also fault-tolerant and can write to external data stores (e.g. HBase) and you can do real-time computation in Flume (using interceptors)

That's all for this day - however, I'll keep on reading through storm-users, so watch this space for more info on storm development.

Private fields and methods are not private in groovy

I used to code in Java before I met groovy. Like most of you, groovy attracted me with many enhancements. This was to my surprise to discover that method visibility in groovy is handled different than Java!

Consider this example:

class Person {
private String name
public String surname

private Person() {}

private String signature() { "${name?.substring(0, 1)}. $surname" }

public String toString() { "I am $name $surname" }
}

How is this class interpreted with Java?

  1. Person has private constructor that cannot be accessed
  2. Field "name" is private and cannot be accessed
  3. Method signature() is private and cannot be accessed

Let's see how groovy interpretes Person:

public static void main(String[] args) {
def person = new Person() // constructor is private - compilation error in Java
println(person.toString())

person.@name = 'Mike' // access name field directly - compilation error in Java
println(person.toString())

person.name = 'John' // there is a setter generated by groovy
println(person.toString())

person.@surname = 'Foo' // access surname field directly
println(person.toString())

person.surname = 'Bar' // access auto-generated setter
println(person.toString())

println(person.signature()) // call private method - compilation error in Java
}

I was really astonished by its output:

I am null null
I am Mike null
I am John null
I am John Foo
I am John Bar
J. Bar

As you can see, groovy does not follow visibility directives at all! It treats them as non-existing. Code compiles and executes fine. It's contrary to Java. In Java this code has several errors, pointed out in comments.

I've searched a bit on this topic and it seems that this behaviour is known since version 1.1 and there is a bug report on that: http://jira.codehaus.org/browse/GROOVY-1875. It is not resolved even with groovy 2 release. As Tim Yates mentioned in this Stackoverflow question: "It's not clear if it is a bug or by design". Groovy treats visibility keywords as a hint for a programmer.

I need to keep that lesson in mind next time I want to make some field or method private!

JCE keystore and untrusted sites

Recently at work I was in need of connecting to a web service exposed via HTTPS. I've been doing this from inside Servicemix 3.3.1, which may seem a bit inhibiting, but that was a requirement. Nevertheless I've been trying my luck with the included ser...Recently at work I was in need of connecting to a web service exposed via HTTPS. I've been doing this from inside Servicemix 3.3.1, which may seem a bit inhibiting, but that was a requirement. Nevertheless I've been trying my luck with the included ser...