Turing completeness

sed is a very powerful tool. A simple sed statement may turn a cat into cement. Observe: echo cat | sed statement I was asked by mariom@ircnet whether is it possible to implement the Euklidean algorithm (the one that computes the greatest common divisor) in awk. awk is a Turing complete language, so the answer is yes, it is possible. There is a snippet proposed by mariom:

{
    a = $1;
    b = $2;
    while (b != 0) {
        c = a % b;
        a = b;
        b = c;
    }
    print a
}

Anyway, my response was that it is possible even in sed. Is it? Of course! Sed is a Turing complete language. Though I had no idea how to write it in sed. I use sed for simple substitutions only, but I could not admit that! I started googling for arithmetic operations in sed and I found

a dc implementation in pure sed. So the next question is how to implement the Euklidean algorithm in dc. It turned out to be quite simple, see:

[lalb%sclbsalcsblb0

This code assumes that there are two integers on the stack. So you can test it with something like that:

echo '20 25 [lalb%sclbsalcsblb0

Let’s analyze what is happening there. There is F macro that is equivalent of “while” loop in awk code. It

loads the values of a and b registers on the stack. Then replaces them with their reminder and saves it in the c registers.

lbsalcsb

statement just copies the value of

b to a and c to b. Finaly it compares the value of b with 0. If former is greater it executes F (note: recurrence. It is the only way to implement a loop in dc). Otherwise it quits.

sasblFxlap

statement is an entry point. It saves values from stack in the 

a and b registers, then executes F and prints the content of the a register. So this sed script is an equivalent of awk script that executes Euklidean algothm. Now it is enough to embed the dc script into dc.sed, et voila! Enjoy: Euclidean algorithm implemented in pure sed.

You May Also Like

Distributed scans with HBase

HBase is by design a columnar store, that is optimized for random reads. You just ask for a row using rowId as an identifier and you get your data instantaneously. Performing a scan on part or whole table is a completely different thing. First of all, it is sequential. Meaning it is rather slow, because it doesn't use all the RegionServers at the same time. It is implemented that way to realize the contract of Scan command - which has to return results sorted by key. So, how to do this efficiently?HBase is by design a columnar store, that is optimized for random reads. You just ask for a row using rowId as an identifier and you get your data instantaneously. Performing a scan on part or whole table is a completely different thing. First of all, it is sequential. Meaning it is rather slow, because it doesn't use all the RegionServers at the same time. It is implemented that way to realize the contract of Scan command - which has to return results sorted by key. So, how to do this efficiently?