Multi phased processing in scala

Last time in our project we had to add progress bar for visualization of long time running process. Process was made of a few phases and we had to print in which phase we currently are. In first step we conclude that we need to create a class of Progre…

Last time in our project we had to add progress bar for visualization of long time running process. Process was made of a few phases and we had to print in which phase we currently are. In first step we conclude that we need to create a class of Progress which will be passed as an implicit parameter to our service. Then we will wrap method calls be inProgress method which will notify some e.g. akka actor about phase begin and phase end.

But this approach has some disadvantages. Firstly before we start service’s operation we need to init progress with count of all phases to get know ratio of progress finish. With this approach we had to add some extra counting before operation start.

If we want to keep real progress notifications the numbers of phases had to fit count of inPhase blocks. Some of phases were dynamically computed and some where omitted in case of failure validations results. This code become to be unmaintained.

We found that we need to join computation of phases with real phase processing. In this case we need to change approach from building process to building chain of phases that will run the process. Each phase will take the result of previous phase and transform it to new output. So example process will look like this:

Code giving this chain functionality looks like this:

We’ve used right associative operator :: for building chain of phases. “Body” of phases is piped by andThen: processPrevWrapped andThen processNext. For nil-tail we need to have a factory creating empty chain with identity “body” function.

Also if we have this kind of tool, we can modify piping code according to nature of our flow. For example if we are using scalaz.Validation we can do validating chain which will extract a success from n-step output and pass it to input of next step (like flatMap). In the other hand if n-step will return Failure, we will skip all remaining phases of validating chain.

To make building of chain more production-ready we add some extra features:

  • Chaining of chains (sth like ::: in scala Lists)
  • Transforming of input/output – for adding some “glue” code for simpler phases chaining
  • Wrapping of chains – also some “glue” code doing both input and output transformations
  • Sequencing of chains – sequenced processing of multiple phases with the same input

If you are interested in using similar approach, take a look at my github project: scala-phases-chain. If you want to integrate this tool with akka actors, simply change MultiPhasedProgress.notifyAboutStatus method to look like this:

You May Also Like

Need to make a quick json fixes – JSONPath for rescue

From time to time I have a need to do some fixes in my json data. In a world of flat files I do this with grep/sed/awk tool chain. How to handle it for JSON? Searching for a solution I came across the JSONPath. It quite mature tool (from 2007) but I haven't hear about it so I decided to share my experience with others.

First of all you can try it without pain online: http://jsonpath.curiousconcept.com/. Full syntax is described at http://goessner.net/articles/JsonPath/



But also you can download python binding and run it from command line:
$ sudo apt-get install python-jsonpath-rw
$ sudo apt-get install python-setuptools
$ sudo easy_install -U jsonpath

After that you can use inside python or with simple cli wrapper:
#!/usr/bin/python
import sys, json, jsonpath

path = sys.argv[
1]

result = jsonpath.jsonpath(json.load(sys.stdin), path)
print json.dumps(result, indent=2)

… you can use it in your shell e.g. for json:
{
"store": {
"book": [
{
"category": "reference",
"author": "Nigel Rees",
"title": "Sayings of the Century",
"price": 8.95
},
{
"category": "fiction",
"author": "Evelyn Waugh",
"title": "Sword of Honour",
"price": 12.99
},
{
"category": "fiction",
"author": "Herman Melville",
"title": "Moby Dick",
"isbn": "0-553-21311-3",
"price": 8.99
},
{
"category": "fiction",
"author": "J. R. R. Tolkien",
"title": "The Lord of the Rings",
"isbn": "0-395-19395-8",
"price": 22.99
}
],
"bicycle": {
"color": "red",
"price": 19.95
}
}
}

You can print only book nodes with price lower than 10 by:
$ jsonpath '$..book[?(@.price 

Result:
[
{
"category": "reference",
"price": 8.95,
"title": "Sayings of the Century",
"author": "Nigel Rees"
},
{
"category": "fiction",
"price": 8.99,
"title": "Moby Dick",
"isbn": "0-553-21311-3",
"author": "Herman Melville"
}
]

Have a nice JSON hacking!From time to time I have a need to do some fixes in my json data. In a world of flat files I do this with grep/sed/awk tool chain. How to handle it for JSON? Searching for a solution I came across the JSONPath. It quite mature tool (from 2007) but I haven't hear about it so I decided to share my experience with others.

Use asInstanceOf[T] carefully!

BackgroundScala has nice static type checking engine but from time to time there are situations when we must downcast some general object. If this casting is not possible we expect that virtual machine will throw ClassCastExeption as fast as possible. ...