Validation with warnings in scala with scalaz

Monad are containers with ‘special powers’, when it comes to applying function over its content.Validation special power is propagating Failure over validation process.If you are not familiar with scalaz.Validation I urge you to read this example,…

Monad are containers with ‘special powers’, when it comes to applying function over its content.
Validation special power is propagating Failure over validation process.

If you are not familiar with scalaz.Validation I urge you to read this example, which shows how to use Validation: A Tale of 3 Nightclubs

Basically validation looks like this:

Scalaz.Validation uses idiomatic scala way to compose monads by For Comprehension.

Concrete validation method, returning scalaz.Validation instances looks like this:

Scalaz provide helper methods for wrapping values into Failure or Success.

To sum it up. Validation is a an elegant way to handle application validation logic.

However it’s not enough.

Our business rules require application logic’s to perform validation with warnings, which should not propagate as failures, but rather propagate independently of Success/Failure types.

We liked monad approach to data validation so we wanted to keep it that way.

Let me introduce Validation with warnings

What it does is basically wrapping scalaz.Validation into another type responsible for carrying warnings over validation process

Thank to scala type inference our validation code look’s just the same, but now for expression operates on ValidationWithWarnings type rather than Validation.

OK, but what about validation code? We created similar helper methods for wrapping validation into ValidationWithWarnings and wrapping values directly into warnings.

One could inline warning in for loop:

Or use it in validation method:

And of course chain it in for-loop:

Applicative

We support scalaz.Applicative, so it’s possible to take few validations and apply them to function if all elements are successes, collecting any errors and warnings if present.

Summing

Similarly to scalaz.Validation, we also support summing values, if value type has Semigroup typeclass:

Repository

Code with examples in test files can be found at https://github.com/Ajk4/ValidationWithWarnings

Q&A

Why not use Writer Monad?
– Same reason why we prefer Validation over Either with left/right projection. It’s more direct and descriptive.

Why validation nel underhood?
– It suited our business needs best.

Validation is not a Monad!
– True. 

You May Also Like

Recently at storm-users

I've been reading through storm-users Google Group recently. This resolution was heavily inspired by Adam Kawa's post "Football zero, Apache Pig hero". Since I've encountered a lot of insightful and very interesting information I've decided to describe some of those in this post.

  • nimbus will work in HA mode - There's a pull request open for it already... but some recent work (distributing topology files via Bittorrent) will greatly simplify the implementation. Once the Bittorrent work is done we'll look at reworking the HA pull request. (storm’s pull request)

  • pig on storm - Pig on Trident would be a cool and welcome project. Join and groupBy have very clear semantics there, as those concepts exist directly in Trident. The extensions needed to Pig are the concept of incremental, persistent state across batches (mirroring those concepts in Trident). You can read a complete proposal.

  • implementing topologies in pure python with petrel looks like this:

class Bolt(storm.BasicBolt):
    def initialize(self, conf, context):
       ''' This method executed only once '''
        storm.log('initializing bolt')

    def process(self, tup):
       ''' This method executed every time a new tuple arrived '''       
       msg = tup.values[0]
       storm.log('Got tuple %s' %msg)

if __name__ == "__main__":
    Bolt().run()
  • Fliptop is happy with storm - see their presentation here

  • topology metrics in 0.9.0: The new metrics feature allows you to collect arbitrarily custom metrics over fixed windows. Those metrics are exported to a metrics stream that you can consume by implementing IMetricsConsumer and configure with Config.java#L473. Use TopologyContext#registerMetric to register new metrics.

  • storm vs flume - some users' point of view: I use Storm and Flume and find that they are better at different things - it really depends on your use case as to which one is better suited. First and foremost, they were originally designed to do different things: Flume is a reliable service for collecting, aggregating, and moving large amounts of data from source to destination (e.g. log data from many web servers to HDFS). Storm is more for real-time computation (e.g. streaming analytics) where you analyse data in flight and don't necessarily land it anywhere. Having said that, Storm is also fault-tolerant and can write to external data stores (e.g. HBase) and you can do real-time computation in Flume (using interceptors)

That's all for this day - however, I'll keep on reading through storm-users, so watch this space for more info on storm development.

I've been reading through storm-users Google Group recently. This resolution was heavily inspired by Adam Kawa's post "Football zero, Apache Pig hero". Since I've encountered a lot of insightful and very interesting information I've decided to describe some of those in this post.

  • nimbus will work in HA mode - There's a pull request open for it already... but some recent work (distributing topology files via Bittorrent) will greatly simplify the implementation. Once the Bittorrent work is done we'll look at reworking the HA pull request. (storm’s pull request)

  • pig on storm - Pig on Trident would be a cool and welcome project. Join and groupBy have very clear semantics there, as those concepts exist directly in Trident. The extensions needed to Pig are the concept of incremental, persistent state across batches (mirroring those concepts in Trident). You can read a complete proposal.

  • implementing topologies in pure python with petrel looks like this:

class Bolt(storm.BasicBolt):
    def initialize(self, conf, context):
       ''' This method executed only once '''
        storm.log('initializing bolt')

    def process(self, tup):
       ''' This method executed every time a new tuple arrived '''       
       msg = tup.values[0]
       storm.log('Got tuple %s' %msg)

if __name__ == "__main__":
    Bolt().run()
  • Fliptop is happy with storm - see their presentation here

  • topology metrics in 0.9.0: The new metrics feature allows you to collect arbitrarily custom metrics over fixed windows. Those metrics are exported to a metrics stream that you can consume by implementing IMetricsConsumer and configure with Config.java#L473. Use TopologyContext#registerMetric to register new metrics.

  • storm vs flume - some users' point of view: I use Storm and Flume and find that they are better at different things - it really depends on your use case as to which one is better suited. First and foremost, they were originally designed to do different things: Flume is a reliable service for collecting, aggregating, and moving large amounts of data from source to destination (e.g. log data from many web servers to HDFS). Storm is more for real-time computation (e.g. streaming analytics) where you analyse data in flight and don't necessarily land it anywhere. Having said that, Storm is also fault-tolerant and can write to external data stores (e.g. HBase) and you can do real-time computation in Flume (using interceptors)

That's all for this day - however, I'll keep on reading through storm-users, so watch this space for more info on storm development.