All you need is docker (and fig)

IntroductionSuppose you want to run scala repl or groovy shell or any other repl-like executable. You should download executable, unpack it, set PATH environmnt variable and now you could use it. Can it be simple? Yes, dockerize everything.Prepare cont…

Introduction

Suppose you want to run scala repl or groovy shell or any other repl-like executable. You should download executable, unpack it, set PATH environmnt variable and now you could use it. Can it be simple? Yes, dockerize everything.

Prepare containers

I expect that you have installed docker and fig on your machine.
Checkout this project from github and build containers:
$ cd docker-with-fig
$ fig build 
It could take several minutes, depends on your internet connection.
You could also build only some of available container, for example with scala and groovy:
$ fig build scala groovy
Available containers are:
  • haskell
  • scala
  • groovy
  • python27
  • python34
  • clojure

Run containers

Now you could start for example scala:
$ fig run scala
Welcome to Scala version 2.11.6 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_40).
Type in expressions to have them evaluated.
Type :help for more information.

scala>
Or groovy:
$ fig run groovy
Mar 29, 2015 7:48:22 AM java.util.prefs.FileSystemPreferences$1 run
INFO: Created user preferences directory.
Groovy Shell (2.4.2, JVM: 1.8.0_40)
Type ':help' or ':h' for help.
-------------------------------------------------------------------------------
groovy:000>
Or clojure with lein:
$ fig run clojure
nREPL server started on port 52730 on host 127.0.0.1 - nrepl://127.0.0.1:52730
REPL-y 0.3.5, nREPL 0.2.6
Clojure 1.6.0
Java HotSpot(TM) 64-Bit Server VM 1.8.0_40-b25
    Docs: (doc function-name-here)
          (find-doc "part-of-name-here")
  Source: (source function-name-here)
 Javadoc: (javadoc java-object-or-class-here)
    Exit: Control+D or (exit) or (quit)
 Results: Stored in vars *1, *2, *3, an exception in *e

user=>
Or python 3.4:
$ fig run python34
Python 3.4.2 (default, Oct  8 2014, 13:08:17) 
[GCC 4.9.1] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
Or ghci haskell:
$fig run haskell
GHCi, version 7.6.3: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Prelude>
You could also run command on docker container with use of files from your current directory because it is mapped to /project directory on docker. Container starts with /project directory as current directory.
For example you have a simple file in your current dir:
$ cat simplePrint.scala 

println("Hello World")


You could run command with this file:

$ fig -f PATH_TO_DOCKER_WITH_FIG_PROJECT/fig.yml run scala scala simplePrint.scala

Hello World

First ‘scala’ in command means that you want to run scala container and ‘scala simplePrint.scala’ means that you want to execute this command on container.

Conclusion

It is simple, isn’t it? All you need is docker and fig and you could use repl or run command with scala, groovy, cojure and any other which are currently supported…
You May Also Like

Recently at storm-users

I've been reading through storm-users Google Group recently. This resolution was heavily inspired by Adam Kawa's post "Football zero, Apache Pig hero". Since I've encountered a lot of insightful and very interesting information I've decided to describe some of those in this post.

  • nimbus will work in HA mode - There's a pull request open for it already... but some recent work (distributing topology files via Bittorrent) will greatly simplify the implementation. Once the Bittorrent work is done we'll look at reworking the HA pull request. (storm’s pull request)

  • pig on storm - Pig on Trident would be a cool and welcome project. Join and groupBy have very clear semantics there, as those concepts exist directly in Trident. The extensions needed to Pig are the concept of incremental, persistent state across batches (mirroring those concepts in Trident). You can read a complete proposal.

  • implementing topologies in pure python with petrel looks like this:

class Bolt(storm.BasicBolt):
    def initialize(self, conf, context):
       ''' This method executed only once '''
        storm.log('initializing bolt')

    def process(self, tup):
       ''' This method executed every time a new tuple arrived '''       
       msg = tup.values[0]
       storm.log('Got tuple %s' %msg)

if __name__ == "__main__":
    Bolt().run()
  • Fliptop is happy with storm - see their presentation here

  • topology metrics in 0.9.0: The new metrics feature allows you to collect arbitrarily custom metrics over fixed windows. Those metrics are exported to a metrics stream that you can consume by implementing IMetricsConsumer and configure with Config.java#L473. Use TopologyContext#registerMetric to register new metrics.

  • storm vs flume - some users' point of view: I use Storm and Flume and find that they are better at different things - it really depends on your use case as to which one is better suited. First and foremost, they were originally designed to do different things: Flume is a reliable service for collecting, aggregating, and moving large amounts of data from source to destination (e.g. log data from many web servers to HDFS). Storm is more for real-time computation (e.g. streaming analytics) where you analyse data in flight and don't necessarily land it anywhere. Having said that, Storm is also fault-tolerant and can write to external data stores (e.g. HBase) and you can do real-time computation in Flume (using interceptors)

That's all for this day - however, I'll keep on reading through storm-users, so watch this space for more info on storm development.

I've been reading through storm-users Google Group recently. This resolution was heavily inspired by Adam Kawa's post "Football zero, Apache Pig hero". Since I've encountered a lot of insightful and very interesting information I've decided to describe some of those in this post.

  • nimbus will work in HA mode - There's a pull request open for it already... but some recent work (distributing topology files via Bittorrent) will greatly simplify the implementation. Once the Bittorrent work is done we'll look at reworking the HA pull request. (storm’s pull request)

  • pig on storm - Pig on Trident would be a cool and welcome project. Join and groupBy have very clear semantics there, as those concepts exist directly in Trident. The extensions needed to Pig are the concept of incremental, persistent state across batches (mirroring those concepts in Trident). You can read a complete proposal.

  • implementing topologies in pure python with petrel looks like this:

class Bolt(storm.BasicBolt):
    def initialize(self, conf, context):
       ''' This method executed only once '''
        storm.log('initializing bolt')

    def process(self, tup):
       ''' This method executed every time a new tuple arrived '''       
       msg = tup.values[0]
       storm.log('Got tuple %s' %msg)

if __name__ == "__main__":
    Bolt().run()
  • Fliptop is happy with storm - see their presentation here

  • topology metrics in 0.9.0: The new metrics feature allows you to collect arbitrarily custom metrics over fixed windows. Those metrics are exported to a metrics stream that you can consume by implementing IMetricsConsumer and configure with Config.java#L473. Use TopologyContext#registerMetric to register new metrics.

  • storm vs flume - some users' point of view: I use Storm and Flume and find that they are better at different things - it really depends on your use case as to which one is better suited. First and foremost, they were originally designed to do different things: Flume is a reliable service for collecting, aggregating, and moving large amounts of data from source to destination (e.g. log data from many web servers to HDFS). Storm is more for real-time computation (e.g. streaming analytics) where you analyse data in flight and don't necessarily land it anywhere. Having said that, Storm is also fault-tolerant and can write to external data stores (e.g. HBase) and you can do real-time computation in Flume (using interceptors)

That's all for this day - however, I'll keep on reading through storm-users, so watch this space for more info on storm development.