{"id":2357,"date":"2011-09-21T23:12:34","date_gmt":"2011-09-21T21:12:34","guid":{"rendered":"http:\/\/mcl.jogger.pl\/2011\/09\/21\/what-is-nosql-good-for\/"},"modified":"2023-03-22T12:04:29","modified_gmt":"2023-03-22T11:04:29","slug":"what-is-nosql-good-for","status":"publish","type":"post","link":"https:\/\/touk.pl\/blog\/2011\/09\/21\/what-is-nosql-good-for\/","title":{"rendered":"What is NoSQL good for?"},"content":{"rendered":"<p><strong>\u2026 or how I ended up writing a CouchDB proof of concept app?<\/strong><\/p>\n<p>Once upon a time I set out on a journey to discover the NoSQL land. I\u2019ve decided that doing simple queries wouldn\u2019t be interesting enough. That\u2019s why I\u2019ve chose to create an app that would be based on some NoSQL database. The main idea was to create an app, that would dynamically update itself with geographic data flowing in. Since there are myriads of geo-data that are available on the internet, you can pick your favorite one and load them into your SQL database of choice. In my case the primary source of data was a proprietary database, or more specifically \u2013 one table in it continuously updated with new data. To make that data visible on my map I needed to: * buffer the huge amount of those records \u2013 so as not to overhoul other services with large traffic, and not to flood the frontend * convert then to my representation * display them \u2013 have presentation layer in a browser \u2013 since browser-based frontend was the easiest and fastest to develop The idea of the front-end HTML page was to show new points on the map. From the moment of opening the page records that appear in database table should be shown interactively on the screen.<\/p>\n<h2 id=\"toys-used\">Toys used<\/h2>\n<p>For the first step I chose to use RabbitMQ broker. A queue on the broker would receive messages \u2013 one message per database table\u2019s row. Then I\u2019d use some simple groovy middle ware to convert the data to appropriate format and put it onto another db \u2013 this time db specific to my app. You may ask why incorporate another database. It would be good for separating environments \u2013 assuming the original data contains some vulnerable content that should be anatomised, or we just don\u2019t feel comfortable exposing the whole database of some XYZ-system just to have access to its one table. Since for my presentation layer I chose HTML+JS without any application server-based back-end I\u2019ve decided on CouchDB . This seemed like a perfect match for this scenario. Why? \u2013 ease of use, REST API, with JSON responses \u2013 just great for interacting with my simple front-end. The flow of things was as shown on the image below:<\/p>\n<p><img decoding=\"async\" src=\"http:\/\/blog.innovative-labs.com\/blog\/gmapper.png\" alt=\"diagram\" \/><\/p>\n<h2 id=\"avro-for-the-beginning\">Avro \u2013 for the beginning<\/h2>\n<p>As you can see, I\u2019ve chosen JSON as my data-format. I\u2019ve been considering <a href=\"http:\/\/avro.apache.org\">Apache Avro<\/a> in the first place but using it was a real pain in the ass. Avro itself is used in <a href=\"http:\/\/hadoop.apache.org\">Apache Hadoop<\/a> as a serialization layer, so it would seem OK, but it has virtually <em>no documentation<\/em>. But once you tear through the unintuitive interface and manage to handle all those unthinkable exceptions you get a few pros for this library. It\u2019s great in that it does not require code generation \u2013 I like it being made on the fly. It also offers sending data in binary format, which was not necessary, but never the less is a nice feature. What I certainly didn\u2019t like about it was its orientation on the files rather than chunks of data \u2013 so it was not so obvious how should I send data through the wire. Than I found out it can produce JSON output, which would work for me, except the output could not have been parsed by other JSON libraries :) (<a href=\"http:\/\/stackoverflow.com\/questions\/5375243\/jcouchdb-svenson-unable-to-parse-json-string\">I\u2019ve asked on stackoverflow about that, but with no luck<\/a>). If my whining haven\u2019t put you back and still would like to see how to use Avro, try this unit test in project\u2019s GitHub repo: <a href=\"https:\/\/github.com\/zygm0nt\/gmapped\/blob\/master\/feeder\/src\/test\/groovy\/pl\/ftang\/example\/feeder\/avro\/AvroSimpleTest.groovy\">AvroSimpleTest.groovy<\/a><\/p>\n<h2 id=\"svenson\">Svenson<\/h2>\n<p>I\u2019ve dropped Avro in favour of a simple JSON lib called (<a href=\"http:\/\/code.google.com\/p\/svenson\/\">Svenson<\/a> and that was painless. The only thing I was forced to do was create my model class in Java \u2013 the rest of the project is written in Groovy. I\u2019ve no idea why was that necessary, and didn\u2019t want to look into it.<\/p>\n<h2 id=\"rabbitmq\">RabbitMQ<\/h2>\n<p>Further on the way is <a href=\"http:\/\/www.rabbitmq.com\/\">RabbitMQ<\/a>, to which records are filled by a feeding middle-ware written in Groovy. Since I use <a href=\"http:\/\/activemq.apache.org\">ActiveMQ<\/a> on a day-to-day basis, I\u2019ve decided to try something new. This broker is a really nice piece of software. Being written in Erlang makes it really fast. What\u2019s more it has some extensive capabilities and is easy to approach for anyone similar with messaging (JMS and friends). For such a lightweight product it is really powerful \u2013 implements AMQP!<\/p>\n<h2 id=\"couchdb\">CouchDB<\/h2>\n<p> From the broker\u2019s queue messages are again fetched by a middle-ware just to be put into <a href=\"http:\/\/couchdb.apache.org\/\">CouchDB<\/a> view. This database is also written in Erlang. It\u2019s very reliable, however the way it handles refreshing view isn\u2019t the most pleasant one \u2013 performance-wise. Word of advice \u2013 if you\u2019re on Debian derivative, be cautious with apt-repository version. It\u2019s rather _ancient_. Also remember to add <strong>allow_jsonp = true<\/strong> to you config file <em>\/opt\/couchbase\/etc\/couchdb\/local.ini<\/em>. It\u2019s not enabled by default, and not having this set would result with empty responses from the CouchDB server. The problem here is, that the browser doesn\u2019t allow quering a web server with hostname other than the one the script originates. More on this case <a href=\"http:\/\/stackoverflow.com\/questions\/3386679\/connection-ajax-couchdb-and-javascript\">here<\/a>. Seems like my problem could be overcame by changing url in index.html and hostname couchdb listens on to the same address. I\u2019ve also created a view, that would expose an event by key: <a href=\"https:\/\/github.com\/zygm0nt\/gmapped\/blame\/master\/couchdb\/by_date_view.js\">view code<\/a><\/p>\n<h2 id=\"presenting-the-dots\">Presenting the dots<\/h2>\n<p> As a back-end I\u2019ve done some JQuery based AJAX calls \u2013 nothing too fancy. All things necessary for presentation layer are in <a href=\"https:\/\/github.com\/zygm0nt\/gmapped\/blob\/master\/index.html\">this file<\/a>.<\/p>\n<h2 id=\"things-to-consider\">Things to consider<\/h2>\n<p> Please bear in mind that this whole application is rather a playground, not a full-fledged project!! After creating all the parts I have some doubts about some architectural decisions I made. I don\u2019t think the security have been taken into account seriously enough. Also scalability was never an issue ;-) If you have some thoughts about any of the aspects mentioned in this post, please feel free to comment or contact me directly :) And also you may try the application by yourself \u2013 it\u2019s on <a href=\"https:\/\/github.com\/zygm0nt\/gmapped\">GitHub<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"\u2026 or how I ended up writing a CouchDB proof of concept app? Once upon a time I&hellip;\n","protected":false},"author":11,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[11],"tags":[687,186],"class_list":["post-2357","post","type-post","status-publish","format-standard","category-development-design","tag-db","tag-message-queue"],"_links":{"self":[{"href":"https:\/\/touk.pl\/blog\/wp-json\/wp\/v2\/posts\/2357","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/touk.pl\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/touk.pl\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/touk.pl\/blog\/wp-json\/wp\/v2\/users\/11"}],"replies":[{"embeddable":true,"href":"https:\/\/touk.pl\/blog\/wp-json\/wp\/v2\/comments?post=2357"}],"version-history":[{"count":8,"href":"https:\/\/touk.pl\/blog\/wp-json\/wp\/v2\/posts\/2357\/revisions"}],"predecessor-version":[{"id":9339,"href":"https:\/\/touk.pl\/blog\/wp-json\/wp\/v2\/posts\/2357\/revisions\/9339"}],"wp:attachment":[{"href":"https:\/\/touk.pl\/blog\/wp-json\/wp\/v2\/media?parent=2357"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/touk.pl\/blog\/wp-json\/wp\/v2\/categories?post=2357"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/touk.pl\/blog\/wp-json\/wp\/v2\/tags?post=2357"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}