Subtelny feature konstrukcji CONNECT BY

Jeśli ktoś używa konstrukcji CONNECT BY i chce uniknąć subtelnych błędów to powinienen zapoznać się z opisaną poniżej własnością. Jeśli w kwerendzie hierarchicznej (czyli używającej konstrukcji CONNECT BY … START WITH …) używa się równocześnie złączeń, czy to ANSI czy Oracle’owych to wynik jest nieintuicyjny. Właśnie wykryłem istotny błąd w moim kodzie z tym związany. Kwerenda hierarchiczna to takie coś, co pozwala uzyskać w jednym zapytaniu np. listę wszystkich przełożonych danego pracownika, ale nie tylko tych bezpośrednich, ale również przełożonych tych przełożonych, itd. Chodzi o uzyskanie informacji z drzewa. Dobry opis działania kwerend hierarchicznych znajduje się tutaj:

http://download.oracle.com/docs/cd/B10501_01/server.920/a96540/queries4a.htm#2053937 Są tam 2 ważne punkty:
* If the|WHERE| predicate contains a join, Oracle applies the joinpredicates /before/ doing the |CONNECT| |BY| processing.
* If the |WHERE| clause does not contain a join, Oracle applies allpredicates other than the|CONNECT| |BY| predicates _/after/_doing the |CONNECT| |BY|processing without affecting the other rows of the hierarchy. Jeśli klauzura where zawiera i złączenie i nie-złączenie (warunek nie będący złączeniem) to złączenie dokonywane jest przed przetwarzaniem connect by a nie-złączenie po. Jako przykład opiszę w uproszczeniu wykryty błąd: Kwerenda miała zwracać dla danego klienta listę jego przełożonych z jego firmy (w uproszczeniu)

SELECT c.* FROM customer_all c
CONNECT BY PRIOR c.customer_id_high = c.customer_id
START WITH c.customer_id = 123456

Potem zaszła potrzeba, aby pobierać też adresy billingowe klientów. Każdy klient może mieć wiele adresów, ale tylko jeden billingowy czyli z flagą CCBILL=’X’. Kwerenda została zmieniona na

SELECT level, c.*, cc.*
  FROM customer_all c,
       ccontact_all cc
 WHERE c.customer_id = cc.customer_id
   AND cc.ccbill = 'X'
CONNECT BY PRIOR c.customer_id_high = c.customer_id
START WITH c.customer_id = 123456

I tu był błąd. Warunek c.customer_id = cc.customer_id nastąpił przed wykonaniem connect by a cc.ccbill = ‘X’ po, co spowodowało że błąd wystąpił, gdy klient 123456 miał dwa adresy a jego przełożony i przełożony przełożonego mieli po jednym. Liczenie zapytania wyglądało następująco: Etap 1: w wyniku zapytania występują 2 rekordy dla klienta 123456, jeden dla adresu billingowego, drugi dla innego adresu Etap 2: dla obu rekordów powstałych w poprzednim etapie wykonywany jest warunek connect by. Do wyników 2 razy zostaje wrzucony przełożony 123456 Etap 3: dla obu rekordów powstałych w poprzednim etapie wykonywany jest warunek connect by. Do wyników 2 razy zostaje wrzucony przełożony przełożonego 123456 Etap 4: Usunięty został rekord z adresem niebillingowym dla klienta 123456, ale rekordy jego przełożonych dodane z powodu tego rekordu nie zostały usunięte ponieważ

*ich* adres był billingowy (nie mieli żadnego nie-billingowego). Skutek był taki, że kwerenda zwróciła 5 rekordów, a powinna była zwrócić 3: rekordy przełożonego i przełożonego przełożonego zostały zdublikowane z powodu drugiego adresu klienta 123456. Rekord z jego drugim adresem został usunięty, ale gałąź, która z niego powstała – nie. Rozwiązaniem było dodanie warunku na poprzedni rekord w CONNECT BY, koniecznie z PRIOR: PRIOR cc.ccbill = ‘X’

SELECT level, c.*, cc.*
  FROM customer_all c, ccontact_all cc
 WHERE c.customer_id = cc.customer_id
   AND cc.ccbill = 'X'
CONNECT BY PRIOR c.customer_id_high = c.customer_id
       AND PRIOR cc.ccbill = 'X'
 START WITH c.customer_id = 123456

Bezpiecznym rozwiązaniem byłoby też robienie joina z ccontract_all dopiero po obliczeniu wyniku kwerendy hierarchicznej.

SELECT *
  FROM (SELECT c. *
          FROM customer_all c,
        CONNECT BY PRIOR c.customer_id_high = c.customer_id
         START WITH c.customer_id = 123456) c,
       ccontract_all cc
 WHERE c.customer_id = cc.customer_id
   AND cc.ccbill = 'X'

Jeszcze taką ciekawą rzecz zauważyłem: gdy w zapytaniu posiadającym i złączenia i kwerendę hierarchiczną zrobi się złączenie poprzez konstrukcję JOIN … ON a nie warunek w WHERE to czas wykonania wzrasta kilkaset tysięcy razy! Być może są przypadki, gdy jest odwrotnie, więc warto to zawsze sprawdzić.

DDevelopment & Design

33rd Degree day 1 review

byJakub Nabrdalik
March 24, 2012

33rd Degree is over. After the one last year, my expectations were very high, but Grzegorz Duda once again proved he's more than able to deliver. With up to five tracks (most of the time: four presentations + one workshop), and ~650 attendees, there was a lot to see and a lot to do, thus everyone will probably have a little bit different story to tell. Here is mine.

Twitter: From Ruby on Rails to the JVM

Raffi Krikorian talking about Twitter and JVM

The conference started with Raffi Krikorian from Twitter, talking about their use for JVM. Twitter was build with Ruby but with their performance management a lot of the backend was moved to Scala, Java and Closure. Raffi noted, that for Ruby programmers Scala was easier to grasp than Java, more natural, which is quite interesting considering how many PHP guys move to Ruby these days because of the same reasons. Perhaps the path of learning Jacek Laskowski once described (Java -> Groovy -> Scala/Closure) may be on par with PHP -> Ruby -> Scala. It definitely feels like Scala is the holy grail of languages these days.

Raffi also noted, that while JVM delivered speed and a concurrency model to Twitter stack, it wasn't enough, and they've build/customized their own Garbage Collector. My guess is that Scala/Closure could also be used because of a nice concurrency solutions (STM, immutables and so on).

Raffi pointed out, that with the scale of Twitter, you easily get 3 million hits per second, and that means you probably have 3 edge cases every second. I'd love to learn listen to lessons they've learned from this.

Complexity of Complexity

The second keynote of the first day, was Ken Sipe talking about complexity. He made a good point that there is a difference between complex and complicated, and that we often recognize things as complex only because we are less familiar with them. This goes more interesting the moment you realize that the shift in last 20 years of computer languages, from the "Less is more" paradigm (think Java, ASM) to "More is better" (Groovy/Scala/Closure), where you have more complex language, with more powerful and less verbose syntax, that is actually not more complicated, it just looks less familiar.

So while 10 years ago, I really liked Java as a general purpose language for it's small set of rules that could get you everywhere, it turned out that to do most of the real world stuff, a lot of code had to be written. The situation got better thanks to libraries/frameworks and so on, but it's just patching. New languages have a lot of stuff build into, which makes their set of rules and syntax much more complex, but once you get familiar, the real world usage is simple, faster, better, with less traps laying around, waiting for you to fall.

Ken also pointed out, that while Entity Service Bus looks really simple on diagrams, it's usually very difficult and complicated to use from the perspective of the programmer. And that's probably why it gets chosen so often - the guys selling/buying it, look no deeper than on the diagram.

Pointy haired bosses and pragmatic programmers: Facts and Fallacies of Software Development

Dima got lucky. Or maybe not.

Venkat Subramaniam is the kind of a speaker that talk about very simple things in a way, which makes everyone either laugh or reflect. Yes, he is a showman, but hey, that's actually good, because even if you know the subject quite well, his talks are still very entertaining.

This talk was very generic (here's my thesis: the longer the title, the more generic the talk will be), interesting and fun, but at the end I'm unable to see anything new I'd have learned, apart from the distinction between Dynamic vs Static and Strong vs Weak typing, which I've seen the last year, but managed to forgot. This may be a very interesting argument for all those who are afraid of Groovy/Ruby, after bad experience with PHP or Perl.

Build Trust in Your Build to Deployment Flow!

Frederic Simon talked about DevOps and deployment, and that was a miss in my schedule, because of two reasons. First, the talk was aimed at DevOps specifically, and while the subject is trendy lately, without big-scale problems, deployment is a process I usually set up and forget about. It just works, mostly because I only have to deal with one (current) project at a time.

Not much love for Dart.

Second, while Frederic has a fabulous accent and a nice, loud voice, he tends to start each sentence loud and fade the sound at the end. This, together with mics failing him badly, made half of the presentation hard to grasp unless you were sitting in the first row.

I'm not saying the presentation was bad, far from it, it just clearly wasn't for me.

I've left a few minutes before the end, to see how many people came to Dart presentation by Mike West. I was kind of interested, since I'm following Warsaw Google Technology User Group and heard a few voices about why I should pay attentions to that new Google language. As you can see from the picture on the right, the majority tends to disagree with that opinion.

Non blocking, composable reactive web programming with Iteratees

Sadek Drobi's talk about Iteratees in Play 2.0 was very refreshing. Perhaps because I've never used Play before, but the presentation was flawless, with well explained problems, concepts and solutions.

Sadek started with a reflection on how much CPU we waste waiting for IO in web development, then moved to Play's Iteratees, to explain the concept and implementation, which while very different from the that overused Request/Servlet model, looked really nice and simple. I'm not sure though, how much the problem is present when you have a simple service, serving static content before your app server. Think apache (and faster) before tomcat. That won't fix the upload/download issue though, which is beautifully solved in Play 2.0

The Future of the Java Platform: Java SE 8 & Beyond

Simon Ritter is an intriguing fellow. If you take a glance at his work history (AT&T UNIX System Labs -> Novell -> Sun -> Oracle), you can easily see, he's a heavy weight player.

His presentation was rich in content, no corpo-bullshit. He started with a bit of history of JCP and how it looks like right now, then moved to the most interesting stuff, changes. Now I could give you a summary here, but there is really no point: you'd be much better taking look at the slides. There are only 48 of them, but everything is self-explanatory.

To Java SE 8 and Beyond! | Simon Ritter

While I'm very disappointed with the speed of changes, especially when compared to the C# world, I'm glad with the direction and the fact that they finally want to BREAK the compatibility with the broken stuff (generics, etc.). Moving to other languages I guess I won't be the one to scream "My god, finally!" somewhere in 2017, though. All the changes together look very promising, it's just that I'd like to have them like... now? Next year max, not near the heat death of the universe.

Simon also revealed one of the great mysteries of Java, to me:

The original idea behind JNI was to make it hard to write, to discourage people form using it.

On a side note, did you know Tegra3 has actually 5 cores? You use 4 of them, and then switch to the other one, when you battery gets low.

BOF: Spring and CloudFoundry

Having most of my folks moved to see "Typesafe stack 2.0" fabulously organized by Rafał Wasilewski and Wojtek Erbetowski (with both of whom I had a pleasure to travel to the conference) and knowing it will be recorded, I've decided to see what Josh Long has to say about CloudFoundry, a subject I find very intriguing after the de facto fiasco of Google App Engine.

The audience was small but vibrant, mostly users of Amazon EC2, and while it turned out that Josh didn't have much, with pricing and details not yet public, the fact that Spring Source has already created their own competition (Could Foundry is both an Open Source app and a service), takes a lot from my anxiety.

For the review of the second day of the conference, go here.

Subtelny feature konstrukcji CONNECT BY

Tags:

Jan Matusiewicz

RTMP - Real Time Messaging Protocol

Migracja strony kodowej

Agile UX – Confitura 2013

33rd Degree day 1 review

Twitter: From Ruby on Rails to the JVM

Complexity of Complexity

Pointy haired bosses and pragmatic programmers: Facts and Fallacies of Software Development

Build Trust in Your Build to Deployment Flow!

Non blocking, composable reactive web programming with Iteratees

The Future of the Java Platform: Java SE 8 & Beyond

BOF: Spring and CloudFoundry

Using Kotlin extensions in Groovy

Subtelny feature konstrukcji CONNECT BY

Tags:

Jan Matusiewicz

RTMP - Real Time Messaging Protocol

Migracja strony kodowej

You May Also Like

Twitter: From Ruby on Rails to the JVM

Complexity of Complexity

Pointy haired bosses and pragmatic programmers: Facts and Fallacies of Software Development

Build Trust in Your Build to Deployment Flow!

Non blocking, composable reactive web programming with Iteratees

The Future of the Java Platform: Java SE 8 & Beyond

BOF: Spring and CloudFoundry