Subtelny feature konstrukcji CONNECT BY

Jeśli ktoś używa konstrukcji CONNECT BY i chce uniknąć subtelnych błędów to powinienen zapoznać się z opisaną poniżej własnością. Jeśli w kwerendzie hierarchicznej (czyli używającej konstrukcji CONNECT BY … START WITH …) używa się równocześnie złączeń, czy to ANSI czy Oracle’owych to wynik jest nieintuicyjny. Właśnie wykryłem istotny błąd w moim kodzie z tym związany. Kwerenda hierarchiczna to takie coś, co pozwala uzyskać w jednym zapytaniu np. listę wszystkich przełożonych danego pracownika, ale nie tylko tych bezpośrednich, ale również przełożonych tych przełożonych, itd. Chodzi o uzyskanie informacji z drzewa. Dobry opis działania kwerend hierarchicznych znajduje się tutaj:

http://download.oracle.com/docs/cd/B10501_01/server.920/a96540/queries4a.htm#2053937 Są tam 2 ważne punkty:
* If the|WHERE| predicate contains a join, Oracle applies the joinpredicates /before/ doing the |CONNECT| |BY| processing.
* If the |WHERE| clause does not contain a join, Oracle applies allpredicates other than the|CONNECT| |BY| predicates _/after/_doing the |CONNECT| |BY|processing without affecting the other rows of the hierarchy. Jeśli klauzura where zawiera i złączenie i nie-złączenie (warunek nie będący złączeniem) to złączenie dokonywane jest przed przetwarzaniem connect by a nie-złączenie po. Jako przykład opiszę w uproszczeniu wykryty błąd: Kwerenda miała zwracać dla danego klienta listę jego przełożonych z jego firmy (w uproszczeniu)

SELECT c.* FROM customer_all c
CONNECT BY PRIOR c.customer_id_high = c.customer_id
START WITH c.customer_id = 123456

Potem zaszła potrzeba, aby pobierać też adresy billingowe klientów. Każdy klient może mieć wiele adresów, ale tylko jeden billingowy czyli z flagą CCBILL=’X’. Kwerenda została zmieniona na

SELECT level, c.*, cc.*
  FROM customer_all c,
       ccontact_all cc
 WHERE c.customer_id = cc.customer_id
   AND cc.ccbill = 'X'
CONNECT BY PRIOR c.customer_id_high = c.customer_id
START WITH c.customer_id = 123456

I tu był błąd. Warunek c.customer_id = cc.customer_id nastąpił przed wykonaniem connect by a cc.ccbill = ‘X’ po, co spowodowało że błąd wystąpił, gdy klient 123456 miał dwa adresy a jego przełożony i przełożony przełożonego mieli po jednym. Liczenie zapytania wyglądało następująco: Etap 1: w wyniku zapytania występują 2 rekordy dla klienta 123456, jeden dla adresu billingowego, drugi dla innego adresu Etap 2: dla obu rekordów powstałych w poprzednim etapie wykonywany jest warunek connect by. Do wyników 2 razy zostaje wrzucony przełożony 123456 Etap 3: dla obu rekordów powstałych w poprzednim etapie wykonywany jest warunek connect by. Do wyników 2 razy zostaje wrzucony przełożony przełożonego 123456 Etap 4: Usunięty został rekord z adresem niebillingowym dla klienta 123456, ale rekordy jego przełożonych dodane z powodu tego rekordu nie zostały usunięte ponieważ

*ich* adres był billingowy (nie mieli żadnego nie-billingowego). Skutek był taki, że kwerenda zwróciła 5 rekordów, a powinna była zwrócić 3: rekordy przełożonego i przełożonego przełożonego zostały zdublikowane z powodu drugiego adresu klienta 123456. Rekord z jego drugim adresem został usunięty, ale gałąź, która z niego powstała – nie. Rozwiązaniem było dodanie warunku na poprzedni rekord w CONNECT BY, koniecznie z PRIOR: PRIOR cc.ccbill = ‘X’

SELECT level, c.*, cc.*
  FROM customer_all c, ccontact_all cc
 WHERE c.customer_id = cc.customer_id
   AND cc.ccbill = 'X'
CONNECT BY PRIOR c.customer_id_high = c.customer_id
       AND PRIOR cc.ccbill = 'X'
 START WITH c.customer_id = 123456

Bezpiecznym rozwiązaniem byłoby też robienie joina z ccontract_all dopiero po obliczeniu wyniku kwerendy hierarchicznej.

SELECT *
  FROM (SELECT c. *
          FROM customer_all c,
        CONNECT BY PRIOR c.customer_id_high = c.customer_id
         START WITH c.customer_id = 123456) c,
       ccontract_all cc
 WHERE c.customer_id = cc.customer_id
   AND cc.ccbill = 'X'

Jeszcze taką ciekawą rzecz zauważyłem: gdy w zapytaniu posiadającym i złączenia i kwerendę hierarchiczną zrobi się złączenie poprzez konstrukcję JOIN … ON a nie warunek w WHERE to czas wykonania wzrasta kilkaset tysięcy razy! Być może są przypadki, gdy jest odwrotnie, więc warto to zawsze sprawdzić.

You May Also Like

Integration testing custom validation constraints in Jersey 2

I recently joined a team trying to switch a monolithic legacy system into set of RESTful services in Java. They decided to use latest 2.x version of Jersey as a REST container which was not a first choice for me, since I’m not a big fan of JSR-* specs. But now I must admit that JAX-RS 2.x is doing things right: requires almost zero boilerplate code, support auto-discovery of features and prefers convention over configuration like other modern frameworks. Since the spec is still young, it’s hard to find good tutorials and kick-off projects with some working code. I created jersey2-starter project on GitHub which can be used as starting point for your own production-ready RESTful service. In this post I’d like to cover how to implement and integration test your own validation constraints of REST resources.

Custom constraints

One of the issues which bothers me when coding REST in Java is littering your class model with annotations. Suppose you want to build a simple Todo list REST service, when using Jackson, validation and Spring Data, you can easily end up with this as your entity class:

@Document
public class Todo {
    private Long id;
    @NotNull
    private String description;
    @NotNull
    private Boolean completed;
    @NotNull
    private DateTime dueDate;

    @JsonCreator
    public Todo(@JsonProperty("description") String description, @JsonProperty("dueDate") DateTime dueDate) {
        this.description = description;
        this.dueDate = dueDate;
        this.completed = false;
    }
    // getters and setters
}

Your domain model is now effectively blured by messy annotations almost everywhere. Let’s see what we can do with validation constraints (@NotNulls). Some may say that you could introduce some DTO layer with own validation rules, but it conflicts for me with pure REST API design, which stands that you operate on resources which should map to your domain classes. On the other hand - what does it mean that Todo object is valid? When you create a Todo you should provide a description and due date, but what when you’re updating? You should be able to change any of description, due date (postponing) and completion flag (marking as done) - but you should provide at least one of these as valid modification. So my idea is to introduce custom validation constraints, different ones for creation and modification:

@Target({TYPE, PARAMETER})
@Retention(RUNTIME)
@Constraint(validatedBy = ValidForCreation.Validator.class)
public @interface ValidForCreation {
    //...
    class Validator implements ConstraintValidator<ValidForCreation, Todo> {
    /...
        @Override
        public boolean isValid(Todo todo, ConstraintValidatorContext constraintValidatorContext) {
            return todo != null
                && todo.getId() == null
                && todo.getDescription() != null
                && todo.getDueDate() != null;
        }
    }
}

@Target({TYPE, PARAMETER})
@Retention(RUNTIME)
@Constraint(validatedBy = ValidForModification.Validator.class)
public @interface ValidForModification {
    //...
    class Validator implements ConstraintValidator<ValidForModification, Todo> {
    /...
        @Override
        public boolean isValid(Todo todo, ConstraintValidatorContext constraintValidatorContext) {
            return todo != null
                && todo.getId() == null
                && (todo.getDescription() != null || todo.getDueDate() != null || todo.isCompleted() != null);
        }
    }
}

And now you can move validation annotations to the definition of a REST endpoint:

@POST
@Consumes(APPLICATION_JSON)
public Response create(@ValidForCreation Todo todo) {...}

@PUT
@Consumes(APPLICATION_JSON)
public Response update(@ValidForModification Todo todo) {...}

And now you can remove those NotNulls from your model.

Integration testing

There are in general two approaches to integration testing:

  • test is being run on separate JVM than the app, which is deployed on some other integration environment
  • test deploys the application programmatically in the setup block.

Both of these have their pros and cons, but for small enough servoces, I personally prefer the second approach. It’s much easier to setup and you have only one JVM started, which makes debugging really easy. You can use a generic framework like Arquillian for starting your application in a container environment, but I prefer simple solutions and just use emdedded Jetty. To make test setup 100% production equivalent, I’m creating full Jetty’s WebAppContext and have to resolve all runtime dependencies for Jersey auto-discovery to work. This can be simply achieved with Maven resolved from Shrinkwrap - an Arquillian subproject:

    WebAppContext webAppContext = new WebAppContext();
    webAppContext.setResourceBase("src/main/webapp");
    webAppContext.setContextPath("/");
    File[] mavenLibs = Maven.resolver().loadPomFromFile("pom.xml")
                .importCompileAndRuntimeDependencies()
                .resolve().withTransitivity().asFile();
    for (File file: mavenLibs) {
        webAppContext.getMetaData().addWebInfJar(new FileResource(file.toURI()));
    }
    webAppContext.getMetaData().addContainerResource(new FileResource(new File("./target/classes").toURI()));

    webAppContext.setConfigurations(new Configuration[] {
        new AnnotationConfiguration(),
        new WebXmlConfiguration(),
        new WebInfConfiguration()
    });
    server.setHandler(webAppContext);

(this Stackoverflow thread inspired me a lot here)

Now it’s time for the last part of the post: parametrizing our integration tests. Since we want to test validation constraints, there are many edge paths to check (and make your code coverage close to 100%). Writing one test per each case could be a bad idea. Among the many solutions for JUnit I’m most convinced to the Junit Params by Pragmatists team. It’s really simple and have nice concept of JQuery-like helper for creating providers. Here is my tests code (I’m also using builder pattern here to create various kinds of Todos):

@Test
@Parameters(method = "provideInvalidTodosForCreation")
public void shouldRejectInvalidTodoWhenCreate(Todo todo) {
    Response response = createTarget().request().post(Entity.json(todo));

    assertThat(response.getStatus()).isEqualTo(BAD_REQUEST.getStatusCode());
}

private static Object[] provideInvalidTodosForCreation() {
    return $(
        new TodoBuilder().withDescription("test").build(),
        new TodoBuilder().withDueDate(DateTime.now()).build(),
        new TodoBuilder().withId(123L).build(),
        new TodoBuilder().build()
    );
}

OK, enough of reading, feel free to clone the project and start writing your REST services!

I recently joined a team trying to switch a monolithic legacy system into set of RESTful services in Java. They decided to use latest 2.x version of Jersey as a REST container which was not a first choice for me, since I’m not a big fan of JSR-* specs. But now I must admit that JAX-RS 2.x is doing things right: requires almost zero boilerplate code, support auto-discovery of features and prefers convention over configuration like other modern frameworks. Since the spec is still young, it’s hard to find good tutorials and kick-off projects with some working code. I created jersey2-starter project on GitHub which can be used as starting point for your own production-ready RESTful service. In this post I’d like to cover how to implement and integration test your own validation constraints of REST resources.

Custom constraints

One of the issues which bothers me when coding REST in Java is littering your class model with annotations. Suppose you want to build a simple Todo list REST service, when using Jackson, validation and Spring Data, you can easily end up with this as your entity class:

@Document
public class Todo {
    private Long id;
    @NotNull
    private String description;
    @NotNull
    private Boolean completed;
    @NotNull
    private DateTime dueDate;

    @JsonCreator
    public Todo(@JsonProperty("description") String description, @JsonProperty("dueDate") DateTime dueDate) {
        this.description = description;
        this.dueDate = dueDate;
        this.completed = false;
    }
    // getters and setters
}

Your domain model is now effectively blured by messy annotations almost everywhere. Let’s see what we can do with validation constraints (@NotNulls). Some may say that you could introduce some DTO layer with own validation rules, but it conflicts for me with pure REST API design, which stands that you operate on resources which should map to your domain classes. On the other hand - what does it mean that Todo object is valid? When you create a Todo you should provide a description and due date, but what when you’re updating? You should be able to change any of description, due date (postponing) and completion flag (marking as done) - but you should provide at least one of these as valid modification. So my idea is to introduce custom validation constraints, different ones for creation and modification:

@Target({TYPE, PARAMETER})
@Retention(RUNTIME)
@Constraint(validatedBy = ValidForCreation.Validator.class)
public @interface ValidForCreation {
    //...
    class Validator implements ConstraintValidator<ValidForCreation, Todo> {
    /...
        @Override
        public boolean isValid(Todo todo, ConstraintValidatorContext constraintValidatorContext) {
            return todo != null
                && todo.getId() == null
                && todo.getDescription() != null
                && todo.getDueDate() != null;
        }
    }
}

@Target({TYPE, PARAMETER})
@Retention(RUNTIME)
@Constraint(validatedBy = ValidForModification.Validator.class)
public @interface ValidForModification {
    //...
    class Validator implements ConstraintValidator<ValidForModification, Todo> {
    /...
        @Override
        public boolean isValid(Todo todo, ConstraintValidatorContext constraintValidatorContext) {
            return todo != null
                && todo.getId() == null
                && (todo.getDescription() != null || todo.getDueDate() != null || todo.isCompleted() != null);
        }
    }
}

And now you can move validation annotations to the definition of a REST endpoint:

@POST
@Consumes(APPLICATION_JSON)
public Response create(@ValidForCreation Todo todo) {...}

@PUT
@Consumes(APPLICATION_JSON)
public Response update(@ValidForModification Todo todo) {...}

And now you can remove those NotNulls from your model.

Integration testing

There are in general two approaches to integration testing:

  • test is being run on separate JVM than the app, which is deployed on some other integration environment
  • test deploys the application programmatically in the setup block.

Both of these have their pros and cons, but for small enough servoces, I personally prefer the second approach. It’s much easier to setup and you have only one JVM started, which makes debugging really easy. You can use a generic framework like Arquillian for starting your application in a container environment, but I prefer simple solutions and just use emdedded Jetty. To make test setup 100% production equivalent, I’m creating full Jetty’s WebAppContext and have to resolve all runtime dependencies for Jersey auto-discovery to work. This can be simply achieved with Maven resolved from Shrinkwrap - an Arquillian subproject:

    WebAppContext webAppContext = new WebAppContext();
    webAppContext.setResourceBase("src/main/webapp");
    webAppContext.setContextPath("/");
    File[] mavenLibs = Maven.resolver().loadPomFromFile("pom.xml")
                .importCompileAndRuntimeDependencies()
                .resolve().withTransitivity().asFile();
    for (File file: mavenLibs) {
        webAppContext.getMetaData().addWebInfJar(new FileResource(file.toURI()));
    }
    webAppContext.getMetaData().addContainerResource(new FileResource(new File("./target/classes").toURI()));

    webAppContext.setConfigurations(new Configuration[] {
        new AnnotationConfiguration(),
        new WebXmlConfiguration(),
        new WebInfConfiguration()
    });
    server.setHandler(webAppContext);

(this Stackoverflow thread inspired me a lot here)

Now it’s time for the last part of the post: parametrizing our integration tests. Since we want to test validation constraints, there are many edge paths to check (and make your code coverage close to 100%). Writing one test per each case could be a bad idea. Among the many solutions for JUnit I’m most convinced to the Junit Params by Pragmatists team. It’s really simple and have nice concept of JQuery-like helper for creating providers. Here is my tests code (I’m also using builder pattern here to create various kinds of Todos):

@Test
@Parameters(method = "provideInvalidTodosForCreation")
public void shouldRejectInvalidTodoWhenCreate(Todo todo) {
    Response response = createTarget().request().post(Entity.json(todo));

    assertThat(response.getStatus()).isEqualTo(BAD_REQUEST.getStatusCode());
}

private static Object[] provideInvalidTodosForCreation() {
    return $(
        new TodoBuilder().withDescription("test").build(),
        new TodoBuilder().withDueDate(DateTime.now()).build(),
        new TodoBuilder().withId(123L).build(),
        new TodoBuilder().build()
    );
}

OK, enough of reading, feel free to clone the project and start writing your REST services!