Idiomatic Peeking with Java Stream API

The peek() method from the Java Stream API is often misunderstood. Let’s try to clarify how it works and when should be used.

Stream

 

Introduction

Let’s start responsibly by RTFM inspecting peek()’s user’s manual.

Returns a stream consisting of the elements of this stream, additionally performing the provided action on each element as elements are consumed from the resulting stream.
Sounds pretty straightforward. We can use it for applying side-effects for every consumed Stream element:

Stream.of("one", "two", "three")
  .peek(e -> System.out.println(e + " was consumed by peek()."))
  .collect(Collectors.toList());

The result of such operation is not surprising:

one was consumed by peek().
two was consumed by peek().
three was consumed by peek().

Stream/peek Lazy Evaluation

but… what will happen if we replace the terminal collect() operation with a forEach()?

Stream.of("one", "two", "three")
  .peek(e -> System.out.println(e + " was consumed by peek()."))
  .forEach(System.out::println);

It might be tempting to think that we’ll see a series of peek() logs followed by a series of for-each logs but this is not the case:

one was consumed by peek().
one
two was consumed by peek().
two
three was consumed by peek().
three

It gets even more interesting if we get rid of the forEach call:

Stream.of("one", "two", "three")
  .peek(e -> System.out.println(e + " was consumed by peek()."));

the result would be nothing:


As JavaDoc states:

Intermediate operations return a new stream. They are always lazy; executing an intermediate operation such as filter() does not actually perform any filtering, but instead creates a new stream that, when traversed, contains the elements of the initial stream that match the given predicate. Traversal of the pipeline source does not begin until the terminal operation of the pipeline is executed.
So, because of the lazy evaluation Stream pipelines are always being traversed “vertically” and not “horizontally” which allows to avoid doing unnecessary calculations. Additionally, the traversal is triggered only when a terminal method is present. Hence, the observed behaviour.

This is why it’s possible to represent and manipulate infinite sequences using Streams. (Because where do you store an infinite sequence? In the cloud?):

Stream.iterate(0, i -> i + 1)
  .peek(System.out::println)
  .findFirst();

The above operation completes almost immediately because of the lazy character of Stream traversal. Of course, if you try to collect the whole infinite sequence to some data structure, even laziness will not save you.

So, we can see that peek() can’t be treated as an intermediate for-each replacement because it invokes the passed Consumer only on elements that are visited by the Stream.

Unfortunately, Streams do not always behave entirely lazily.

Proper Usage

Further inspection of the official docs reveals a note:

This method exists mainly to support debugging, where you want to see the elements as they flow past a certain point in a pipeline
The point of confusion is the “mainly” word. Let’s have a peek(pun intended) at the English dictionary:

We can see that non-debugging usages are not forbidden nor discouraged.

So, technically, we should be able to, e.g., modify the Stream elements on the fly which would not be possible in the immutable, functional world. Let’s use the infamous mutable java.util.Date:

Stream.of(Date.from(Instant.EPOCH))
  .peek(d -> d.setTime(Long.MAX_VALUE))
  .forEach(System.out::println);

// Sun Aug 17 08:12:55 CET 292278994

and we can observe that the result is far away from the standard epoch, which means the mutating operation was indeed applied.

The problem is that this behaviour is highly deceiving because certain Stream implementations can optimize out peek() calls.

This gets clarified in the early draft of JDK 9’s docs eventually:

In cases where the stream implementation is able to optimize away the production of some or all the elements (such as with short-circuiting operations like findFirst, or in the example described in count()), the action will not be invoked for those elements. (…) An implementation may choose to not execute the stream pipeline (either sequentially or in parallel) if it is capable of computing the count directly from the stream source.
So, now it’s clear that it might not be the best choice if we want to perform some side effects deterministically.

According to this, peek() might not be even that reliable debugging tool after all.

Key Takeaways

  • The peek() method works fine as a debugging tool when we want to see what is being consumed by a Stream
  • It seems to work fine when applying mutating operations but should not be used this way because this behaviour is non-deterministic due to the possibility of certain peek() calls being omitted due to internal optimization
  • The discussion, whether mutation operations should be allowed or not, would never take place if we were restricted to operate only on immutable values
  • We have around 292276977 years before we run out of java.util.Date range
You May Also Like

Integration testing custom validation constraints in Jersey 2

I recently joined a team trying to switch a monolithic legacy system into set of RESTful services in Java. They decided to use latest 2.x version of Jersey as a REST container which was not a first choice for me, since I’m not a big fan of JSR-* specs. But now I must admit that JAX-RS 2.x is doing things right: requires almost zero boilerplate code, support auto-discovery of features and prefers convention over configuration like other modern frameworks. Since the spec is still young, it’s hard to find good tutorials and kick-off projects with some working code. I created jersey2-starter project on GitHub which can be used as starting point for your own production-ready RESTful service. In this post I’d like to cover how to implement and integration test your own validation constraints of REST resources.

Custom constraints

One of the issues which bothers me when coding REST in Java is littering your class model with annotations. Suppose you want to build a simple Todo list REST service, when using Jackson, validation and Spring Data, you can easily end up with this as your entity class:

@Document
public class Todo {
    private Long id;
    @NotNull
    private String description;
    @NotNull
    private Boolean completed;
    @NotNull
    private DateTime dueDate;

    @JsonCreator
    public Todo(@JsonProperty("description") String description, @JsonProperty("dueDate") DateTime dueDate) {
        this.description = description;
        this.dueDate = dueDate;
        this.completed = false;
    }
    // getters and setters
}

Your domain model is now effectively blured by messy annotations almost everywhere. Let’s see what we can do with validation constraints (@NotNulls). Some may say that you could introduce some DTO layer with own validation rules, but it conflicts for me with pure REST API design, which stands that you operate on resources which should map to your domain classes. On the other hand - what does it mean that Todo object is valid? When you create a Todo you should provide a description and due date, but what when you’re updating? You should be able to change any of description, due date (postponing) and completion flag (marking as done) - but you should provide at least one of these as valid modification. So my idea is to introduce custom validation constraints, different ones for creation and modification:

@Target({TYPE, PARAMETER})
@Retention(RUNTIME)
@Constraint(validatedBy = ValidForCreation.Validator.class)
public @interface ValidForCreation {
    //...
    class Validator implements ConstraintValidator<ValidForCreation, Todo> {
    /...
        @Override
        public boolean isValid(Todo todo, ConstraintValidatorContext constraintValidatorContext) {
            return todo != null
                && todo.getId() == null
                && todo.getDescription() != null
                && todo.getDueDate() != null;
        }
    }
}

@Target({TYPE, PARAMETER})
@Retention(RUNTIME)
@Constraint(validatedBy = ValidForModification.Validator.class)
public @interface ValidForModification {
    //...
    class Validator implements ConstraintValidator<ValidForModification, Todo> {
    /...
        @Override
        public boolean isValid(Todo todo, ConstraintValidatorContext constraintValidatorContext) {
            return todo != null
                && todo.getId() == null
                && (todo.getDescription() != null || todo.getDueDate() != null || todo.isCompleted() != null);
        }
    }
}

And now you can move validation annotations to the definition of a REST endpoint:

@POST
@Consumes(APPLICATION_JSON)
public Response create(@ValidForCreation Todo todo) {...}

@PUT
@Consumes(APPLICATION_JSON)
public Response update(@ValidForModification Todo todo) {...}

And now you can remove those NotNulls from your model.

Integration testing

There are in general two approaches to integration testing:

  • test is being run on separate JVM than the app, which is deployed on some other integration environment
  • test deploys the application programmatically in the setup block.

Both of these have their pros and cons, but for small enough servoces, I personally prefer the second approach. It’s much easier to setup and you have only one JVM started, which makes debugging really easy. You can use a generic framework like Arquillian for starting your application in a container environment, but I prefer simple solutions and just use emdedded Jetty. To make test setup 100% production equivalent, I’m creating full Jetty’s WebAppContext and have to resolve all runtime dependencies for Jersey auto-discovery to work. This can be simply achieved with Maven resolved from Shrinkwrap - an Arquillian subproject:

    WebAppContext webAppContext = new WebAppContext();
    webAppContext.setResourceBase("src/main/webapp");
    webAppContext.setContextPath("/");
    File[] mavenLibs = Maven.resolver().loadPomFromFile("pom.xml")
                .importCompileAndRuntimeDependencies()
                .resolve().withTransitivity().asFile();
    for (File file: mavenLibs) {
        webAppContext.getMetaData().addWebInfJar(new FileResource(file.toURI()));
    }
    webAppContext.getMetaData().addContainerResource(new FileResource(new File("./target/classes").toURI()));

    webAppContext.setConfigurations(new Configuration[] {
        new AnnotationConfiguration(),
        new WebXmlConfiguration(),
        new WebInfConfiguration()
    });
    server.setHandler(webAppContext);

(this Stackoverflow thread inspired me a lot here)

Now it’s time for the last part of the post: parametrizing our integration tests. Since we want to test validation constraints, there are many edge paths to check (and make your code coverage close to 100%). Writing one test per each case could be a bad idea. Among the many solutions for JUnit I’m most convinced to the Junit Params by Pragmatists team. It’s really simple and have nice concept of JQuery-like helper for creating providers. Here is my tests code (I’m also using builder pattern here to create various kinds of Todos):

@Test
@Parameters(method = "provideInvalidTodosForCreation")
public void shouldRejectInvalidTodoWhenCreate(Todo todo) {
    Response response = createTarget().request().post(Entity.json(todo));

    assertThat(response.getStatus()).isEqualTo(BAD_REQUEST.getStatusCode());
}

private static Object[] provideInvalidTodosForCreation() {
    return $(
        new TodoBuilder().withDescription("test").build(),
        new TodoBuilder().withDueDate(DateTime.now()).build(),
        new TodoBuilder().withId(123L).build(),
        new TodoBuilder().build()
    );
}

OK, enough of reading, feel free to clone the project and start writing your REST services!

I recently joined a team trying to switch a monolithic legacy system into set of RESTful services in Java. They decided to use latest 2.x version of Jersey as a REST container which was not a first choice for me, since I’m not a big fan of JSR-* specs. But now I must admit that JAX-RS 2.x is doing things right: requires almost zero boilerplate code, support auto-discovery of features and prefers convention over configuration like other modern frameworks. Since the spec is still young, it’s hard to find good tutorials and kick-off projects with some working code. I created jersey2-starter project on GitHub which can be used as starting point for your own production-ready RESTful service. In this post I’d like to cover how to implement and integration test your own validation constraints of REST resources.

Custom constraints

One of the issues which bothers me when coding REST in Java is littering your class model with annotations. Suppose you want to build a simple Todo list REST service, when using Jackson, validation and Spring Data, you can easily end up with this as your entity class:

@Document
public class Todo {
    private Long id;
    @NotNull
    private String description;
    @NotNull
    private Boolean completed;
    @NotNull
    private DateTime dueDate;

    @JsonCreator
    public Todo(@JsonProperty("description") String description, @JsonProperty("dueDate") DateTime dueDate) {
        this.description = description;
        this.dueDate = dueDate;
        this.completed = false;
    }
    // getters and setters
}

Your domain model is now effectively blured by messy annotations almost everywhere. Let’s see what we can do with validation constraints (@NotNulls). Some may say that you could introduce some DTO layer with own validation rules, but it conflicts for me with pure REST API design, which stands that you operate on resources which should map to your domain classes. On the other hand - what does it mean that Todo object is valid? When you create a Todo you should provide a description and due date, but what when you’re updating? You should be able to change any of description, due date (postponing) and completion flag (marking as done) - but you should provide at least one of these as valid modification. So my idea is to introduce custom validation constraints, different ones for creation and modification:

@Target({TYPE, PARAMETER})
@Retention(RUNTIME)
@Constraint(validatedBy = ValidForCreation.Validator.class)
public @interface ValidForCreation {
    //...
    class Validator implements ConstraintValidator<ValidForCreation, Todo> {
    /...
        @Override
        public boolean isValid(Todo todo, ConstraintValidatorContext constraintValidatorContext) {
            return todo != null
                && todo.getId() == null
                && todo.getDescription() != null
                && todo.getDueDate() != null;
        }
    }
}

@Target({TYPE, PARAMETER})
@Retention(RUNTIME)
@Constraint(validatedBy = ValidForModification.Validator.class)
public @interface ValidForModification {
    //...
    class Validator implements ConstraintValidator<ValidForModification, Todo> {
    /...
        @Override
        public boolean isValid(Todo todo, ConstraintValidatorContext constraintValidatorContext) {
            return todo != null
                && todo.getId() == null
                && (todo.getDescription() != null || todo.getDueDate() != null || todo.isCompleted() != null);
        }
    }
}

And now you can move validation annotations to the definition of a REST endpoint:

@POST
@Consumes(APPLICATION_JSON)
public Response create(@ValidForCreation Todo todo) {...}

@PUT
@Consumes(APPLICATION_JSON)
public Response update(@ValidForModification Todo todo) {...}

And now you can remove those NotNulls from your model.

Integration testing

There are in general two approaches to integration testing:

  • test is being run on separate JVM than the app, which is deployed on some other integration environment
  • test deploys the application programmatically in the setup block.

Both of these have their pros and cons, but for small enough servoces, I personally prefer the second approach. It’s much easier to setup and you have only one JVM started, which makes debugging really easy. You can use a generic framework like Arquillian for starting your application in a container environment, but I prefer simple solutions and just use emdedded Jetty. To make test setup 100% production equivalent, I’m creating full Jetty’s WebAppContext and have to resolve all runtime dependencies for Jersey auto-discovery to work. This can be simply achieved with Maven resolved from Shrinkwrap - an Arquillian subproject:

    WebAppContext webAppContext = new WebAppContext();
    webAppContext.setResourceBase("src/main/webapp");
    webAppContext.setContextPath("/");
    File[] mavenLibs = Maven.resolver().loadPomFromFile("pom.xml")
                .importCompileAndRuntimeDependencies()
                .resolve().withTransitivity().asFile();
    for (File file: mavenLibs) {
        webAppContext.getMetaData().addWebInfJar(new FileResource(file.toURI()));
    }
    webAppContext.getMetaData().addContainerResource(new FileResource(new File("./target/classes").toURI()));

    webAppContext.setConfigurations(new Configuration[] {
        new AnnotationConfiguration(),
        new WebXmlConfiguration(),
        new WebInfConfiguration()
    });
    server.setHandler(webAppContext);

(this Stackoverflow thread inspired me a lot here)

Now it’s time for the last part of the post: parametrizing our integration tests. Since we want to test validation constraints, there are many edge paths to check (and make your code coverage close to 100%). Writing one test per each case could be a bad idea. Among the many solutions for JUnit I’m most convinced to the Junit Params by Pragmatists team. It’s really simple and have nice concept of JQuery-like helper for creating providers. Here is my tests code (I’m also using builder pattern here to create various kinds of Todos):

@Test
@Parameters(method = "provideInvalidTodosForCreation")
public void shouldRejectInvalidTodoWhenCreate(Todo todo) {
    Response response = createTarget().request().post(Entity.json(todo));

    assertThat(response.getStatus()).isEqualTo(BAD_REQUEST.getStatusCode());
}

private static Object[] provideInvalidTodosForCreation() {
    return $(
        new TodoBuilder().withDescription("test").build(),
        new TodoBuilder().withDueDate(DateTime.now()).build(),
        new TodoBuilder().withId(123L).build(),
        new TodoBuilder().build()
    );
}

OK, enough of reading, feel free to clone the project and start writing your REST services!

Mentoring in Software Craftsmanship

Maria Diaconu and  Alexandru Bolboaca are both strong supporters of Software Craftsmanship and Agile movements in Romania, with years of experience as developers, leaders, architects, managers and instructors. On their lecture at Agile Central Eur...Maria Diaconu and  Alexandru Bolboaca are both strong supporters of Software Craftsmanship and Agile movements in Romania, with years of experience as developers, leaders, architects, managers and instructors. On their lecture at Agile Central Eur...