Maintaining PriorityQueue Order with Java Streams

The tricky thing about working with PriorityQueues is that, ironically, they don’t always behave according to the PriorityQueue semantics.

PriorityQueue Traversal

If we have a look at PriorityQueue.iterator() documentation, we’ll see that, unintuitively, iterator() is not traversing the queue according to its priority order:

Returns an iterator over the elements in this queue. The iterator does not return the elements in any particular order.

The same behaviour can be noticed when trying to use Java Stream API for processing PriorityQueue’s elements using the instance obtained by the stream() method – the Stream instance depends on the Spliterator instance which doesn’t guarantee the desired traversal order.

PriorityQueue<String> queue = new PriorityQueue<>(comparing(String::length));
List<String> content = Arrays.asList("1", "333", "22", "55555", "4444");
queue.addAll(content);

assertThat(queue.stream())
  .containsExactlyElementsOf(content);

We can see that the insertion order gets preserved regardless of the fact that our queue was expected to be providing Strings according to their length.

Solution #1

We can use the poll() method to fetch elements from the queue according to the priority order.

So – let’s generate a Stream from consecutive elements, returned by the poll() method, using Stream.generate() method:

List<String> result = Stream.generate(queue::poll)
  .limit(queue.size())
  .collect(Collectors.toList());

assertThat(result)
  .containsExactly("1", "22", "333", "4444", "55555");
assertThat(queue)
  .isEmpty();

The problem with this implementation is that it’s not concurrent-modification-friendly. Until Java 9 gets released, we can’t terminate the generated Stream dynamically so we need to rely on limiting the Stream size to the queue size – which is not perfect because this can change during an actual processing.

The crucial part of this implementation is that after consuming a Stream instance, we end up with a modified queue – the poll() method removes the polled element from the queue.

Eventually, this approach can be extracted to the separate utility method:

static <T> Stream<T> drainToStream(PriorityQueue<T> queue) {
    Objects.requireNonNull(queue);
    return Stream.generate(queue::poll)
      .limit(queue.size());
}

Java 9

Since Java 9, it’ll be possible to rewrite it in a concurrent-friendly manner:

static <T> Stream<T> drainToStream(PriorityQueue<T> queue) {
    Objects.requireNonNull(queue);
    return Stream.generate(queue::poll)
      .takeWhile(Objects::nonNull)
}

Solution #2

The second approach involves simply sorting the Stream instance using the same comparator that the queue uses. We need to remember that this will work as long as the queue was initialized using a custom comparator:

List<String> result = queue.stream()
  .sorted(queue.comparator())
  .collect(Collectors.toList());

assertThat(result)
  .containsExactly("1", "22", "333", "4444", "55555");
assertThat(queue)
  .isNotEmpty();

This approach can be used with any Collection type (as long as we can get ahold of the right Comparator instance).

If we store Comparable objects in the queue and depend on their natural order, this becomes even simpler because we do not need to reach for the Comparator instance:

PriorityQueue<String> queue = new PriorityQueue<>();
queue.addAll(Arrays.asList("1", "333", "22", "55555", "4444"));

List<String> result = queue.stream()
  .sorted()
  .collect(Collectors.toList());

assertThat(result)
  .containsExactly("1", "22", "333", "4444", "55555");
assertThat(queue)
  .isNotEmpty();

In this case, after consuming a Stream instance, our original queue remains intact.

Eventually, this approach can be extracted to the separate utility method:

static <T> Stream<T> asStream(PriorityQueue<T> queue) {
    Objects.requireNonNull(queue);
    Comparator<? super T> comparator = queue.comparator();
    return comparator != null
      ? queue.stream().sorted(comparator)
      : queue.stream().sorted();
}

Conclusion

The priority order of the PriorityQueue is not preserved when iterating/traversing so, essentially, we need to create our Stream instances ourselves.

The working code snippets can be found on GitHub.

You May Also Like

JBoss Envers and Spring transaction managers

I've stumbled upon a bug with my configuration for JBoss Envers today, despite having integration tests all over the application. I have to admit, it casted a dark shadow of doubt about the value of all the tests for a moment. I've been practicing TDD since 2005, and frankly speaking, I should have been smarter than that.

My fault was simple. I've started using Envers the right way, with exploratory tests and a prototype. Then I've deleted the prototype and created some integration tests using in-memory H2 that looked more or less like this example:

@Test
public void savingAndUpdatingPersonShouldCreateTwoHistoricalVersions() {
    //given
    Person person = createAndSavePerson();
    String oldFirstName = person.getFirstName();
    String newFirstName = oldFirstName + "NEW";

    //when
    updatePersonWithNewName(person, newFirstName);

    //then
    verifyTwoHistoricalVersionsWereSaved(oldFirstName, newFirstName);
}

private Person createAndSavePerson() {
    Transaction transaction = session.beginTransaction();
    Person person = PersonFactory.createPerson();
    session.save(person);
    transaction.commit();
    return person;
}    

private void updatePersonWithNewName(Person person, String newName) {
    Transaction transaction = session.beginTransaction();
    person.setFirstName(newName);
    session.update(person);
    transaction.commit();
}

private void verifyTwoHistoricalVersionsWereSaved(String oldFirstName, String newFirstName) {
    List<Object[]> personRevisions = getPersonRevisions();
    assertEquals(2, personRevisions.size());
    assertEquals(oldFirstName, ((Person)personRevisions.get(0)[0]).getFirstName());
    assertEquals(newFirstName, ((Person)personRevisions.get(1)[0]).getFirstName());
}

private List<Object[]> getPersonRevisions() {
    Transaction transaction = session.beginTransaction();
    AuditReader auditReader = AuditReaderFactory.get(session);
    List<Object[]> personRevisions = auditReader.createQuery()
            .forRevisionsOfEntity(Person.class, false, true)
            .getResultList();
    transaction.commit();
    return personRevisions;
}

Because Envers inserts audit data when the transaction is commited (in a new temporary session), I thought I have to create and commit the transaction manually. And that is true to some point.

My fault was that I didn't have an end-to-end integration/acceptance test, that would call to entry point of the application (in this case a service which is called by GWT via RPC), because then I'd notice, that the Spring @Transactional annotation, and calling transaction.commit() are two, very different things.

Spring @Transactional annotation will use a transaction manager configured for the application. Envers on the other hand is used by subscribing a listener to hibernate's SessionFactory like this:

<bean id="sessionFactory" class="org.springframework.orm.hibernate3.annotation.AnnotationSessionFactoryBean" >        
...
 <property name="eventListeners">
     <map key-type="java.lang.String" value-type="org.hibernate.event.EventListeners">
         <entry key="post-insert" value-ref="auditEventListener"/>
         <entry key="post-update" value-ref="auditEventListener"/>
         <entry key="post-delete" value-ref="auditEventListener"/>
         <entry key="pre-collection-update" value-ref="auditEventListener"/>
         <entry key="pre-collection-remove" value-ref="auditEventListener"/>
         <entry key="post-collection-recreate" value-ref="auditEventListener"/>
     </map>
 </property>
</bean>

<bean id="auditEventListener" class="org.hibernate.envers.event.AuditEventListener" />

Envers creates and collects something called AuditWorkUnits whenever you update/delete/insert audited entities, but audit tables are not populated until something calls AuditProcess.beforeCompletion, which makes sense. If you are using org.hibernate.transaction.JDBCTransaction manually, this is called on commit() when notifying all subscribed javax.transaction.Synchronization objects (and enver's AuditProcess is one of them).

The problem was, that I used a wrong transaction manager.

<bean id="transactionManager" class="org.springframework.jdbc.datasource.DataSourceTransactionManager" >
    <property name="dataSource" ref="dataSource"/>
</bean>

This transaction manager doesn't know anything about hibernate and doesn't use org.hibernate.transaction.JDBCTransaction. While Synchronization is an interface from javax.transaction package, DataSourceTransactionManager doesn't use it (maybe because of simplicity, I didn't dig deep enough in org.springframework.jdbc.datasource), and thus Envers works fine except not pushing the data to the database.

Which is the whole point of using Envers.

Use right tools for the task, they say. The whole problem is solved by using a transaction manager that is well aware of hibernate underneath.

<bean id="transactionManager" class="org.springframework.orm.hibernate3.HibernateTransactionManager" >
    <property name="sessionFactory" ref="sessionFactory"/>
</bean>

Lesson learned: always make sure your acceptance tests are testing the right thing. If there is a doubt about the value of your tests, you just don't have enough of them,

Devoxx 2012 review


I'm sitting in a train to Charleroi, looking through a window at the Denmark landscape, street lights flashing by, people comming home from work, getting out for a Friday night party, or having a family dinner. To my left, guys from SoftwareMill are playing cards.
I don't really see them. My mind is busy elsewhere, sorting out and processing last two days in Antwerp, where 3400 developers, from 41 different countries, listened to 200 different sessions at the Devoxx, AFAIK the biggest Java conference this year.