Iterating two collections simultaneously

In this article, I will discuss iterating two collections simultaneously. This is not a very recurring topic in Java, but sometimes it may be necessary. The main goal of this article, in particular, is to show how to create some good code designs for greater code reuse.

Introduction

When you need to iterate on a single collection, you generally have several choices, as listed in the following points:

  • You can use the Iterator object explicitly.
  • You can use the enhanced for (the so-called “for-each”) since Java 5.
  • You can use the forEach method available on the Iterable interface since JDK 8.
  • You can use a stream, from the Stream API since JDK 8, for any “advanced” usage.

Furthermore, if you have precisely an ArrayList or Vector type, you can also use the classic indexed for cycle. However, this approach is inappropriate and inefficient for LinkedList and clearly does not apply to other collections that are not accessible by index.

In general, if you have an arbitrary collection, the best way to iterate it is through the Iterator object, either explicitly or implicitly (using the enhanced for or the forEach method).

What if you have two collections and you want to iterate them simultaneously in “pair”? In this situation, you must first establish what should happen when the two collections have different sizes. There are various possibilities. The simple solution presented in this article is to iterate on the minimum size, ensuring you always have one or more pairs from the two collections (when not empty, of course).

The non-reusable approach

At the practical level, the best way to iterate on two collections simultaneously is to use an explicit Iterator object for both, like in the following code snippet:

List<String> strings = Arrays.asList("A", "B", "C");
List<Integer> numbers = Arrays.asList(100, 200, 300);

Iterator<String> stringsItr = strings.iterator();         // \
Iterator<Integer> numbersItr = numbers.iterator();        //  |
                                                          //  | Can we encapsulate
while (stringsItr.hasNext() && numbersItr.hasNext()) {    //  | this part of code?
    String stringItem = stringsItr.next();                //  |
    Integer numberItem = numbersItr.next();               // /

    // use stringItem and numberItem ...
}

The above code could be more reusable since you should replicate it everywhere and change the types as necessary. So the central question of this article is: can we properly encapsulate this logic in a class/method to achieve better code reuse?

Absolutely yes! There are even various code designs that I will describe in detail in the following sections.

Solution 1: the functional approach

The first code design I am going to discuss uses a functional approach. Starting from Java 8, you can easily pass a behavior to a method using the new concepts of lambda expressions and functional interfaces. To be precise and honest, you could pass a behavior to a method even before Java 8. However, it was very uncomfortable because you typically had to use inner classes (e.g., anonymous inner classes) or standard top-level classes.

Using the functional approach, you can suppose to write a simple “utility” method that receives 3 arguments:

  • The first sequence of elements
  • The second sequence of elements
  • A function that accepts two values for each pair from the two sequences

At the practical level, the following is the most straightforward but widely applicable implementation:

import java.util.Iterator;
import java.util.function.BiConsumer;

public class IterationUtils {
    public static <T, U> void forEachPair(Iterable<? extends T> iterable1,
            Iterable<? extends U> iterable2, BiConsumer<? super T, ? super U> consumerFunc) {
        Iterator<? extends T> iterator1 = iterable1.iterator();
        Iterator<? extends U> iterator2 = iterable2.iterator();

        while (iterator1.hasNext() && iterator2.hasNext()) {
            consumerFunc.accept(iterator1.next(), iterator2.next());   // calls the function
        }
    }
}

Notes

Various things should be noted in the above code. First, BiConsumer is one of the new functional interfaces introduced in JDK 8. This interface represents a “function” that accepts two object arguments and returns nothing.

Then, I used Iterable, not Collection, to receive the sequence of elements. Remember that Iterable is a super-interface of Collection. This is generally the best choice when you only need to iterate over any iterable object. Furthermore, there exist some types that implement Iterable but that are not “collections”. One notable example is the Path interface (from the NIO2 API since JDK 7), which implements Iterable<Path> but is clearly not a general-purpose collection.

Finally, I heavily used generics, particularly the so-called bounded wildcards (<? extends X> and <? super X>). The difference between extends and super derives from the well-known “Get and Put Principle” described extensively in the great O’Reilly book “Java Generics and Collections” (note: I highly recommend buying this book if you are very interested in Java generics!).

The principle is pretty simple. You use the extends wildcard when you only need to extract values from an object. This is the case of Iterable since we only get a value from the next method of the provided Iterator. Instead, you use a super wildcard when you only need to insert (means: pass as argument) a value into an object. This is the case of BiConsumer since we pass 2 arguments to its accept method. And to be complete, you do not use a wildcard when you need to both get and put a value.

Usage example

The forEachPair method can be used in the following way, for example:

List<String> strings = Arrays.asList("A", "B", "C");
List<Integer> numbers = Arrays.asList(100, 200, 300);

IterationUtils.forEachPair(strings, numbers, (s, n) -> System.out.println(s + " / " + n));

Although the forEachPair method works perfectly and is easily reusable, it has a significant limitation you should be aware of. Since the iteration is entirely performed by the forEachPair method, the lambda function has no control over it. There is no way to use a break or continue to change the flow of the loop. This is a well-known general limitation of the functional approach.

One way to mitigate this problem is by changing the design slightly. Instead of BiConsumer, you could use a BiPredicate, which accepts two object arguments and returns a boolean. You may use a true value to continue the loop and a false to terminate it. I leave this as an exercise for the reader.

Solution 2: the Iterable approach

The second code design I will discuss employs a custom implementation of the Iterable interface. This approach allows you to use the enhanced for (“for-each”) cycle, where you have complete flow control.

To use the for-each, you need two things: an object that implements Iterable (the “target” of the for-each) and a type representing each item of the iteration.

In our case, we need a type representing a “pair” of values. You can define your own Pair type or use a Pair type provided by some external library. The following is a trivial implementation of a pair type:

The Pair type

public final class Pair<T, U> {
    private final T item1;
    private final U item2;

    public Pair(T item1, U item2) {
        this.item1 = item1;
        this.item2 = item2;
    }

    public T getItem1() {
        return item1;
    }

    public U getItem2() {
        return item2;
    }

    // ... other methods (e.g., equals, hashCode, toString)
}

Otherwise, you can use one of the following types (non-exhaustive list!):

  • The org.apache.commons.lang3.tuple.Pair from the Apache Commons Lang library.
  • The javafx.util.Pair from the JavaFX API (however, it was mainly intended for key/value pairs).
  • The org.springframework.data.util.Pair from Spring Data Common.

Once you have decided which Pair type to use, you must define a class that implements Iterable. Since Iterable only provides an Iterator, you must also implement a custom one. The following is my personal solution:

The Iterable implementation

import java.util.Iterator;

public class PairIterable<T, U> implements Iterable<Pair<T, U>> {
    private final Iterable<? extends T> iterable1;
    private final Iterable<? extends U> iterable2;

    public PairIterable(Iterable<? extends T> iterable1, Iterable<? extends U> iterable2) {
        this.iterable1 = iterable1;
        this.iterable2 = iterable2;
    }

    @Override
    public Iterator<Pair<T, U>> iterator() {
        return new Itr();
    }

    private class Itr implements Iterator<Pair<T, U>> {
        private final Iterator<? extends T> iterator1 = iterable1.iterator();
        private final Iterator<? extends U> iterator2 = iterable2.iterator();

        @Override
        public boolean hasNext() {
            return iterator1.hasNext() && iterator2.hasNext();
        }

        @Override
        public Pair<T, U> next() {
            return new Pair<>(iterator1.next(), iterator2.next());
        }
    }
}

I decided to implement Iterator using a “regular” inner class, which is also private and thus not visible outside. This is just only a matter of personal preference. You could use an anonymous inner class defined in the iterator method or a standard top-level class (less preferable but technically possible).

Note that bounded wildcards still pertain to this case. Since we only extract values from the next method of Iterator, the extends wildcard is appropriate.

Usage example

We can use the PairIterable, for example, in the following way:

List<String> strings = Arrays.asList("A", "B", "C");
List<Integer> numbers = Arrays.asList(100, 200, 300);

for (Pair<String, Integer> pair : new PairIterable<>(strings, numbers)) {
    System.out.println(pair.getItem1() + " / " + pair.getItem2());
}

The code is clearly longer than the one using the functional approach. However, the loop is now entirely under our control, so we can eventually use break or continue (and even return!) to change the flow of the loop.

If you use Java 10 or higher, you can abbreviate the loop slightly using the new “var” syntax (see my previous article, “The “var” syntax for local variables since Java 10”) like the following:

for (var pair : new PairIterable<>(strings, numbers)) {
    System.out.println(pair.getItem1() + " / " + pair.getItem2());
}

The Iterable approach, on its side, is slightly inefficient because, for each pair of values, a new Pair object needs to be created. This is generally not a big issue unless you have millions of elements.

Conclusions

In this article, I provided two ways to solve the question about simultaneously iterating two collections. Most importantly, the article’s main point was to show how to create reusable code to avoid bad repetitions.

Both approaches are generally satisfying, but it all depends on your specific scenario. There are mainly two aspects to consider:

  • What you need to do with each pair
  • How much control do you want to have over the iteration

For example, the functional approach is preferable if you only need to print each pair (like the above examples). However, suppose you must perform complex logic with conditional tests and break/continue/return to change the flow of the loop. In that case, the Iterable approach is the only one applicable.

Similar Posts