Iterating two collections simultaneously
In this article, I will discuss iterating two collections simultaneously. This is not a very recurring topic in Java, but sometimes it may be necessary. The main goal of this article, in particular, is to show how to create some good code designs for greater code reuse.
Introduction
When you need to iterate on a single collection, you generally have several choices, as listed in the following points:
- You can use the
Iterator
object explicitly. - You can use the enhanced for (the so-called “for-each”) since Java 5.
- You can use the
forEach
method available on theIterable
interface since JDK 8. - You can use a stream, from the Stream API since JDK 8, for any “advanced” usage.
Furthermore, if you have precisely an ArrayList
or Vector
type, you can also use the classic indexed for cycle. However, this approach is inappropriate and inefficient for LinkedList
and clearly does not apply to other collections that are not accessible by index.
In general, if you have an arbitrary collection, the best way to iterate it is through the Iterator
object, either explicitly or implicitly (using the enhanced for or the forEach
method).
What if you have two collections and you want to iterate them simultaneously in “pair”? In this situation, you must first establish what should happen when the two collections have different sizes. There are various possibilities. The simple solution presented in this article is to iterate on the minimum size, ensuring you always have one or more pairs from the two collections (when not empty, of course).
The non-reusable approach
At the practical level, the best way to iterate on two collections simultaneously is to use an explicit Iterator
object for both, like in the following code snippet:
List<String> strings = Arrays.asList("A", "B", "C");
List<Integer> numbers = Arrays.asList(100, 200, 300);
Iterator<String> stringsItr = strings.iterator(); // \
Iterator<Integer> numbersItr = numbers.iterator(); // |
// | Can we encapsulate
while (stringsItr.hasNext() && numbersItr.hasNext()) { // | this part of code?
String stringItem = stringsItr.next(); // |
Integer numberItem = numbersItr.next(); // /
// use stringItem and numberItem ...
}
The above code could be more reusable since you should replicate it everywhere and change the types as necessary. So the central question of this article is: can we properly encapsulate this logic in a class/method to achieve better code reuse?
Absolutely yes! There are even various code designs that I will describe in detail in the following sections.
Solution 1: the functional approach
The first code design I am going to discuss uses a functional approach. Starting from Java 8, you can easily pass a behavior to a method using the new concepts of lambda expressions and functional interfaces. To be precise and honest, you could pass a behavior to a method even before Java 8. However, it was very uncomfortable because you typically had to use inner classes (e.g., anonymous inner classes) or standard top-level classes.
Using the functional approach, you can suppose to write a simple “utility” method that receives 3 arguments:
- The first sequence of elements
- The second sequence of elements
- A function that accepts two values for each pair from the two sequences
At the practical level, the following is the most straightforward but widely applicable implementation:
import java.util.Iterator;
import java.util.function.BiConsumer;
public class IterationUtils {
public static <T, U> void forEachPair(Iterable<? extends T> iterable1,
Iterable<? extends U> iterable2, BiConsumer<? super T, ? super U> consumerFunc) {
Iterator<? extends T> iterator1 = iterable1.iterator();
Iterator<? extends U> iterator2 = iterable2.iterator();
while (iterator1.hasNext() && iterator2.hasNext()) {
consumerFunc.accept(iterator1.next(), iterator2.next()); // calls the function
}
}
}
Notes
Various things should be noted in the above code. First, BiConsumer
is one of the new functional interfaces introduced in JDK 8. This interface represents a “function” that accepts two object arguments and returns nothing.
Then, I used Iterable
, not Collection
, to receive the sequence of elements. Remember that Iterable
is a super-interface of Collection
. This is generally the best choice when you only need to iterate over any iterable object. Furthermore, there exist some types that implement Iterable
but that are not “collections”. One notable example is the Path
interface (from the NIO2 API since JDK 7), which implements Iterable<Path>
but is clearly not a general-purpose collection.
Finally, I heavily used generics, particularly the so-called bounded wildcards (<? extends X>
and <? super X>
). The difference between extends
and super
derives from the well-known “Get and Put Principle” described extensively in the great O’Reilly book “Java Generics and Collections” (note: I highly recommend buying this book if you are very interested in Java generics!).
The principle is pretty simple. You use the extends
wildcard when you only need to extract values from an object. This is the case of Iterable
since we only get a value from the next
method of the provided Iterator
. Instead, you use a super
wildcard when you only need to insert (means: pass as argument) a value into an object. This is the case of BiConsumer
since we pass 2 arguments to its accept
method. And to be complete, you do not use a wildcard when you need to both get and put a value.
Usage example
The forEachPair
method can be used in the following way, for example:
List<String> strings = Arrays.asList("A", "B", "C");
List<Integer> numbers = Arrays.asList(100, 200, 300);
IterationUtils.forEachPair(strings, numbers, (s, n) -> System.out.println(s + " / " + n));
Although the forEachPair
method works perfectly and is easily reusable, it has a significant limitation you should be aware of. Since the iteration is entirely performed by the forEachPair
method, the lambda function has no control over it. There is no way to use a break
or continue
to change the flow of the loop. This is a well-known general limitation of the functional approach.
One way to mitigate this problem is by changing the design slightly. Instead of BiConsumer
, you could use a BiPredicate
, which accepts two object arguments and returns a boolean
. You may use a true
value to continue the loop and a false
to terminate it. I leave this as an exercise for the reader.
Solution 2: the Iterable approach
The second code design I will discuss employs a custom implementation of the Iterable
interface. This approach allows you to use the enhanced for (“for-each”) cycle, where you have complete flow control.
To use the for-each, you need two things: an object that implements Iterable
(the “target” of the for-each) and a type representing each item of the iteration.
In our case, we need a type representing a “pair” of values. You can define your own Pair
type or use a Pair
type provided by some external library. The following is a trivial implementation of a pair type:
The Pair type
public final class Pair<T, U> {
private final T item1;
private final U item2;
public Pair(T item1, U item2) {
this.item1 = item1;
this.item2 = item2;
}
public T getItem1() {
return item1;
}
public U getItem2() {
return item2;
}
// ... other methods (e.g., equals, hashCode, toString)
}
Otherwise, you can use one of the following types (non-exhaustive list!):
- The
org.apache.commons.lang3.tuple.Pair
from the Apache Commons Lang library. - The
javafx.util.Pair
from the JavaFX API (however, it was mainly intended for key/value pairs). - The
org.springframework.data.util.Pair
from Spring Data Common.
Once you have decided which Pair
type to use, you must define a class that implements Iterable
. Since Iterable
only provides an Iterator
, you must also implement a custom one. The following is my personal solution:
The Iterable implementation
import java.util.Iterator;
public class PairIterable<T, U> implements Iterable<Pair<T, U>> {
private final Iterable<? extends T> iterable1;
private final Iterable<? extends U> iterable2;
public PairIterable(Iterable<? extends T> iterable1, Iterable<? extends U> iterable2) {
this.iterable1 = iterable1;
this.iterable2 = iterable2;
}
@Override
public Iterator<Pair<T, U>> iterator() {
return new Itr();
}
private class Itr implements Iterator<Pair<T, U>> {
private final Iterator<? extends T> iterator1 = iterable1.iterator();
private final Iterator<? extends U> iterator2 = iterable2.iterator();
@Override
public boolean hasNext() {
return iterator1.hasNext() && iterator2.hasNext();
}
@Override
public Pair<T, U> next() {
return new Pair<>(iterator1.next(), iterator2.next());
}
}
}
I decided to implement Iterator
using a “regular” inner class, which is also private
and thus not visible outside. This is just only a matter of personal preference. You could use an anonymous inner class defined in the iterator
method or a standard top-level class (less preferable but technically possible).
Note that bounded wildcards still pertain to this case. Since we only extract values from the next
method of Iterator
, the extends
wildcard is appropriate.
Usage example
We can use the PairIterable
, for example, in the following way:
List<String> strings = Arrays.asList("A", "B", "C");
List<Integer> numbers = Arrays.asList(100, 200, 300);
for (Pair<String, Integer> pair : new PairIterable<>(strings, numbers)) {
System.out.println(pair.getItem1() + " / " + pair.getItem2());
}
The code is clearly longer than the one using the functional approach. However, the loop is now entirely under our control, so we can eventually use break
or continue
(and even return
!) to change the flow of the loop.
If you use Java 10 or higher, you can abbreviate the loop slightly using the new “var” syntax (see my previous article, “The “var” syntax for local variables since Java 10”) like the following:
for (var pair : new PairIterable<>(strings, numbers)) {
System.out.println(pair.getItem1() + " / " + pair.getItem2());
}
The Iterable
approach, on its side, is slightly inefficient because, for each pair of values, a new Pair
object needs to be created. This is generally not a big issue unless you have millions of elements.
Conclusions
In this article, I provided two ways to solve the question about simultaneously iterating two collections. Most importantly, the article’s main point was to show how to create reusable code to avoid bad repetitions.
Both approaches are generally satisfying, but it all depends on your specific scenario. There are mainly two aspects to consider:
- What you need to do with each pair
- How much control do you want to have over the iteration
For example, the functional approach is preferable if you only need to print each pair (like the above examples). However, suppose you must perform complex logic with conditional tests and break
/continue
/return
to change the flow of the loop. In that case, the Iterable
approach is the only one applicable.