A hypothetical mapStream for the Java Stream API

On my blog, I generally write articles on Java programming topics that already exist. This time, however, I want to write about a new hypothetical mapStream method that currently doesn’t exist in the Java Stream API but could be interesting to add.

Introduction

The Java Stream API was initially introduced in JDK 8, with several enhancements released in later versions. I often use the Stream API in my daily work with Java, and I have always found it helpful and pleasant. However, I think a little feature could be added to improve the overall usage of the Stream API.

Consider, for example, the following simple usage of the Stream API:

List<Integer> numbers = // ......

List<String> strings = numbers.stream()
        .filter(num -> num != null && num >= 0)
        .map(num -> "[" + num + "]")
        .collect(Collectors.toList());

The above code shows the traditional and typical way of using the Stream API with the so-called “method chaining” technique. Let’s try momentarily to split the invocations by assigning the result of each operation to different variables.

Stream<Integer> stream1 = numbers.stream();
Stream<Integer> stream2 = stream1.filter(num -> num != null && num >= 0);
Stream<String> stream3 = stream2.map(num -> "[" + num + "]");
List<String> strings = stream3.collect(Collectors.toList());

This latter code provides the same behavior as the first code. The only difference is that here we have saved the result of each invocation to a separate variable. You always get a new Stream object whenever you invoke one of the “intermediate” operations (like filter, map, flatMap, or sorted). With the latter code, at runtime, you have that stream2 is a distinct object from stream1, and stream3 is a different object from stream2 and stream1.

The point I want to highlight is that the return of the new Stream object is entirely under the control of the Stream API. You cannot “bypass” the creation of such a new Stream (or create an alternate one) using the version that employs the method chaining technique.

The hypothetical mapStream

What I would like to have is a new method that could be called mapStream. Adding a simple default method in the Stream interface, like the following, would be sufficient:

// The Stream from JDK
public interface Stream<T> extends BaseStream<T, Stream<T>> {

    // ... existent methods

    // The new hypothetical “mapStream”
    default <R> Stream<R> mapStream(Function<? super Stream<T>, ? extends Stream<R>> streamMapper) {
        return streamMapper.apply(this);
    }
}

The hypothetical mapStream is very easy to explain. It accepts a function that receives the “current” (this) Stream object and returns a Stream object that may be the same … or a different one. Note that the streamMapper function is invoked immediately “on the fly”. Thus, it happens during the stream chaining, not during the processing of stream elements!

What could be done with this? Consider, for example, the following simple static method:

public static <T> Function<Stream<T>, Stream<T>> debugLogging(Logger logger, String format) {
    if (logger.isDebugEnabled()) {
        return stream -> stream.peek(e -> logger.debug(format, e));
    } else {
        return stream -> stream;   // returns the SAME stream
    }
}

This debugLogging method returns a function that “maps” a Stream of any generic type. If the DEBUG level is disabled for that Logger (note: I am considering SLF4J for convenience), the same Stream object is returned. It means there will be no change in the overall stream behavior or added penalties at runtime. However, if the DEBUG level is enabled, a new “peeked” version of the Stream is returned, and the peek operation will log each stream element.

You could use such a debugLogging method in the following (hypothetical) way:

List<String> strings = numbers.stream()
        .filter(num -> num != null && num >= 0)
        .map(num -> "[" + num + "]")
        .mapStream(debugLogging(logger, "{}"))     // <--- currently hypothetical!
        .collect(Collectors.toList());

With the actual state of the Java Stream API, this type of “bypass” logic is impossible without breaking the method chaining. Note that breaking the method chaining is always possible but would be clumsy and uncomfortable, as you saw in the second code snippet.

Logging is undoubtedly one possible use case, but I am sure many other use cases may exist. Introducing a mapStream method like the one proposed above could also “open the door” to higher code reuse. A developer could pass a (reusable) function that combines nonnull filtering with the sorted() operation. Something like this:

public static <T> Function<Stream<T>, Stream<T>> nonNullsSorted() {
    return stream -> stream.filter(Objects::nonNull).sorted();
}

And then used in the following way:

List<Xyz> list2 = list1.stream()
        .mapStream(nonNullsSorted())     // <--- currently hypothetical!
        .collect(Collectors.toList());

I hope I have provided a comprehensive description of this hypothetical feature. There is, however, one more little thing to add.

The final hypothetical mapStream

In theory, the result of the streamMapper function could be any object, not just a Stream. Thus, the mapStream method could be modified and improved in the following way:

    default <R> R mapStream(Function<? super Stream<T>, ? extends R> streamMapper) {
        return streamMapper.apply(this);
    }

This could “open the door” to even more code reuse. A developer could encapsulate, for example, the combination of a sorted() operation and a collect(Collectors.toList()) operation with ease.

Conclusions

In this article, I wrote about the hypothetical mapStream method for the Java Stream API. This is not a significant or essential feature, but I believe it could help developers in many ways. Sometimes, during my Java developments, I’ve found myself in situations where I wish I had this hypothetical mapStream method.

Note that I mainly talked about the Stream interface, but the same concepts can also be applied to the other “primitive” streams, IntStream, LongStream, and DoubleStream.

My hope (and little dream) is that Oracle can read and evaluate this one day …

Is it an exciting feature? Do you see possible disadvantages or flaws? Please, feel free to contact me for any feedback!

Similar Posts