
Iterable and Sequence. This is understandable since even their definitions are nearly identical:interface Iterable<out T> { operator fun iterator(): Iterator<T> } interface Sequence<out T> { operator fun iterator(): Iterator<T> }
Iterable and Sequence are associated with totally different usages (have different contracts), so nearly all their processing functions work differently. Sequences are lazy, therefore intermediate functions for Sequence processing don't do any calculations. Instead, they return a new Sequence that decorates the previous one with the new operation. All these computations are evaluated during a terminal operation like toList() or count(). Iterable processing, on the other hand, returns a collection like List at every step.public inline fun <T> Iterable<T>.filter( predicate: (T) -> Boolean ): List<T> { return filterTo(ArrayList<T>(), predicate) } public fun <T> Sequence<T>.filter( predicate: (T) -> Boolean ): Sequence<T> { return FilteringSequence(this, true, predicate) }
Sequence processing functions are not invoked until the terminal operation (an operation that returns something different than Sequence). For instance, for Sequence, filter is an intermediate operation, so it doesn’t do any calculations; instead, it decorates the sequence with the new processing step. Calculations are done in a terminal operation like toList or sum.
val list = listOf(1, 2, 3) val listFiltered = list .filter { print("f$it "); it % 2 == 1 } // f1 f2 f3 println(listFiltered) // [1, 3] val seq = sequenceOf(1, 2, 3) val filtered = seq.filter { print("f$it "); it % 2 == 1 } println(filtered) // FilteringSequence@... val asList = filtered.toList() // f1 f2 f3 println(asList) // [1, 3]
- They keep the natural order of operations.
- They do a minimal number of operations.
- They can be infinite.
- They do not need to create collections at every step.
listOf(1, 2, 3) .filter { print("F$it, "); it % 2 == 1 } .map { print("M$it, "); it * 2 } .forEach { print("E$it, ") } // Prints: F1, F2, F3, M1, M3, E2, E6, sequenceOf(1, 2, 3) .filter { print("F$it, "); it % 2 == 1 } .map { print("M$it, "); it * 2 } .forEach { print("E$it, ") } // Prints: F1, M1, E2, F2, F3, M3, E6,

for (e in listOf(1, 2, 3)) { print("F$e, ") if (e % 2 == 1) { print("M$e, ") val mapped = e * 2 print("E$mapped, ") } } // Prints: F1, M1, E2, F2, F3, M3, E6,

find:(1..10) .filter { print("F$it, "); it % 2 == 1 } .map { print("M$it, "); it * 2 } .find { it > 5 } // Prints: F1, F2, F3, F4, F5, F6, F7, F8, F9, F10, // M1, M3, M5, M7, M9, (1..10).asSequence() .filter { print("F$it, "); it % 2 == 1 } .map { print("M$it, "); it * 2 } .find { it > 5 } // Prints: F1, M1, F2, F3, M3,
first, find, take, any, all, none, or indexOf.generateSequence or sequence.generateSequence needs the first element and a function specifying how to calculate the next one:generateSequence(1) { it + 1 } .map { it * 2 } .take(10) .forEach { print("$it, ") } // Prints: 2, 4, 6, 8, 10, 12, 14, 16, 18, 20,
sequence uses a suspending function (coroutine[^51_1]) that generates the next number on demand. Whenever we ask for the next number, the sequence builder runs until a value is yielded using yield. The execution then stops until we ask for another number. Here is an infinite list of Fibonacci numbers:import java.math.BigDecimal val fibonacci: Sequence<BigDecimal> = sequence { var current = 1.toBigDecimal() var prev = 1.toBigDecimal() yield(prev) while (true) { yield(current) val temp = prev prev = current current += temp } } fun main() { print(fibonacci.take(10).toList()) // [1, 1, 2, 3, 5, 8, 13, 21, 34, 55] }
print(fibonacci.toList()) // Runs forever
take, or we need to use a terminal operation that will not need all elements, like first, find or indexOf. Basically, these are also the operations for which sequences are more efficient because they do not need to process all elements.any, all, and none should not be used without being limited first. any can only return true or run forever. Similarly, all and none can only return false.List. This could be an advantage as we have something ready to be used or stored after every step, but this comes at a cost. Such collections need to be created and filled with data at every step.numbers .filter { it % 10 == 0 } // 1 collection here .map { it * 2 } // 1 collection here .sum() // In total, 2 collections created under the hood numbers .asSequence() .filter { it % 10 == 0 } .map { it * 2 } .sum() // No collections created
readLines returns List<String>):// BAD SOLUTION, DO NOT USE COLLECTIONS FOR // POSSIBLY BIG FILES File("ChicagoCrimes.csv").readLines() .drop(1) // Drop descriptions of the columns .mapNotNull { it.split(",").getOrNull(6) } // Find description .filter { "CANNABIS" in it } .count() .let(::println)
OutOfMemoryError.Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
useLines function, which always operates on a single line:File("ChicagoCrimes.csv").useLines { lines -> // The type of `lines` is Sequence<String> lines.drop(1) // Drop descriptions of the columns .mapNotNull { it.split(",").getOrNull(6) } // Find description .filter { "CANNABIS" in it } .count() .let { println(it) } // 318185 }
CrimeData.csv file with the same crimes but with a size of only 728 MB. Then, I did the same processing. The first implementation, which uses collection processing, took around 13s; the second one, which uses sequences, took around 4.5s. As you can see, using sequences for big files is good not only for memory but also for performance.map, or the cost is bigger when it is not, like when we use filter. Knowing the size of a new list is important because Kotlin generally uses ArrayList as the default list, which needs to copy its internal array when its number of elements is greater than internal capacity (whenever this copy operation is growing, internal capacity grows by half). Sequence processing only creates a collection when it ends with toList, toSet, etc. However, notice that toList on a sequence does not know the size of the new collection in advance.fun singleStepListProcessing(): List<Product> { return productsList.filter { it.bought } } fun singleStepSequenceProcessing(): List<Product> { return productsList.asSequence() .filter { it.bought } .toList() }
filter function is inline). However, when you compare functions with more than one processing step, like the functions below which use filter and then map, the difference will be visible for bigger collections. To see the difference, let's compare typical processing with two and three processing steps for 5,000 products:fun twoStepListProcessing(): List<Double> { return productsList .filter { it.bought } .map { it.price } } fun twoStepSequenceProcessing(): List<Double> { return productsList.asSequence() .filter { it.bought } .map { it.price } .toList() } fun threeStepListProcessing(): Double { return productsList .filter { it.bought } .map { it.price } .average() } fun threeStepSequenceProcessing(): Double { return productsList.asSequence() .filter { it.bought } .map { it.price } .average() }
productsList:twoStepListProcessing 81 095 ns
twoStepSequenceProcessing 55 685 ns
twoStepListProcessingAndAcumulate 83 307 ns
twoStepSequenceProcessingAndAcumulate 6 928 ns
sorted is an example from Kotlin stdlib (currently, this is the only example). sorted uses an optimal implementation: it accumulates the Sequence into List and then uses sort from Java stdlib. The disadvantage is that this accumulation process takes longer than the same processing on a Collection (however, if Iterable is not a Collection or array, then the difference is not significant because it also needs to be accumulated).Sequence has methods like sorted is controversial because sequences that contain a method that requires all elements to calculate the next one are only partially lazy (evaluated when we need to get the first element) and they don’t work on infinite sequences. It was added because it is a popular function and is much easier to use in this way; however, Kotlin developers should remember its flaws, especially the fact that it cannot be used on infinite sequences.generateSequence(0) { it + 1 }.take(10).sorted().toList() // [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] generateSequence(0) { it + 1 }.sorted().take(10).toList() // Infinite time. Does not return.
sorted is a rare example of a processing step that is faster on Collection than on Sequence. Still, when we do a few processing steps and a single sorted function (or another function that needs to work on the whole collection), we might expect some performance improvement using sequence processing.productsList.asSequence() .filter { it.bought } .map { it.price } .sorted() .take(10) .sum()
productsList.asSequence() .filter { it.bought } .map { it.price } .average() productsList.stream() .filter { it.bought } .mapToDouble { it.price } .average() .orElse(0.0)
-
Kotlin sequences have many more processing functions (because they are defined as extension functions), and they are generally easier to use (this is because Kotlin sequences were designed when Java streams were already in use; for instance, we can collect using
toList()instead ofcollect(Collectors.toList())) - Java stream processing can be started in parallel mode using a parallel function. This can give us a huge performance improvement in contexts in which we have a machine with multiple cores that are often unused (which is common nowadays). However, use this with caution as this feature has known pitfalls[^51_2].
- Kotlin sequences can be used in common modules, Kotlin/JVM, Kotlin/JS, and Kotlin/Native modules. Java streams can be only used in Kotlin/JVM, and only when the JVM version is at least 8.

- They keep the natural order of operations.
- They do a minimal number of operations.
- They can be infinite.
- They do not need to create collections at every step.
[^51_2]: The problems come from the common join-fork thread pool they use, which means that one process can block another. There’s also a problem with the fact that single element processing can block other elements. Read more about it here: kt.academy/l/java8-streams-problem
[^51_3]: You can find this database at data.cityofchicago.org
[^51_4]: Processor 2.6 GHz Intel Core i7, Memory 16 GB 1600 MHz DDR3
