A binary search tree in Kotlin pt. 2: Generic node
Posted by Simon Larsén in Programming
Welcome to part 2 of my series on the idiomatic Kotlin binary tree! In this part, we're gonna have a look at how to make the node representation from part 1 capable of carrying any kind of data (i.e. generic).
Series index
- Representing a node
- Generic node (this part!)
- Generic BST with insert, contains and traversal (coming soon!)
Attribution and reading recommendations
In this part, we'll start working a little bit with binary tree algorithms. More
specifically, we'll complete the contains
function from part 1. All of the
algorithms I implement in this series are based on
this article by Stefan Nilsson.
If you are unfamiliar with the concepts of binary trees, I highly recommend that
you sift through that article before continuing with this one.
Improving contains
Recall the contains
function that we started working on in part 1. Before we
start working on generics, I want us to complete this function. It will make
some of the decisions about generics much more apparent. Anyway, here's
contains
as we wrote it in part 1.
// check if data is contained in node
fun contains(node: ANode, data: Int): Boolean = when (node) {
is Empty -> false
is Node -> data == node.data // note implicit cast
}
It's not a particularly useful function, as it only checks the current node.
What we'd rather have it do is check the entire subtree, in which node
is the
root. This can quite easily be performed with recursion. The Empty
case still
stands, if we hit an empty node, the data is not contained in the tree. If
node
is a Node
, on the other hand, there are three possibilities:
data < node.data
, in which case we keep searching in the left subtree.data > node.data
, in which case we keep searching in the right subtree.data
is neither smaller or larger thannode.data
, so they are comparably equal. If this actually means that they are equal or not is implementation specific, but it is highly recommended that ordering is consistent withequals
. We will assume that this is the case.
As we have three distinct cases, we can again use a when
expression.
// check if data is contained in the tree rooted in node
fun contains(node: ANode, data: Int): Boolean = when (node) {
is Empty -> false
is Node -> when { // no-argument when so we can do arbitrary comparisons
data < node.data -> contains(node.left, data)
data > node.data -> contains(node.right, data)
else -> true
}
}
Note that the nested when
expression has no argument in parentheses, allowing
us to perform more complex operations in the matchings. And that's it for the
contains
operation. It's really quite elegant. We can quickly ammend the
main
function to try it out:
fun main(args: Array<String>) {
// create the tree
// 6
// / \
// 3 9
// \
// 4
val root = Node(6,
left = Node(3,
right = Node(4)),
right = Node(9))
println("Search for elements in the tree")
for (data in listOf(6, 3, 4, 9)) {
println(contains(root, data))
}
println("Search for elements not in the tree")
for (data in listOf(10, 2, 1, -12)) {
println(contains(root, data))
}
}
With that out of the way, let's dive into making the whole thing generic!
Getting generic
Now, how do we make the node classes generic? A first attempt might be to just
change the Node
class, and do something like this:
data class Node<T>(val data: T, var left: ANode = Empty, var right: ANode = Empty) : ANode()
fun <T : Comparable<T>> contains(node: ANode, data: T): Boolean = when (node) {
is Empty -> false
is Node<T> -> when {
data < node.data -> contains(node.left, data)
data > node.data -> contains(node.right, data)
else -> true
}
}
This is a reasonable attempt. Before we dive into the problems, let's analyze
what we just did. After fun
we write <T : Comparable<T>>
, which makes this
function generic. The addition states that the type parameter T
can be
substituted by any type that implements Comparable<T>
(which we need to be
able to use <
and >
). Node<T>
simply defines a type parameter T
that
can be substituted with any type. It is a bit inconvenient that we could put
non-comparable types in a Node
object, but we'll see that it sorts itself out
when we create the Tree
class in part 3. For now, just ignore that detail.
Now, the above code won't compile, for multiple reasons. The first problem is
that we can't ask at runtime if node is Node<T>
(the compiler will say Cannot
check for erased type: Nodeis Node<*>
, but then we run into the real
showstopper: we don't know whether data
and node.data
are actually
comparable, as they might not have the same type. With the current class
hierarchy, there is no reasonable way around this. An ANode
is not
parameterized, and therefore the dynamic type of an ANode
can be any Node
type (e.g. Node<Int>
, Node<String>
etc) or Empty
. We have to put the type
parameter T
higher up in the inheritance chain.
Inheriting from a generic class
Since ANode
is the only class higher up in the inheritance chain (apart from
Any
), this is where we need to put our type parameter. For ANode
and Node
,
it is straightforward.
sealed class ANode<T>
data class Node<T>(val data: T, var left: ANode<T> = Empty, var right: ANode<T> = Empty) :
ANode<T>()
Note that Node<T>
is inheriting from ANode<T>
. We cannot (and wouldn't want
to, anyway) leave it as ANode
, because unlike Java, Kotlin does not support
raw types. Now, since we cannot inherit from ANode
, but must specify the type
parameter with a concrete type, what do we put there for Empty
? In the case of
Node<T>
, we simply inherit from ANode<T>
, because T
is declared as a
parameter to Node<T>
and is therefore concrete for for ANode
. We can't
however just do something like
object Empty : ANode<T>()
because T
is not declared in that scope. We also can't give Empty
a type
parameter, because Empty
is a singleton object, and type parameters simply
don't work with singletons (it wouldn't make much sense, if you stop to think
about it for a while). What we actually want to put as the type parameter, is
nothing. Literally. We wan't Nothing
, a concrete type in Kotlin which is a
subtype of every non-nullable type.
object Empty : ANode<Nothing>()
Note that we don't want to put Any
(which is the supertype of every
non-nullable type), because the we wouldn't be able to assign Empty
to any
concrete type of ANode<T>
. Now you may be angrily shouting that you
still can't assign Empty
to any type of ANode<T>
. Unfortunately,
ANode<T>
(for any substition of T
) is invariant. Let's fix that.
Generics are invariant by default, but Kotlin can stretch the rules
Any generc class is invariant by default. What does this mean? In short, it
means that a generic type (e.g. ANode<Int>
) is not a supertype, nor subtype,
of any other type. Formally, it means that if we have two types A
and B
such that B
is a subtype of A
, ANode<B>
is not a subtype of ANode<A>
.
Take for example Empty
, which subtypes ANode<Nothing>
. It is not a subtype
of ANode<Int>
, even though Nothing
is a subtype of Int
. Inconvenient, we
want Empty
to be a subtype of any concrete ANode
. We can achieve this by
using the out
modifier, and declaring Anode<out T>
. Formally, we make
ANode
covariant on the type parameter T
. We can only do this if every use
of T
is in an out position (i.e. return values). Note that this
restriction applies only to the body of ANode
, the T
in Node<T>
is not
the same type parameter as in ANode<out T>
. If you found all of that
confusing (I sure did the first time I read about it), you can read more about
variance in the
Kotlin docs on generics.
Here is what the working class hierarchy looks like:
sealed class ANode<out T>
object Empty : ANode<Nothing>()
data class Node<T>(
val data: T,
var left: ANode<T> = Empty,
var right: ANode<T> = Empty) : ANode<T>()
contains
now looks like this:
// check if data is contained in node
fun <T : Comparable<T>> contains(node: ANode<T>, data: T): Boolean = when (node) {
is Empty -> false
is Node<T> -> when {
data < node.data -> contains(node.left, data)
data > node.data -> contains(node.right, data)
else -> true
}
}
Now, you may be asking yourself why in the world we can do the is Node<T>
check now, while we could not before? Well, we're not actually checking at
runtime whether node
is Node<T>
, because the compiler knows any variable
with the static type ANode<T>
is either Empty
, or Node<T>
. So, for
example, ANode<Int>
must have dynamic type Empty
or Node<Int>
, there are
no other possibilities. As the compiler knows this, we can in fact skip the
<T>
and just write is Node
. That's all we need of our node classes, so we
can move on to implement the Tree
class in part 3!
Final code listing
This is the final state of the code that we'll be using for part 3.
sealed class ANode<out T>
object Empty : ANode<Nothing>()
data class Node<T>(
val data: T,
var left: ANode<T> = Empty,
var right: ANode<T> = Empty) : ANode<T>()
// check if data is contained in node
fun <T : Comparable<T>> contains(node: ANode<T>, data: T): Boolean = when (node) {
is Empty -> false
is Node<T> -> when {
data < node.data -> contains(node.left, data)
data > node.data -> contains(node.right, data)
else -> true
}
}
fun main(args: Array<String>) {
// create the tree
// 6
// / \
// 3 9
// \
// 4
val root = Node(6,
left = Node(3,
right = Node(4)),
right = Node(9))
println("Search for elements in the tree")
for (data in listOf(6, 3, 4, 9)) {
println(contains(root, data))
}
println("Search for elements not in the tree")
for (data in listOf(10, 2, 1, -12)) {
println(contains(root, data))
}
}