A binary search tree in Kotlin pt. 1: Representing a node
Posted by Simon Larsén in Programming
In my journey to become a somewhat competent Kotlin developer, I've decided to implement a few of the basic data structures that I've picked up during my three years of computer science studies. First up, we have a generic binary tree. This is an interesting case, because it lets us both delve into generics in Kotlin, and some aspects of inheritance that differ from inheritance in Java (in a good way!). As I want to cover some topics in depth, this will be a three-part series with the following content:
- In the first part, we'll develop a basic class hierarchy for representing
nodes. It covers object types, data classes and sealed classes,
as well as some other related topics. The node is however restricted to only
carry
Int
data. - In the second part, we'll expand upon the class hierarchy from part 1 to create a generic node class that can hold any type of data.
- Finally, we'll use the results of part 2 to develop a rudimentary binary tree in (what is in my opinion) idiomatic Kotlin.
If you already know about object types, data classes and sealed classes, I recommend that you skip directly to part 2. If you are already comfortable with generics, including generic inheritance, you may skip directly to part 3.
Series index
- Representing a node (this part!)
- Generic node
- Generic BST with insert, contains and traversal (coming soon!)
Goals and intended audience
I write articles mostly for myself, and as such, this article series is intended for developers with some experience with Java looking to get into Kotlin. Let's get at it then, shall we?
Representing a node: A Java-like attempt
As I see it, a tree node can be one of two things: existent, or non-existent.
In other words, it can be a node or an empty node. As Kotlin is, thankfully,
quite adverse to using null
, I will refrain from doing so as well. So what we
want is an abstract node class ANode
and sub-classes Node
and Empty
. Let's
give it a first try in a pretty Java-like manner, and then improve upon it with
some neat Kotlin language constructs.
abstract class ANode
class Empty : ANode()
class Node : ANode {
val data: Int
var left: ANode
var right: ANode
constructor(data: Int) : super() {
this.data = data;
right = Empty()
left = Empty()
}
}
If you've had any experience with any remotely Java-looking language, you can
probably guess what's going on here. There's the abstract ANode
class, the
Empty
class representing the absence of a node and the Node
class
representing an actual node. Note also that we have not delved into generics
yet, this is a node that can only hold Int
data. That's fine for now, we'll
expand upon this implementation with generics in part 2. When we later implement
the binary tree, we will often want to distinguish between a Node
and Empty
.
One such case is when we search the tree for a given value, to see if it is
contained in the tree. This operation can be succinctly expressed using
recursion, but let us leave that for part 2. For now, let's just check the first
node (the root), without exploring its children.
// check if data is contained in node
fun contains(node: ANode, data: Int): Boolean = when (node) {
is Empty -> false
is Node -> data == node.data // note implicit cast
else -> throw IllegalArgumentException("node argument was neither Empty nor Node!")
}
This is of course a pretty stupid function at this point, but we'll make it much
more worthwhile in part 2. Note the
expression body
used here, in combination with a
when
expression.
If you are unfamiliar with those concepts, follow the links and read up on them,
they will be crucial when implementing the tree algorithms in parts 2 and 3.
Also note the
implicit cast occurring
on the second line of the function. Since we used is Node
to match node
, the
compiler can infer that node
is in fact a Node
object, and we can safely
dereference it with node.data
! Finally, note that the else
case is needed as
the compiler does not know that there are only two subclasses of ANode
(even
though we currently do, in this very small project). We'll see how to resolve
that shortly. Let's try this function out:
>>> contains(Empty(), 2)
false
>>> contains(Node(2), 2)
true
>>> contains(Node(2), 3)
false
It seems to work just fine. We could leave the class hierarchy like this and jump straight into generics. There are, however, three notable problems with the node classes.
- For each empty node we need, a new instance of
Empty
is created. This is wasteful. - The body of
Node
is a whole lot of code for very little functionality. - The compiler can't tell that
Node
andEmpty
are the only subtypes ofANode
, forcing us to use anelse
in thewhen
expression.
As it turns out, all of these problems are easy to solve in Kotlin!
Problem 1 solution: Singleton objects
Problem number 1 can be solved very easily, as Kotlin has language support for the singleton pattern. We simply swap this declaration
class Empty : ANode()
for this declaration
object Empty : ANode()
Empty
is now a singleton object, so we can assign it without instantiating
Empty
s all over the place. The constructor for Node
now looks like this:
constructor(data: Int) : super() {
this.data = data;
right = Empty // note the lack of parentheses!
left = Empty
}
One problem solved, two to go!
Problem 2 solution: Primary constructors and data classes
We can solve problem number 2 with Kotlin's syntax for
primary constructors.
Instead of defining Node
the Java way, we do it the Kotlin way:
class Node(val data: Int, var right: ANode = Empty, var left: ANode = Empty) : ANode()
This is almost equivalent to the previous declaration, with the exception that
right
and left
are assigned default values in the header such that they can
be replaced by explicit arguments when calling the constructor. Note that
ANode
must be instantiated right there in the header as well. However, since
we know that Node
will always be a simple container, we can do one better here
by prepending data
to the declaration.
data class Node(val data: Int, var right: ANode = Empty, var left: ANode = Empty) : ANode()
This makes Node
a
data class, which
among other things come with implementations of equals
and toString
. A
fortunate accident here is that the toString
of Node
will actually let us
view the whole tree with very little effort, as toString
will be called on
both left
and right
, recursively (this will be demonstrated in part 3). Do
be careful not to create a cycle, though, as this will cause a stack overflow,
endlessly calling toString
(a tree, by definition, has no cycles, so we are
good in this case).
Problem 3 solution: Sealed classes
To reiterate, the problem was that the compiler can't tell that Node
and
Empty
are the only subtypes of ANode
. Therefore, we needed the else
in the
when
expression to cover up the non-existent case of the argument to
contains
being anything else.
// check if data is contained in node
fun contains(node: ANode, data: Int): Boolean = when (node) {
is Empty -> false
is Node -> data == node.data
else -> throw IllegalArgumentException("node argument was neither Empty nor Node!")
}
We can, however, tell the compiler that Node
and Empty
are the only
subtypes by making ANode
a
sealed
class.
Any subclass of a sealed class must be declared inside the same file, which lets
the compiler know precisely which subtypes can exist. To accomplish this, we
simply replace the abstract
modifier with sealed
(because sealed
implies
abstract
, we don't need the latter).
sealed class ANode
We can now drop the else
from contains
, because the compiler knows that a
variable with static type ANode
is either Empty
, or a Node
, there are no
other possibilities.
// check if data is contained in node
fun contains(node: ANode, data: Int): Boolean = when (node) {
is Empty -> false
is Node -> data == node.data
}
Let's give it a try
>>> contains(Node(2), 2)
true
>>> contains(Empty, 2)
false
Neat, now we have a good base for venturing into the fraught land of generics in part 2.
Final code listing
The final version of the code, that we'll use in part 2, can be found below. I've also included a main function such that you can run the code in your preferred way, right off the bat!
sealed class ANode
object Empty : ANode()
data class Node(val data: Int, var left: ANode = Empty, var right: ANode = Empty) : ANode()
// check if data is contained in node
fun contains(node: ANode, data: Int): Boolean = when (node) {
is Empty -> false
is Node -> data == node.data // note implicit cast
}
fun main(args: Array<String>) {
println(contains(Empty, 2))
println(contains(Node(2), 2))
println(contains(Node(2), 3))
}