## Description

## 2-3 Trees: Definition

Suppose that E is an ordered type, that is, a nonempty set of values that have a total order. A

2-3-tree, for type E, is a finite rooted tree T (like a binary search tree or a red-black tree) that

satisfies the following 2-3 Tree Properties:

(a) Every leaf in T stores an element of E. and the elements stored at the leaves of T are

distinct. Thus T represents a finite subset of E, namely, the set of values stored in the

leaves of T.

(b) Every internal node of T has (exactly) either two or three children — which are each either

leaves or internal nodes of T.

(c) If an internal node x has exactly two children — a first child, which is the root of a first

subtree of the subtree with root x, and a second child, which is the root of a second

subtree of the subtree with root x — then each of the values stored at the leaves of the

first subtree is less than each of the values stored at the leaves of the second subtree.

(d) If an internal node x has exactly three children — a first child, second child and third

child, which are the roots of the first subtree, second subtree and third subtree of the

subtree with root x, respectively — then each of the values stored at the leaves of the first

subtree is less than each of the values stored at the leaves of the second subtree, and

1

(3, 9, 20)

1 3

(1, 3)

10 16 20

(5, 9) (10, 16, 20)

5 9

Figure 1: A 2-3 Tree

each of the values stored at the leaves of the second subtree is less than each of the

values stores at the leaves of the third subtree.

(e) If an internal node x of T has exactly two children then x stores a 2-tuple (m1, m2) —

where m1 is the largest value stored at any leaf of the first subtree, and m2 is the largest

value stored at any leaf of the second subtree of the subtree with root x.

(f) If an internal node x of T has exactly three children then x stores a 3-tuple (m1, m2, m3)

— where m1 is the largest value stored at any leaf of the first subtree, m2 is the largest

value stored at any leaf of the second subtree, and m3 is the largest value stored at any

leaf of the third subtree of the subtree with root x.

(g) Every leaf of T is at the same level of the tree — that is, the number of edges from the root

down to each leaf is the same.

(h) Each node x of T is the root of a 2-3 tree as well. That is, the subtree with root x also

satisfies properties (a)–(g), for every node x in T.

For example, the tree shown in Figure 1 is a 2-3 tree representing the set of integers

{1, 3, 5, 9, 10, 16, 20}.

Exercise: Confirm that this tree satisfies the “2-3 Tree Properties” that have now been given.

Note: While a variety of references (that are available online) describe “2-3 Trees”, they do not

all describe the same kind of tree and, in particular, they do not necessarily describe the kind

of tree that is to be considered in this assignment!

It will be useful to measure costs of operations on 2-3 trees in terms of the sizes of the sets

of elements of E that they represent, so that it will be helpful to bound the number of internal

nodes in this way too.

2

1. Let T be a nonempty 2-3 tree, so that it includes at least one node. Prove that if T

represents a subset S ⊆ E such that |S| = n ∈ N, then T has at most n − 1 internal

nodes.

It follows that (if the uniform cost criterion is used to define this) the amount of storage space

needed to represent a non-empty 2-3 tree is at most linear in the size of the subset of E that it

represents.

The depth of an empty 2-3 tree will be defined to be −1, just like the depth of an empty binary

search tree. Otherwise it is defined to be the number of edges in a longest path from the root

to a leaf in T — which is the number of edges in any path from the root down to any leaf in T,

by 2-3 Tree Property (g). Useful relationships between the depth of a 2-3 tree and the size of

the subset of E it represents can now be established.

2. Let T be a non-empty 2-3 tree with depth d ∈ N and let n be the size of the subset of E

represented by T. Prove that 2

d ≤ n ≤ 3

d

.

3. Let d ∈ N and suppose that |E| ≥ 2

d

. Prove that there exists a non-empty 2-3 tree with

depth d, representing a subset of E with size exactly 2

d

.

The proof used to complete Problem #3 can be modified to prove that if d ∈ N and |E| ≥ 3

d

then

there exists a non-empty 2-3 tree with depth d, representing a subset of E with size exactly 3

d

,

as well. Thus the bounds on sizes of a 2-3 tree, given in Problem #1, are tight. That is, they

cannot be improved.

## 2-3 Trees: Searches

If a search algorithm is to be implemented as a method, provided by some 2-3 tree T, then it

can be considered as an algorithm that solves the following problem1

.

Searching in This 2-3 Tree

Precondition:

(a) This 2-3 tree, T, satisfies the 2-3 Tree Properties given above.

(b) A non-null key, of type E, is given as input.

1

It is assumed here, and in the rest of the assignment, that null elements of E are never stored in a 2-3 tree —

and that no attempts to search for, insert or delete null elements are ever made.

3

Postcondition:

(a) If the key is stored at a leaf in T then the leaf storing this key is returned as

output. A NoSuchElementException is thrown, otherwise.

(b) This 2-3 tree has not been changed, so it still satisfies the 2-3 Tree Properties.

Consider the following problem (which might also be solved using a method provided by this

2-3 tree) as well.

Searching in a Subtree of This 2-3 Tree

Precondition:

(a) This 2-3 tree, T, satisfies the 2-3 Tree Properties given above.

(b) key is a non-null input of type E.

(c) x is a non-null in T that is also given as input.

Postcondition:

(a) If the key is stored in the subtree of T with root x, then the leaf (in this subtree) storing x is returned as output. A NoSuchElementException is thrown

otherwise.

(b) This 2-3 tree has not changed, so it still satisfies the 2-3 Tree Properties.

Suppose that get is an algorithm that correctly solves the second of these problems. Then the

algorithm shown in Figure 2 correctly solves the first of these problems, assuming that T.root

is a reference to the root of this 2-3 tree (so that T.root is null if and only if this is an empty

tree).

It is possible to modify a corresponding algorithm, for searching in a subtree of a binary tree,

to obtain a simple recursive algorithm that solves the second problem. Since this is a 2-3 tree

and the input code x is not null, such an algorithm is as shown in Figure 3. It is assumed

here that if x is not a leaf then x.firstChild is its first child and x.firstMax is the largest

element stored in the first subtree.

Similarly, if x is not a leaf then x.secondChild is its

second child and x.secondMax is the largest element stored in its second subtree — and

if x has three children then x.thirdChild is its third child and x.thirdMax is the largest

element stored in its third subtree.

4. Prove that the depth of the subtree with root x is a bound function for the recursive

algorithm in Figure 3.

4

node search(E key) {

1. if (T.root == null) {

2. Throw a NoSuchElementException

} else {

3. return get(key, root)

}

}

Figure 2: An algorithm to Search in this 2-3 Tree

5. Prove that this algorithm correctly solves the “Searching in a Subtree of This 2-3 Tree”

problem that is given above.

6. Now let Tget(k) be an upper bound for the number of steps used by the algorithm when

the depth of the subtree with the input node x as root is k, for k ∈ N. Use the uniform

cost criterion to define this.

Write a recurrence for Tget(k) as a function of k. Explain why your answer (which you

should make as precise as possible) is correct.

Note: You should be able to use your recurrence to show that Tget(0) ≤ 2, Tget(1) ≤ 7,

and Tget(2) ≤ 12.

7. Use your recurrence to give an upper bound in closed form (that is, not involving a

summation or recurrence) for Tget(k) and prove that your bound is correct. Once again,

this should be a function of k and you should try to make it as precise as possible.

## 2-3 Trees: Insertions

Consider, now, the problem of inserting an element into the set stored by a 2-3 tree. If this

is to be implemented as a method supplied by this tree then it can be considered to solve the

following computational problem.

Insertion into This 2-3 Tree

Precondition:

(a) This 2-3 tree, T, satisfies the 2-3 Tree properties given above.

(b) key is a non-null input of type E.

5

node get(E key, node x) {

1. if (x is a leaf) {

2. if (key is equal to the element stored at x) {

3. return x

} else {

4. Throw a NoSuchElementException

}

5. } else if (x has two children) {

6. if (key ≤ x.firstMax) {

7. return get(key, x.firstChild)

} else {

8. return get(key, x.secondChild)

}

} else {

9. if (key ≤ x.firstMax) {

10. return get(key, x.firstChild)

11. else if (key ≤ x.secondMax) }

12. return get(key, x.secondChild)

} else {

13. return get(key, x.thirdChild)

}

}

}

Figure 3: An Incomplete Algorithm to Search in a Subtree of this 2-3 Tree

Postcondition:

(a) If the input key is not already in the subset of E represented by T then

it is added to this set (with this set being otherwise unchanged). An

ElementFoundException is thrown and the set is not changed, if the key

already belongs to this subset.

(b) T satisfies the 2-3 Tree properties given above.

There are a few cases where this problem is easy, even though a node has to be added:

6

8. Briefly describe how to solve the above problem in the following cases.

(a) T is an empty tree, so that its root is a null node.

(b) T is not empty, but the set it represents has size one (so that its root is a leaf),

Suppose, for the rest of this assignment, that addFirstElement is a simple algorithm (based

on your answer for part (a)) that takes an input key and can be used to complete the insertion

operation when T is empty, and that addSecondElement is another algorithm (based on your

answer for part (b)) that also takes an input key and can be used to complete the insertion

operation when the root of T is a leaf.

A problem that arises, in other cases when 2-3 trees are modified, is that internal nodes temporarily have illegal numbers of children.

With that noted, consider the following Modified Tree Properties, which might be satisfied by

a rooted tree T when an operation is in progress.2

(a) T satisfies 2-3 Tree Properties (a), (c), (d), (e), (f), and (g) — but not, necessarily, 2-3 Tree

Properties (b) or (h).

(b) Every internal node of T has (exactly) either one, two, three or four children — which are

each either leaves or internal nodes of T. There is at most one internal node of T that

has either (exactly) one or four children — all other internal nodes of T have either exactly

two children or exactly three children.

(c) If an internal node x of T has exactly one child, then this is called a first child, which is

the root of a first subtree of the subtree with root x. In this case, x stores a 1-tuple (m1)

— where m1 is the largest value stored at any leaf of the first subtree of the subtree with

root x.

(d) If an internal node x of T has exactly four children — a first child, second child, third

child and fourth child, which are the roots of the first subtree, second subtree, third

subtree and fourth subtree of the subtree with root x, respectively — then each of the

values stored at the leaves of the first subtree is less than each of the values stored at the

leaves of the second subtree, each of the values stored at the leaves of the second subtree

is less than each of the values stored at the leaves of the third subtree, and each of the

values stored at the leaves of the third subtree is less than each of the values stored at

the leaves of the fourth subtree.

(e) If an internal node x of T has exactly four children then x stores a 4-tuple (m1, m2, m3, m4)

— where m1 is the largest value stored at any leaf of the first subtree, m2 is the largest

2As for the “2-3 Tree Properties”, some of these introduce terminology that will be used in the rest of the

assignment.

7

value stored at any leaf of the second subtree, m3 is the largest value stored at any leaf of

the third subtree, and m4 is the largest value stored at any leaf of the fourth subtree of the

subtree with root x.

(f) Each node x of T is the root of a rooted tree satisfying these properties as well. That is,

the subtree with root x also satisfies the above properties (a)–(e), for every node x in T.

Consider the following computational problem. It might not be clear why this is useful as

described here (but it will be).

## Insertion into Subtree of This 2-3 Tree

Precondition:

(a) This 2-3 tree, T, satisfies the 2-3 Tree properties given above.

(b) key is a non-null input of type E.

(c) x is a non-null node in T.

(d) Either key is not stored in any leaf of T at all, or it is stored in a leaf of the subtree

of T with root x. If it is not stored at any of these leaves then it is possible to add

a new leaf storing this key, as a new child of one of the internal nodes in the

subtree with root x (without making other changes) in such a way that T satisfies

the “Modified Tree” properties.

Postcondition:

(a) If the input key already belongs to the subset of E stored at the leaves in the

subtree of T with root x, then an ElementFoundException is thrown and T is

not changed.

(b) If x is a leaf that stores an element of E that is not equal to the input key then a

NoSuchElementException is thrown and T is not changed.

(c) If x is an internal node and the input key does not (initially) belong to the subset

of E stored at the leaves of the subtree of T with root x, then

• the input key is added to the subset of E stored at the leaves of T — which

is otherwise unchanged;

• either T satisfies the 2-3 Tree properties, given above, or T satisfies the

“Modified Tree” properties, given above, and x is now an internal node with

four children.

Suppose, now, that insertIntoSubtree is a method that solves this computational problem.

8

void insert(E key) {

1. if (T.root == null) {

2. addFirstElement(key)

3. } else if (T.root is a leaf) {

4. addSecondElement(key)

} else {

5. insertIntoSubtree(key, T.root)

6. if (T.root has four children) {

7. fixRoot()

}

}

}

Figure 4: An Algorithm for the “Insertion into This 2-3 Tree” Problem

9. Suppose that the root of T is an internal node and the insertIntoSubtree method

is called with an input key ∈ E and with the root of T as input. Suppose, as well, that

the input key was not initially stored at any leaf of T — and that the root of T has four

children when the execution of this method ends.

Briefly describe how to modify T so that the 2-3 Tree properties are restored (that is,

T is a 2-3 tree, once again) — without changing the subset of E represented by T, and

using only a constant number of steps.

Suppose, now, that fixRoot is a method (with no inputs) that can be used to carry out the

computation described in the above question — and that this method is only applied when T

satisfies the above “Modified Tree” properties and its root is an internal node with four children.

If the addFirstElement, addSecondElement, insertIntoSubtree and fixRoot methods are as described above, then an insert method that solves the “Insertion into This 2-3

Tree” problem is as shown in Figure 4 — where “ T.root” is a reference to the root of T.

The only method that has not been described is the insertIntoSubtree method that solves

the above “Insertion into Subtree of This 2-3 Tree” problem, and which is used at line 5. It will

be useful to describe another pair of methods that will be used by this one.

• Suppose that T satisfies the above “Modified Tree” properties and x is an internal node

of T, whose children are also internal nodes, such that either

– one of the children of x has four children, or

9

– there is no internal node of T with either one or four children at all, so that T is a

2-3 tree.

Suppose, as well, that if it called with input x, then the raiseSurplus method modifies T, without changing the subset of E represented by T or changing the position of x

in T (that is, without changing the distance or path from x to the root), so that T still

satisfies the above “Modified Tree” properties and either

– x has four children, or

– there is no internal node of T with either one or four children, so that T is a 2-3 tree.

One way to perform this computation is to create an ArrayList of nodes, and another

ArrayList of the largest elements of E stored at the subtrees with these nodes as

roots. If none of the children of x has four children then the first ArrayList will simply

store these children.

On the other hand, if some child of x has four children then this

should be replaced by two internal nodes, that each have two children, namely, two of

the four children of this child. In effect this child of x should be split into a pair of internal

nodes, increasing the number of children of x by one.

With a bit of work, it can be confirmed that the number of steps needed for an execution

of this method is bounded by a constant.

• Suppose, instead, that T is a 2-3 tree, x is an internal node of T whose children are

leaves, key is a non-null element of E that is not stored in any of the leaves of T, and

the the “Modified Tree” properties would still be satisfied if x was given another child,

namely, a new leaf storing the key.

Suppose that, if called with the above key and internal node x as inputs, the method

addLeaf adds another leaf, storing the input key, as another child of x — so that T still

satisfies the “Modified Tree” properties, the subset of E represented has been changed

by including key (and making no other changes), and either x has four children or T is

a 2-3 tree.

Once again, an ArrayList — of elements of E, to be stored at the children of x —

can be useful as part of an implementation this method, and this method can also be

implemented so that the number of steps used by an execution of the method is at most

a constant.

Suppose, now, that the insertIntoSubtree method is as shown in Figure 5 on page 11. It

is assumed that x.firstChild, x.secondChild and x.thirdChild are the first, second

and third children of x, respectively, and that x.firstMax, x.secondMax and x.thirdMax

are the largest elements of E stored that the leaves of the subtrees of T with these children

of x as roots (when these children exist).

10

void insertIntoSubtree(E key, node x) {

1. if (x is a leaf) {

2. Set e to the be the element of E stored at x

3. if (e is equal to the input key) {

4. Throw an ElementFoundException

} else {

5. Throw a NoSuchElementException

}

6, } else {

7. try {

8. if (x has two children) {

9. if (key ≤ x.firstMax) {

10. insertIntoSubtree(key, x.firstChild)

} else {

11. insertIntoSubtree(key, x.secondChild)

}

} else {

12. if (key ≤ x.firstMax) }

13. insertIntoSubtree(key, x.firstChild)

14. } else if (key ≤ x.secondMax) {

15. insertIntoSubtree(key, x.secondChild)

} else {

16. insertIntoSubtree(key, x.thirdChild)

}

};

17. raiseSurplus(x);

18. } catch (NoSuchElementException ex) {

19. addLeaf(key, x)

}

}

}

Figure 5: The insertIntoSubtree Method

10. Suppose that the precondition for the “Insertion into Subtree of This 2-3 Tree” problem

11

is satisfied and the insertIntoSubtree method is executed with the input key and a

node x that is a leaf in T. Tracing through the execution of the pseudocode shown in

Figure 5, as needed, explain why this method would terminate with the postcondition for

the “Insertion into Subtree of This 2-3 Tree” problem satisfied.

11. Suppose, again, that the precondition for the “Insertion into Subtree of This 2-3 Tree”

problem is satisfied and the insertIntoSubtree method is executed with the key

and with an internal node x whose children are leaves, as inputs. Tracing through the

execution of the pseudocode shown in Figure 5, as needed, explain why this method

would terminate with the postcondition for the “Insertion into Subtree of This 2-3 Tree”

problem satisfied.

Note: It might be helpful to compare the algorithms shown in Figures 3 and 5, and make

use of similarities in their structure and analysis.

12. Describe how it can be proved that the insertIntoSubtree algorithm correctly solves

the “Insertion into Subtree of This 2-3 Tree” problem. Your answer does not need to be a

complete proof.

However, you should identify the proof technique that is used, describe

what you are inducting on and give both an “Inductive Hypothesis” and “Inductive Claim”

whenever induction is to be used, and briefly describe any cases that must be handled

(including, but not limited to, situations considered when solving the above problems).

13. Briefly describe how to prove that the number of steps executed by this method (when

it starts with its problem’s precondition satisfied) is in O(depth(x)) where x is the node

given as input — and where the Uniform Cost criterion is used to define this number of

steps.

It is not necessary to give a complete proof. However, a bound function for this method

should be identified. and you should explain how the upper bound for the number of

steps, given above, is established.

Once again, an examination of the insert method shown in Figure 4 should confirm that this

correctly solves the “Insertion into This 2-3 Tree” problem. Furthermore, the number of steps

used by an execution of this method is at most linear in the depth of this tree T.

2-3 Trees: Deletions

Now consider the problem of deleting a value from the subset of E represented by a 2-3 tree.

This problem can be defined as follows.

12

Deletion from This 2-3 Tree

Precondition:

(a) This 2-3 tree, T, satisfies the 2-3 Tree properties given above.

(b) key is a non-null input of type E.

Postcondition:

(a) If the input key belongs to the subset of E represented by T then

the key is removed from this set, which is otherwise unchanged. A

NoSuchElementException is thrown and T is not changed if the key does

not belong to this subset of E.

(b) T satisfies the 2-3 Tree properties given above.

14. Describe related problems to be solved, algorithms that solve these problems, and

sketches of proofs of correctness, as needed to describe a solution for this computational problem too.

It will probably be helpful to consider the development of the solution for the “Insertion

into This 2-3 Tree” problem — that you have completed if you have solved the problems

before this — and describe significant differences between these two problems and their

solutions.

• In both case some sort of “problem node” has probably been identified and then

moved from the area near a leaf up to the root of the tree. Say what kind of “problem

node” is involved for the “Deletion” problem instead of the “Insertion” one.

• You may find that lots of the related problems are similar to the ones needed for

“Insertion” problem. Make sure that you clearly identify which problems are similar, as you go, and describe the differences in enough detail so that a reader can

understand this.

One difference might be that grandchildren of a node should be considered, at a

place where the children a node got considered by the corresponding part of the

“Insertion” algorithm.

Please allow lots of time to answer this question! It will probably required more

work (and a different kind of activity) than the questions before it.

13

## Implementation as a Java Program

15. An incomplete Java program, TwoThreeTree.java, is now available. Details of the

implementation of a node for a 2-3 tree are consistent with the pseudocode given in this

assignment.

Please complete this, by supplying code for the methods whose content currently consists of the phrase

// FOR YOU TO COMPLETE

(or something very similar, if there are typographical errors causing similar comments to

be used too).

Do not change the code that has already been supplied: This may cause tests used

to evaluate your submission to fail. A complete solution has been prepared and tested,

so it has already been confirmed that no such changes are needed: If you feel that a

change is needed then this is probably evidence that you do not understand the problem

to be solved, or that you need to change the code you are including.

However, you will need to introduce additional methods. It might be helpful to add one or

two more methods to reduce repetitiveness in the code that you would otherwise need to

provide to complete an implementation of an insert algorithm.

You should also include

methods solving the additional computational problems you discussed when answering

the previous question, in order to complete the implementation of a delete algorithm.

Additional code for testing — with information about how to use it — will be made available online very soon (and, in plenty of time for you to use it).

14