Description
GOAL
Trees are invaluable data structures. In this assignment, you will write the code that will create a binary search tree
(BST) of unique words from a given input text file, and also keep track of the frequency of the words in the text. For
example:
Example input text file “example01.txt” Resulting BST
This is a test. This is a test.
This is a test. This is a test.
This is a test. This is a test.
This is a test. This is a test.
This is a test.
Example Input text file “example02.txt”
The first second was alright, but the second second was tough.
Resulting BST
CPSC 319, Summer 2020 Asgmt #2 – Binary Search Trees (BST) Page 2 of 7
2
The nodes from the BST store the word itself and its frequency in the input text file. Using this structure for a tree, it
is possible to find if a word occurs in the text in O(log2n) (i.e., if the tree is of minimum height) as well as the number
of times the word occurs in the text.
Specifications for reading and processing the input text file:
• All text will be considered to be lower-case, even if the word in the file has an upper-case character as the first
character, or even if it is all upper-case, or a mixture of upper- and lower-case.
• Words are delimited with spaces.
• All punctuation will be ignored.
• Hyphenated words will be considered as two words (or state another way, the hyphen is a delimiter).
INSTRUCTIONS
Write the Java program that will do the following:
1. Request the user for the name
of a text file.
This should be an ASCII text file
(i.e., like the format from
assignment #1)
Example (display screen) for “example01.txt” file:
2. Use the words from 2 input files (available at the assignment’s D2L page) to create the BST specified in the
previous page. You will have to do the conversions to lower-case and take care of all of the other characters
that do not make up the word as defined above.
Two Input files: text1 (i.e., merged example01 and example02 from this assignment specs), text2 (i.e., an
excerpt from “The Rime of the Ancient Mariner in Seven Parts” by Samuel Taylor Coleridge)
3. When the tree has been created, it should supply the following information as a display output:
(3.1) The total number of words in the file.
Use either IN-ORDER, PRE-ORDER, or POST-ORDER traversal, counting the number of times you visit each
node in the BST.
(3.2) The number of unique words in the file.
Use either IN-ORDER, PRE-ORDER, or POST-ORDER traversal, counting the number of nodes visited
containing the word with frequency = 1
(3.3) The word which occurs most often and the number of times that it occurs.
Use either IN-ORDER, PRE-ORDER, or POST-ORDER traversal, looking for the word(s) with the largest
frequency.
CPSC 319, Summer 2020 Asgmt #2 – Binary Search Trees (BST) Page 3 of 7
3
Example (display screen for question #3) for example01 using POST-ORDER traversal (i.e., left, right, node):
Example (display screen for question #3) for example02 using POST-ORDER traversal (i.e., left, right, node):
4. The user should also be able to
request information about any
word that is input when requested.
The result should be whether
the word exists, and the
number of times it appears (i.e.,
its frequency). Use either INORDER, PRE-ORDER, or POSTORDER traversal.
Example (display screen) for example01:
CPSC 319, Summer 2020 Asgmt #2 – Binary Search Trees (BST) Page 4 of 7
4
5. BONUS – The user should also be able to request to display the entire tree using any of the 3 traversal methods.
The result should be each word in the tree separated by a single space.
Example (display screen for question #5) for example01:
Example (display screen for question #5) for example02:
CPSC 319, Summer 2020 Asgmt #2 – Binary Search Trees (BST) Page 5 of 7
5
So, here is the sequence of what should be displayed by running your program for example 01
For this next display, you should keep prompting the user using any key to finish it
Finally, the last display
CPSC 319, Summer 2020 Asgmt #2 – Binary Search Trees (BST) Page 6 of 7
6
HAND-IN
1. Your code + README (w/ instructions on how to run and use/interact with your program)
2. Include all of the above items in a zipped folder titled (asgmt2-your LAST NAME-ID).zip and submit it to D2L
dropbox.
3. LATE ASSIGNMENTS WILL NOT BE ACCEPTED
MARKING
Source code that does not compile or produces run-time errors will receive a grade of 0%.
Item Points
1 Request the user for the name of a text file
(item #2, page 2 of this document) 1 point
2
When the tree has been created, execute, and
display the information
(items 3.1, 3.2, 3.3, page 2 of this document)
6 points
1 point for each of the 3 items (3.1, 3.2, 3.3)
for each of the 2 input files (text1, text2) – i.e., 3 x 2 = 6 points
2
The user should also be able to request
information about any word that is input when
requested
(item #4, page 3 of this document)
4 points
2 points (1 for the user prompt interface + 1 point for the
correct answer from your program) for each of the 2 input
files (text1, text2) – i.e., 2 x 2 = 4 points
3 Output for each of the 3 traversal modes
(page 4 of this document)
6 BONUS points
1 point for the correct output of each of the 3 traversal modes
for each of the 2 input files (text1, text2) – i.e., 3 x 2 = 6 bonus
points
TOTAL 11 points (10% of the assignment grade) +
6 BONUS points
Note: I will distribute the bonus points as follows:
1 bonus point to quiz #1,
1 bonus point to quiz #2 and
4 bonus points to the final exam
CPSC 319, Summer 2020 Asgmt #2 – Binary Search Trees (BST) Page 7 of 7
7
INDIVIDUAL WORK
• The assignment must be done individually, so you must write up the solutions on your own in your own words.
• Everything that you hand in must be your original work, except for the code copied from the textbook, lecture
material (i.e., slides, notes), web, or that supplied by your TA. When someone else’s code is used like this, you
must acknowledge the source explicitly, citing the sources in a scientific way (i.e., including author(s), title, page
numbers, URLs, etc.).
• Copying another student’s work constitutes academic misconduct, a severe offence that will be dealt with
rigorously in all cases. Please read the sections of the University Calendar under the heading “Student
Misconduct.”
• If you are in doubt whether a specific form of aid is allowed, ask your instructor!
• Contact your TA if you have problems getting your code to work.
• Note that your code may be checked thoroughly for plagiarism by computer.
END OF THE ASSIGNMENT