Sale!

CS 539-001 Natural Language Processing EX 1: Finite State Transducers

$30.00 $18.00

Category: You will Instantly receive a download link for .zip solution file upon Payment || To Order Original Work Click Custom Order?

Description

5/5 - (3 votes)

1 An FST for Pluralization
Input line 1: a p p l e
(12 states / 11 arcs reduce-> 7/6)
a p p l e s 1
And you should see that the FST added an -s to the end of “apple”.
start s0 s1
:s
x:x
Figure 1: An FST that tries to add an -s to the end of a word. Note that x is a variable that matches any
letter. That self arrow means that we read in any letter and output back that same letter. Check out the
pluralize.fst file to see how this FST is defined in Carmel.
This is great but will also add an -s to words like “bus” that already have an s at the end.
1You can also download it for yourself from http://www.isi.edu/licensed-sw/carmel/, but it is rather challenging to
compile it from source. So we strongly recommend you take our provided binaries.
2Don’t worry about all of the crazy flags just yet. They are explained at the top of HW1.
1
You can run pluralize.fst using:
2
> echo “a p p l e” | carmel -sliOEQk 5 pluralize.fst
> echo “b u s” | carmel -sliOEQk 5 pluralize.fst
Input line 1: b u s
(8 states / 7 arcs reduce-> 5/4)
b u s s 1
This FST turns “bus” → “buss”. Which is not correct. We went ahead and provided another FST file
pluralize2.fst that tries to add -es instead of -s to the end of bus.
start s0 s1 s2 s3
s:s
x:x
:e :s
:s
Figure 2: An FST that tries to add an -es or an -s to the end of a word.
This FST now almost does the right thing with “bus”:
echo “b u s” | carmel -sliOEQk 5 pluralize2.fst
Input line 1: b u s
(10 states / 10 arcs reduce-> 7/7)
b u s s 1
b u s e s 1
How do you fix it so that it only outputs the correct “buses”? And what happens if you change the input to
“sus”? Or “fuss”? Why? Figure out a way to fix this, so that pluralize2.fst can properly handle words
like “bus”, “bass”, “sass”, or “rise”. Submit your modified pluralize2.fst file. Describe in your report
the modifications made to the FST.
2 Optional: Other Pluralizations
Modify your FST again to handle some other pluralization rules.3 Some examples might include
• “cherry” → “cherries”
• “leaf” → “leaves”
• “matrix” → “matrices”
• “automaton” → “automata”
Save this as a new file pluralize3.fst and submit that along with pluralize2.fst. Describe in your
report what you tried, the examples that you handled, examples that you cannot handle, and how you
modified your FST to accomplish this.