ECS140A Project 1

$30.00

Category: Tags: , , , , You will Instantly receive a download link for .zip solution file upon Payment || To Order Original Work Click Custom Order?

Description

5/5 - (3 votes)

For this project you will be writing several Java classes to implement a programming language scanner and a CSV parser. The classes developed in this project will be used in subsequent projects. A Token class has been provided for you and an interface the Scanner will rely on for input has also been provided. The Scanner class must take in a PeekableCharacterStream and a List of keywords and must produce a stream of Tokens that are queried from either peekNextToken or getNextToken methods. The rules for the tokens defined below. You will also develop the CSVParser class that will parse CSV files into Maps, one for each row. Your first task is to develop a class that implements the PeekableCharacterStream interface for a FileInputStream. public interface PeekableCharacterStream{ // Returns true if more characters are available, false otherwise public boolean moreAvailable(); // Returns the next character that would be returned without consuming // the character. If no more characters are available -1 is returned. public int peekNextChar(); // Returns the character ahead in the stream without consuming the // the character. peekAheadChar(0) returns the same character as // peekNextChar(). If no more characters are available at that position // -1 is returned. public int peekAheadChar(int ahead); // Returns the next character and consumes it. If no more characters are // available -1 is returned. public int getNextChar(); // Closes the stream. public void close(); } Your second task is to develop the Scanner class that will rely on the PeekableCharacterStream interface and the Token class. The Scanner must have the following minimal interface. public class Scanner{ // Constructor that takes in a stream and a list of keywords. public Scanner(PeekableCharacterStream stream, List keywordlist); // Returns the next token without consuming it. If no more tokens are // available a None token is returned. public Token peekNextToken(); // Returns the next token and consumes it. If no more tokens are // available a None token is returned. ECS140A FQ20 October 7, 2020 This content is protected and may not be shared, uploaded, or distributed. Project 1 2 of 3 public Token getNextToken(); } Token Rules: Identifier := ( _ | Alpha ) { ( _ | Digit | Alpha ) } Operator := ( | , | ) | { | } | = | == | < | > | <= | >= | != | + | – | * | / | ; IntConstant := [ – ] Digit { Digit } FloatConstant := [ – ] Digit { Digit } [ . Digit { Digit } ] StringConstant := ” { ( CharacterLiteral | EscapedCharacter ) } ” Digit := 0 – 9 Alpha := A – Z | a – z WhiteSpace := Space | Tab | CarriageReturn | NewLine CharacterLiteral := Space – ! | # – [ | ] – ~ EscapedCharacter := \b | \n | \r | \t | \\ | \’ | \” Additional Tokenizing Rules: • Keywords are identifiers that are in the List provided to the Scanner. • Whitespace must be skipped during the tokenizing. • A negative sign immediately preceding an integer or float constant must be tokenized as an operator if the previous token was a constant or identifier. For example, “A -5” must be tokenized as Identifier “A”, Operator “-”, and IntConstant “5”, not as Identifier “A”, and IntConstant “-5”. • If an underscore or Alpha character immediately follows a constant, the token is considered Invalid. All Alpha, Digit, and underscore characters will be part of the Invalid token. • If an invalid character is in a string constant, the characters are consumed until the next non-escaped ” is reached or the end of stream is reached. The token type is considered Invalid. • Any invalid character beginning a token will be considered an Invalid token by itself. For example, “@#4” must be tokenized as Invalid “@”, Invalid “#”, and IntConstant “4”. Implementation Requirements: • You may use java.io.FileInputStream. • You may use java.util.Set, java.util.HashSet, java.util.Arrays and similar containers. • You may not use java.util.regex or similar packages. • You may not use java.util.StringTokenizer or similar library classes. Your final task is to develop the CSVParser class that will rely on the PeekableCharacterStream interface. The CSVParser must have the following minimal interface. public class CSVParser{ // Constructor that takes in a stream. public CSVParser(PeekableCharacterStream stream); // Returns the next row without consuming it. If no more rows are // available null is returned. public Map<String,String> peekNextRow(); // Returns the next row and consumes it. If no more rows are // available null is returned. public Map<String,String> getNextRow(); } ECS140A FQ20 October 7, 2020 This content is protected and may not be shared, uploaded, or distributed. Project 1 3 of 3 CSV Format Rules • CSV files must have a header row, and no column in the header may be repeated or empty. • Each row is terminated by a newline character. • Each column is terminated by a comma character. • Any whitespace (space, tab, carriage return, or newline) character that is part of a column must be a double quoted ” column. The escape sequence for a double quote in a double quoted column is two double quotes in a row. • Any empty columns or missing columns will return a value of null for the corresponding value in the returned Map. • Valid CSV files are not allowed to have more columns in a data row than the header row but may have fewer. Your Scanner and CSVParser classes must have a main function that takes in a filename as an argument and outputs the contents similar to the examples in /home/cjnitta/ecs140a/proj1 on the CSIF. You can run the provided solutions by running the shell scripts Scanner.sh or CSVParser.sh and providing a filename to open. Your code will be tested on the CSIF and is expected to compile and run on the CSIF. You must submit the source file(s), a Makefile, and README.txt file, in a tgz archive. Do a make clean prior to zipping up your files so the size will be smaller. You will want to be in the parent directory of the project directory when creating the tgz archive. You can tar gzip a directory with the command: tar -zcvf archive-name.tgz directory-name You should avoid using existing source code as a primer that is currently available on the Internet. You must specify in your readme file any sources of code that you have viewed to help you complete this project. You must also provide the URL any code sources in comments of your source code. All class projects will be submitted to MOSS to determine if students have excessively collaborated. Excessive collaboration, or failure to list external code sources will result in the matter being referred to Student Judicial Affairs.