Description
Objectives
- Build and analyze simple regression algorithms based on KNN and linear models
- Identify cases of underfitting and overfitting
- Select parameters that optimize performance (generalization)
Problem #1
For this problem, you will use the Wine Quality database (posted in Blackboard). Use the provided training data subset to train your model and the testing subset to predict and analyze your results.
- Build and train a KNN Regression Vary the parameter K and analyze the results by identifying cases of overfitting and underfitting. Select the optimal value of K and justify your choice.
- Build and train an OLS Regression Analyze the results and indicate if the learned model is a good choice for this data. Justify your conclusions.
- Build and train a Ridge Regression Vary the constraint parameter a and analyze the results by identifying cases of overfitting and underfitting. Select the optimal value of a and justify your choice.
- Build and train a LASSO Regression Vary the constraint parameter a and analyze the results by identifying cases of overfitting and underfitting. Select the optimal value of a and justify your choice.
What to submit?
- A report that
- Describes your experiments,
- Summarizes, explains (using concepts covered in lectures) and compares the results (using plots, tables, figures)
- Identifies the best method for each dataset.
- Do not submit your source code
- Do not submit raw output generated by your code!
- Your report needs to be a single file (MS Word or PDF)
- Your report cannot exceed 10 pages using a font of 12
- Assign numbers to all your figures/tables/plots and use these numbers to reference them in your discussion