Data Mining

Classification Trees Analysis

This assignment is to give you the hands-on experience using R to conduct logistic regression in real world data set. Please refer to the Chapter 9 in the reference textbook (through the link at the bottom under “Lessons”) for details about how to generate classification tree models and the evaluate the model performances. Then open A Complete Guide On Decision Tree Algorithm (or open the attached Week 5 A Complete Guide On Decision Tree Algorithm.docx), go over the mushrooms.csv example and use the same R codes to reproduce the results step by step, study the way to explain the model and evaluate the results:

Step 1: Install and load libraries

Step 2: Import the data set

Step 3: Data Cleaning

Step 4: Data Exploration and Analysis

Step 5: Data Splicing

Step 6: Building a model

Step 7: Visualizing the tree

Step 8: Testing the model

Step 9: Calculating accuracy

Now open this file mushrooms2.csv (slightly different from the sample dataset) and repeat the same analysis as in the website to conduct a classification tree analysis according to the above steps specifically. Please copy/paste screen images of your work in R, and put into a Word document for submission. Be sure to provide narrative of your answers (i.e., do not just copy/paste your answers without providing some explanation of what you did or your findings). Please include Introduction, R codes with outputs, Figures and explanations with cover and reference pages. A good conclusion to wrap up the assignment is also expected. You also need to follow APA formats.

Reference

https://www.edureka.co/blog/decision-tree-algorithm/