Attached was the midterm project now i need the final project done
- No more than 2 to 3 pages
- Present the findings using the skillset acquired (topics covered) in class.
- Also include the dataset with the analysis (could be excel or any statistical package). You should provide details of the analysis in an Appendix.
How are the 40 points given: 10 for each Module you choose to apply. (For example, you choose regression to test an association or predict an outcome, you get 10 points for that analysis)
Data Analytics is a subject that can be best appreciated only when applied to a dataset you are familiar with. The aim of this project is to achieve that. Do not view this project as a hurdle in the course, rather a bridge to connect the topics you learnt to your work or subject domain. There are five main modules in this course:
- Module 1 : Normal Distribution (Percentile, distribution of means, and chance of occurrence if we assume normal distribution)
- Module 2 : Confidence Interval Estimation (Including Sample Size determination)
- Module 3 : Inferences from data (Hypothesis testing, i.e., confirming or checking if a claim made about the data. In this module, we dealt with only one sample)
- Module 4 : More Inferences from data (Multiple samples)
- Module 5 : Regression analysis (Both simple and multiple, apart from basic ANOVA)
- Bring your own data from work (you can remove any private or confidential information, for example: if you are bringing any sales or cost data of an item/product or service the name can be masked)
- Use data from your previous work or company you have access to (again you can remove any private/confidential information)
- Use data from public domain In todays world, there is no dearth of structured data. Here are some places where you can get data from:
- Any data source you have access to like the Hawkes Learning Resources
- Datasets (1) (Links to an external site.) from Hawkes
- Datasets (2) (Links to an external site.) from Hawkes – Look at the additional datasets, not the chapter datasets
- U.S. Bureau of Labor Statics (Links to an external site.)
- U.S. Governments open data (Links to an external site.)
- Center for Medicare and Medicaid services (Links to an external site.)
- Kaggle datasets (Links to an external site.)
- WHO Data repository (Links to an external site.)
- World Bank Data (Links to an external site.)
- Google Public data explorer (Links to an external site.)
- Amazing visualization or graphics
- But remember, we need the data to do analysis, if you look at the bottom of any figure Google would provide the source name, and you can retrieve data from there.
- Any sports data (from the appropriate website, getting data in structured format for several years might be challenge, but a few minutes or an hour you can do it)
- For example Cricket data could be obtained from espncricinfo (Links to an external site.).
- Any data source you have access to like the Hawkes Learning Resources