You will perform the calculations for three problems using data in the attached spreadsheet containing Data Set 6. Please perform the calculations in the spreadsheet.

Upload all of your answers, including your analysis and interpretation, in a Word document. Please also upload your Excel spreadsheet to show your calculations.

16. For this problem, you will construct and interpret a scatter plot as well as construct a linear regression to describe relationship between two continuous variables. The data on the tab 16. (Under 5 mortality) of DS6 represent mortality among children under 5 years of age (U5mort) and a measure of literacy among women (Femlit) for a number of countries. Assuming that mortality and literacy are measured using a consistent measuring system, do the following:

- Create a scatter plot displaying the relation between female literacy and childhood mortality. Be sure to plot female literacy on the horizontal (i.e., x) axis. Describe the scatter plot and include a copy in your written report. (2 points)
- Compute the correlation coefficient for the relationship between female literacy and childhood mortality using the =CORREL() function in Excel. Describe the correlation coefficient and offer an interpretation. (2 points)
- Using the data analysis tools in Excel, create a linear regression model to test the hypothesis that childhood mortality is related to female literacy in these data. (3 points)
- Interpret the results of the output from the regression analysis, and write out the equation that describes these data. (3 points)

17. In this problem, you will create and use a dummy variable and the regression procedure to test the hypothesis of independence of two variables. The data in tab 17 (Waiting time) in DS6 represent the time spent waiting in the ER prior to being seen for two groups of patients. One group of patients had true emergencies; the second group had conditions requiring urgent medical attention. Using these data, do the following:

- Rearrange the data so they can be analyzed using the regression tool in the data analysis add-in in Excel. That is, place the waiting times into a single column next to which you will add the dummy variable described in part b below. (2 points)
- Add an independent variable that has the values 1 for Emergency and 0 otherwise, i.e., create a dummy variable for type of visit. (2 points)
- Test the hypothesis that waiting time is independent of the reason for the emergency room admission (i.e., that wait time does not depend on the reason for the visit) using regression in Excel. Report the results of your test and include a copy of the ANOVA table and the table with the regression coefficients from the regression analysis in your report. (6 points)

18. Use the data in tab 18. (Charges) in DS6 and perform a multiple regression analysis of predictors of hospital charges (Charges in the dataset). Include in your regression model only the following independent variables: Sex1, LOS, and age. Note, Sex1 is a dummy variable to indicate the sex of the patient.

- Create a multiple regression equation that describes the relation between these three independent variables and hospital charges and provide a written interpretation of your results. (5 points)
- Describe the statistical significance of each regression coefficient and write out the final regression equation. (5 points) Include a copy of the relevant portions of the output from the regression procedure in your report.