Please use the data, StatCrunch, (or Excel), and your knowledge of statistics to answer the questions below. *Download the provided CDC data into StatCrunch or Excel (located under this assignment on the class site).
1a. Calculate the correlation coefficient (the r value) between each of the independent (and quantitative) variables, and the variable called diabetes.
Fill in this TABLE that gives the r value (correlation) between each variable in the dataset and diabetes:
r values TABLEObesity Rates
DIABETESr value here r value here r value here r value here
1b. What does this value tell us about the relationship between each of the independent variables and diabetes?
Hint: In other words, looks at each r value you have calculated. Each r-value will be either positive or negative (or 0). Each r-value will be strong, medium, or weak. Describe each of the r values in terms of the relationship that each represents.
FILL IN THIS TABLE TO ANSWER #2:
r values TABLEObesity RatesPhysical ActivityPoverty RateSmoking
DIABETESdescribe r value (positive or negative?) (strong, medium, weak)? What does it tell about the relationship? describe r value (positive or negative?) (strong, medium, weak)? What does it tell about the relationship?describe r value (positive or negative?) (strong, medium, weak)? What does it tell about the relationship?describe r value (positive or negative?) (strong, medium, weak)? What does it tell about the relationship?
2a. Run a regression using diabetes as the dependent variable (y), and smoking as the independent variable (x).
HINT: Therefore, the variable called Diabetes is going to be your dependent or y variable and the variable called smoking will be the independent or x variable. If you create a scatterplot (with the x variable on the horizontal axis and the y variable on the vertical access, you can see the relationship).
PASTE THE SCATTERPLOT WITH THE TRENDLINE and EQUATION HERE
Make sure you have the regression equation included.
3a. Is there a statistically significant relationship between poverty rates and diabetes? Explain.
HINTS: This question is not related to the question above it. In the question above it, you are asked to run regression for Diabetes and Smoking.
For this question, you are looking at Poverty and Diabetes. You are asked to determine if the relationship between Poverty and Diabetes is significant.
Here are two great YouTube Videos for Regression and the p values for the correlation in Excel. Even if you use StatCrunch, the concepts are the same
Write out the regression equation calculated using the data. You can use Excel to get this.
Interpret the slope coefficient the value in front of the x.
HINTS: You can do this in Excel (or StatCrunch). As a note, I always use Excel because it is more common, better for a resume, and publically available (StatCrunch is not publically available).
Linear Regression Equation in Excel