STT 422 EXAM 1

Due Monday Apr 08, 11:59 pm, 25% of Final Grade = 80 points

There are total 8 questions.

Each question has subparts.

1

1 Problem 1

Consider the data set bank wage.csv. Using R or otherwise answer the following questions:

1. (2 points) Plot wages versus LOS and circle the outlier with the highest value of wage. (Drop

this observation for remaining parts.)
2. (1 point) Find the least squares regression line for the regression of wages on LOS.
3. (4 points) Give the significance test for the slope of LOS. (Clearly mention the hypothesis test,

test statistic, pvalue and conclusion).
4. (3 points) Give a 95% prediction interval at LOS=55.
5. Problem 2

Consider the data set student gpa.csv. Consider a regression model for predicting GPA using IQ,

gender and self-concept. Using R or otherwise answer the following questions:
6. (4 points) Give the F-statistic for testing

H0 : βIQ = βgender = βselfconcept = 0

Also provide the degrees of freedom for this F-statistic.
7. (4 point) Run correlation tests to check if GPA is correlated to

(a) IQ

(b) GENDER
8. Problem 3

Consider the data set biomarkers.csv. Consider a regression model for predicting VO+ using OC,

TRAP and VO-. Using R or otherwise answer the following questions:
9. (2 points) Give the statistical model for this including all assumptions.
10. (2 point) Give the multiple regression regression line to predict VO+ from OC, TRAP and

VO-.
11. (4 points) Make a table with t-statistics and pvalues for all the explanatory variables. Which

is the least significant variable among OC, TRAP and VO-.
12. (4 points) Consider the full model and the one without the least significant variable. Give the

anova table to compare these two models.

2
13. Problem 4

Do people from different cultures experience emotions differently? Here is a summary of the data:

Are the means same across different cultures?
14. (2 points) Should you use a pooled standard deviation? If yes, what is its value?
15. (4 points) Construct an ANOVA table for this problem.
16. (2 points) State the hypothesis test for this problem.
17. (2 points) Provide the p-value for hypothesis test in part 3.
18. Problem 5

Consider the data set price promotion.csv. Using R or otherwise answer the following questions.
19. (2 points) Construct a contrast which can compare the average of promotions 1 and 7 to the

average of promotions 3 and 5.
20. (3 points) Give a 95% confidence interval for the contrast in part 1.
21. (4 points) Use the Bonferroni or another multiple-comparisons procedure to compare different

price promotion groups.
22. Problem 6

Consider the data set intervene program.csv. Using R or otherwise answer the following questions.
23. (3 points) Plot the means. Do you think there is an interaction between Group and Time.

3
24. (2 points) Give an estimate for the main effect of group 1.
25. (4 points) Construct the two way anova model for this problem with group and time as the

factors.
26. (2 points) Can you accept the hypothesis that there is a main effect of time?
27. Problem 7

Consider the data set plants1.csv. Using R or otherwise answer the following questions.
28. (4 points) Find the means for each species-by-water combination. Plot these means versus

water for the four species, connecting the means for each species by lines.
29. (2 points) Give the interaction effect between species level 1 and water level 6.
30. (4 points) Give the two-way analysis of variance with species and water as factors.
31. Problem 8

A study of 170 franchise firms classified each firm as to whether it was successful or not. Attached is

the data.
32. (2 points) What proportion of exclusive territory firms are successful?
33. (2 points) Find the log odds for the answer in part 1.)
34. (6 points) Let x = 1 for exclusive territories and x = 0 for other territories. Using R or otherwise. (a) (3 points) The Fit for logistic regression model. (B) (3 points) Odds ratio for exclusive territory versus no exclusive territory. WX: codehelp