logo

Data Analysis: Correlation, Regression and Hypothesis Testing

   

Added on  2023-06-04

9 Pages2835 Words336 Views
Data Analysis
[1]

Task 1
1.
(a) Side-by-side box plot was constructed of GPA for students of both genders.
Figure 1: Side-by-side Box Plot for GPA Scores
The average (median) GPA scores for both the genders were found to be almost same. Medians for
both the distributions were almost in the middle of the spread, indicating that both the distributions
were almost normal in nature. The spread or Interquartile range of GPA scores for males was observed
to slightly larger than that of the spread of GPA scores for females, indicating that middle 50% males
varied greatly in GPA scores than that of the females. The lower 25% of the males obtained less marks
compared to lower 25% of the females.
(b) The difference in average GPA scores between male and female students was compared with
independent t-test at 5% level of significance. The null hypothesis assumed that there was no
difference in average GPA scores between male and female students. Average GPA score for females
(M = 4.75, SD = 1.18) was noted to be greater than that of males (M = 4.52, SD =1.40). The claim was
tested with one tail t-test and no statistically significant difference between the average GPA score (t
= 1.294, p = 0.099). Hence, the apparent claim of average GPA score for females to be greater than
that of males was rejected at 5% level. The null hypothesis failed to get rejected at 5% level.
2.
(a) The claim that students with higher socio-economic status (SES) tend to have stronger academic
achievement was tested at 5% level of significance. It was hypothesized that average GPA scores for
[2]

post-graduate and undergraduate were equal. The null hypothesis failed to get rejected at 5% level
as average GPA of PG_SES (M = 5.10) was found to have no statistically significant difference (t =
0.94, p = 0.176) with average GPA of UG_SES (M = 4.89). The one tail (right) test was conducted at 5%
level, and the both of the group were found to have similar GPA scores.
(b) The claim that GPA scores of students with undergraduate parents are higher (M = 4.89) than that of
the students with parents having secondary or below qualification (M = 4.08) was tested at 5% level
of significance. The null hypothesis assumed that there was no significant difference in GPA scores
between students with parents as undergraduate and secondary level. The null hypothesis was
rejected (t = - 4.291, p < 0.05) at 5% level, indicating that GPA scores of students with undergraduate
parents are significantly higher than that of the students with parents having secondary or below
qualification.
Task 2
3. The correlation matrix has been provided in Table 1. It was observed that GPA score was positively
associated with all the independent variables. The linear association was significant enough (r >= 0.3)
between the GPA and other four independent quantitative variables.
Table 1: Correlation Matrix
GPA HS_SCI HS_ENG HS_MATH ATAR
GPA 1
HS_SCI 0.344 1
HS_ENG 0.304 0.579 1
HS_MATH 0.444 0.576 0.447 1
ATAR 0.424 0.852 0.764 0.797 1
4. For finding the significant predictors of GPA scores, an ordinary least square regression model was
constructed as below.
Table 2: Regression Model with All the Independent Variables
Regression Statistics
Multiple R 0.464
R Square 0.216
Adjusted R Square 0.201
Standard Error 1.190
Observations 224
ANOVA
df SS MS F Significance F
Regression 4 85.236 21.309 15.045 0.000
Residual 219 310.191 1.416
Total 223 395.427
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 1.031 0.526 1.961 0.051 -0.005 2.068
HS_SCI 0.089 0.104 0.852 0.395 -0.116 0.294
HS_ENG 0.106 0.099 1.069 0.286 -0.089 0.300
HS_MATH 0.307 0.101 3.054 0.003 0.109 0.505
ATAR -0.007 0.027 -0.265 0.791 -0.060 0.046
[3]

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
BSB123 Data Analysis Research Report
|7
|1649
|219

Data Analysis: Box plot, t-tests, and Regression Analysis
|14
|1588
|320

Data Analysis Report for Student GPA
|14
|1450
|447

Data Analysis Research Report | Desklib
|12
|1770
|141

Hypothesis Testing and Confidence Intervals in Biostatistics
|11
|1892
|389

Hypothesis Testing With Sun Coast Remediation Data Set
|5
|801
|160