This report analyzes the GPA of male and female students, parent qualification, and predictors of GPA using correlation and regression analysis. Find out the significance of variables and recommendations for improving the regression model.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
DATA ANALYSIS STUDENT NAME/ID [Pick the date]
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
Task 1 (a)Box plot to represent the GPA of male and female students The box plot is not symmetric rather skewed which is confirmed from the outliers located at both ends. The outliers indicate that there are students who have subsequently scored high GPA. As the data is skewed and comprise outliers and hence, median and inter quartile range (IQR) are the right option for central tendency and variance measures of GPA scores. (b)Hypothesis testing for checking the claim that GPA of male students differs from the GPA of female students. H0:μGPA(Female)−μGPA(male)=0 H1:μGPA(Female)−μGPA(male)≠0 two sample t test for unequal variance is termed as appropriate test to check the claim of hypothesis test. 1
Assuming the level of significance as 5% The alternative hypothesis sign is indicative of the aspect that two tailed p value would be taken into account. The p value (0.1972) is more than significance level (0.05) which implies that sufficient statistically evidence is available to not reject the null hypothesis. Null hypothesis is not rejected and thereby, the alternative hypothesis would not be accepted and hence, GPA of male students does not differ from the GPA of female students. 2(a) Hypothesis testing for checking the claim that GPA of students with parent qualification post-graduation is higher than the GPA of students with parent qualification under-graduation is conducted below. H0:μGPA(Postgraduate)−μGPA(Undergraduate)=0 H1:μGPA(Postgraduate)>μGPA(Undergraduate) Two sample t test for unequal variance is termed as appropriate test to check the claim of hypothesis test. 2
Assuming the level of significance as 5% The alternative hypothesis sign is indicative of the aspect that one tailed p value would be taken into account. The p value (0.1757) is more than significance level (0.05) which implies that sufficient statistically evidence is available to not reject the null hypothesis. Null hypothesis is not rejected and thereby, the alternative hypothesis would not be accepted and hence, GPA of students with parent qualification post-graduation is not higher than the GPA of students with parent qualification under-graduation. (b)Hypothesis testing for checking the claim that GPA of students with parent qualification under-graduation is higher than the GPA of students with parent qualification secondary or below is conducted below. H0:μGPA(Postgraduate)−μGPA(Secondary∨below)=0 H1:μGPA(Postgraduate)>μGPA(Secondary∨below) Two sample t test for unequal variance is termed as appropriate test to check the claim of hypothesis test. 3
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Assuming the level of significance as 5% The alternative hypothesis sign is indicative of the aspect that one tailed p value would be taken into account. The p value (0.00) is lower than significance level (0.05) which implies that sufficient statistically evidence is available to reject the null hypothesis and thereby, the alternative hypothesis would be accepted and hence, GPA of students with parent qualification under-graduation is higher than the GPA of students with parent qualification secondary or below. Task 2 (3) The correlation matrix as computed using Excel is highlighted as follows. 4
The correlational matrix is indicative of the fact that HS_MATH has the maximum amount of correlation with GPA. The next in line on count of correlation would be ATA. The lowest correlation is witnessed between HS_ENG and GPA. (4)(i) One of the predictors of GPA is HS_SCI which is reflected in both correlation analysis along with regression analysis. The relevant regression analysis is indicated below. (ii) One of the predictors of GPA is HS_ENG which is reflected in both correlation analysis along with regression analysis. The relevant regression analysis is indicated below. (iii) One of the predictors of GPA is HS_MATH which is reflected in both correlation analysis along with regression analysis. The relevant regression analysis is indicated below. 5
(iv) One of the predictors of GPA is ATAR which is reflected in both correlation analysis along with regression analysis. The relevant regression analysis is indicated below. 5) Regression model 6
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
6) Regression output (for step 5) Interpretation ï‚·HS_SCI: The slope is indicative that GPA change observed would be 0.08 when the underlying student would secure a unit higher score in science at high school. . ï‚·HS_ ENG: The slope is indicative that GPA change observed would be 0.04 when the underlying student would secure a unit higher score in English at high school. 9
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
ï‚·HS_MATH: The slope is indicative that GPA change observed would be 0.35 when the underlying student would secure a unit higher score in Math at high school. ï‚·PARENT EDUC: The slope is indicative that GPA change observed would be 0.88 when the underlying student would have a parent whose highest qualification is improved by one level. ï‚·GENDER: In comparison with the male student, a female student would tend to score on average 0.11 GPA higher assuming that the other parameters remain constant across genders. The relevant hypothesis in regards to given hypothesis test are enumerated below. The decision rule is that the independent variables for which the p value does not exceed the significance level would be significant. These are essentially HS_Math and Parent_Educ which are considered significant here. 6) The ATAR coefficient is negative in Step 6 which is quite strange as higher ATAR would be likely to be associated with GPA. However, it is noteworthy that ATAR slope lacks significance and therefore is not a significant parameter. Task 3 Summary Report 10
The SES has sizable influence on the average student GPA as reflected from the inferential testing used (Figure 1). This is confirmed from the fact that differences in performance are observed between students having differing academic qualification. But, no significant influence is observed for gender as an independent variable since for male and female, the average GPA do not vary (Figure 2). With regards to regression, the key observation is that the movement from Step 1 to Step 3 leads to continued insignificance for HS_SCI slope coefficient and lowering of magnitude.This makes a strong case for insertion of additional independent variables which need to be inserted for increasing the predictability of the regression model. Also, when ATAR is used as a standalone independent variable, then significance is noticed (i.e. p <0.05). But as exhibited from other models, as there is insertion of other independent variables also, significance is lost by ATAR (Figure 3). In order to choose the best possible model from the available choice, the adjusted R2would be used as a suitable indicator. The Step 4 is the most suitable model from the available choices. If a more desirable model is required from the given variables, it makes sense to form a new regression model which comprises of only two independent variables which have proved their significance (i.e. HS_MATH and PARENT_EDUC). Taking the optimum model as Step 4, the fit of the model is quite poor considered that the independent variables jointly can offer explanation to 23.22% of changes in GPA (Figure 4). Considering the poor fit of the existing best fir regression model it is imperative to consider new predictor variables which can be added to the regression model so as to enhance the predictive power and lower the standard error. Some of the potential options in this aspect would be study hours, attendance in class, engagement in online modules and general IQ level of students. Since some of these variables could lead to GPA estimate improvement, hence these need to be strongly inserted. Figure 1 11
Figure 2 Figure 3 Figure 4 12
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.