CP2403 Project Part 2: Regression Analysis of Oceanographic Data

Verified

Added on  2023/03/21

|3
|299
|59
Project
AI Summary
This CP2403 project focuses on performing regression analysis on the CalCOFI dataset, specifically examining the relationship between water temperature and factors like salinity and oxygen concentration. The analysis includes scatter plots, variable selection, regression results, and a regression equation. A QQ plot is used to assess the normality of the data, and the percentage of observations exceeding 2 and 2.5 standard deviations are calculated and compared. The project concludes that the data is normally distributed and the model explains the response variable. Desklib offers a variety of resources, including similar solved assignments and past papers, to aid students in their studies.
Document Page
CP2403 - Project – Part 2- ANOVA
First Name:
Last Name:
1: Scatter plots between each explanatory variable and response variable
2: List all the explanatory variables selected for regression analysis. Justify your
selection
The explanatory variables selected for the regression analysis are Salinity of water (Salnty)
and Oxygen concentration of the water (02Sat). The reason why we chose these two
variables is that we want to determine whether Salinity of water and the Oxygen
concentration of the water affects the temperature of the water. Hence, temperature of the
water (T_degC) is the response variable
3: Regression analysis results
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
4: Regression equation/line
Temperature of Water(T_degC)=0.1575(Salinity)+0.098(Oxygen Concentration)
5: qqplot
Document Page
6: Conclusion from qqplot
It can be noticed that the data points are compact. This means that the data is normally
distributed. Thus, we conclude that the model explain s the response variable
7: percentage of observations over 2 standardized deviation
2 Standard Deviation
Depthm T_degC Salnty O2ml_L
4.16% 1.27% 5.48% 19.50%
8: percentage of observations over 2.5 standardized
2.5 Standard Deviation
Depthm T_degC Salnty O2ml_L
2.20% 1.27% 5.48% 19.50%
9: Conclusion from observations over 2 std and 2.5 std
There are more observations over 2 standard deviations compared to those outside 2.5
standard deviations when it comes to Depthm
The observation over 2 and 2.5 standard deviation are the same for the variables T_degC,
Salnty, and 02ml_L as indicated above.
Conclusion there doesn’t seem to be a big difference between the two types of standard
deviations when the variables are normally distributed
chevron_up_icon
1 out of 3
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]