logo

Regression Analysis Project Assignment

   

Added on  2022-05-31

13 Pages3377 Words67 Views
Statistics and Probability
 | 
 | 
 | 
REGRESSION ANALYSIS
Assignment 2018/19 (Part 2)
Student Number: 17068115
Regression Analysis Project Assignment_1

Contents
1. Introduction.......................................................................................................................................................... 3
2. Methods.................................................................................................................................................................. 3
3. Results..................................................................................................................................................................... 3
4. Model evaluation................................................................................................................................................. 6
5. Assumptions......................................................................................................................................................... 6
1. Independence of residuals...................................................................................................................... 6
2. Residuals should be normally distributed....................................................................................... 6
3. Homoskedasticity of the residuals...................................................................................................... 7
6. Conclusions......................................................................................................................................................... 10
Appendix.................................................................................................................................................................. 11
R commands....................................................................................................................................................... 11
Regression Analysis Project Assignment_2

1. Introduction
The objectives are to create prediction models for water usage using data discussed in a previous
report.
2. Methods
This research study is dedicated to the prediction of water utilization in a plant with regards to
gallons. The data employment in this report is comprised of 100 random data sample for each of
the following variables: monthly water usage, supervior in charge, person on monthly payroll,
operating days, Amount of production, and average monthly temperatures. The table illustrated
below showcases the first three random data samples for all six variables.
Table 1: Water Usage study data
Average
monthly
temperate (F)
Amount of
production (Million
pounds)
Number of plant
operating days in
the month
Number of persons
on the monthly
plant payroll
Which of three supervisors
(A, B, C) were in charge
that month
Monthly water
usage (gallons)
80.4 14948 23 187 C 3815
67.5 14643 20 190 B 2917
71.3 6579 21 153 C 2891
This report data has five independent variables (supervior in charge, person on monthly payroll,
operating days, amount of production, and average monthly temperatures) and one dependent
variables (monthly water usage). All variables are continuous in nature except for supervisor in
charge which is categorical assuming one of three values A, B or C. Multiple regression
modeling technique is employed as a feasible mean of predicting the amount of water usages
based on known values of supervior in charge, person on monthly payroll, operating days,
amount of production, and average monthly temperatures. A stepwise approach was adopt in the
determination of the most appropriate indiependent variables to employ in the regression
analysis.
3. Results
Explanatory Analysis
We start by examining the data by drawing a scatter plots of “Water usage” against the four
explanatory variables.
Regression Analysis Project Assignment_3

Figure 1: Scatterplot of Water use dataset
Looking at the scatterplot presented in figure 1, it is clear that there is a positive
relationship between monthly water usage and production amount. The positive relationship is
denoted by an upwards trendline that indicates that when production increases the quantity of
water usage also increases. Therefore, a linear relationship can be used to describe the
relationship between production amount and monthly water usage. The same type f relationship
can be defined for average monthly temperature and monthly water usage. However, in this
situation the trendline is less inclined indicating a weaker positive relationship between the two
variables. The scatterplot for operating days in months against monthly water usage has a fair flat
trendline which indicates the lack of any discernable relationship between the two variables. As
such, this variable who be unnecessary to be included in a multiple regression model. Lastly, the
scatterplot for persons on monthly payroll against monthly water usage has an upward moving
trendline. This means that the two variables have a positive relationship where an increment in
the explanatory variable will cause an increase in the response variable.
The figure below indicates a boxplot for supervisor in charge against monthly water
usage. Judging from the results presented above it is clear that variation of water utilization is
lowest and highest when supervisor C and B were in charge respectively. TWith regard to median
of water usage the figure was highest and lowest when supervisor C and B were in charge
respectively.
Regression Analysis Project Assignment_4

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Descriptive Statistics, Histograms, and Relationships in Water Production
|8
|1109
|436

Statistical Technique for Business Assignment
|10
|2239
|365