ProductsLogo
LogoStudy Documents
LogoAI Grader
LogoAI Answer
LogoAI Code Checker
LogoPlagiarism Checker
LogoAI Paraphraser
LogoAI Quiz
LogoAI Detector
PricingBlogAbout Us
logo

Regression Analysis and Interpretation

Verified

Added on  2020/05/28

|23
|3433
|387
AI Summary
The assignment focuses on analyzing a provided dataset using regression techniques. It involves interpreting the meaning of regression coefficients, particularly highlighting positive and negative linear relations. Additionally, students need to understand and explain the coefficient of determination (R-squared), which represents the proportion of variation in one variable explained by others.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
Running head: BUSINESS STATISTICS
BUSINESS STATISTICS
Name of the Student
Name of the Author
Author Note

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
1BUSINESS STATISTICS
Table of Contents
Assignment Part I............................................................................................................................3
1. b...............................................................................................................................................3
Assignment Part II...........................................................................................................................3
2. a................................................................................................................................................5
2. b...............................................................................................................................................5
2. c................................................................................................................................................5
3. a................................................................................................................................................5
3. b...............................................................................................................................................7
3. b. (i)......................................................................................................................................8
3. b. (ii)....................................................................................................................................8
3. c..............................................................................................................................................14
3. d.............................................................................................................................................14
4. a..............................................................................................................................................14
4. b.............................................................................................................................................15
4. c. (i)........................................................................................................................................15
4. c. (ii).......................................................................................................................................16
4. d.............................................................................................................................................16
5. a..............................................................................................................................................17
5. b.............................................................................................................................................18
Document Page
2BUSINESS STATISTICS
5. c..............................................................................................................................................20
7. a..............................................................................................................................................22
7. b.............................................................................................................................................23
7. c..............................................................................................................................................23
7. d.............................................................................................................................................23
7. e..............................................................................................................................................23
7. f..............................................................................................................................................23
References......................................................................................................................................24
Document Page
3BUSINESS STATISTICS
Assignment Part I
1. b.
The student ID number selected for this particular assignment is MIT17122. Thus,
according to the provided guideline of selecting the relevant random numbers, the last three
digits of the student ID are considered. Those are 122. Consequently, the random number
selection procedure has been started from row number 22 and column number 1. As the random
numbers are provided in sets of 6 digits, each set or block provides two random numbers
(Hamman et al., 2016). The first and last three digits of each block represent two distinct random
numbers of sizes three. In the corresponding excel sheet, the first column denotes the random
number selected. Second column denotes the respective values of the random numbers selected.
For instance, the first selected random number is 937 and so on (Chatterjee & Hadi, 2015). Third
column describes whether the selected random number is “Good” or “Not-Good”. Good means
the number can be selected as a sample number (Wilson, Bhatnagar & Townsend, 2017). Not
good means it has to be rejected. Random numbers from 001 to 300 are selected otherwise it is
rejected, including 000.
The selected samples are outlaid in the file named “SampleSmartPhoneData”, containing
50 samples from the provided list of 300.
Assignment Part II
As asked to provide, a Frequency Column Chart and a Relative Frequency Pie-chart has
been constructed to depict the number of and proportions of different entertainment type (Wun et
al., 2016).

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
4BUSINESS STATISTICS
Music,Video
and movies News and
Weather
Apps
IM and Social
Network
Apps
Games eBooks Maps and
Navigation
Apps
Other
0
5
10
15
20
25
Total
Frequency
Music,Video and
movies
(0.42)
News and Weather Apps
(0.04)IM and Social Network Apps
(.14)
Games
(0.16)
eBooks
(0.18)
Maps and Navigation Apps
(0.02)
Other
(0.04)
Relative Frequency Pie- Chart
Document Page
5BUSINESS STATISTICS
2. a.
As per the following frequency column chart, 21 of the samples contain entertainment in
the form of Music, Videos and Movies (Weaver et al., 2018).
2. b.
It is evident from the frequency column chart that that music, videos and movies are the
most commonly downloaded form of entertainments.
2. c.
0.18 of the sample proportion of entertainments are that of eBooks.
3. a.
The table below shows the incomes from a higher to lower order. Corresponding CN
numbers are also attached for convention.
CN V1 Rank
60 $250,000 1.5
193 $250,000 1.5
140 $180,000 3.5
225 $180,000 3.5
72 $160,000 5
113 $155,000 6
114 $102,983 7
300 $101,262 8
137 $100,267 9
243 $100,200 10
Document Page
6BUSINESS STATISTICS
252 $99,742 11
237 $99,398 14.5
242 $99,398 14.5
223 $99,398 14.5
248 $99,398 14.5
57 $99,398 14.5
165 $99,398 14.5
273 $99,374 18
46 $99,336 19
202 $98,955 20
241 $98,678 21
180 $98,673 22.5
205 $98,673 22.5
102 $98,645 24
249 $98,191 25
146 $97,756 26
277 $97,338 27.5
277 $97,338 27.5
134 $97,000 29
293 $96,286 30
98 $95,957 31
88 $95,931 32
49 $95,877 33
131 $95,297 34

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
7BUSINESS STATISTICS
62 $95,000 35
153 $93,250 36
153 $93,250 37
91 $90,164 38
221 $90,025 39
169 $88,887 40
123 $72,000 41
268 $70,000 43
176 $70,000 43
176 $70,000 43
2 $62,500 45.5
251 $62,500 45.5
4 $55,000 47
234 $45,000 48
234 $45,000 49
25 $40,000 50
3. b.
The formula to determine the location of the percentile, that is to find the value of the
corresponding percentile from the data provided, is as follows –
LP =( n+1) P
100 ; Where n is the total number of observations and P is defined as the
desired percentile.
Document Page
8BUSINESS STATISTICS
3. b. (i)
Here, the desired percentile is 70. Thus P =70. Substituting the value of P and considering
n = 50, the location of the parameter is found out to be –
LP = ( 50+1 ) 70
100 =35.7
It can be written as IR+FR=35+0.7=35.7
The value with rank 35 is $95000 and the value of 36th rank element is $93,250. Further to
determine the exact value corresponding to the 70th percentile, the formula used is –
0.7 (95000-93250) + 93250
= 0.7*1750+93250
=1125+93250
= $ 94475
3. b. (ii)
CN V1 Rank
60 $250,000 1
193 $250,000 1
140 $180,000 2
225 $180,000 2
72 $160,000 3
113 $155,000 4
Document Page
9BUSINESS STATISTICS
114 $102,983 5
300 $101,262 6
137 $100,267 7
243 $100,200 8
252 $99,742 9
237 $99,398 10
242 $99,398 10
223 $99,398 10
248 $99,398 10
57 $99,398 10
165 $99,398 10
273 $99,374 11
46 $99,336 12
202 $98,955 13
241 $98,678 14
180 $98,673 15
205 $98,673 15
102 $98,645 16
249 $98,191 17
146 $97,756 18
277 $97,338 19
277 $97,338 19
134 $97,000 20
293 $96,286 21
98 $95,957 22

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
10BUSINESS STATISTICS
88 $95,931 23
49 $95,877 24
131 $95,297 25
62 $95,000 26
153 $93,250 27
153 $93,250 28
91 $90,164 29
221 $90,025 30
169 $88,887 31
123 $72,000 32
268 $70,000 33
176 $70,000 33
176 $70,000 33
2 $62,500 34
251 $62,500 34
4 $55,000 35
234 $45,000 36
234 $45,000 36
25 $40,000 37
The first and third quartiles represent the 25th and 75th percentile. The calculations are
carried out in a similar fashion. To determine the 25th percentile value,
LP = ( 50+1 ) 25
100 =12.75
Document Page
11BUSINESS STATISTICS
This can be expressed as 12+ 0.75 = IR+FR =12.75
The value with rank 12 is $99,336 and the value of 13th rank element is $98,955. Further to
determine the exact value corresponding to the 25th percentile, the formula used is –
0.75* (99336-98955) + 98995
= 0.75*381+98995
=285.75+93250
= $ 99280.75
CN V1 Rank
60 $250,000 1.5
193 $250,000 1.5
140 $180,000 3.5
225 $180,000 3.5
72 $160,000 5
113 $155,000 6
114 $102,983 7
300 $101,262 8
137 $100,267 9
243 $100,200 10
252 $99,742 11
237 $99,398 14.5
242 $99,398 14.5
Document Page
12BUSINESS STATISTICS
223 $99,398 14.5
248 $99,398 14.5
57 $99,398 14.5
165 $99,398 14.5
273 $99,374 18
46 $99,336 19
202 $98,955 20
241 $98,678 21
180 $98,673 22.5
205 $98,673 22.5
102 $98,645 24
249 $98,191 25
146 $97,756 26
277 $97,338 27.5
277 $97,338 27.5
134 $97,000 29
293 $96,286 30
98 $95,957 31
88 $95,931 32
49 $95,877 33
131 $95,297 34
62 $95,000 35
153 $93,250 36
153 $93,250 37
91 $90,164 38

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
13BUSINESS STATISTICS
221 $90,025 39
169 $88,887 40
123 $72,000 41
268 $70,000 43
176 $70,000 43
176 $70,000 43
2 $62,500 45.5
251 $62,500 45.5
4 $55,000 47
234 $45,000 48
234 $45,000 49
25 $40,000 50
In order to find the 75th percentile, proceeding in a similar fashion, we get
LP = ( 50+1 ) 75
100 =38.25
This can be expressed as 38+ 0.25 = IR+FR =38.25
The value with rank 38 is $90,164 and the value of 39th rank element is $90,025. Further to
determine the exact value corresponding to the 75th percentile, the formula used is –
0.25* (90,164-90025) + 90025
= 0.75*139+90025
=34.75+90025
Document Page
14BUSINESS STATISTICS
= $ 90059.75
3. c.
Before answering this specific question, it is important to clarify the idea of percentiles.
Percentile refers to the percentage of population above a certain point. For instance, 70th
percentile would mean the no of people of above that specific value. In this particular case, the
value is found out to be $94475. Which implies that among the total 50 selected samples, 70
percent of them have the annual income of above $94475.
3. d.
Inter quartile range is defined as the difference between the third quartile and the first
quartile. Thus, the inter quartile range in this case is –
Q3Q1 =P75P25 = 90059.75 - 99280.75 = $9221
Inter quartile range is determined with primary focus on the deviation or variation within
a data set. Inter quartile range basically provides an idea about the 50% of the values spread
across the mean or the average. Thus, in this case the inter quartile range is $9221. This implies
that the annual income of the 50% of middle range of the provided data is spread within a range
of 9221.
4. a.
The following descriptive statistics table has been constructed in excel and then pasted
here.
Column1
Document Page
15BUSINESS STATISTICS
Mean 101554.46
Standard Error 5815.206028
Median 97973.5
Mode 99398
Standard Deviation 41119.71617
Sample Variance 1690831058
Kurtosis 5.801565461
Skewness 2.114964151
Range 210000
Minimum 40000
Maximum 250000
Sum 5077723
Count 50
4. b.
The upper and lower inner fences are calculated by the provided formulae.
IFUL=Q3 +1.5IQR=90059.75+1.59221=¿103891.3
IFLL=Q11.5IQR=99280.751.59221=¿85449.25
4. c. (i).
The suitable measure of central tendency chosen is the mean or the average. Among all
the other measures of central tendency, viz. median, mode and others, Mean is regarded as the

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
16BUSINESS STATISTICS
best measure. Thus it is chosen primarily for this purpose. Also since there the inter quartile
range or even the range shows that the data is well spread, median and mode will not be the best
choice. The mean is defined as –
X = 1
N
i=1
N
Xi
4. c. (ii).
The suitable measure of dispersion chosen for this particular set of data is standard
deviation (SD). SD is defined as the square root of the sum of the squares of deviations from the
mean. Since the measure of central tendency is chosen as the mean, it is convenient from a
practical perspective to use the standard deviation to calculate the level of dispersion. SD is
defined as –
σ = 1
N ( XiX )2
4. d.
The V1 variable is defined as the annual income of the samples under consideration. The
mean, as mentioned above, is found out to be $101554.46. This implies on an average the annual
income of the 50 samples is the aforementioned amount. This may not seem like a middle or
central value as the incomes range from $250,000 to $40,000.
The median or the middle most value of the entire data set is calculated as $97973.5. This
means that half of the observation set, that is income of 50% of the observations lie above this
value point and consequently rest lie underneath this point. The median also depicts that the
majority of the people have income in the vicinity of the mentioned value.
Document Page
17BUSINESS STATISTICS
Quartiles are referred to the groups or sections when the entire data set is divided in four
of them. All of the quartiles values are calculated till now. Here,
Q1=99280.75, Q3=90059.75Q2= ~
X=97973.5
The first quartile provides the value above which 25% of the observations lie.
Consequently third quartile does the same with that of 75% of the observations. Median or the
50th percentile or the second quartile is the middle value of the data. This means 50% of the
observations are above this value and the rest are below.
Measures of variation include the range and the sample SD and the sample variance. All
the values are calculated through excel and mentioned above. The values are found out to be
Standard Deviation 41119.71617
Sample Variance 1690831058
Range 210000
Clearly the SD and the Variance are very high. The range also indicates the dispersion of the data
set.
5. a.
The three measures that help in recognizing whether the data is follows a normal
distribution or not are – Mean Median and Skewness. In case of Normal distribution, Mean,
Median and Mode shall all be equal. That is not the case for this particular data set (Leamer,
2016). The Skewness is also high as Skewness for a normal distribution tends to zero. Thus the
data does not follow a Normal Population.
Document Page
18BUSINESS STATISTICS
5. b.
Here, the following table is drawn to conclude the number observations within the asked
range.
CN V1 Z
25 $40,000 -1.49696
234 $45,000 -1.37536
234 $45,000 -1.37536
4 $55,000 -1.13217
2 $62,500 -0.94977
251 $62,500 -0.94977
268 $70,000 -0.76738
176 $70,000 -0.76738
176 $70,000 -0.76738
123 $72,000 -0.71874
169 $88,887 -0.30806
221 $90,025 -0.28039
91 $90,164 -0.27701
153 $93,250 -0.20196
153 $93,250 -0.20196
62 $95,000 -0.1594
131 $95,297 -0.15218
49 $95,877 -0.13807
88 $95,931 -0.13676

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
19BUSINESS STATISTICS
98 $95,957 -0.13613
293 $96,286 -0.12812
134 $97,000 -0.11076
277 $97,338 -0.10254
277 $97,338 -0.10254
146 $97,756 -0.09238
249 $98,191 -0.0818
102 $98,645 -0.07076
180 $98,673 -0.07007
205 $98,673 -0.07007
241 $98,678 -0.06995
202 $98,955 -0.06322
46 $99,336 -0.05395
273 $99,374 -0.05303
237 $99,398 -0.05244
242 $99,398 -0.05244
223 $99,398 -0.05244
248 $99,398 -0.05244
57 $99,398 -0.05244
165 $99,398 -0.05244
252 $99,742 -0.04408
243 $100,200 -0.03294
137 $100,267 -0.03131
300 $101,262 -0.00711
114 $102,983 0.034741
Document Page
20BUSINESS STATISTICS
113 $155,000 1.299755
72 $160,000 1.421351
140 $180,000 1.907736
225 $180,000 1.907736
60 $250,000 3.610082
193 $250,000 3.610082
The z scores are defined 1.5 and -1.5. From the standard normal table, the value found out is
0.43319. For both sides, the total of 86.638 % observations lies between the mentioned regions
(Wan et al., 2014). This means about 43 observations lie in between the specified region.
5. c.
The following table has been constructed to provide an idea of the region asked for.
CN V1 TRUE/FALSE
25 $40,000 TRUE
234 $45,000 TRUE
234 $45,000 TRUE
4 $55,000 TRUE
2 $62,500 TRUE
251 $62,500 TRUE
268 $70,000 TRUE
176 $70,000 TRUE
176 $70,000 TRUE
Document Page
21BUSINESS STATISTICS
123 $72,000 TRUE
169 $88,887 TRUE
221 $90,025 TRUE
91 $90,164 TRUE
153 $93,250 TRUE
153 $93,250 TRUE
62 $95,000 TRUE
131 $95,297 TRUE
49 $95,877 TRUE
88 $95,931 TRUE
98 $95,957 TRUE
293 $96,286 TRUE
134 $97,000 TRUE
277 $97,338 TRUE
277 $97,338 TRUE
146 $97,756 TRUE
249 $98,191 TRUE
102 $98,645 TRUE
180 $98,673 TRUE
205 $98,673 TRUE
241 $98,678 TRUE
202 $98,955 TRUE
46 $99,336 TRUE
273 $99,374 TRUE

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
22BUSINESS STATISTICS
237 $99,398 TRUE
242 $99,398 TRUE
223 $99,398 TRUE
248 $99,398 TRUE
57 $99,398 TRUE
165 $99,398 TRUE
252 $99,742 TRUE
243 $100,200 TRUE
137 $100,267 TRUE
300 $101,262 TRUE
114 $102,983 TRUE
113 $155,000 TRUE
72 $160,000 TRUE
140 $180,000 FALSE
225 $180,000 FALSE
60 $250,000 FALSE
193 $250,000 FALSE
It is evident 46 of the observations fall in the given region.
7. a.
The regression equation is ^Y =6.855+0.608X.
1 out of 23
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]