Statistical Analysis of Passenger Data and Sales Trends

Verified

Added on  2025/05/03

|24
|3789
|255
AI Summary
Desklib provides solved assignments and past papers to help students succeed.
Document Page
Applied Quantitative Methods
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
TABLE OF CONTENTS
Question 1..................................................................................................................................1
a. Construct a frequency distribution using 10 classes, stating the Frequency, Relative
Frequency, Cumulative Relative Frequency and Class Midpoint..........................................1
b. Using (a), construct a histogram........................................................................................2
C. Based upon the raw data (NOT the Frequency Distribution), what is the mean, median
and mode?..............................................................................................................................2
Question 2..................................................................................................................................3
a. Is above a population or a sample? Explain the difference................................................3
b. Calculate the standard deviation of the weekly attendance...............................................3
c. Calculate the Inter Quartile Range (IQR) of the chocolate bars sold. When is the IQR
more useful than the standard deviation?...............................................................................4
d. Calculate the correlation coefficient. Using the problem we started with, interpret the
correlation coefficient............................................................................................................4
Question 3..................................................................................................................................6
a. Calculate AND interpret the Regression Equation.............................................................6
b. Calculate AND interpret the Coefficient of Determination...............................................7
Question 4..................................................................................................................................9
a. What is the probability that a randomly chosen player will be from Holmes OR receiving
Grassroots training?...............................................................................................................9
b. What is the probability that a randomly selected player will be External AND be in
scientific training?..................................................................................................................9
c. Given that a player is from Holmes, what is the probability that he is in scientific
training?..................................................................................................................................9
d. Is training independent from recruitment? Show your calculations and then explain in
your own words what it means.............................................................................................10
Question 5................................................................................................................................11
Document Page
A. The company would like to know the probably that a consumer comes from segment A
if it is known that this consumer prefers Product X over Product Y and Product Z............11
B. what is the probability that a random consumer’s first preference is product X?...........11
Question 6................................................................................................................................13
A. During a 1 minute period you counted 8 people entering the store. What is the
probability that only 2 or less of those 8 people will buy anything?...................................13
B. On average you have 4 people entering your store every minute during the quiet 10-
11am slot..............................................................................................................................14
Question 7................................................................................................................................15
A. Assuming a normal distribution, what is the probability that apartment will sell for over
$2 million?...........................................................................................................................15
B. What is the probability that the apartment will sell for over $1 million but less than $1.1
million..................................................................................................................................15
Question 8................................................................................................................................17
A. Since the apartments on Surfers Paradise are a mix of cheap older and more expensive
new apartments, you know the distribution is NOT normal. Can you still use a Z-
distribution to test your assistant’s research findings against yours? Why, or why not?.....17
B. You have over 2 000 investors in your fund. You and your assistant phone 45 of them to
ask if they are willing to invest more than $1 million (each) to the proposed new fund.
Only 11 say that they would, but you need at least 30% of your investors to participate to
make the fund profitable. Based on your sample of 45 investors, what is the probability
that 30% of the investors would be willing to commit $1 million or more to the fund?.....17
References................................................................................................................................19
Document Page
LIST OF TABLES
Table 1: Given data table...........................................................................................................1
Table 2: Frequency distribution table........................................................................................1
Table 3: Mean, median and mode..............................................................................................2
Table 4: Weekly attendance and number of chocolate bars sold...............................................3
Table 5: correlation coefficient table.........................................................................................5
Table 6: Weekly attendance and number of chocolate bars sold...............................................6
Table 7: Interpretation of Regression Equation.........................................................................6
Table 8: already given data........................................................................................................9
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
LIST OF FIGURES
Figure 1: Histogram...................................................................................................................2
Document Page
Question 1
Data were collected on the number of passengers at each train station in Melbourne. The
numbers for the weekday peak time, 7am to 9:29am, are given below.
Table 1: Given data table
456 1189 410 318 648 2300 382 248 379 1240 2048 272
267 1134 733 262 682 906 338 1750 530 1584 3045 323
1311 1536 1606 982 878 169 583 548 429 658 344 2450
538 494 1946 268 435 862 866 579 1348 1022 1618 1021
401 1181 1178 637 2745 1000 2900 962 697 401 1442 1115
a. Construct a frequency distribution using 10 classes, stating the Frequency, Relative
Frequency, Cumulative Relative Frequency and Class Midpoint
Table 2: Frequency distribution table
Class
interval
Frequen
cy
Relative
frequency
Relative
Cumulative
frequency
Lower
Point
Upper
Point
Mid-
point
0-100 0 0.0000 0.0000 0 100 50
100-450 17 0.2833 0.2833 100 450 275
450-800 13 0.2167 0.5000 450 800 625
800-1150 11 0.1833 0.6833 800 1150 975
1150-1500 7 0.1167 0.8000 1150 1500 1325
1500-1850 5 0.0833 0.8833 1500 1850 1675
1850-2200 2 0.0333 0.9167 1850 2200 2025
2200-2550 2 0.0333 0.9500 2200 2550 2375
2550-2900 2 0.0333 0.9833 2550 2900 2725
2900-3250 1 0.0167 1.0000 2900 3250 3075
1
Document Page
b. Using (a), construct a histogram
Freqency
0
2
4
6
8
10
12
14
16
18
Frequency Histogram
0-100
100-450
450-800
800-1150
1150-1500
1500-1850
1850-2200
2200-2550
2550-2900
2900-3250
Class Interval
Frequency
Figure 1: Histogram
C. Based upon the raw data (NOT the Frequency Distribution), what is the mean, median and
mode?
Table 3: Mean, median and mode
S. No Statistics Value
1 Mean 976.5667
2 Median 797.5
3 Mode 401
2
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Question 2
You are the manager of the supermarket on the ground floor of Holmes Building. You are
wondering if there is a relation between the number of students attending class at Holmes
Institute each day, and the amount of chocolate bars sold. That is, do you sell more chocolate
bars when there are a lot of Holmes students around, and less when Holmes is quiet? If there
is a relationship, you might want to keep less chocolate bars in stock when Holmes is closed
over the upcoming holiday. With the help of the campus manager, you have compiled the
following list covering 7 weeks.
Table 4: Weekly attendance and number of chocolate bars sold
Weekly attendance Number of chocolate bars sold
472 6 916
413 5 884
503 7 223
612 8 158
399 6 014
538 7 209
455 6 214
a. Is above a population or a sample? Explain the difference.
Since we have collected data of 7 weeks this is sample only.
b. Calculate the standard deviation of the weekly attendance
Since we are finding standard deviation for sample we have formula as s= ( Xix ) 2
n1
Where S= sample standard deviation
n = number of scores in sample.
Mean of Weekly attendance = 472+413+503+612+399+538+455/7=484.571
n -1 =7-1=6
3
Document Page
So standard deviation= Square root (472-484.571)2 + (413-484.571)2 + (503-484.571)2 + (612-
484.571)2 + (399-484.571)2 + (538-484.571)2 + (455-484.571)2 /6 =74.060
c. Calculate the Inter Quartile Range (IQR) of the chocolate bars sold. When is the IQR more
useful than the standard deviation?
For finding IQR we order given data in increasing order I.e. 5884, 6014, 6214, 6916, 7209,
7223, and 8158.
Now we have to find Q1=Point between the lowest 25% of values and the highest 75% of
values & Q3=Point between the lowest 75% and highest 25% of values
So from above data we can find Q1= (6014+6214)/2=6114
Q3= (7209+7223)/2=7216
Now formula of IQR= Q3-Q1=7216-6114=1102
We can use IQR instead of SD when our data set contains extreme values. This IQR will
provide the dispersion data for whole data set by omitting extreme values. So if we have
outliers in our data IQR will be the better measure. But in standard deviation we cannot omit
extreme values (Sachs, 2012).
In our example IQR will be the better choice as it contains 5884 as lowest and 8158 as
highest value. The difference between these two equals to 2274. In that case IQR gives better
judgment than SD.
d. Calculate the correlation coefficient. Using the problem we started with, interpret the
correlation coefficient.
To find correlation coefficient we have correlation formula as
r= n ¿ ¿
Where n= number of observation
X= variable 1(Weekly attendance)
Y= variable 2 (Number of chocolate bars sold)
Table 5: correlation coefficient table
4
Document Page
(Based on: Cox, 2018)
S. No
Weekly
attendance
(X)
Number of
chocolate bars
sold (Y)
XY X2 Y2
1 472 6916 3264352 222784 47831056
2 413 5884 2430092 170569 34621456
3 503 7223 3633169 253009 52171729
4 612 8158 4992696 374544 66552964
5 399 6014 2399586 159201 36168196
6 538 7209 3878442 289444 51969681
7 455 6214 2827370 207025 38613796
TOTAL 3392 47618 23425707 1676576 327928878
Now putting these values to the above correlation formula we get
(7*23425707)- (3392)( 47618)/[Sqrt{(7*1676576)-(33922)}{(7*327928878)-( 476182)
=2459693/ sqrt(6456805445696)=0.967993
Since r is very high we can conclude that as weekly attendance of Holmes students increases
number of chocolate bar sold also increases. We can estimate the number of chocolate bar
sold for weekly attendance by using linear regression.
5
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Question 3
You are the manager of the supermarket on the ground floor of Holmes Building. You are
wondering if there is a relation between the number of students attending class at Holmes
Institute each day, and the amount of chocolate bars sold. That is, do you sell more chocolate
bars when there are a lot of Holmes students around, and less when Holmes is quiet? If there
is a relationship, you might want to keep less chocolate bars in stock when Holmes is closed
over the upcoming holiday. With the help of the campus manager, you have compiled the
following list covering 7 weeks.
Table 6: Weekly attendance and number of chocolate bars sold
Weekly attendance Number of chocolate bars sold
472 6916
413 5884
503 7223
612 8158
399 6014
538 7209
455 6214
a. Calculate AND interpret the Regression Equation
For developing Regression equation we first develop regression table as below.
Table 7: Interpretation of Regression Equation
(Based on: Draper and Smith, 2014)
S. No Weekly
attendance
(X)
Number of
chocolate
bars sold(Y)
Xi- x yi- y (xi- x ¿ ¿2 (yi- y ¿ ¿2 (Xi- x)*( yi- y)
1 472 6916 -12.5714 113.429 158.0401
12866.1
4 -1425.961331
2 413 5884 -71.5714 -918.571 5122.465 843772. 65743.41247
6
Document Page
7
3
503 7223 18.4286 420.429 339.6133
176760.
5 7747.917869
4
612 8158
127.428
6 1355.429 16238.05 1837188 172720.4199
5
399 6014 -85.5714 -788.571 7322.464
621844.
2 67479.12447
6
538 7209 53.4286 406.429 2854.615
165184.
5 21714.93247
7
455 6214 -29.5714 -588.571 874.4677
346415.
8 17404.86847
TOT
AL 3392 47618
32909.71 4004032 351384.7143
Mean 484.5714 6802.571
The regression equation is a linear equation of the form: ŷ = b0 + b1x
Where ŷ= dependent variable
b0= intercept & b1 = regression coefficient and X= independent variable
First we solve the regression confidents= b1= Σ [ (xi - x)(yi - y) ] / Σ [ (xi - x)2]
=351384.7143/32909.71
=10.67723382
We can solve for the regression slope (b0): b0 = y - b1 * x
=6802.571-(10.67723382*484.5714) = 1628.688985
Finally the regression equation is ŷ= 1628.688985+ (10.67723382* X)
So as we change x values Y values also changes. Here X (Weekly attendance) is called
independent variable and Y (Number of chocolate bars sold) is called dependent variable. So
as X value changes it will affect on Y values.
7
chevron_up_icon
1 out of 24
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]