Analysis of Data using T-test and Regression

Verified

Added on  2020/01/16

|12
|2726
|208
AI Summary
This assignment involves analyzing a dataset using statistical methods such as the t-test and regression analysis. It presents tables showcasing the results of these analyses, including mean differences, variances, t-statistics, p-values, R-squared values, ANOVA results, and coefficient estimates. The data is analyzed to test hypotheses and understand the relationship between variables.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
FINANCIAL STATISTICS

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
TABLE OF CONTENTS
Executive summary.........................................................................................................................3
INTRODUCTION...........................................................................................................................3
(2) Descriptive statistics of the variables.....................................................................................3
(3) Calculation of average age and transaction done by the business firms................................5
(4) Hypothesis testing..................................................................................................................5
(5) Evaluation of linear relationship between transaction dollar and age...................................6
CONCLUSION
Figure 1Variation in balance amount from mean value..................................................................3
Figure 2 Chart on variation in transaction.......................................................................................4
Figure 3 Chart on age......................................................................................................................4
Figure 4 Line fit plot........................................................................................................................7
.........................................................................................................................................................6
REFERENCES................................................................................................................................7
Document Page
Executive summary
In the present research study data set that is related to bank is analyzed and lots of thing
which are valuable are identified. The main findings of the research work are that male and
female have nearby value of balance in their banks. Moreover, age does not play any role in
determining the transaction value. People are making equal use of their visa and non-visa cards
to do transactions. All these findings are prepared on the basis of detail analysis of the data set.
INTRODUCTION
Analytics is the vast field which is used by the firms to solve their problems. In the report,
statistical tools like mean and mode etc are applied. Along with this, mean values are identified
at confidence level of 95% for age and transactions. At end of the report, regression model is
applied and its results are interpreted in systematic way.
(2) Descriptive statistics of the variables
Descriptive statistics encompass basic statistical tool that can be used for analyzing the
firm business. There are some tools in the descriptive statistics that can be used in the current
variables. Some of these tools are explained below and their interpretation is made.
Figure 1Variation in balance amount from mean value
Document Page
Figure 2 Chart on variation in transaction
Figure 3 Chart on age
ï‚· Mean: It is the one of the most important statistical tool that is used by the most data
scientists (Weiss and Weiss, 2012). By making use of this tool average performance of the
variable can be identified by the relevant person. In case of variable age mean value is 58
which means that on average basis individuals that have their accounts in bank are nearby to

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
age of 58. On average basis these people keep a balance of 1532 in their bank account.
Which means that most of people that are in sample have amount nearby to the 1532 in their
bank account. It is also revealed from the data set that mean average transaction is 7487.
Hence, it can be said that heavy amount of transition are done by the people from their bank
account.
ï‚· Median: It is one of the most important statistical tool because it divide entire data in to
multiple parts (Bickel and Lehmann, 2012). Thus, it helps on analyzing data in a systematic
way. Median of age is 55 and it is value that portioned data in to two multiple parts. It is clear
that average of age is 58 and comparison of mean and median of age reflect that there are
number of people that have age above 55. Median value of balance is 275 and it can be seen
from the data that its amount is fluctuating at rapid pace. Hence, it can be said that above and
below median value balance amount is not moving in the specific direction. Median value of
transaction is 8500 and mean of same is 8668. It can be said that above median value
transaction amount does not increase at very fast pace.
ï‚· Mode: It is a tool which reflect the value in the data set that is often repeating in same (Wall
and Jenkins, 2012). Value of mode in case of age is 51 which means that in the sample there
are many people that are in age of 51. Apart from this in case of balance value of mode is 47
which reflects that there are multiple people that have balance of mentioned value in their
bank account. Transaction value (mode) is 10,000 which is above mean and median value of
the same variable. This reflects that there are some people who are making big amount of
transaction which is far away from mean value of mentioned variable.
ï‚· Standard deviation: Standard deviation of mean is 10 which is very high and it reflects that
in the sample different age people are covered (Urdan, 2011). Standard deviation of balance
is 2999 which is moderate. It can be said that there is not a big difference in the amount that
people kept in their bank account. In case of transaction also same thing is observed because
value of standard deviation is only 3638. It can be said that moderate change is observed in
the values of data set related to mean value.
ï‚· Range: Range reflect the difference between maximum and minimum value of the data set.
In the case of age value of range is 19 and it indicate that there is big difference in the
minimum and maximum age which is in the data set. For transaction value of range is 6000
Document Page
which is high. It can be said that there are many people that are making huge amount of
business transaction from their bank accounts.
ï‚· Coefficient of variance: Coefficient of variance is clearly reflecting the amount of risk that is
on each unit of mean value of the variable (What is descriptive statistics, 2016). It can be
seen from the table given in the appendix that COV is 0.18 which is low. On other hand, in
case of variable transaction value of COV is 0.41 which is moderate. Hence, it can be said
that with small change in the mean value of transaction moderate change will be observed in
the standard deviation. For balance coefficient of variation is 1.99 or 2 which is high and it
reflect that if mean balance will change standard deviation will change at rapid pace.
ï‚· Interquartile range: It is another tool which reflect the difference between first and last
quartile. In case of age value of interquartile range is 17 which means that there are large
number of sample units in the data set that have age above 50. In case of balance also value
of interquartile range is high which 1050 is and it means that there are number of sample
units that have balance above 53.82 in their bank account. On other hand, in case of
transaction interquartile range is equivalent to 4000 and it means that there are number of
people that are making transaction above 6000.
(3) Calculation of average age and transaction done by the business firms
Confidence intervals reflect the probability that values of the data set will lie in the specific
range. Usually confidence interval is kept at 95% which means that researcher is 95% confident
that values of the variable will remain in specific range. 5% indicate that there is mentioned
percentage probability that values of the data set will be beyond specific range of data set. At
95% confidence interval mean value of age may lie in range of 55.88 to 60.86. On other hand, at
95% confidence interval mean value of transaction lies in range of 7822-9514. This means that
there is high probability that age of the respondents will remain in limit set by the range given
above. On other hand, there is high probability that people will make transaction in range of
7822-9514 if data will be collected in upcoming time period. On comparison of range with the
relevant variables mean value it is identified that results will not change so much in upcoming
months if assumption that 95% same trend will be observed in the data that will be collected in
nearby months.
(4) Hypothesis testing
(a) Mean difference between females and male account balance
Document Page
In the present research study hypothesis test was conducted in order to identify whether there
is a significant difference in the mean values of the males and females. From table given in
appendix it can be seen that value of level of significance is 0.14>0.05, t (2.11) which means that
there is no significant difference between the mean values of the balance that is in account of
males and females. This happened because male and female both do a job and there is a minor
difference in their salary.
(b)Identification of difference in the value of transaction in visa and non-visa cards
In the current research study hypothesis test is done under which it is identified whether
there is significant difference between the mean values of the transactions that takes place
through visa and non-visa cards. Results indicate that there is no significant difference in the
transactions that are done from the visa and non-visa cards. Value of level of significance is
0.92>0.05 and same of t (2.01) and this proved the interpretation given above.
(5) Evaluation of linear relationship between transaction dollar and age
Figure 4 Line fit plot
In this section hypothesis was prepared that there is no linear relationship between
transaction values and age. In order to prove this hypothesis regression model is applied in the
raw data. It can be seen from the table that value of R is 0.22 which means that there is very low
relationship between the variables. On other hand, value of R square is 0.04 or 4% which means
that due to change in independent variable only 4% change is taking place in the dependent
variable. It can be said that age factor is not playing in role in determining the transaction that
can be done by the specific age group people. Value of level of significance is 0.12>0.05 which
reflects that there is no significant difference between the mean values of the dependent and
independent variable. Degree of freedom is 1 which means that only one case in the data set is
allowed to be altered in the data set. Value of coefficient for dependent variable is -69 which

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
means that with elevation in age transaction reduced by -69 points which is very low and it again
proved that age factor does not play any role in determination of amount of transaction that an
individual can do from his bank account. Intercept revealed the expected mean value of the
dependent variable. Model clearly reveal that expected value of transaction is 12,580. Hence, in
upcoming time period it can be seen that on average basis people are doing a transaction nearby
to 12,580 from their bank account. Hence, it can be said that in order to identify reason due to
which one withdrawal are frequently observed in the people bank account there is need to
identify any other variable.
Value of Spearman rho is the -0.22 which reflect the negative relationship between the
variables. However, negative relationship can be considered low or moderate. Hence, it can be
said that there is not a significant relationship between variables. This is proved from the value
of p which is 0.1225>0.05.
CONCLUSION
On the basis of above discussion it is concluded that statistics is vast field and it help
researchers in analyzing the data in better way. Business firms must like banks must analyze data
time to time in order to make better business decisions. It is also concluded that male and female
have bank balance nearby to each other. There is not a big difference in the amount of
transaction that individual done from Visa and non-visa cards. It is also concluded that age
factors also not affects amount of transaction done by the people from their bank account.
Document Page
REFERENCES
Books & journals
Weiss, N.A. and Weiss, C.A., 2012. Introductory statistics. London: Pearson Education.
Bickel, P.J. and Lehmann, E.L., 2012. Descriptive statistics for nonparametric models IV.
Spread. In Selected Works of EL Lehmann (pp. 519-526). Springer US.
Wall, J.V. and Jenkins, C.R., 2012. Practical statistics for astronomers. Cambridge University
Press.
Urdan, T.C., 2011. Statistics in plain English. Routledge.
Online
What is descriptive statistics, 2016. [Online]. Available through :<
http://study.com/academy/lesson/what-is-descriptive-statistics-examples-lesson-quiz.html>.
[Accessed on 27th October 2016].
Document Page
APPENDIX
(2)
Table 1 Descriptive statistics
Age Balance $
Ope
n
Transaction (in whole
$)
Mean
58.36956
5 1532.99 1 8668.478261
Median 53.5 275.72 1 8500
Mode 51 47 1 10000
STDEV
10.70899
7
2998.067
5 0 3638.549141
Minimum 35 0 0 2000
Maximum 78 13298.28 1 20000
Range 19 35 1 6000
Coefficient of
variation
0.183468
8
1.955699
3 0 0.419744854
Interquartile range
Q1 50 53.82 1 6000
Q3 67
1104.317
5 1 10000
Interquartile range 17
1050.497
5 0 4000
(3)
Table 2Mean at 95% confidence interval
Age
Mean 58.37
Median 56.00
Mode 81.00
Standard deviation 10.71
Z value 1.645
Square root 7.071067812
STDEV/SR 1.514480827
Z*STDEV/SR 2.49132096
Mean-(Z*STDEV/SR) 55.88
Mean+(Z*STDEV/SR) 60.86

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Data set mean 57
Table 3 Mean at 95% CI
Transaction
Mean 8668.48
Median 0.00
Mode 0.00
Standard deviation 3638.55
Z value 1.645
Square root 7.071067812
STDEV/SR 514.5685543
Z*STDEV/SR 846.4652719
Mean-(Z*STDEV/SR) 7822.01
Mean+(Z*STDEV/SR) 9514.94
Data set mean 8,022
(4)
4.1
Table 4 T test for first hypothsis
Variable
1 Variable 2
Mean 4283.421 1635.088889
Variance 19746440 7494140.453
Observations 9 9
Pooled Variance 13620290
Hypothesized Mean
Difference 0
df 16
t Stat 1.522248
P(T<=t) one-tail 0.073732
t Critical one-tail 1.745884
P(T<=t) two-tail 0.147464
t Critical two-tail 2.119905
4.2
Table 5 T test for second hypothesis
Variable Variable
Document Page
1 2
Mean 8663.462 8562.5
Variance 8984712 18202446
Observations 26 24
Pooled Variance 13401542
Hypothesized Mean
Difference 0
df 48
t Stat 0.097428
P(T<=t) one-tail 0.461396
t Critical one-tail 1.677224
P(T<=t) two-tail 0.922792
t Critical two-tail 2.010635
(5)
Table 6Regression Statistics
Regression Statistics
Multiple R 0.221308189
R Square 0.048977314
Adjusted R Square 0.029164342
Standard Error 3570.390924
Observations 50
Table 7ANNOVA table
df SS MS F Significance F
Regression 1 31512065.31 31512065.31 2.471982137 0.122460855
Residual 48 611889184.7 12747691.35
Total 49 643401250
Table 8 Coefficient table
Coefficie
nts
Standard
Error t Stat P-value
Lower
95%
Upper
95%
Lower
95.0%
Upper
95.0%
Interce
pt
12580.7
4489
2572.374
085
4.89071
3588
1.16851
E-05
7408.64
0146
17752.8
4964
7408.64
0146
17752.8
4964
X
Variabl
e 1
-
69.6722
5741
44.31361
886
-
1.57225
384
0.12246
0855
-
158.770
7597
19.4262
449
-
158.770
7597
19.4262
449
1 out of 12
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]