Data Analysis Project: Card Fraud Analysis and Recommendations

Verified

Added on  2020/05/04

|19
|3736
|143
Project
AI Summary
This project, titled "Foundations of Data Analysis," investigates card fraud through statistical analysis. The research examines data from 420 customers to identify the frequency of online and offline card fraud, and the impact of various factors on customer satisfaction with the credit card fraud resolution team. The study employs t-tests, ANOVA, and regression analysis to test hypotheses related to gender, age groups, and the influence of response time, advice, and communication on overall customer satisfaction. The findings reveal that online card fraud is more prevalent than offline fraud, and that customer satisfaction is influenced by the fraud resolution team's response time, advice, and communication skills. The project concludes with recommendations for improving security measures and customer service to mitigate card fraud and enhance customer satisfaction.
Document Page
Running Head: FOUNDATIONS OF DATA ANALYSIS
Foundations of Data Analysis
Name of the Student
Name of the University
Author Note
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
1FOUNDATIONS OF DATA ANALYSIS
Executive Summary
This research is conducted in order to identify the frequency of online and offline card frauds and
understand the need of reducing the card fraud and the service time that is required to solve the
issue. It has been observed that online card fraud is more likely to occur than offline card fraud
and the company should adopt better security plans to reduce this issue. Further, the overall
satisfaction of the customers is influenced by ‘response time’, ‘the level of advice’ and ‘the level
of communication’ of the credit card fraud resolution team. Thus, these services provided by the
company should also be improved.
Document Page
2FOUNDATIONS OF DATA ANALYSIS
Table of Contents
1.0 Introduction................................................................................................................................3
2.0 Research Design........................................................................................................................4
3.0 Hypothesis Development...........................................................................................................5
4.0 Statistical Technique and Justification......................................................................................6
5.0 Results and Interpretations........................................................................................................7
5.1 Test for Question 1................................................................................................................7
5.2 Test for Question 2..............................................................................................................10
5.3 Test for Question 3..............................................................................................................12
5.4 Test for Question 4..............................................................................................................13
5.5 Test for Question 5..............................................................................................................14
6.0 Analysis and Summary of the Results.....................................................................................16
7.0 Recommendations....................................................................................................................16
REFERENCES..............................................................................................................................17
Document Page
3FOUNDATIONS OF DATA ANALYSIS
1.0 Introduction
One of the largest networks in the world in retail electronic payments is operated by the
company Visa Inc. The company is also one of the most recognized brands in financial services
across the world. The company provides a lot of facilities to the global commerce. These
facilities include information and value transfer within some financial institutions, consumers,
businesses, merchants and government entities.
There are a lot of types of fraud cases that are going on in the world nowadays. The most
frequent fraud case that is happening now is credit card fraud. This type of card fraud is
happening online as well as offline. Thus, this credit card company Visa Inc. is running this
research to identify some specific issues and reduce the frequency of card fraud.
The primary objectives of this research are discussed as follows.
To identify whether the number of card fraud is experienced differs across gender.
To identify whether there is any difference across age group of people regarding card
fraud.
To determine whether the average time that is required to resolve the problem of card
fraud is less than 12 hours or not.
To determine the frequency of occurrence of online or offline card fraud.
To determine whether there is any difference between the frequency of online and offline
card fraud.
To identify the influence of customers’ satisfaction scores of ‘response time’, ‘the level
of advice’ and ‘the level of communication’ on the overall satisfaction with the credit
card fraud resolution team.
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
4FOUNDATIONS OF DATA ANALYSIS
2.0 Research Design
In order to perform this research, data has to be collected. The data that is collected is on
the experience of the customers in personal fraud. 2000 customers were selected randomly using
the technique of simple random sampling. Among these 2000 customers who were selected to
fill the questionnaire, 420 only responded. Thus, the success rate of the responses is only 21
percent.
The ethical considerations that has to be kept in mind while collecting the data and doing
the research are given below (Cacciattolo, 2015):
The participants involved in the research should not be subjected to any types of harm.
Respect and priority should be given to the dignity of the participants involved in the
research.
The participants should give full consent before participating in the study.
The privacy of the participants taking part in the research must be assured.
The data collected for the research purpose should be kept confidentially and must not be
disclosed to anybody other than the individuals who are directly related to the research. .
The participants of the survey must be assured that their identities will not be disclosed
anywhere.
The research aims and objectives must not be exaggerated.
Any affiliations that are offered to the research or any sources from which the funding for
the research are obtained must be mentioned at the time of the research.
Communications that are necessary in doing the research must be done honestly and
transparently.
Document Page
5FOUNDATIONS OF DATA ANALYSIS
Information that are misleading to the research and biasness of the data in analysis and
representation must not be done.
3.0 Hypothesis Development
For the purpose of the research, the following hypothesis can be framed:
Question 1: Is the number of card fraud experienced the same across gender?
Null Hypothesis (H01): There is no significant difference between the card fraud experienced by
males and females.
Alternate Hypothesis (HA1): There is significant difference between the card fraud experienced
by males and females.
Question 2: Are there differences across age groups regarding card fraud?
Null Hypothesis (H02): There is no significant difference across age groups regarding card fraud.
Alternate Hypothesis (HA2): There is no significant difference across age groups regarding card
fraud.
Question 3: Is 12 hours’ time significant as response time compared to what the customers have
experienced before?
Null Hypothesis (H03): There is no significant difference in the average response time from 12
hours
Alternate Hypothesis (HA3): The average response time is less than 12 hours.
Question 4: Is the frequency of online card fraud more than that of offline card fraud?
Document Page
6FOUNDATIONS OF DATA ANALYSIS
Null Hypothesis (H04): There is no significant difference in the frequency of online card fraud
from offline card fraud.
Alternate Hypothesis (HA4): There is significant difference in the frequency of online card fraud
from offline card fraud.
Question 5: Do any of the customers’ satisfaction scores of ‘response time, ‘the level of advice’
and ‘the level of communication’ influence the overall satisfaction with the credit card fraud
resolution team?
Null Hypothesis (H05): The customers’ satisfaction scores of ‘response time, ‘the level of
advice’ and ‘the level of communication’ do not influence the overall satisfaction with the credit
card fraud resolution team
Alternate Hypothesis (HA5): The customers’ satisfaction scores of ‘response time, ‘the level of
advice’ and ‘the level of communication’ influence the overall satisfaction with the credit card
fraud resolution team.
4.0 Statistical Technique and Justification
The hypothesis that has been stated above has to be tested using appropriate statistical
techniques. The techniques required to test the above stated hypothesis will be discussed here.
To test the first hypothesis, two sample t-test will be used. A two sample t test or an
independent sample t test is the most appropriate test that can be used to compare the difference
of the means of the two different groups of a single variable (Traitler, Coleman & Burbidge,
2017).
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
7FOUNDATIONS OF DATA ANALYSIS
To test the second hypothesis, analysis of variance (ANOVA) test will be used as this the
most appropriate test to compare the means of more than two groups of a single variable (Wiley
& Pace, 2015).
To test the third hypothesis, a one-sample t-test will be used as this is the most
appropriate test to compare the mean of one variable with a pre determined mean of the variable
(Chachi, Taheri & Viertl, 2016).
To test the fourth hypothesis, two sample t-test will be performed as this is the most
appropriate test that can be used to compare the difference of the means of the two different
groups of a single variable.
To test the fifth hypothesis, regression analysis will be used as with the help of regression
analysis only it is possible to find out whether there is any influence of the independent variables
on the dependent variable (Draper & Smith, 2014).
5.0 Results and Interpretations
5.1 Test for Question 1
It can be clearly observed that 34 percent of the respondents have not faced card fraud in
the last 12 months. 66 percent of the respondents have experienced card fraud in the last 12
months. Thus, it can be said that most of the people around the world are now experiencing card
fraud. The figures are given in table 5.1 and figure 5.1.
Table 5.1: Number of people who faced card fraud in last 12 months
Row Labels Count of Question1
1 278
2 142
Grand Total 420
Document Page
8FOUNDATIONS OF DATA ANALYSIS
66%
34%
Number of respondents who Faced Card Fraud
1
2
Figure 5.1: Percentage of people who faced card fraud in last 12 months
Thus, as shown before, out of 420 respondents, 278 have experienced card fraud. Now,
the difference between the numbers of card frauds experienced by these 278 people across
gender has to be tested. At first, the difference between the numbers of offline card frauds has
been tested.
Table 5.2: Two-Sample t-test for difference in offline fraud
Male Female
Mean 4.40 4.24
Variance 23.44 23.64
Observations 126 152
Pooled Variance 23.55
Hypothesized Mean Difference 0
df 276
t Stat 0.262
P(T<=t) one-tail 0.397
t Critical one-tail 1.650
P(T<=t) two-tail 0.793
t Critical two-tail 1.969
Statistical Interpretation:
Document Page
9FOUNDATIONS OF DATA ANALYSIS
From table 5.2, it is evident that t-calculated (0.262) is less than t-critical (1.969) and p-
value is more than the significance level (5 percent level of significance), thus, we can accept the
null hypothesis (H01) that there is no significant difference between the offline card fraud
experienced by males and females (p-value 0.793) at 5 percent level of significance.
Non-Statistical Interpretation:
The average number of times the females get card frauds offline does not differ much
from the number of times the males get card fraud offline. Therefore, people should me much
more careful so that nobody can fraud them.
Table 5.3: Two-Sample t-test for difference in online fraud
Male Female
Mean 5.80 6.66
Variance 14.98 15.52
Observations 126 152
Pooled Variance 15.28
Hypothesized Mean Difference 0
df 276
t Stat -1.818
P(T<=t) one-tail 0.035
t Critical one-tail 1.650
P(T<=t) two-tail 0.070
t Critical two-tail 1.969
Statistical Interpretation:
From table 5.3, it is evident that t-calculated (-1.818) is less than t-critical (1.969) and p-
value is more than the significance level (5 percent level of significance), thus, we can accept the
null hypothesis (H01) that there is no significant difference between the offline card fraud
experienced by males and females (p-value 0.070) at 5 percent level of significance.
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
10FOUNDATIONS OF DATA ANALYSIS
Non-Statistical Interpretation:
The average number of times the females get card frauds online does not differ much
from the number of times the males get card fraud offline. Therefore, people should me much
more careful while accessing their cards online so that nobody can fraud them.
5.2 Test for Question 2
To test the hypothesis stated for question 2, the following test has been performed:
Table 5.4: Summary Statistics for ANOVA on Offline Card Fraud
Groups Count Sum Average Variance
Less than 25 years 67 326 4.87 24.45
26-35 years 66 309 4.68 24.25
36-45 years 80 358 4.48 23.64
46-55 years 42 106 2.52 18.21
More than 55 years 23 100 4.35 24.15
Table 5.5: ANOVA Table on Offline Card Fraud
Source of Variation SS df MS F P-value F crit
Between Groups 166.021 4 41.505 1.788 0.131 2.405
Within Groups 6335.753 273 23.208
Total 6501.773 277
Statistical Interpretation:
From table 5.5, it is evident that f-calculated (1.788) is less than t-critical (2.405) and p-
value is more than the significance level (5 percent level of significance), thus, we can accept the
null hypothesis (H02) that there is no significant difference between the offline card fraud
experienced across different age groups (p-value 0.131) at 5 percent level of significance.
Document Page
11FOUNDATIONS OF DATA ANALYSIS
Non-Statistical Interpretation:
The average number of times the people of different age groups get offline card fraud has
no significant difference. Thus, from here it can be said that people of all age groups has an equal
chance of getting card fraud offline.
Table 5.6: Summary Statistics for ANOVA on Online Card Fraud
Groups Count Sum Average Variance
Less than 25 years 67 417 6.22 16.02
26-35 years 66 463 7.02 12.66
36-45 years 80 516 6.45 15.74
46-55 years 42 206 4.90 16.04
More than 55 years 23 141 6.13 16.66
Table 5.7: ANOVA Table on Online Card Fraud
Source of Variation SS df MS F P-value F crit
Between Groups 118.112 4 29.528 1.943 0.104 2.405
Within Groups 4148.654 273 15.197
Total 4266.766 277
Statistical Interpretation:
From table 5.7, it is evident that f-calculated (1.943) is less than t-critical (2.405) and p-
value is more than the significance level (5 percent level of significance), thus, we can accept the
null hypothesis (H02) that there is no significant difference between the online card fraud
experienced across different age groups (p-value 0.104) at 5 percent level of significance.
Non-Statistical Interpretation:
The average number of times the people of different age groups get online card fraud has
no significant difference. Thus, from here it can be said that people of all age groups has an equal
chance of getting card fraud online as well as offline.
Document Page
12FOUNDATIONS OF DATA ANALYSIS
5.3 Test for Question 3
From table 5.8 given below, it can be seen clearly that the average time that can be lost
by a customer suffering from online fraud is 13.65 hours.
Table 5.8: Descriptive statistics for amount of time lost (in hours) in resolving the most
recent incident of credit card fraud
Mean 13.65
Standard Error 1.05
Median 1
Mode 1
Standard Deviation 17.562
Sample Variance 308.430
Kurtosis -0.991
Skewness 0.844
Range 50
Minimum 0
Maximum 50
Sum 3795
Count 278
The company Visa Inc. has set a time-period of 12 hours. To test whether there will be
any significant improvement to the response time; the following test has been done.
Table 5.9: One-Sample t-test for difference in service time from predefined mean
Amount of time lost (in hours) in resolving the most recent incident of
credit card fraud
Mean 13.65
Variance 308.43
Observations 278
Hypothesized Mean
Difference 12
df 277
t Stat 1.568
P(T<=t) one-tail 0.059
t Critical one-tail 1.650
P(T<=t) two-tail 0.118
t Critical two-tail 1.969
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
13FOUNDATIONS OF DATA ANALYSIS
Statistical interpretation:
From Table 5.9, it is evident that t-calculated (1.568) is less than t-critical (1.650) and p-
value is greater than the significance level (5% level of significance), thus, we can reject that
alternate hypothesis (HA3) that the average service time in resolving the problem of online fraud
is not less than 12 hours (p-value 0.059) at 5% level of significance.
Non-statistical interpretation:
The average service time (13.65 hours) is not less than 12 hours (value obtained from
forecasting model). Therefore, the average service time should be used while advertizing for the
company.
5.4 Test for Question 4
From table 5.10, it can be clearly understood that the average number of times an offline
card fraud occurs is 4 and the average number of times an online card fraud occurs is 6.
Table 5.10: Descriptive statistics measures for the frequency of online and offline card fraud
Offline Card Fraud Online Card Fraud
Mean 4 6
Standard Error 0.29 0.24
Median 0 6
Mode 0 1
Standard Deviation 4.845 3.925
Sample Variance 23.472 15.403
Kurtosis -1.896 -1.346
Skewness 0.292 -0.075
Range 10 12
Minimum 0 0
Maximum 10 12
Sum 1199 1743
Count 278 278
Document Page
14FOUNDATIONS OF DATA ANALYSIS
Table 5.11: Two-Sample t-test for difference in offline and online card fraud
Offline Card Fraud Online Card Fraud
Mean 4 6
Variance 23.47 15.40
Observations 278 278
Pooled Variance 19.44
Hypothesized Mean Difference 0
df 554
t Stat -5.233
P(T<=t) one-tail 0.000
t Critical one-tail 1.648
P(T<=t) two-tail 0.000
t Critical two-tail 1.964
Statistical Interpretation:
From table 5.11, it is evident that t-calculated (-5.233) is greater than t-critical (1.648)
and p-value is less than the significance level (5 percent level of significance), thus, we can reject
the null hypothesis (H04) that there is no significant difference between the offline and online
card fraud (p-value 0.070) at 5 percent level of significance.
Non-Statistical Interpretation:
The average number of times people get card frauds online is much more than the
number of times the people get offline card fraud. Therefore, the company Visa Inc. must invest
in the updated online security that will decrease the number of online card fraud while doing
online transactions.
5.5 Test for Question 5
The following analysis has been performed to test the influence of the customers’
satisfaction scores of ‘response time’, ‘the level of advice’, and ‘the level of communication’ on
the overall satisfaction with the credit card fraud resolution team
Document Page
15FOUNDATIONS OF DATA ANALYSIS
Table 5.12: Regression Statistics
Multiple R 0.80
R Square 0.65
Adjusted R Square 0.64
Standard Error 1.04
Observations 278
Table 5.13: ANOVA
df SS MS F Significance F
Regression 3 536.252 178.751 166.415 0.000
Residual 274 294.310 1.074
Total 277 830.561
Table 5.14: Regression Coefficients
Coefficient
s
Standard
Error t Stat P-value Lower 95%
Upper
95%
Intercept 1.725 0.233 7.387 0.000 1.265 2.184
Response Time 0.246 0.058 4.271 0.000 0.133 0.360
Level of Advice 0.152 0.122 1.251 0.212 -0.087 0.392
Level of
Communication 0.244 0.123 1.985 0.048 0.002 0.487
Statistical Interpretation
From table 5.14, it can be seen clearly that coefficients of the independent variables
‘response time’, ‘level of advice’ and ‘level of communication’ are not equal to zero. Thus, it can
be said that the null hypothesis (H05) is rejected. The variables ‘response time’, ‘level of advice’
and ‘level of communication’ does influence the overall satisfaction with the credit card fraud
resolution team. 65 percent of the overall satisfaction can be explained by the variables ‘response
time’, ‘level of advice’ and ‘level of communication’. The prediction equation can be given as
follows:
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
16FOUNDATIONS OF DATA ANALYSIS
Overall satisfaction = 1.725 + (0.246 * Response Time) + (0.152 * Level of Advice) + (0.244 *
Level of Communication)
Non Statistical Interpretation
The average score of overall satisfaction can be predicted 65 percent correctly by the
scores of response time, level of advice and the level of communication given by the customers.
Since the correctness of the prediction is quite high, the company should try to improve these
scores in order to maximize the overall satisfaction of the customers.
6.0 Analysis and Summary of the Results
From the analysis conducted above, it has been stated clearly that the average number of
card frauds (online or offline) do not differ across gender. It has also been stated that the average
number of online and offline card frauds do not differ across different age groups. The average
time required by the company Visa Inc is more than 12 hours which is not the claim the company
has made. It has also been observed that the frequency of online card fraud is much more than
that of offline card fraud. Thus, the company should take suitable measures of increasing security
to reduce the frequency of card fraud during online transactions. Further, it has also been
observed from the analysis that the overall satisfaction of the customers is influenced by the
response time, level of advice and the level of communication of the card fraud resolution team.
7.0 Recommendations
The company Visa Inc. should take rapid measures on the account of customer security to
reduce the frequency of card fraud that is taking place currently at the time of online
transactions. The company should also develop the response time, level of advice and the level
Document Page
17FOUNDATIONS OF DATA ANALYSIS
of communication of the card fraud resolution team in order to increase the overall satisfaction of
the customers.
Document Page
18FOUNDATIONS OF DATA ANALYSIS
REFERENCES
Cacciattolo, M. (2015). Ethical considerations in research. In The Praxis of English Language
Teaching and Learning (PELT) (pp. 61-79). SensePublishers.
Chachi, J., Taheri, S. M., & Viertl, R. (2016). Testing statistical hypotheses based on fuzzy
confidence intervals. Austrian Journal of Statistics, 41(4), 267-286.
Draper, N. R., & Smith, H. (2014). Applied regression analysis. John Wiley & Sons.
Traitler, H., Coleman, B., & Burbidge, A. (2017). Testing the hypotheses. Food Industry R&D:
A New Approach, 227-247.
Wiley, J. F., & Pace, L. A. (2015). Analysis of variance. In Beginning R (pp. 111-120). Apress.
chevron_up_icon
1 out of 19
circle_padding
hide_on_mobile
zoom_out_icon
logo.png

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]