Fundamentals of Data Analytics: Telecommunication Fraud Detection
VerifiedAdded on 2020/03/04
|10
|2496
|208
Report
AI Summary
This report provides a comprehensive overview of telecommunication fraud detection using data analytics. It introduces the problem of fraudulent activities in telecommunication systems and proposes data analytics techniques as a solution. The report outlines the aims, objectives, and potential outcomes of detecting telecommunication fraud, emphasizing the importance of safeguarding telecommunication systems and protecting the interests of management, system administrators, and clients. It explores the background of the problem, including various types of fraud such as tumbling fraud, calling card fraud, cloning fraud, and subscriber fraud. The report then delves into the current nature of the problem, discussing the limitations of existing analytics tools and the complexities of the fraud landscape. It details the data analytics scenario and methodology, focusing on the Cross Industry Standard Process for Data Mining (CRISP-DM), SEMMA, and ASUM-DM. The report explains data collection, organization strategies, and data mining methods, including clustering methods and the different phases of CRISP-DM. It also discusses the evaluation and deployment of results, emphasizing the need for continuous monitoring and maintenance to adapt to the evolving nature of fraud. The report concludes with a detailed bibliography of cited sources. This report is a valuable resource for students and professionals interested in data analytics and fraud detection in the telecommunications industry.

Running Head: FUNDAMENTALS OF DATA ANALYTICS 1
DETECTING TELECOMMUNICATION FRAUD
STUDENT:
INSTITUTION:
DATE:
DETECTING TELECOMMUNICATION FRAUD
STUDENT:
INSTITUTION:
DATE:
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.

FUNDAMENTALS OF DATA ANALYTICS: 2
Detecting telecommunication fraud
Introduction
Fraudulent telecommunication activities are common through un-mastered computer systems.
This project proposal presents the detection of the problem of telecommunication fraud using the
various notable techniques of data analytics. The tools that are available to detect the potential
telecommunication activities which are fraudulent comprise of a computer machine interfaced to
receive a call information and putting it to record. Operators within the computer are used set up
to compare the parameters in a current call with the past general practice of that particular
subscriber. The output will indicate the areas where there is potential fraud (Palacios, 2016). The
comparing tools use structured data analytics tools as presented in this report.
Aims, objectives and possible outcomes
Aims
The purpose of detecting telecommunication fraud is to create an opportunity to identify key
potential fraud areas, identify instances of fraudulent operations and catch up with it to safeguard
a telecommunication system. The result is a system that is free from fraud to guarantee the
interests of the management, system administrators, and the clients. Many institutions handle
large volumes of call data subsequently. A minute can record up to 29 calls in a standard
organization. Calls will always range from satisfaction complaints, suggestions and feedbacks
from customers, dealers and the employees in their regular operations.
Objectives
Detecting telecommunication fraud
Introduction
Fraudulent telecommunication activities are common through un-mastered computer systems.
This project proposal presents the detection of the problem of telecommunication fraud using the
various notable techniques of data analytics. The tools that are available to detect the potential
telecommunication activities which are fraudulent comprise of a computer machine interfaced to
receive a call information and putting it to record. Operators within the computer are used set up
to compare the parameters in a current call with the past general practice of that particular
subscriber. The output will indicate the areas where there is potential fraud (Palacios, 2016). The
comparing tools use structured data analytics tools as presented in this report.
Aims, objectives and possible outcomes
Aims
The purpose of detecting telecommunication fraud is to create an opportunity to identify key
potential fraud areas, identify instances of fraudulent operations and catch up with it to safeguard
a telecommunication system. The result is a system that is free from fraud to guarantee the
interests of the management, system administrators, and the clients. Many institutions handle
large volumes of call data subsequently. A minute can record up to 29 calls in a standard
organization. Calls will always range from satisfaction complaints, suggestions and feedbacks
from customers, dealers and the employees in their regular operations.
Objectives

FUNDAMENTALS OF DATA ANALYTICS: 3
The purpose of the report is to enable the users of a telecommunication system accurately
pinpoint and eliminate fraud as a source of loss to an organization. The large amounts of
communications data operated are difficult to verify their levels of being genuine. The only way
to handle this task is the implementation of an automatic system for data analytics. Data will
comprise of old calls of a particular caller, the current call record, and fraud results. Data mining
tools are used to suspected call records mined from information haystack will be compared to
records and display the results (Orlaith, 2016).
Possible outcomes
The resulting outcomes are scripts of call information branded as being fraud related. When the
data analytics tools are implemented, there will be a higher possibility of detecting fraud in
telecommunication, stop progress cases and a consequent investigation to safeguard the interests
of the organization. The prevalent costs are the primary inhibiting factor for the implementation.
A proper application of data analytics tools will enhance the process of achieving the overall
goals of the organization. A successful process also improves the image of an organization to the
publics and the investors (Omar, 2017). Few firms that are financially viable in the country have
already adopted a fraud detection most reporting positive progress disregard of the
accompanying costs in development. It forms a significant achievement in growth. All human
and robotic fraud attempts are maintained (Salgado, 2016).
Background of the problem
Both the wireless systems and the systems that use transmission lines from telecommunication
systems. Telecom fraud is an unauthorized usage in which the user has not yet paid for the
services. Monitoring systems provide the apparatus and tools to detect the potential usages of
The purpose of the report is to enable the users of a telecommunication system accurately
pinpoint and eliminate fraud as a source of loss to an organization. The large amounts of
communications data operated are difficult to verify their levels of being genuine. The only way
to handle this task is the implementation of an automatic system for data analytics. Data will
comprise of old calls of a particular caller, the current call record, and fraud results. Data mining
tools are used to suspected call records mined from information haystack will be compared to
records and display the results (Orlaith, 2016).
Possible outcomes
The resulting outcomes are scripts of call information branded as being fraud related. When the
data analytics tools are implemented, there will be a higher possibility of detecting fraud in
telecommunication, stop progress cases and a consequent investigation to safeguard the interests
of the organization. The prevalent costs are the primary inhibiting factor for the implementation.
A proper application of data analytics tools will enhance the process of achieving the overall
goals of the organization. A successful process also improves the image of an organization to the
publics and the investors (Omar, 2017). Few firms that are financially viable in the country have
already adopted a fraud detection most reporting positive progress disregard of the
accompanying costs in development. It forms a significant achievement in growth. All human
and robotic fraud attempts are maintained (Salgado, 2016).
Background of the problem
Both the wireless systems and the systems that use transmission lines from telecommunication
systems. Telecom fraud is an unauthorized usage in which the user has not yet paid for the
services. Monitoring systems provide the apparatus and tools to detect the potential usages of

FUNDAMENTALS OF DATA ANALYTICS: 4
such applications. Access to information is becoming essential in every field of business,
government, and science with much increase in the use of wireless systems. Fraudulent usage of
telecommunication systems is also on the rise causing up to a total of not less than $600 billion
in a year, and the high figures have created a desirability for a system to detect and prevent those
activities (Monde, 2017).
The varieties of fraud include tumbling fraud using different IDs which are generated to place
different calls. It is easy if pre-call verifications for the IDs are not conducted (Zolotová, 2017).
The user identifications will then remain to be unassigned and cannot be billed. The calling card
fraud is done by misappropriating a valid call card number and using it to make calls with billing
done to the unsuspecting subscriber. Cloning fraud is associated with cellular systems through
plundering a valid customer identification, cloning the ID into a mobile phone which is used for
the billing made at the subscriber's ID. A tumbling-clone fraud hybrids the tumbling and the
cloning type scams. Cellular phone calls are placed on successive customers' IDs all programmed
into the telephone. The tumbling-clone fraud is harder to detect among all the others. Another
type is the subscriber fraud conducted by a rather valid customer (Monde, 2017). The customer
uses a system without the intention of paying and continues to do so until access to the service is
blocked.
The current nature of the problem
The analytics tools available include the spectral clustering technique, the cross-object
relationships, the Recency, Frequency and Monetary method (RFM) and the Customer Lifetime
Value. System implementation poses tremendous operating costs for business and hence
acquisition is limited to big telecommunication companies. Furthermore, the nature of the
problem is complex and needs to be regularly reviewed due to the flexible nature of fraud. Few
such applications. Access to information is becoming essential in every field of business,
government, and science with much increase in the use of wireless systems. Fraudulent usage of
telecommunication systems is also on the rise causing up to a total of not less than $600 billion
in a year, and the high figures have created a desirability for a system to detect and prevent those
activities (Monde, 2017).
The varieties of fraud include tumbling fraud using different IDs which are generated to place
different calls. It is easy if pre-call verifications for the IDs are not conducted (Zolotová, 2017).
The user identifications will then remain to be unassigned and cannot be billed. The calling card
fraud is done by misappropriating a valid call card number and using it to make calls with billing
done to the unsuspecting subscriber. Cloning fraud is associated with cellular systems through
plundering a valid customer identification, cloning the ID into a mobile phone which is used for
the billing made at the subscriber's ID. A tumbling-clone fraud hybrids the tumbling and the
cloning type scams. Cellular phone calls are placed on successive customers' IDs all programmed
into the telephone. The tumbling-clone fraud is harder to detect among all the others. Another
type is the subscriber fraud conducted by a rather valid customer (Monde, 2017). The customer
uses a system without the intention of paying and continues to do so until access to the service is
blocked.
The current nature of the problem
The analytics tools available include the spectral clustering technique, the cross-object
relationships, the Recency, Frequency and Monetary method (RFM) and the Customer Lifetime
Value. System implementation poses tremendous operating costs for business and hence
acquisition is limited to big telecommunication companies. Furthermore, the nature of the
problem is complex and needs to be regularly reviewed due to the flexible nature of fraud. Few
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.

FUNDAMENTALS OF DATA ANALYTICS: 5
service providers are available in the industry to be sourced which makes the prices to rise
higher. The database nature of the fraud has also made it difficult to outsource services from
external solution providers. The delicate natures of business and industry competitiveness call
for extreme privacy in internal issues (Lopez, 2017).
Data analytics scenario and methodology
Formulation of the problem and data mining techniques
The Cross Industry Standard Process for Data Mining (CRISP-DM) is the leading approach used
in data mining because of its effectiveness. The other methodology for conducting the process is
the SEMMA. The ASUM-DM, Analytics Solution Unified Method for Data Mining has been
released lately to refine and extend the CRISP-DM. The phases of the CRISP-DM will give the
business and data understanding, preparation of data, modeling, evaluation, and deployment.
In the telecommunication industry, it is hard to the next date, time and duration of the genuine
subscribers (Jaratsri, 2017). Subscribers do not have a standard frequency of making calls even
for the biggest customers. In solving the problem, customers are categorized using the loyalties
perceived on them. The method will help to classified by their most likely next use. Data must be
collected and analyzed according to types, the method of recording, their storage formats, and
any other possible changes. The clustering method is the best to use when classifying
subscribers’ data. It will ease prediction to analyze the data and a model that will be useful in
future decision making. The analysis is done by the predictions for specific categories (Lersel,
2017).
Data collection and organization strategy
service providers are available in the industry to be sourced which makes the prices to rise
higher. The database nature of the fraud has also made it difficult to outsource services from
external solution providers. The delicate natures of business and industry competitiveness call
for extreme privacy in internal issues (Lopez, 2017).
Data analytics scenario and methodology
Formulation of the problem and data mining techniques
The Cross Industry Standard Process for Data Mining (CRISP-DM) is the leading approach used
in data mining because of its effectiveness. The other methodology for conducting the process is
the SEMMA. The ASUM-DM, Analytics Solution Unified Method for Data Mining has been
released lately to refine and extend the CRISP-DM. The phases of the CRISP-DM will give the
business and data understanding, preparation of data, modeling, evaluation, and deployment.
In the telecommunication industry, it is hard to the next date, time and duration of the genuine
subscribers (Jaratsri, 2017). Subscribers do not have a standard frequency of making calls even
for the biggest customers. In solving the problem, customers are categorized using the loyalties
perceived on them. The method will help to classified by their most likely next use. Data must be
collected and analyzed according to types, the method of recording, their storage formats, and
any other possible changes. The clustering method is the best to use when classifying
subscribers’ data. It will ease prediction to analyze the data and a model that will be useful in
future decision making. The analysis is done by the predictions for specific categories (Lersel,
2017).
Data collection and organization strategy

FUNDAMENTALS OF DATA ANALYTICS: 6
The Cross Industry Standard Process for Data Mining lays down the process for tackling the
fraud problem. The data collected is prepared for the modeling stage. Data collection is followed
by familiarization to identify the quality problems and get insights. Subsets are created to form
hypotheses about hidden information. Preparation of data covers all activities concerning the
subscribers to enable construction of a final dataset. It can be done for multiple times. The
processes are done in data preparation cover past call records in tables, selection of the
subscribers' characteristics and a transformation succeeded by cleaning of data in preparation for
modeling.
In the modeling phase, several techniques are available (Lersel, 2017). The parameters are
attuned to optimal values. All the methods available for data mining will solve the same problem
of fraudulent users in telecommunication and seem to have similar data from requirements.
Data mining methods
The Cross Industry Standard Process for Data Mining (CRISP-DM)
It has six phases as indicated below which are non-directional. The process is continuous even
after finding the solution. It is required that the business is defined regarding its objectives. The
purpose of the firm is the detection of fraud. The standard decision model is used as a design
plan. The preliminary data collected must be analyzed to identify problems with its quality. Data
subsets are also formed here. Data is cleaned up and attributed for a multiple of times. The final
set of data is prepared to be fed in the modeling tools. The modeling methods have preset
standard values with similar and specific data requirements (Itani, 2017).
Models are built for data analysis according to the quality parameter. The steps executed in the
construction of the models must be thoroughly reviewed to achieve telecommunication
The Cross Industry Standard Process for Data Mining lays down the process for tackling the
fraud problem. The data collected is prepared for the modeling stage. Data collection is followed
by familiarization to identify the quality problems and get insights. Subsets are created to form
hypotheses about hidden information. Preparation of data covers all activities concerning the
subscribers to enable construction of a final dataset. It can be done for multiple times. The
processes are done in data preparation cover past call records in tables, selection of the
subscribers' characteristics and a transformation succeeded by cleaning of data in preparation for
modeling.
In the modeling phase, several techniques are available (Lersel, 2017). The parameters are
attuned to optimal values. All the methods available for data mining will solve the same problem
of fraudulent users in telecommunication and seem to have similar data from requirements.
Data mining methods
The Cross Industry Standard Process for Data Mining (CRISP-DM)
It has six phases as indicated below which are non-directional. The process is continuous even
after finding the solution. It is required that the business is defined regarding its objectives. The
purpose of the firm is the detection of fraud. The standard decision model is used as a design
plan. The preliminary data collected must be analyzed to identify problems with its quality. Data
subsets are also formed here. Data is cleaned up and attributed for a multiple of times. The final
set of data is prepared to be fed in the modeling tools. The modeling methods have preset
standard values with similar and specific data requirements (Itani, 2017).
Models are built for data analysis according to the quality parameter. The steps executed in the
construction of the models must be thoroughly reviewed to achieve telecommunication

FUNDAMENTALS OF DATA ANALYTICS: 7
objectives. At deployment, data is presented to the administrators depending on their
requirements. Report generation, data scoring, and data mining from things that should be
considered. The client telecommunication firms are made to develop an appropriate business
strategy.
Sample, Explore, Modify, Model and Assess (SEMMA) model
It is another statistics and computer intelligence support software that guides data mining. It is a
logical functional tool for organizing data in a generalized manner (Longjun, 2017). Sampling
involves selection of a large set of data for modeling sufficient information. Data is partitioned
into small samples for efficiency. Data is then analyzed to understand relationships between the
variables and the anomalies. Modification selects, create and transform the variables as the
modeling phase creates models that will provide the desired output. Assessment is done to view
the results regarding usefulness and reliability (Figueiras, 2016). The criticism on SEMMA is its
focus on modeling alone.
Analytics Solution Unified Method for Data Mining (ASUM-DM)
It is an extension to CRISP-DM refined by the IBM computer company. It covers all the
properties of the preluding application and extends it with a functionality of more detailed
smaller sets which most analysts label as too complex reducing its popularity (Hofmann, 2016).
Knowledge Discovery in Database (KDD)
It describes technologies and methods to assist people in extracting information that is useful
from volumes of information that expands rapidly.
Evaluation of the results
objectives. At deployment, data is presented to the administrators depending on their
requirements. Report generation, data scoring, and data mining from things that should be
considered. The client telecommunication firms are made to develop an appropriate business
strategy.
Sample, Explore, Modify, Model and Assess (SEMMA) model
It is another statistics and computer intelligence support software that guides data mining. It is a
logical functional tool for organizing data in a generalized manner (Longjun, 2017). Sampling
involves selection of a large set of data for modeling sufficient information. Data is partitioned
into small samples for efficiency. Data is then analyzed to understand relationships between the
variables and the anomalies. Modification selects, create and transform the variables as the
modeling phase creates models that will provide the desired output. Assessment is done to view
the results regarding usefulness and reliability (Figueiras, 2016). The criticism on SEMMA is its
focus on modeling alone.
Analytics Solution Unified Method for Data Mining (ASUM-DM)
It is an extension to CRISP-DM refined by the IBM computer company. It covers all the
properties of the preluding application and extends it with a functionality of more detailed
smaller sets which most analysts label as too complex reducing its popularity (Hofmann, 2016).
Knowledge Discovery in Database (KDD)
It describes technologies and methods to assist people in extracting information that is useful
from volumes of information that expands rapidly.
Evaluation of the results
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

FUNDAMENTALS OF DATA ANALYTICS: 8
The assessment of results calls for an evaluation of data. The mining of results will be done
hand-in-hand with the principles of business success. The models are acceptable according to
world standards. The process is not a final process but a continuous process. The next thing is to
outlay the solution and then formulate a decision. The model results must be tested on a
simulation platform to access the accuracy levels (Darshana, 2017).
Deployment of results into business
The final phase of the CRISP-DM methodology is the implementation of the decision. It should
be done within the first month to avoid a further loss of the business. The business has to develop
appropriate strategies to comply the analysis solutions (Fahmi, 2017). Continuous monitoring
and maintenance must be effected because the fraud environment evolves each day. The business
findings must be documented in a final report for presentation to the client. An expert review of
the documentation will eliminate possible errors. An error free documentation will provide the
reference in succeeding analysis to prevent a repeat of the already perfectly handled areas which
could create additional costs for the business and waste time on the already handled issue.
Records also facilitate tracking of the identified fraud cases (Al-asadi, 2017).
The client will be required to keep a regular contact with the analysts for support on any needed
assistance.
The assessment of results calls for an evaluation of data. The mining of results will be done
hand-in-hand with the principles of business success. The models are acceptable according to
world standards. The process is not a final process but a continuous process. The next thing is to
outlay the solution and then formulate a decision. The model results must be tested on a
simulation platform to access the accuracy levels (Darshana, 2017).
Deployment of results into business
The final phase of the CRISP-DM methodology is the implementation of the decision. It should
be done within the first month to avoid a further loss of the business. The business has to develop
appropriate strategies to comply the analysis solutions (Fahmi, 2017). Continuous monitoring
and maintenance must be effected because the fraud environment evolves each day. The business
findings must be documented in a final report for presentation to the client. An expert review of
the documentation will eliminate possible errors. An error free documentation will provide the
reference in succeeding analysis to prevent a repeat of the already perfectly handled areas which
could create additional costs for the business and waste time on the already handled issue.
Records also facilitate tracking of the identified fraud cases (Al-asadi, 2017).
The client will be required to keep a regular contact with the analysts for support on any needed
assistance.

FUNDAMENTALS OF DATA ANALYTICS: 9
Bibliography
Al-asadi, T., 2017. A Survey on Web Mining Techniques and Applications. International
Journal of Advanced Science, Engineering and Information Technology, 7(4), pp. 1178-1184.
Darshana, P., 2017. Privacy-Preserving Associative Classification.. Cham, Springer.
Fahmi, N., 2017. Fuzzy Logic for an Implementation Environment Health Monitoring System
Based on Wireless Sensor Network. Journal of Telecommunication, Electronic, and Computer
Engineering, 2(4), pp. 119-122.
Figueiras, P., 2016. Big Data Harmonization for Intelligent Mobility: On the Move to
Meaningful Internet Systems. Cham, Springer.
Hofmann, C., 2016. A Two-Layer Method for Sedentary Behaviors Classification Using
Smartphones. Tokyo, Springer.
Itani, N., 2017. LINK MINING PROCESS. Journal of Technology and Science, 7(149), pp. 254-
261.
Jaratsri, R., 2017. Data Mining Techniques for Predicting. Journal of Telecommunication,
Electronic, and Computer Engineering, 2(4), pp. 95-99.
Lersel, V., 2017. Going concern decision prediction using predictive analytics. Analytics, 1(9),
pp. 43-44.
Longjun, Z., 2017. Privacy-Preserving Data Mining on Big Data Computing Platform: Trends
and Future. Cham, Springer.
Bibliography
Al-asadi, T., 2017. A Survey on Web Mining Techniques and Applications. International
Journal of Advanced Science, Engineering and Information Technology, 7(4), pp. 1178-1184.
Darshana, P., 2017. Privacy-Preserving Associative Classification.. Cham, Springer.
Fahmi, N., 2017. Fuzzy Logic for an Implementation Environment Health Monitoring System
Based on Wireless Sensor Network. Journal of Telecommunication, Electronic, and Computer
Engineering, 2(4), pp. 119-122.
Figueiras, P., 2016. Big Data Harmonization for Intelligent Mobility: On the Move to
Meaningful Internet Systems. Cham, Springer.
Hofmann, C., 2016. A Two-Layer Method for Sedentary Behaviors Classification Using
Smartphones. Tokyo, Springer.
Itani, N., 2017. LINK MINING PROCESS. Journal of Technology and Science, 7(149), pp. 254-
261.
Jaratsri, R., 2017. Data Mining Techniques for Predicting. Journal of Telecommunication,
Electronic, and Computer Engineering, 2(4), pp. 95-99.
Lersel, V., 2017. Going concern decision prediction using predictive analytics. Analytics, 1(9),
pp. 43-44.
Longjun, Z., 2017. Privacy-Preserving Data Mining on Big Data Computing Platform: Trends
and Future. Cham, Springer.

FUNDAMENTALS OF DATA ANALYTICS:
10
Lopez, J., 2017. Application of Data Mining Algorithms to Classify Biological Data. Cham,
Springer.
Monde, A., 2017. Application of Data Mining techniques to identify the significant patterns.
Stellenbosch, Stellenbosch University.
Omar, N., 2017. Home-Based Intrusion Detection System." (JTEC) 9.2-4 (2017): 107-111..
Journal of Telecommunication, Electronic, and Computer Engineering, 2(4), pp. 107-111.
Orlaith, M., 2016. Predicting Intake of Applications for First Registration in the Property
Registration Authority. Dublin Institute of Technology, 4(17), pp. 133-139.
Palacios, H., 2016. A comparative between CRISP-DM and SEMMA. Journal of Technology,
3(9), pp. 1-93.
Salgado, R., 2016. Data mining and cluster organisations. Database Systems, 7(4), pp. 1-59.
Zolotová, I., 2017. Data mining in cloud usage data with Matlab's statistics and machine
learning toolbox. London, IEEE.
10
Lopez, J., 2017. Application of Data Mining Algorithms to Classify Biological Data. Cham,
Springer.
Monde, A., 2017. Application of Data Mining techniques to identify the significant patterns.
Stellenbosch, Stellenbosch University.
Omar, N., 2017. Home-Based Intrusion Detection System." (JTEC) 9.2-4 (2017): 107-111..
Journal of Telecommunication, Electronic, and Computer Engineering, 2(4), pp. 107-111.
Orlaith, M., 2016. Predicting Intake of Applications for First Registration in the Property
Registration Authority. Dublin Institute of Technology, 4(17), pp. 133-139.
Palacios, H., 2016. A comparative between CRISP-DM and SEMMA. Journal of Technology,
3(9), pp. 1-93.
Salgado, R., 2016. Data mining and cluster organisations. Database Systems, 7(4), pp. 1-59.
Zolotová, I., 2017. Data mining in cloud usage data with Matlab's statistics and machine
learning toolbox. London, IEEE.
1 out of 10
Related Documents

Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
© 2024 | Zucol Services PVT LTD | All rights reserved.