DATA4000 Case Study: Machine Learning in Telecommunications
VerifiedAdded on 2022/07/28
|6
|2777
|24
Case Study
AI Summary
This case study analyzes the application of data analytics, machine learning, and data science within the telecommunications industry and fire departments. The study examines the use of classification algorithms, particularly logistic regression, for churn prediction in the telecommunications sector and the potential of predictive modeling using sensor data for fire risk assessment. The analysis includes a discussion on the types of analytics used, challenges faced, and recommendations for stakeholders. The document also explores the role of business analytics, the importance of sourcing analytics professionals, and the different phases of business analytics, including descriptive, predictive, and prescriptive analytics. The provided data is a CSV file that is used to showcase how machine learning is implemented to derive insights for business decisions.

Topic
Name of Student
Name of Professor
Name of School
State and City
Date
Name of Student
Name of Professor
Name of School
State and City
Date
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.

Introduction
1. Artificial Intelligence, Big Data and Machine Learning rule the waves with machine learning
combating the most serious parts of the entire group even though machine learning is the subset
of big data and big data the subset of machine learning. In this study, a focus will focus on
machine learning and the machine leaning purpose and solution to the industry selected. Just to
get our feet wet we will have a look into the industry in which the analytics of this study will be
applied through the machine learning algorithms that we are going to select as well as the
machine learning software of analysis that is going to be used. The industry of focus of this
research study will be the telecommunication industry. The telecommunication industry is filled
up with tech companies that aid communication and connectivity of individuals across different
telecommunication networks. For this fact, there are respective data analytics departments in
every company that focuses on the analysis of data sets of different designs and formats in the
respective companies where insights are being sorted for. The reason for the existence of the data
analytics department is to aid with the provision of better insights and trends of the activities as
well as the traffic that is driven by the involved telecommunication companies and therefore
Industry at large (Alpaydin, E., 2020). The only problem is that the industry tends to be a very
large organ to look into when working through data Analytics. For this reason, therefore, there
will be a focus on a single telecommunication company, and the analytics results that will be
gotten from the single telecommunication company will serve as a reflection of the entire
industry for extrapolations can be made while making inferences and interpretations. There are
several analytics reasons as to why the data analytics department in every telecommunication
company exists. Of the reasons, we have is to get to know; monetization of the sales traffic; the
rate of attainment of new customers; retaining of existing and new customers and loss of already
existing customers. Most telecommunication companies are privately owned and therefore
profits made is a major concern of the businesses. For this reason, the focus will be based on the
behaviour of customers as they're consuming the product of the telecommunication company in
question as the behaviour of customers tend to determine the level of profits that are earned by a
company. There is the pressure of retaining as many customers as possible to realize growth in
the profit margins over time. For this reason, the data analytics department will be focusing on
classifying, the existing customers into two categories. The categories of classification will be
those that are loyal to the company and those that would churn if by any chance there are any
hiccups in the provision of services that were promised to them by the company (Bell, J., 2020).
There are only two categories of customers that will be looked into and this includes those that
are loyal and those that are not loyal (that would opt-out to the next telecommunication company
if by any chance their desires in service provision are not met). For this reason, the approach that
will be given in the data analytics method that will be used will be the classification algorithms
method. Of the data set that will be used, there are several variables some of which are irrelevant
and some of which are relevant. The mix-up of variables to be used, therefore, calls for the use of
a linear regression algorithm. The linear regression algorithm, in this case, will be used to point
out the variables that are highly statistically significant for classification. The classification
algorithm that will be used after the pinpointing of the highly significant variables will be the
logistic regression (Brunton, S.L., Noack, B.R. and Koumoutsakos, P., 2020).
1. Artificial Intelligence, Big Data and Machine Learning rule the waves with machine learning
combating the most serious parts of the entire group even though machine learning is the subset
of big data and big data the subset of machine learning. In this study, a focus will focus on
machine learning and the machine leaning purpose and solution to the industry selected. Just to
get our feet wet we will have a look into the industry in which the analytics of this study will be
applied through the machine learning algorithms that we are going to select as well as the
machine learning software of analysis that is going to be used. The industry of focus of this
research study will be the telecommunication industry. The telecommunication industry is filled
up with tech companies that aid communication and connectivity of individuals across different
telecommunication networks. For this fact, there are respective data analytics departments in
every company that focuses on the analysis of data sets of different designs and formats in the
respective companies where insights are being sorted for. The reason for the existence of the data
analytics department is to aid with the provision of better insights and trends of the activities as
well as the traffic that is driven by the involved telecommunication companies and therefore
Industry at large (Alpaydin, E., 2020). The only problem is that the industry tends to be a very
large organ to look into when working through data Analytics. For this reason, therefore, there
will be a focus on a single telecommunication company, and the analytics results that will be
gotten from the single telecommunication company will serve as a reflection of the entire
industry for extrapolations can be made while making inferences and interpretations. There are
several analytics reasons as to why the data analytics department in every telecommunication
company exists. Of the reasons, we have is to get to know; monetization of the sales traffic; the
rate of attainment of new customers; retaining of existing and new customers and loss of already
existing customers. Most telecommunication companies are privately owned and therefore
profits made is a major concern of the businesses. For this reason, the focus will be based on the
behaviour of customers as they're consuming the product of the telecommunication company in
question as the behaviour of customers tend to determine the level of profits that are earned by a
company. There is the pressure of retaining as many customers as possible to realize growth in
the profit margins over time. For this reason, the data analytics department will be focusing on
classifying, the existing customers into two categories. The categories of classification will be
those that are loyal to the company and those that would churn if by any chance there are any
hiccups in the provision of services that were promised to them by the company (Bell, J., 2020).
There are only two categories of customers that will be looked into and this includes those that
are loyal and those that are not loyal (that would opt-out to the next telecommunication company
if by any chance their desires in service provision are not met). For this reason, the approach that
will be given in the data analytics method that will be used will be the classification algorithms
method. Of the data set that will be used, there are several variables some of which are irrelevant
and some of which are relevant. The mix-up of variables to be used, therefore, calls for the use of
a linear regression algorithm. The linear regression algorithm, in this case, will be used to point
out the variables that are highly statistically significant for classification. The classification
algorithm that will be used after the pinpointing of the highly significant variables will be the
logistic regression (Brunton, S.L., Noack, B.R. and Koumoutsakos, P., 2020).

The main challenge of using logistical regression as the analytics approach method to achieve the
business objectives of knowing which customers would churn and which one will not is that the
level of accuracy might reduce depending on the set of variables that have been chosen for
logistic regression.
The logistic regression classification will assist the stakeholders such as the designers,
promoters, sales department, in knowing how to handle the customers that churn at a higher rate
to retain them for growth of profits in the future (Carrasquilla, J., 2020).
2. For Amsterdam fire department on the other hand should also depend largely on machine
learning and data science. In this case though there is the need for the development of jargons
that would be used for fighters in order to understand the data that is collected. Without data that
would be easily understood, there would be no need for the use for the same dataset to help put
out the fire cases that are to be addressed at one particular case. Looking at the fire engines, there
should be a fix of sensors on them in order to aid know the extent of a damage that there is in
specific areas of fire hit.
Predictive modeling like support vector machine, would give the extent to which can be used in
order to have an asses of the risks that there are in specific locations when putting out fighters by
the men who are handling that. What is important to note is that fire spreads very fast and if data
science was to be involved in putting off the fire at that instance, then there should be highly
skilled officials that would generate faster data that would be used to assess the damage at hand
in order to help combat fire as soon as expected. This is only because of the fact that data that
flows at a time might be a mismatch with the existing analysis.
Role of Business Analytics
The dataset that is to be used in this case is a CSV format file. The response variable is the churn
column and it has two categorical entries. Of the two entries, we have a binary column which has
just and no answers. From the churn column with the binary entries, it is very evident that there
will be the use of classification algorithms in solving the business Analytics problem. This is
because what needs to be known if the percentage categorization of the classification of each
category of customers in the churn column. There are very many classification algorithms that
can be employed to perform the classification needed on the response variable. Of the ones that
will be employed in that can give better results include logistic regression algorithm, which is the
simplest and easiest to manipulate under the decision tree for the classification algorithm. The
best part of using both logistic regression algorithm and decision tree classification algorithm is
the fact that both algorithms give way for the actual split of the involved dataset. The split which
allows for the existence of two datasets from the main data set allows for the actual development
of either of the algorithms using reliable variables hence there is the sense of adoption of the
characteristics of the data set that is to be used (Hey, T., Butler, K., Jackson, S. and
Thiyagalingam, J., 2020). Once the relevant characteristics have been adopted then the second
split of the main data set which is termed as the test data set is supplied to the developed
algorithm using the first split of the datasets which was termed as the training dataset. This
business objectives of knowing which customers would churn and which one will not is that the
level of accuracy might reduce depending on the set of variables that have been chosen for
logistic regression.
The logistic regression classification will assist the stakeholders such as the designers,
promoters, sales department, in knowing how to handle the customers that churn at a higher rate
to retain them for growth of profits in the future (Carrasquilla, J., 2020).
2. For Amsterdam fire department on the other hand should also depend largely on machine
learning and data science. In this case though there is the need for the development of jargons
that would be used for fighters in order to understand the data that is collected. Without data that
would be easily understood, there would be no need for the use for the same dataset to help put
out the fire cases that are to be addressed at one particular case. Looking at the fire engines, there
should be a fix of sensors on them in order to aid know the extent of a damage that there is in
specific areas of fire hit.
Predictive modeling like support vector machine, would give the extent to which can be used in
order to have an asses of the risks that there are in specific locations when putting out fighters by
the men who are handling that. What is important to note is that fire spreads very fast and if data
science was to be involved in putting off the fire at that instance, then there should be highly
skilled officials that would generate faster data that would be used to assess the damage at hand
in order to help combat fire as soon as expected. This is only because of the fact that data that
flows at a time might be a mismatch with the existing analysis.
Role of Business Analytics
The dataset that is to be used in this case is a CSV format file. The response variable is the churn
column and it has two categorical entries. Of the two entries, we have a binary column which has
just and no answers. From the churn column with the binary entries, it is very evident that there
will be the use of classification algorithms in solving the business Analytics problem. This is
because what needs to be known if the percentage categorization of the classification of each
category of customers in the churn column. There are very many classification algorithms that
can be employed to perform the classification needed on the response variable. Of the ones that
will be employed in that can give better results include logistic regression algorithm, which is the
simplest and easiest to manipulate under the decision tree for the classification algorithm. The
best part of using both logistic regression algorithm and decision tree classification algorithm is
the fact that both algorithms give way for the actual split of the involved dataset. The split which
allows for the existence of two datasets from the main data set allows for the actual development
of either of the algorithms using reliable variables hence there is the sense of adoption of the
characteristics of the data set that is to be used (Hey, T., Butler, K., Jackson, S. and
Thiyagalingam, J., 2020). Once the relevant characteristics have been adopted then the second
split of the main data set which is termed as the test data set is supplied to the developed
algorithm using the first split of the datasets which was termed as the training dataset. This

action is intended to make predictions that have the same characteristics and results as those that
had been witnessed by either of the algorithms during their developments using the training
dataset. Because the training dataset has similar characteristics to the main data set, the
prediction that will have been made will be a reliable prediction since both the train and the test
data set have similar variables and therefore similar characteristics (Noé, F., De Fabritiis, G. and
Clementi, C., 2020).
The classification, therefore, that arises from either of the algorithms can help relevant
stakeholders that are responsible for decision making in a company make viable decisions in
regards to the percentage of the customers that are loyal and those that are not loyal. This, in
turn, would help different departments of a company and therefore industry to operate
accordingly to woo customers into consuming the services and products being offered and this
eventually leads to improved profits over time (Virmani, C., Choudhary, T., Pillai, A. and Rani,
M., 2020). the real-world problem and that has been employed across all the learning institutions
and just for practice in understanding what actual classification is, is the classification of the Iris
dataset. The Iris dataset contains data on petals and they are in three categories called the Setosa,
Versicolor and Virginia. The attributes that are involved include Sepal length, Sepal width, Petal
width and Petal width.
Sourcing Analytics Professionals
Of the problem statements that we are supposed to be answering in the analytics research
problem that we are conducting here, there will be different analytics types. Because we will be
using a CSV data file, there will be the need to give the descriptive description of all the
variables that are there in the data set used. What will be included in this case would include the
mean the median and other measures of central tendency like the mode the standard deviation
and a measure of skewness. In addition to all of that, there will be a need to check for the missing
data points. After which they will be the cleaning of the data set through the fill-up of the
missing data points or the deletion of the rows that contain the missing data points.
The second and most important analytics type that will be included in the predictive Analytics
type. The predictive Analytics is where either the decision tree or logistic regression algorithm
comes in. The reason as to why the original data set that is being supplied for classification is
split into both train and test set is because the train set is used for classification algorithm buildup
( in this case the decision tree or logistic regression classification algorithms) whereas, the test
set as the name suggests is used to test the validity of the classification algorithm built using the
train set, hence the test set is used for making predictions and therefore confirming predictive
analytics (Watt, J., Borhani, R. and Katsaggelos, A., 2020).
There are different phases of business analytics and by the end of this research discussion, these
different phases will have been pointed out. To start with as has been pointed out descriptive
business analytics give the numerical representation of the entire data set used in a summarized
format. From descriptive business analytics, one can see what the measures of central tendency
and the skewness of the data set are. Once the major distribution of the data set has been
had been witnessed by either of the algorithms during their developments using the training
dataset. Because the training dataset has similar characteristics to the main data set, the
prediction that will have been made will be a reliable prediction since both the train and the test
data set have similar variables and therefore similar characteristics (Noé, F., De Fabritiis, G. and
Clementi, C., 2020).
The classification, therefore, that arises from either of the algorithms can help relevant
stakeholders that are responsible for decision making in a company make viable decisions in
regards to the percentage of the customers that are loyal and those that are not loyal. This, in
turn, would help different departments of a company and therefore industry to operate
accordingly to woo customers into consuming the services and products being offered and this
eventually leads to improved profits over time (Virmani, C., Choudhary, T., Pillai, A. and Rani,
M., 2020). the real-world problem and that has been employed across all the learning institutions
and just for practice in understanding what actual classification is, is the classification of the Iris
dataset. The Iris dataset contains data on petals and they are in three categories called the Setosa,
Versicolor and Virginia. The attributes that are involved include Sepal length, Sepal width, Petal
width and Petal width.
Sourcing Analytics Professionals
Of the problem statements that we are supposed to be answering in the analytics research
problem that we are conducting here, there will be different analytics types. Because we will be
using a CSV data file, there will be the need to give the descriptive description of all the
variables that are there in the data set used. What will be included in this case would include the
mean the median and other measures of central tendency like the mode the standard deviation
and a measure of skewness. In addition to all of that, there will be a need to check for the missing
data points. After which they will be the cleaning of the data set through the fill-up of the
missing data points or the deletion of the rows that contain the missing data points.
The second and most important analytics type that will be included in the predictive Analytics
type. The predictive Analytics is where either the decision tree or logistic regression algorithm
comes in. The reason as to why the original data set that is being supplied for classification is
split into both train and test set is because the train set is used for classification algorithm buildup
( in this case the decision tree or logistic regression classification algorithms) whereas, the test
set as the name suggests is used to test the validity of the classification algorithm built using the
train set, hence the test set is used for making predictions and therefore confirming predictive
analytics (Watt, J., Borhani, R. and Katsaggelos, A., 2020).
There are different phases of business analytics and by the end of this research discussion, these
different phases will have been pointed out. To start with as has been pointed out descriptive
business analytics give the numerical representation of the entire data set used in a summarized
format. From descriptive business analytics, one can see what the measures of central tendency
and the skewness of the data set are. Once the major distribution of the data set has been
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.

understood there can be, therefore, the predictive analytics which in this problem statement is
done through classification algorithms. Just to state but a few, we have logistic regression
decision tree and random forest for classification algorithms. For these algorithms, we first train
them using train dataset and then validated using test data set from which predictions are made
hence the name predictive analytics. The final phase of the three phases of business analytics is
the prescriptive business analytics phase. Prescriptive analytics is the first way the available
results from descriptive and predictive analytics are used to influence the current business
decisions that are made by the relevant management systems in an industry or an organization.
Prescriptive analytics it's a very sensitive face because this is the decision making and all the
divisions that are made depending on the results submitted from the first two faces of business
analytics. Before the decisions are made prescriptive analytics tends to quantify the effects of
future decisions to tweak and advise accordingly before decisions are made and implemented.
Several analytics combinations are used in predictive analytics and they include business rules,
algorithms, machine learning processes, as well as computational modelling procedures. It is the
final phase of business analytics and can be delayed if by any chance the results are deemed not
trustworthy.
The business analytics tasks described in this research report require special skills that are highly
defined and that can handle multiple tasks that are not there with people with other skills. For
this reason, the specifications of the job descriptions of the individuals that can fit the tasks that
are involved with the above business analytics procedures include data scientist data
visualization analyst and analytics translators. The reason for the above descriptions is because a
data scientist can give best results when it comes to descriptive analytics then later provide
sufficient machine learning algorithms for predictive analytics and therefore later help the
management decide the results that have been deduced. The same applies to analytics translators
who will help interpret the data that has been analyzed by the data scientist. The actual
coordination of the data scientist and data Analytics interpreter comes in because there needs to
be called the nation during a presentation for results to be conveyed to the management in an
easier and a more understanding way. Therefore, in data analytics department in every company,
there must be a mixture of scale specifications because all these specifications have their
specialized areas of action. The business analyst, on the other hand, is more of a data analyst and
therefore has limited skills when it comes to the handling of machine learning algorithms.
Conclusion
There is a clear indication that with data science there needs to be more than one specialists or
job descriptions as there are several roles that need different people in order to be addressed.
Looking at the extent of data science, it has taken a larger portion of decision making in business
analytics and the job itself is growing and all business should embrace its use.
done through classification algorithms. Just to state but a few, we have logistic regression
decision tree and random forest for classification algorithms. For these algorithms, we first train
them using train dataset and then validated using test data set from which predictions are made
hence the name predictive analytics. The final phase of the three phases of business analytics is
the prescriptive business analytics phase. Prescriptive analytics is the first way the available
results from descriptive and predictive analytics are used to influence the current business
decisions that are made by the relevant management systems in an industry or an organization.
Prescriptive analytics it's a very sensitive face because this is the decision making and all the
divisions that are made depending on the results submitted from the first two faces of business
analytics. Before the decisions are made prescriptive analytics tends to quantify the effects of
future decisions to tweak and advise accordingly before decisions are made and implemented.
Several analytics combinations are used in predictive analytics and they include business rules,
algorithms, machine learning processes, as well as computational modelling procedures. It is the
final phase of business analytics and can be delayed if by any chance the results are deemed not
trustworthy.
The business analytics tasks described in this research report require special skills that are highly
defined and that can handle multiple tasks that are not there with people with other skills. For
this reason, the specifications of the job descriptions of the individuals that can fit the tasks that
are involved with the above business analytics procedures include data scientist data
visualization analyst and analytics translators. The reason for the above descriptions is because a
data scientist can give best results when it comes to descriptive analytics then later provide
sufficient machine learning algorithms for predictive analytics and therefore later help the
management decide the results that have been deduced. The same applies to analytics translators
who will help interpret the data that has been analyzed by the data scientist. The actual
coordination of the data scientist and data Analytics interpreter comes in because there needs to
be called the nation during a presentation for results to be conveyed to the management in an
easier and a more understanding way. Therefore, in data analytics department in every company,
there must be a mixture of scale specifications because all these specifications have their
specialized areas of action. The business analyst, on the other hand, is more of a data analyst and
therefore has limited skills when it comes to the handling of machine learning algorithms.
Conclusion
There is a clear indication that with data science there needs to be more than one specialists or
job descriptions as there are several roles that need different people in order to be addressed.
Looking at the extent of data science, it has taken a larger portion of decision making in business
analytics and the job itself is growing and all business should embrace its use.

References
Alpaydin, E., 2020. Introduction to machine learning. MIT press.
Bell, J., 2020. Machine learning: hands-on for developers and technical professionals. John
Wiley & Sons.
Brunton, S.L., Noack, B.R. and Koumoutsakos, P., 2020. Machine learning for fluid
mechanics. Annual Review of Fluid Mechanics, 52, pp.477-508.
Carrasquilla, J., 2020. Machine Learning for Quantum Matter. arXiv preprint arXiv:2003.11040.
Hey, T., Butler, K., Jackson, S. and Thiyagalingam, J., 2020. Machine learning and big scientific
data. Philosophical Transactions of the Royal Society A, 378(2166), p.20190054.
Noé, F., De Fabritiis, G. and Clementi, C., 2020. Machine learning for protein folding and
dynamics. Current Opinion in Structural Biology, 60, pp.77-84.
Virmani, C., Choudhary, T., Pillai, A. and Rani, M., 2020. Applications of Machine Learning in
Cyber Security. In Handbook of Research on Machine and Deep Learning Applications for
Cyber Security (pp. 83-103). IGI Global.
Watt, J., Borhani, R. and Katsaggelos, A., 2020. Machine learning refined: foundations,
algorithms, and applications. Cambridge University Press.
Alpaydin, E., 2020. Introduction to machine learning. MIT press.
Bell, J., 2020. Machine learning: hands-on for developers and technical professionals. John
Wiley & Sons.
Brunton, S.L., Noack, B.R. and Koumoutsakos, P., 2020. Machine learning for fluid
mechanics. Annual Review of Fluid Mechanics, 52, pp.477-508.
Carrasquilla, J., 2020. Machine Learning for Quantum Matter. arXiv preprint arXiv:2003.11040.
Hey, T., Butler, K., Jackson, S. and Thiyagalingam, J., 2020. Machine learning and big scientific
data. Philosophical Transactions of the Royal Society A, 378(2166), p.20190054.
Noé, F., De Fabritiis, G. and Clementi, C., 2020. Machine learning for protein folding and
dynamics. Current Opinion in Structural Biology, 60, pp.77-84.
Virmani, C., Choudhary, T., Pillai, A. and Rani, M., 2020. Applications of Machine Learning in
Cyber Security. In Handbook of Research on Machine and Deep Learning Applications for
Cyber Security (pp. 83-103). IGI Global.
Watt, J., Borhani, R. and Katsaggelos, A., 2020. Machine learning refined: foundations,
algorithms, and applications. Cambridge University Press.
1 out of 6

Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
© 2024 | Zucol Services PVT LTD | All rights reserved.