logo

Big Data and Machine Learning

   

Added on  2022-07-28

6 Pages2777 Words24 Views
Topic
Name of Student
Name of Professor
Name of School
State and City
Date

Introduction
1. Artificial Intelligence, Big Data and Machine Learning rule the waves with machine learning
combating the most serious parts of the entire group even though machine learning is the subset
of big data and big data the subset of machine learning. In this study, a focus will focus on
machine learning and the machine leaning purpose and solution to the industry selected. Just to
get our feet wet we will have a look into the industry in which the analytics of this study will be
applied through the machine learning algorithms that we are going to select as well as the
machine learning software of analysis that is going to be used. The industry of focus of this
research study will be the telecommunication industry. The telecommunication industry is filled
up with tech companies that aid communication and connectivity of individuals across different
telecommunication networks. For this fact, there are respective data analytics departments in
every company that focuses on the analysis of data sets of different designs and formats in the
respective companies where insights are being sorted for. The reason for the existence of the data
analytics department is to aid with the provision of better insights and trends of the activities as
well as the traffic that is driven by the involved telecommunication companies and therefore
Industry at large (Alpaydin, E., 2020). The only problem is that the industry tends to be a very
large organ to look into when working through data Analytics. For this reason, therefore, there
will be a focus on a single telecommunication company, and the analytics results that will be
gotten from the single telecommunication company will serve as a reflection of the entire
industry for extrapolations can be made while making inferences and interpretations. There are
several analytics reasons as to why the data analytics department in every telecommunication
company exists. Of the reasons, we have is to get to know; monetization of the sales traffic; the
rate of attainment of new customers; retaining of existing and new customers and loss of already
existing customers. Most telecommunication companies are privately owned and therefore
profits made is a major concern of the businesses. For this reason, the focus will be based on the
behaviour of customers as they're consuming the product of the telecommunication company in
question as the behaviour of customers tend to determine the level of profits that are earned by a
company. There is the pressure of retaining as many customers as possible to realize growth in
the profit margins over time. For this reason, the data analytics department will be focusing on
classifying, the existing customers into two categories. The categories of classification will be
those that are loyal to the company and those that would churn if by any chance there are any
hiccups in the provision of services that were promised to them by the company (Bell, J., 2020).
There are only two categories of customers that will be looked into and this includes those that
are loyal and those that are not loyal (that would opt-out to the next telecommunication company
if by any chance their desires in service provision are not met). For this reason, the approach that
will be given in the data analytics method that will be used will be the classification algorithms
method. Of the data set that will be used, there are several variables some of which are irrelevant
and some of which are relevant. The mix-up of variables to be used, therefore, calls for the use of
a linear regression algorithm. The linear regression algorithm, in this case, will be used to point
out the variables that are highly statistically significant for classification. The classification
algorithm that will be used after the pinpointing of the highly significant variables will be the
logistic regression (Brunton, S.L., Noack, B.R. and Koumoutsakos, P., 2020).

The main challenge of using logistical regression as the analytics approach method to achieve the
business objectives of knowing which customers would churn and which one will not is that the
level of accuracy might reduce depending on the set of variables that have been chosen for
logistic regression.
The logistic regression classification will assist the stakeholders such as the designers,
promoters, sales department, in knowing how to handle the customers that churn at a higher rate
to retain them for growth of profits in the future (Carrasquilla, J., 2020).
2. For Amsterdam fire department on the other hand should also depend largely on machine
learning and data science. In this case though there is the need for the development of jargons
that would be used for fighters in order to understand the data that is collected. Without data that
would be easily understood, there would be no need for the use for the same dataset to help put
out the fire cases that are to be addressed at one particular case. Looking at the fire engines, there
should be a fix of sensors on them in order to aid know the extent of a damage that there is in
specific areas of fire hit.
Predictive modeling like support vector machine, would give the extent to which can be used in
order to have an asses of the risks that there are in specific locations when putting out fighters by
the men who are handling that. What is important to note is that fire spreads very fast and if data
science was to be involved in putting off the fire at that instance, then there should be highly
skilled officials that would generate faster data that would be used to assess the damage at hand
in order to help combat fire as soon as expected. This is only because of the fact that data that
flows at a time might be a mismatch with the existing analysis.
Role of Business Analytics
The dataset that is to be used in this case is a CSV format file. The response variable is the churn
column and it has two categorical entries. Of the two entries, we have a binary column which has
just and no answers. From the churn column with the binary entries, it is very evident that there
will be the use of classification algorithms in solving the business Analytics problem. This is
because what needs to be known if the percentage categorization of the classification of each
category of customers in the churn column. There are very many classification algorithms that
can be employed to perform the classification needed on the response variable. Of the ones that
will be employed in that can give better results include logistic regression algorithm, which is the
simplest and easiest to manipulate under the decision tree for the classification algorithm. The
best part of using both logistic regression algorithm and decision tree classification algorithm is
the fact that both algorithms give way for the actual split of the involved dataset. The split which
allows for the existence of two datasets from the main data set allows for the actual development
of either of the algorithms using reliable variables hence there is the sense of adoption of the
characteristics of the data set that is to be used (Hey, T., Butler, K., Jackson, S. and
Thiyagalingam, J., 2020). Once the relevant characteristics have been adopted then the second
split of the main data set which is termed as the test data set is supplied to the developed
algorithm using the first split of the datasets which was termed as the training dataset. This

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Predictive Analytics in Financial Service
|5
|1217
|38

Cases in Telecommunication Report
|4
|728
|77

Digital Disruption and its Impact on Morrison's Business
|14
|2826
|67