logo

Big Data Mining Process and Application

Write a critique of a journal article on the CRISP-DM model and offer a critical analysis of a 'Big Data' mining problem domain.

13 Pages3887 Words124 Views
   

Added on  2023-04-23

About This Document

This article discusses the process of big data mining and its applications. It critiques the CRISP-DM model and provides a critical analysis of big data mining problems. The article also mentions data mining tools like Apache Mahout and Weka. Subject: Computer Science, Course Code: NA, Course Name: NA, College/University: NA, Document Type: Article, Assignment Type: NA

Big Data Mining Process and Application

Write a critique of a journal article on the CRISP-DM model and offer a critical analysis of a 'Big Data' mining problem domain.

   Added on 2023-04-23

ShareRelated Documents
Big Data Mining Process and Application 1
Big Data Mining Process and Application
Name
Institution
Big Data Mining Process and Application_1
Big Data Mining Process and Application 2
Big Data Mining Process and Application
Introduction
Data mining is the process via which large datasets are sorted to establish relationships
and to identify patterns. The data mining tools have been widely used by computer scientists to
predict future trends. The idea of data mining was first coined in the 1990s but accepted in 1996
where organizations realized the value of data mining. During this year, the CRISP-DM model
was introduced by four leaders which led to adoption of data mining idea.
“CRISP-DM Model: The New Blueprint for data mining” is an article by Colin Shearer.
The article has organized data mining processes into 6. This monogram will have two parts, the
first part give a critique of the article by Colin as it applies to data mining. This will include a
review of some of the related articles. The second will outline a critical analysis of big data
mining problem domain.
Overview of the article CRISP-DM model by Colin
The article by Colin comprises of six phases of data mining. The very first phase as
indicated by Colin is a business understanding phase. The requirement of this phase is
understanding project requirements from an organization point of view which is then converted
into data mining problem definition. At this phase, one is required to assess situations and
produce the project plans. Data understanding which is the second phase, starts with initial data
collection and then proceeds to familiarity with the organization data to discover initial insights
and identifying data quality problems. Data preparation is the third phase, it covers all the
activities that constructs the final dataset. The fourth phase is the modelling phase where various
modelling techniques and their parameters are selected and calibrated into optimal values. The
fifth phase is the evaluation phase, this phase reviews model’s construction. The author states
Big Data Mining Process and Application_2
Big Data Mining Process and Application 3
that the role of this phase is to critically determine if some essential business issues have been
considered. The last phase is the deployment; which is used for generating a report or for
implementing the data mining process across the organization (Shearer, 2000).
A review of the articles used
The first article which have been used to give a critique of the CRISP-DM model is the
James Taylor’s article. James is one of the leading experts in analytic technology. One of his
achievements is building Decisions support systems. In his article “Decisions Management
Solutions”, James has started by giving an overview of CRISP-DM model and later the four
issues brought by the model. Some of the issues highlighted by the author are lack of clarity,
mindless rework, blind hand of IT, and failure to iterate.
The second article, which has been used in this paper, is the article by Farhad Foroughi
and Peter Luksch, “Data Science methodology for cybersecurity projects” the author has
discussed data mining from cyber-security point of view. Specifically, the author has highlighted
the difference between TDSP and CRISP-DM model. The author has highlighted some of the
issues with CRISP-DM. The third article which have been used is the article by Jen Stirrup,
“What is wrong with CRISP-DM, and is there an alternative?” The author has started by giving
an overview of CRISP-DM and TDSP process. The author has concluded by highlighting the
issues with CRISP-DM model.
Critique
To start with, the article by Colin lack clarity as compared to the current big data mining
practices; this is according to the article by James Taylor (2017). Currently, most of the
companies and even small businesses are handling complex issues. As one can view from the
article by Colin, he does not nail down into details on business problems and how the CRISP-
Big Data Mining Process and Application_3
Big Data Mining Process and Application 4
DM analysis can help businesses. The team that implements the data mining project are usually
limited to business objectives, project goals and some metrics which measure success. This
means that an appropriate data mining model ought to give a clear detailed analysis which can
assist in big data mining.
James (2017) has continued to state that if the model lacks clarity then it means that the
team has very few options. Most are the times that the data mining teams find new data and new
modelling techniques rather working with the organization or the business partners to re-evaluate
a business issues. James has continued to highlight that the developers of the model never
engaged the IT specialist when developing the model on how analytical needs of data mining
needs to be done which results to a model thrown over the wall to data mining process. This also
results to increase in cost and time of a deploying a model which will never have a business
impact. Lastly, the model fail to iterate; this is because the model is never kept-up to date as
business circumstances can change (Taylor, 2017).
Second, the fifth phase by the author i.e. the evaluation phase has been overlooked by the
author. This phase also needs to cover quality assurances. In addition just like the TDSP process
model, the CRISP-DM model need to provide a dynamic framework where the first phase to not
only define the business idea from an organization point of view but it also have to identify some
of the possible scenarios and evaluate them which terminates by generating a project plan for
delivering the solution. The second phase which concentrates on data understanding needs to
perform data acquisition as well just like the TDSP process. In here the phase has also to include
fact-finding and familiarity about big data; the TDSP process has been clearly outlined by Farhad
Foroughi and Peter Luksch (2018). In addition the modeling phase also has to be verified against
Big Data Mining Process and Application_4

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Data Modelling.
|4
|484
|158

B9DA103 Data Mining - CRISP-DM model
|6
|1587
|14

Business Intelligence Project on Australian Weather Data using Rapid Miner and Data Warehouse Architecture Design
|26
|4567
|302

Planning for a project by applying the CRISP-DM framework
|8
|1471
|250

Business Intelligence Project on Weather AUS Data
|29
|3445
|100

Data Mining: A Solution for Business Problems
|7
|1117
|413