Analyzing Data Mining with CRISP-DM and SEMMA: A Comprehensive Report

Verified

Added on  2023/04/21

|4
|484
|158
Report
AI Summary
This report provides an overview of two prominent data mining methodologies: CRISP-DM and SEMMA. CRISP-DM, or Cross-Industry Standard Process for Data Mining, consists of six phases: business understanding, data understanding, data preparation, modeling, evaluation, and deployment. SEMMA, which stands for Sample, Explore, Modify, Model, and Assess, offers an alternative approach. The report details each phase within both methodologies, highlighting their purpose and function in the data mining process. It concludes by noting that both methodologies are implemented using data mining tools to analyze data, identify issues, and display results in a visualization model.
Document Page
Data Modelling
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Contents
Introduction...........................................................................................................................................3
CRISP-DM............................................................................................................................................3
SEMMA................................................................................................................................................3
Conclusion.............................................................................................................................................3
References.............................................................................................................................................4
Document Page
Introduction
Data mining refers to a data analyzing process, where the data are collected in a common
area such as algorithm, data ware housing. The data discovery and knowledge discovery is
main concept of data mining. During 1996, for data mining (CRISP-DM), the cross industry
standard process was launched. This model comprises of six phases namely- Sample,
Explore, Modify, Model and Access which are abbreviated as SEMMA. For the analytical
project, two methodologies are used they are CRISP-DM and SEMMA (Wowczko, 2015).
CRISP-DM
It is one of the data methodology process and data process model, which includes six
phases namely, business understating, data understanding, data preparation, modelling,
evaluation and deployment. In the business understating process, it refers to identifying the
problem (Han, Kamber and Pei, 2012). The data understanding phase refers to identifying the
data quality issues and it also collects the data. The data preparation phase refers to selecting
the attribute and data cleansing. The modelling phase is used to run the process. The
evaluation phase includes identification of business issues and the deployment phase displays
the dataset result.
SEMMA
SEMMA stands for sample, explore, modify, model and access, where Sample deals with
data partioning (Shafique and Qaiser, 2014). Moreover, it contains a large dataset for
retrieving the data. The explore phase is used for understating the data and it displays the
output in the visualization model. The modify phase comprises of the actions like, select,
create and transform the variables. The model phase for applying the model retrieves the data
that provides the output. Finally, the results are shown in the access phase.
Document Page
Conclusion
Two methodologies namely CRISP-DM and SEMMA are used for data mining process.
It is implemented by the data mining tools. First the data is analyzed and then it checks the
data issues. Finally, it displays the output on the visualization model.
References
Han, J., Kamber, M. and Pei, J. (2012). Data mining. Waltham, MA: Morgan
Kaufmann/Elsevier.
Shafique, U. and Qaiser, H. (2014). A Comparative Study of Data Mining Process Models
(KDD, CRISP-DM and SEMMA). International Journal of Innovation and Scientific
Research, 12(1).
Wowczko, I. (2015). A Case Study of Evaluating Job Readiness with Data Mining Tools and
CRISP-DM Methodology. International Journal for Infonomics, 8(3), pp.1066-1070.
chevron_up_icon
1 out of 4
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]