MITS4003 Data Mining Report: Practical Machine Learning Techniques

Verified

Added on  2023/04/23

|6
|1535
|230
Report
AI Summary
This report delves into the realm of data mining, presenting a comprehensive overview of the subject. It begins with an introduction to data mining, explaining its core concepts and processes, including its role in problem-solving and data analysis. The report then explores various applications of data mining, such as market analysis, corporate analysis and risk management, and fraud detection. It discusses the different stages of data mining, from business understanding to deployment, and highlights the six data mining classes: anomaly detection, association rule learning, clustering, classification, regression, and summarization. The report also touches upon the evolution of data mining, its relationship with artificial intelligence and applied statistics, and its application in diverse database types. The report concludes by summarizing the importance of data mining in the context of knowledge discovery in databases (KDD).
tabler-icon-diamond-filled.svg

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
Running head: DATA MINING
DATA MINING – PRACTICAL MACHINE LEARNING TOOLS
AND TECHNIQUES
Name of the Student
Student ID
Name of the University
AUTHOR Name
(Ian H. Written, Eibe Frank, Mark A. Hall & Christopher J. Pal)
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
DATA MINING
Table of Contents
Introduction...........................................................................................................................................2
About data mining.................................................................................................................................2
Applications of data mining...................................................................................................................3
Conclusion.............................................................................................................................................4
References.............................................................................................................................................5
1
[Name of the Student]
[Student ID]
Document Page
DATA MINING
Introduction
The aim of the paper is to research about data mining. The report is going to discuss
the concepts applied by the researcher in their paper. The report will describe all the
components related to data mining. Data mining is the process through which data are stored
electronically. Data mining basically deals with problem solving and analysing the data that
are already present within the database (Maione ET AL., 2016). The process followed by data
mining needs to be discovered and must be meaningful so that advantages are achieved by
the system. Data mining is referred to interdisciplinary that aims at extracting information
from the computer.
About data mining
Data mining term was appeared in the year 1990. Data mining is referred to analysis
of steeps associated with knowledge discovery related to databases. Apart from this data
mining also includes aspects related to database management, model and interference
considerations. The major difference between data mining and data analysis is to aggregate
the activities that includes analysing effectiveness related to market campaign. The aim of
data mining is too extract knowledge and patterns from large data sets (Witten et al., 2016).
This are also used in information processing that ensures collection, analysis, extraction and
statistics of a data. Data mining activity includes analysis of large set of data automatically.
Analysis of data records are known as cluster analysis, the analysis of unusual records known
as anomaly detection and finally dependencies that re associated with rule mining and
sequential pattern mining. Earlier data mining was done with the help of regression analysis
(Larose, 2015). With the increase in size of data sets the complexity also increases thus it
becomes important to have a cluster analysis, generic algorithms, decision trees and support
vector machine. Data mining is referred to the process that applies this methods in order to
uncover the hidden patterns in case of large data sets. With the help of data mining the gap
between artificial intelligence and applied statistics can be reduced. The process related to
data mining are divided in to different stages that includes the phases: business
understanding, data understanding, data preparation, modeling, evaluation and deployment.
The six data mining classes includes anomaly detection, association rule learning,
clustering, classification, regression and summarization. Anomaly detection includes
identification of unusual data records. The anomalies or data errors that are needed to be
investigated further. Association rule learning searches for the relationship within the
2
[Name of the Student]
[Student ID]
Document Page
DATA MINING
variables. Clustering helps the data mining process by discovering the group of data that are
similar to each other (García, Luengo & Herrera, 2015). Classification is the task that is
applied for new data structure. With the use of regression functions related to each data
model with least error can be identified. Summarization helps to generate more compact
representation for the data set that includes report generation. Data mining is referred to the
mining of knowledge from collection of data. The main application of fata mining are market
analysis, fraud detection, production control, science exploration and customer retention and
risk management.
Applications of data mining
Data mining has great importance in the following areas:
Market analysis and management: The places where data mining helps in building
up the market includes determining the customer details and analysing the products
preferred by them. With the help data mining customer requirements can be identified
clearly. This also uses several factors to attract new customers (GamarrA, Guerrero &
Montero, 2016). Data mining is also used to determine the customer purchasing
patterns and also performs several activities to perform association within the product
sales. Apart from this data mining helps to identify clusters of model customers that
will share same characteristics.
Corporate analysis and risk management: in the corporate sector data mining helps
by planning the finance and asset evaluation. This includes analysis of cash flow and
evaluation of assets related to the sector. Apart from this it involves resource planning
for performing the activity (Dua & Du, 2016). The risk management is also done with
the help of data mining. This allows monitoring the activities of competitors and
providing a market direction clearly.
Fraud detection: data mining plays a major role in detecting the frauds in the field of
credit cards services and telecommunication. This helps in identifying the location of
fraud telephone calls include the time duration and day. This also helps in analysing
the patterns deviated from the expected norms.
Data mining focuses on the data pattern that can be analysed. There are basically two
types of functions involves with data mining, this are classification and prediction and
descriptive. The descriptive function includes concept description, mining of correlations,
mining of clusters and mining for analysing the frequent patterns. Classification and
3
[Name of the Student]
[Student ID]
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
DATA MINING
prediction involves determining the model that will be able to describe the data classes
related to the concept. The main aim is to analyse the class objects that are related to useful
information. Data mining can be applied on relational databases, data warehouse, structured
unstructured databases and object oriented databases (Cao, 2015). However data mining is
also known as knowledge discovery in database. The main steps associated with KDD
includes data cleaning that removes the irrelevant data present within the database. The next
step includes data integration that includes combination of heterogeneous data sources into a
single data unit. After this data selection comes, this helps in retrieving the relevant data for
analysing the process retrieved from the database. Data transformation is the next task that
allows data selection that are suitable for data mining. Data mining is the fifth stage that
applies various techniques for understanding the data patterns. After this comes the pattern
evaluation and finally knowledge representation.
Conclusion
The above report has described about data mining in details. From the chosen paper it
can be stated that research paper has included every crucial points regarding data mining.
Data mining plays an important role in DBMS. This is also known as KDD as it is related to
knowledge discovery in database. The paper has described the applications of data mining
and its important use. Every stages are important to understand the concept of data mining.
4
[Name of the Student]
[Student ID]
Document Page
DATA MINING
References
Cao, L. (2015). Actionable knowledge discovery and delivery. In Metasynthetic computing and
engineering of complex systems (pp. 287-312). Springer, London.
Dua, S., & Du, X. (2016). Data mining and machine learning in cybersecurity. Auerbach
Publications.
Gamarra, C., Guerrero, J. M., & Montero, E. (2016). A knowledge discovery in databases approach
for industrial microgrid planning. Renewable and Sustainable Energy Reviews, 60, 615-630.
García, S., Luengo, J., & Herrera, F. (2015). Data preprocessing in data mining (pp. 195-243).
Switzerland: Springer International Publishing.
Larose, D. T. (2015). Data mining and predictive analytics. John Wiley & Sons.
Maione, C., Batista, B. L., Campiglia, A. D., Barbosa Jr, F., & Barbosa, R. M. (2016). Classification
of geographic origin of rice by data mining and inductively coupled plasma mass
spectrometry. Computers and Electronics in Agriculture, 121, 101-107.
Witten, I. H., Frank, E., Hall, M. A., & Pal, C. J. (2016). Data Mining: Practical machine learning
tools and techniques. Morgan Kaufmann
Zheng, Y. (2015). Trajectory data mining: an overview. ACM Transactions on Intelligent Systems and
Technology (TIST), 6(3), 29.
5
[Name of the Student]
[Student ID]
chevron_up_icon
1 out of 6
circle_padding
hide_on_mobile
zoom_out_icon
logo.png

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]