logo

Evaluation Criteria for Data Mining Software

   

Added on  2022-09-18

26 Pages7227 Words34 Views
Theoretical Computer ScienceData Science and Big DataArtificial IntelligenceLanguages and CultureStatistics and Probability
 | 
 | 
 | 
Running head: EVALUATION CRITERIA FOR DATA MINING SOFTWARES. 1
EVALUATION CRITERIA FOR DATA MINING SOFTWARES.
Student’s Name
Professor’s Name
Institutional Affiliation
Date Due
Evaluation Criteria for Data Mining Software_1

EVALUATION CRITERIA FOR DATA MINING SOFTWARES. 2
Executive summary
Powerful software tools are required for the application purposes of data mining algorithms. Data
mining is very sensitive; thus, it requires intensive effort in the selection and evaluation of the
software available. With the evolving technology, modifications have been done in many
software to meet the need of the user. It's a challenging task to settle on one or two data mining
tools that will satisfy you. Besides, the number of available software meant for data mining is on
the rise. One of the fields that heavily rely on data mining tools and software is statistics.
Statistics involves the collection of different variety of data. These data need to be simplified to
be interpreted. Complex statistical data are synthesized, analyzed, and interpreted using SAS
Enterprise Miners (EM) data mining software and Statistical Package for the Social Sciences
(SPSS). This data mining software has its strengths and weakness; therefore, they need to be
evaluated. The most effective data mining software evaluation criteria include but not limited to;
Performance, Functionality, Usability, and Ancillary support. These criteria for evaluation have
been adequately explained in this paper. SAS EM and SPSS heavily rely on Decision Tree (DTs)
algorithm to execute their statistical functions
Evaluation Criteria for Data Mining Software_2

EVALUATION CRITERIA FOR DATA MINING SOFTWARES. 3
Contents
Chapter 1: Introduction................................................................................................... 4
Objectives of the study................................................................................................. 5
Limitations of the study................................................................................................ 5
Chapter 2: Overview of DT induction................................................................................. 6
Tree pruning.............................................................................................................. 8
Important DTs algorithm............................................................................................ 10
Chapter 3: Evaluation criteria........................................................................................ 14
Functionality............................................................................................................ 15
Ancillary task support................................................................................................. 15
Usability.................................................................................................................. 16
Performance............................................................................................................. 17
Chapter 4: Description of the DT induction software...........................................................18
EM features for Data mining process............................................................................... 18
Chapter 5: Description of SPSS......................................................................................... 20
The core functions of SPSS.......................................................................................... 20
The benefits of using SPSS............................................................................................ 21
Chapter 6: Comparative Analysis in terms of Relevant Criteria............................................22
Chapter 7: Conclusion................................................................................................... 23
References.................................................................................................................. 25
Evaluation Criteria for Data Mining Software_3

EVALUATION CRITERIA FOR DATA MINING SOFTWARES. 4
Chapter 1: Introduction.
Data mining traces its roots from artificial intelligence, statistics, machine learnings, and
database research. Advancements have been witnessed in these fields in that, many data mining
tools have been invented. Mainframe programs have been developed for statistical analysis.
Nevertheless, different varieties of standalone server and web-dependent software are being used
to demystify statistical data sets. Data mining is very important in understanding different
databases (KDD). Data mining involves the application of particular algorithms that create a
particular or specific enumeration of models (or patterns) across the data. KDD is the nontrivial
procedure of identifying novel, valid, potentially useful, and ultimately understandable patterns
in data. This definition of KDD is used synonymously to define data mining. Most software tools
complement the KDD field, which has transformed and grown. Thus, it is indeed sensible to
inquire which data mining software is better placed to transform the market.
Meanwhile, it's a challenging task for business users to settle on the best data mining software
that will meet their budget and utility desires. However, choosing a wrong data mining software
is time consuming, costly, spurious results, and personnel resources (Dušanka et al., 2017). To
mitigate these risks, there is a need to understand the data mining software evaluation criteria.
From these evaluation criteria, one can make an informed decision on the right data mining tool
to settle on depending on the nature of their data. This paper evaluates the SAS Enterprise
Miners (EM) data mining software data mining tool versus Statistical Package for the Social
Sciences (SPSS). This software is evaluated using usability, performance, functionality, and
ancillary support criteria. In solving classification and regression problems, this software relies
on Decision Tree (DT) algorithms. A decision tree is a supervised classification algorithm that is
easy to interpret due to the tree structure. DTs algorithms rely on human-readable and
understandable tree rules of "if...Then ..." to extract predictive information (Upadhyay, Pradesh
Evaluation Criteria for Data Mining Software_4

EVALUATION CRITERIA FOR DATA MINING SOFTWARES. 5
& Verma, 2019). The DTs algorithms discussed in this paper include CART, C4.5, ID3, and
CHAID.
Objectives of the study
The main objectives of this paper include;
To describe the DT induction software
Evaluation of SAS EM and SPSS using different evaluation criteria
To provide an insight into the best data mining software.
The decision tree induction software covered in this paper are;
Statistical Package for the Social Sciences (SPSS)
SAS Enterprise Miners (EM) data mining software
Limitations of the study
The current study only highlights the evaluation criteria of data mining tools. Therefore,
it provides little information about the data mining software selection criteria.
The DT induction software discussed in this paper are SAS EM and SPSS; therefore,
little information is provided on the other software meant for Data mining
Chapter 2: Overview of DT induction
One of the most useful supervised learning algorithms is the Decision Tree (DT). In
supervised learning, the behavior to be predicted is already known as well as your existing data
is already labeled; this is different from unsupervised learning. Data is explored using algorithms
to find patterns in unsupervised learning since there are no output variables to guide the learning
process. DTs algorithms are used by business organizations in approximating customer lifetime
values plus their churn rates. Nevertheless, DTs algorithms are incorporated in the manufacture
of autonomous vehicles, which help in recognition of pedestrians (Quirynen, Berntorp &
Evaluation Criteria for Data Mining Software_5

EVALUATION CRITERIA FOR DATA MINING SOFTWARES. 6
Cairano, 2018). DTs algorithms constantly divide data into smaller subsets based on
characteristic features, until they achieve sets that are small enough to be described by some
label. DT algorithms are perfect in solving regression (where values are predicted, for instance,
property prices by machines) and classification (where data are being sorted into classes, for
instance, showing whether an email is a spam or not) problems. Regression trees are utilized
when the targeted variable is continuous or quantitative (for instance, if we want to predict the
probability of experiencing rainfall). Whereas, when the dependent variable is qualitative and
categorical classification trees are used. For instance, if a doctor wants to investigate the blood
group of a patient, a classification tree is the most appropriate. DTs have many applications in
the real world, making them very important. Furthermore, DTs algorithms are mostly used in
ML. Further, they have relevant applications in several industries.
DTs algorithms are used in the early detection of cognitive impairments in the medicine
industry (Su et al., 2019). Besides, they also predict possible development of dementia in
the coming days.
DTs algorithms are used in the manufacture of Chatbots that have revolutionized the
healthcare sector. They gather information from patients through friendly chats.
Nevertheless, Chatbots have completely transformed the customer care sector. Internet
platforms as Google and Amazon are acquiring Chatbots to help them manage their
customer care services (Ikedinachi et al., 2019).
DTs are trained to recognize different causes of forest loss from satellite imagery. They
can be used to predict the possible causes of forest destructors as wildfires, large or small
scale agriculture, logging of tree plantations, and urbanization (Srivastava et al., 2019).
Evaluation Criteria for Data Mining Software_6

End of preview

Want to access all the pages? Upload your documents or become a member.