MITS5509: Report on Data Mining Techniques for Intelligent Systems

Verified

Added on 2022/09/01

AI Summary

This report analyzes a research paper on data mining techniques applied to intelligent systems, specifically focusing on breast cancer detection. The report delves into the background of the research, highlighting the significance of breast cancer and the challenges in analyzing related data. It discusses the methods used, including three decision tree classifiers: Sequential Minimal Optimization (SMO), IBK (K-nearest neighbors), and Best First trees, along with the WISCONSIN dataset and Weka toolkit. The findings section details the classification of the dataset, experimental results, and the evaluation criteria used. The report also addresses the issues and limitations, such as the static nature of the data, and concludes by summarizing the application of data mining techniques and their effectiveness in achieving the research objectives. The report also includes a comprehensive list of references.

Running head: REPORT FOR DATA MINING TECHNIQUES FOR IS
REPORT
FOR
DATA MINING TECHNIQUES
WITH
IS
Name of the Student
Name of the University
Author Note:

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

1DATA MINING TECHNIQUES FOR IS
Introduction:
Considering the rapid growth of technology it is noticed that Data mining is one of the
trending technology that has majorly influenced the operational activities of several sensitive
industries. The data mining technique is such practice that uses a data analytics tool to analyze a
large set of data with the purpose to determine the relationship between the variables present in
the data set. [1]
Followed by this identification the primary objective of this paper is to analyze a research
paper that consists of a detailed study on the application of data mining on detecting breast
cancer, which will help to provide a detailed idea about the application of the nominated
technology. Along with this, it will also focus on the discussion of uses methods in the research
as well as research findings. Apart from these identifications, this paper will also consist of a
detailed elaboration on the issues that have been highlighted by the researcher. Lastly, will
conclude by stating the limitations of the study as well.
Background:
As compared to other diseases breast cancer is one of the harmful cancer types faced by
women and statistically it is proven that this is the second most harmful cancer that causes death
to women. This research has primarily focused on the urban cities of India with the purpose to
analyze the significance of breast cancer. Considering the global increment of the significance of
breast cancer it is noticed that women across the world are very much threatened by the
nominated disease. [2]
However, followed by these identifications it is recognized that it is very difficult to
analyze a large set of data related to this domain. Considering these aspects this paper has aimed
to discuss the aspect of breast cancer by analyzing a large set of banking data, retail data as well

2DATA MINING TECHNIQUES FOR IS
as telecommunication data by using the data mining techniques. Identifying the necessity of the
nominated research area the researcher has primarily focused on the data analysis by using
effective data mining techniques. [3]
After analyzing the above aspect it is noticed that in order to detect breast cancer it is first
essential to classify the patterns of breast cancer. Thus, in order to detect the pattern of breast
cancer in this research, the researcher has used three decision tree classifiers.
Methods:
In this selected research paper the researcher has focused on the investigation of breast
cancer by using three decision tree which includes the Sequential Minimal Optimization, IBK (K
nearest Neighbors classifier) and Best first trees. While analyzing the operations of Sequential
Minimal Optimization (SMO) is a support vector machine algorithm (SVM) which is one of the
fasted method training support vector machines. [4] Followed by these identifications it is
noticed that there are several benefits of SMO present in the nominated area which includes the
effective problem-solving category as well as analytical capability. Considering all of the above
aspects the researcher has declared SMO as one of the fastest liner SVM for light data set,
however, in this research we required to analyze large sets to data to get the desired result. [5]
Followed by the above identification it is noticed that K nearest Neighbor classification is
also one of the effective classifiers that classify data based on their similarities. While analyzing
these aspects it is noticed that these types of classifier that classifies the data by considering the
nearest neighbors of a selected point. Considering this identification it is noticed that this
technique is very much useful in determining unknown instances by identifying the nearest point
of the selected point. Along with these identifications, it is noticed that this technique is very
much effective in analyzing a large set of data. However, along with these identifications, it has

3DATA MINING TECHNIQUES FOR IS
several limitations that also while analyzing any large set of data consist of the different
variables as this technique is capable of identifying the nearest neighbor only. [6]
Along with this identification, it is noticed that the best first algorithm is also one of the
effective classifier technique that is highly effective in selecting nodes with impurity. While
analyzing the working infrastructure of the BF technique it is identified that it follows the divide
and conquer process. Along with these identifications, it is noticed that there are several benefits
present while using this technique as it continues its classification process until every node of the
tree gets purity. In order to measure the purity of the nodes, information gain and Gini index
algorithms have been used. Followed by this identification it is also identified that in order to
conduct the research the WISCONSIN data set has been used. [7]
In the research, the Weka toolkit has been used along with the above-mentioned data
mining algorithm.
Findings:
After completion of the above analysis, it is noticed that the selected database has been
gathered from UC Irvine machine learning which is primarily based on breast cancer Wisconsin.
In this study the researcher has effectively classified the data from which it is identified that the
data set consists of almost 699 instances, 9 integer attributes as well as 2 classes one is malignant
and another one is benign. Now focusing on the experimental results of this paper it can be stated
that the researcher has primarily focused on describing the final data set and after that, it has
modeled and classifies the selected database.
Followed by the above identification it is noticed that the database has been classified
using four evaluation criteria which include timing to build model, correctly classified instances,
incorrectly classified instances as well as the accuracy.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

4DATA MINING TECHNIQUES FOR IS
After completion of the analysis of the result and findings part of this research, it can be
stated that the researcher has effectively analyzed the data by using several techniques like kappa
mode in order to get the best results from the investigation. Followed by these considerations it
can be stated that this research is very effective in analyzing every aspect of the nominated area.
[8]
Thus, after completion of this analysis, it can be stated that in this research the researcher
has followed all of the necessary instructions as well as represented the result in graphical
format. The classification matrix consists of calculation of right and wrong predictions along
with this identification it has also compared the values of the data set by using the trained model.
Along with these identifications, it is noticed that this study has helped to get a better
understanding of the selected database as well as this identification this has also helped to
understand the variables present in this database with the purpose to draw the relationship
between those identified variables. [9]
Issues:
Followed by the completion of the above-mentioned discussion it is identified that the
researcher has highlighted few things which have created limitations while doing the research
those factors include the lack of time and static nature of the collected data. Making allowance to
this concern it is noticed that during the research the author has effectively applied all of the
possible aspects of the nominated research areas as well as used several effective techniques
which will help to analyze the gathered data in order to extract the best results from the research.
[10]
Considering these identifications it is noticed that the researcher has faced the limitation
due to the static nature of the database as the selected database does not consist of updated data.

5DATA MINING TECHNIQUES FOR IS
However, since the research has primarily targeted to determine the pattern of breast cancer by
using the data mining capabilities it can be stated that by analyzing these data the researcher as
effectively achieved the desired result from the research. [11]
Followed by these discussions in this research the researcher has concluded the paper by
stating the necessity of determining the pattern of breast cancer and how the application of data
mining techniques has helped to get the desired results. Thus, it can be summarized that this
paper has effectively covered every relatable aspect present in the selected research topic and
along with this, it has also helped me get a piece of detailed information about the data mining
technique and how they work while classifying the input data. [12]
Conclusion:
After completion of the above discussion, it can be stated that this paper has effectively
focused on one of the most trending technology data and its application. In order to support the
discussion in this paper, an article has been analyzed which has primarily focused on the
application of data mining with the purpose to detect breast cancer. And after analyzing this
article a detailed discussion of the background of the research is provided which will help to
understand the primary objective behind this investigation. Followed by this identification it is
noticed that this study also includes a detailed analysis of the research methods as well as on the
research findings which will help the purpose to understand the procedure of this research. Along
with this discussion, it has also focused on the discussion of limitation that arises while doing the
research.
Thus, after completion of this, it can be concluded that this paper will effectively help to
get a detailed understanding of the nominated topic.

6DATA MINING TECHNIQUES FOR IS
Reference:
[1]V. Chaurasia and S. Pal, A novel approach for breast cancer detection using data mining
techniques. International Journal of Innovative Research in Computer and Communication
Engineering (An ISO 3297: 2007 Certified Organization) Vol, 2.
[2]J. Brooks, M. Kerr and J. Guttag, "Using machine learning to draw inferences from pass
location data in soccer", Statistical Analysis and Data Mining: The ASA Data Science Journal,
vol. 9, no. 5, pp. 338-349, 2016. Available: 10.1002/sam.11318.
[3]Y. Ye, T. Li, D. Adjeroh and S. Iyengar, "A Survey on Malware Detection Using Data Mining
Techniques", ACM Computing Surveys, vol. 50, no. 3, pp. 1-40, 2017. Available:
10.1145/3073559.
[4]H. Pourghasemi, S. Yousefi, A. Kornejady and A. Cerdà, "Performance assessment of
individual and ensemble data-mining techniques for gully erosion modeling", Science of The
Total Environment, vol. 609, pp. 764-775, 2017. Available: 10.1016/j.scitotenv.2017.07.198.
[5]E. Costa, B. Fonseca, M. Santana, F. de Araújo and J. Rego, "Evaluating the effectiveness of
educational data mining techniques for early prediction of students' academic failure in
introductory programming courses", Computers in Human Behavior, vol. 73, pp. 247-256, 2017.
Available: 10.1016/j.chb.2017.01.047.
[6]A. Souri and R. Hosseini, "A state-of-the-art survey of malware detection approaches using
data mining techniques", Human-centric Computing and Information Sciences, vol. 8, no. 1,
2018. Available: 10.1186/s13673-018-0125-x.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

7DATA MINING TECHNIQUES FOR IS
[7]S. Goswami, S. Chakraborty, S. Ghosh, A. Chakrabarti and B. Chakraborty, "A review on
application of data mining techniques to combat natural disasters", Ain Shams Engineering
Journal, vol. 9, no. 3, pp. 365-378, 2018. Available: 10.1016/j.asej.2016.01.012.
[8]B. Peromingo, D. Caballero, A. Rodríguez, A. Caro and M. Rodríguez, "Application of data
mining techniques to predict the production of aflatoxin B1 in dry-cured ham", Food Control,
vol. 108, p. 106884, 2020. Available: 10.1016/j.foodcont.2019.106884.
[9]P. Hachesu, M. Ahmadi, S. Alizadeh and F. Sadoughi, "Use of Data Mining Techniques to
Determine and Predict Length of Stay of Cardiac Patients", Healthcare Informatics Research,
vol. 19, no. 2, p. 121, 2013. Available: 10.4258/hir.2013.19.2.121.
[10]M. R, "Prediction of Diabetes Disease Using Classification Data Mining
Techniques", International Journal of Engineering and Technology, vol. 9, no. 5, pp. 3610-3614,
2017. Available: 10.21817/ijet/2017/v9i5/170905319.
[11]"Using Data Mining Strategies in Clinical Decision Making", CIN: Computers, Informatics,
Nursing, vol. 34, no. 10, p. 484, 2016. Available: 10.1097/01.ncn.0000504587.62271.53.
[12]N. Agrawal and A. Jawdekar, "A Survey Report On Current Research and Development of
Data Processing In Web Usage Data Mining", International Journal of Database Theory and
Application, vol. 9, no. 5, pp. 101-110, 2016. Available: 10.14257/ijdta.2016.9.5.10.

1 out of 8

MITS5509: Report on Data Mining Techniques for Intelligent Systems

Paraphrase This Document

Paraphrase This Document

Paraphrase This Document

Related Documents

MITS4003 Data Mining Report: Practical Machine Learning Techniques

+13062052269

info@desklib.com

MITS5509: Report on Data Mining Techniques for Intelligent Systems

Paraphrase This Document

⊘ This is a preview!⊘

Paraphrase This Document

⊘ This is a preview!⊘

Paraphrase This Document

Related Documents

MITS4003 Data Mining Report: Practical Machine Learning Techniques

+13062052269

info@desklib.com