Data Mining Analysis: Historical, Theoretical Background and Tools
VerifiedAdded on 2021/06/14
|7
|1587
|459
Homework Assignment
AI Summary
This assignment delves into the historical background of data mining, tracing its evolution from the 1760s to the present day, highlighting key milestones like the introduction of Bayes' Theorem and the emergence of data mining in the 1990s. It then analyzes the theoretical underpinnings of data mining, emphasizing the relational model and the significance of statistics. The assignment further identifies and evaluates various data mining tools used in the industry, including WEKA, Orange, NLTK, R-Programming, and Rapid Miner. It explores both traditional approaches like classification, association, outlier detection, and clustering, and modern approaches such as web mining, neural networks, and machine learning. Finally, the assignment reviews the benefits of data mining for organizations, including enhanced decision-making, improved security, better planning, and increased customer satisfaction, emphasizing the importance of data mining for data-driven organizations.

P1 Investigate the historical background of data mining.
What is Data mining
The process by which the designs of a huge collection of data can be determined and also
contains the approaches of database scheme, machine learning and statistics are called data
mining.
The main work of data mining is to assess the huge amount of data automatically or semi-
automatically to extract the unknown, exciting designs like dependencies, statistics and data
`1b0020records. Normally it has database techniques like spatial indices. Moreover, these
designs may be perceived like a type of brief of data input that can be used for supplementary
enquiry.
Historic background of Data mining
The concept “Data Mining” was first launched in the 1990s. But, it has a large evaluation history.
Fig: Different phases of data mining
In 1763, the paper of Thomas Bayes which was called The Bayes’ Theorem launched for
connecting the existing probability.
What is Data mining
The process by which the designs of a huge collection of data can be determined and also
contains the approaches of database scheme, machine learning and statistics are called data
mining.
The main work of data mining is to assess the huge amount of data automatically or semi-
automatically to extract the unknown, exciting designs like dependencies, statistics and data
`1b0020records. Normally it has database techniques like spatial indices. Moreover, these
designs may be perceived like a type of brief of data input that can be used for supplementary
enquiry.
Historic background of Data mining
The concept “Data Mining” was first launched in the 1990s. But, it has a large evaluation history.
Fig: Different phases of data mining
In 1763, the paper of Thomas Bayes which was called The Bayes’ Theorem launched for
connecting the existing probability.
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

In 1805, Regression is functioned by Adrien-Marie Legendre and Friedrich Gauss for defining
the orbits of figures of the Sun (comets and planets). It is one of the core tools to analyse the
probable associations with variables and exact method which is used in the method of least
squares.
In 1970s, data warehouses agree that the user can transfer from a transaction-oriented
technique because of the possibility of keeping the urbane database management system.
In 1980s, the phrase Database Mining was trademarked by HNC for the safety purpose of a
product which is named Database Mining Workstation. After that, Gregory Piatetsky-Shapiro
discovered the clop Knowledge “Discovery in Databases” (KDD) in 1989.
The word Data mining was first introduced in database community in 1990s. Many companies
are using data mining for examining and identifying things to enhance the amount of the
customer. In 2001, Jeff Hammerbacher and DJ Patil used this term for building their data
science team and defining their roles in Facebook and Linkedln. DJ Patil became the first chief
data scientist of White House in 2015.
Now a days data mining is widely used in different sectors like science, engineering, business
and many more. Moreover, the national safety, card transaction stock market, genome
sequence etc. are called the iceberg for data mining. Deep learning has become the most active
techniques. Capable of capturing enslavements and complicated patterns far beyond other
techniques, it's reigniting a number of the most important challenges within the world of
information mining, data science and computing.
P2 Analyze the theoretical background of data mining and identify data mining tools
used in industry.
Theoretical background of data mining:
The most focusable thing for researching in data mining is to expand the decent procedure for
different kinds of assignment. There are some reasons for giving effort in particular part of data
mining. That are examining database themes or visualization, data mining expansion and user
edge matters.
the orbits of figures of the Sun (comets and planets). It is one of the core tools to analyse the
probable associations with variables and exact method which is used in the method of least
squares.
In 1970s, data warehouses agree that the user can transfer from a transaction-oriented
technique because of the possibility of keeping the urbane database management system.
In 1980s, the phrase Database Mining was trademarked by HNC for the safety purpose of a
product which is named Database Mining Workstation. After that, Gregory Piatetsky-Shapiro
discovered the clop Knowledge “Discovery in Databases” (KDD) in 1989.
The word Data mining was first introduced in database community in 1990s. Many companies
are using data mining for examining and identifying things to enhance the amount of the
customer. In 2001, Jeff Hammerbacher and DJ Patil used this term for building their data
science team and defining their roles in Facebook and Linkedln. DJ Patil became the first chief
data scientist of White House in 2015.
Now a days data mining is widely used in different sectors like science, engineering, business
and many more. Moreover, the national safety, card transaction stock market, genome
sequence etc. are called the iceberg for data mining. Deep learning has become the most active
techniques. Capable of capturing enslavements and complicated patterns far beyond other
techniques, it's reigniting a number of the most important challenges within the world of
information mining, data science and computing.
P2 Analyze the theoretical background of data mining and identify data mining tools
used in industry.
Theoretical background of data mining:
The most focusable thing for researching in data mining is to expand the decent procedure for
different kinds of assignment. There are some reasons for giving effort in particular part of data
mining. That are examining database themes or visualization, data mining expansion and user
edge matters.

Basically, data mining is an applied part but it is important to know about the theoretical part of
it. By studying theory of data mining we can understand the relational model of data mining.
Codd’s relational model was one of the example of efficient for describing the construction of
data. A common method of the theory of data mining is that it is statistics. More than it has
reached on the side of relative knowledge bases texting and transferring the data.
Some data mining tools in industry
At the modern time data mining is used in different sectors. For doing this process in a good
way, we need to use some data mining tools. Some trendy tools are:
WEKA: This tools can easily be used easily. It is a Java based tool that embraces many
methods for imaging and statistical analysis clustering etc.
fig:WEKA
Python based Orange and NLTK: Orange is a platform which is written in python. We know
that python is one of the easiest code language. That is why it is very much famous language.
Orange python code text analysis, machine learning and data analysis.
NLTK is also an authoritative language for data mining which is usually written in python.
it. By studying theory of data mining we can understand the relational model of data mining.
Codd’s relational model was one of the example of efficient for describing the construction of
data. A common method of the theory of data mining is that it is statistics. More than it has
reached on the side of relative knowledge bases texting and transferring the data.
Some data mining tools in industry
At the modern time data mining is used in different sectors. For doing this process in a good
way, we need to use some data mining tools. Some trendy tools are:
WEKA: This tools can easily be used easily. It is a Java based tool that embraces many
methods for imaging and statistical analysis clustering etc.
fig:WEKA
Python based Orange and NLTK: Orange is a platform which is written in python. We know
that python is one of the easiest code language. That is why it is very much famous language.
Orange python code text analysis, machine learning and data analysis.
NLTK is also an authoritative language for data mining which is usually written in python.
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

fig: Orange fig: NLTK
R-Programming tool: This tool consents the data miners as a language programming tool. It is
created in C and FORTRAN. It is suitable for linear and non-linear modeling, clustering,
classification and time-based analysis of data.
fig: R-Programming tool:
Rapid Miner: It is a very famous because it has no hassle to operate. It is an open source and
non-coding application. We can implement multifaceted data mining processing such as
imaging, prophetic scrutiny, and preprocessing.
R-Programming tool: This tool consents the data miners as a language programming tool. It is
created in C and FORTRAN. It is suitable for linear and non-linear modeling, clustering,
classification and time-based analysis of data.
fig: R-Programming tool:
Rapid Miner: It is a very famous because it has no hassle to operate. It is an open source and
non-coding application. We can implement multifaceted data mining processing such as
imaging, prophetic scrutiny, and preprocessing.
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

fig: Rapid Miner
M1 Evaluate traditional and modern approaches to data mining and show the building
blocks of both approaches.
Traditional approaches:
Traditional data mining is very useful in business sector. Through its complex algorithms, they
can develop data patterns and trends. Some are developed for showing the shapes on the
computer. On the other hand, others gather information from an outside server. Some can be
found on windows and UNIX but some are specially created for that particular operating system.
Some traditional tactics are:
M1 Evaluate traditional and modern approaches to data mining and show the building
blocks of both approaches.
Traditional approaches:
Traditional data mining is very useful in business sector. Through its complex algorithms, they
can develop data patterns and trends. Some are developed for showing the shapes on the
computer. On the other hand, others gather information from an outside server. Some can be
found on windows and UNIX but some are specially created for that particular operating system.
Some traditional tactics are:

Classification: Classification is a complicated method. We can gather various attributes
in organized categories. It can be useful for more works and services.
Association: It is connected to tracking configurations which is be contingent in related
variables. It helps to expression the possibility of organization between data items within
large data sets.
Outlier detection: In many cases only identifying the main pattern does not give a clear
vision of data set. It is also need to gain the ability to find out the outliers in the data.
Clustering: It is nearly close to classification. Also, it contains many data in organized
way which is based on their connections.
Modern Approaches:
Modern data processing denotes to the practice of attempting to find useful or relevant
information across large data sets.
Web mining: Web mining is the development of data mining and algorithms for taking
out information straightly. In web mining there are many techniques to accumulate data.
There are many tools that are using for web mining. Among all scrappy and octopuses
are well-known for web mining. Through web mining we can get the data widely.
Neural Network: Neural network is a computer method which layer of interconnected
nodes. Neural networks the data is converted into conceptual layers by splitting neural
networks. The performance of layers are described by the strength of the associations.
Neural network collects information from the processor that are organized in tier. Then it
shifted to raw input data. It routes through the nodes. Then it passes to the next tier as
output.
Machine Learning: Basically all the process of machine learning depends on computer
algorithms. It is also called the subdivision of A.I. (Artificial Intelligence) it is made on the
base of mathematical model established on an example data also known as training
data. It densely connected to computational figures that emphases on predictions
consuming computers.
D1 Review how an organization benefits from data mining.
in organized categories. It can be useful for more works and services.
Association: It is connected to tracking configurations which is be contingent in related
variables. It helps to expression the possibility of organization between data items within
large data sets.
Outlier detection: In many cases only identifying the main pattern does not give a clear
vision of data set. It is also need to gain the ability to find out the outliers in the data.
Clustering: It is nearly close to classification. Also, it contains many data in organized
way which is based on their connections.
Modern Approaches:
Modern data processing denotes to the practice of attempting to find useful or relevant
information across large data sets.
Web mining: Web mining is the development of data mining and algorithms for taking
out information straightly. In web mining there are many techniques to accumulate data.
There are many tools that are using for web mining. Among all scrappy and octopuses
are well-known for web mining. Through web mining we can get the data widely.
Neural Network: Neural network is a computer method which layer of interconnected
nodes. Neural networks the data is converted into conceptual layers by splitting neural
networks. The performance of layers are described by the strength of the associations.
Neural network collects information from the processor that are organized in tier. Then it
shifted to raw input data. It routes through the nodes. Then it passes to the next tier as
output.
Machine Learning: Basically all the process of machine learning depends on computer
algorithms. It is also called the subdivision of A.I. (Artificial Intelligence) it is made on the
base of mathematical model established on an example data also known as training
data. It densely connected to computational figures that emphases on predictions
consuming computers.
D1 Review how an organization benefits from data mining.
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

Now, many multinational companies and organizations has their own data collection. This huge
collection of data are used for making decisions, research and progress. For that they can easily
get the information when they need it. Most importantly, they can analyze the data and find out
the required information.
Data mining is very much important for the success of data-driven organizations. The benefits of
data mining is huge. Respondents finds 30 [positive impact of data mining in business sector. Of
all, top 10 benefit of data mining are:
1. Enhanced decision-making (56%)
2. Improved security risk posture (47%)
3. Better planning and forecasting (44%)
4. Competitive benefit (41%)
5. Expense diminution (41%)
6. Client achievement (40%)
7. Different income streams (40%)
8. New customer achievement (34%)
9. Developed customer relationships (31%)
10. Development of new products (31%)
Through data mining a businessman can shift his business even execute which can fulfill the
necessity of the businessman for the success of data mining investments. All task has some
barriers. If any company can overcome these barriers, they can exploit the advantage of data
mining. Successful companies
Are aware of their needs that can mention.
Utilize the data sources of data mining tools by identifying and estimating it.
Control application like BI systems that are interconnected with each other.
Identify which data is suitable for their requirements.
If the industries utilize data mining in their business, it is a high chance to get success in those
business.
collection of data are used for making decisions, research and progress. For that they can easily
get the information when they need it. Most importantly, they can analyze the data and find out
the required information.
Data mining is very much important for the success of data-driven organizations. The benefits of
data mining is huge. Respondents finds 30 [positive impact of data mining in business sector. Of
all, top 10 benefit of data mining are:
1. Enhanced decision-making (56%)
2. Improved security risk posture (47%)
3. Better planning and forecasting (44%)
4. Competitive benefit (41%)
5. Expense diminution (41%)
6. Client achievement (40%)
7. Different income streams (40%)
8. New customer achievement (34%)
9. Developed customer relationships (31%)
10. Development of new products (31%)
Through data mining a businessman can shift his business even execute which can fulfill the
necessity of the businessman for the success of data mining investments. All task has some
barriers. If any company can overcome these barriers, they can exploit the advantage of data
mining. Successful companies
Are aware of their needs that can mention.
Utilize the data sources of data mining tools by identifying and estimating it.
Control application like BI systems that are interconnected with each other.
Identify which data is suitable for their requirements.
If the industries utilize data mining in their business, it is a high chance to get success in those
business.
1 out of 7
Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
Copyright © 2020–2025 A2Z Services. All Rights Reserved. Developed and managed by ZUCOL.


