Data Mining Assignment: Security, Privacy, Ethics, and Data Analysis
VerifiedAdded on 2021/04/24
|17
|1315
|150
Report
AI Summary
This data mining assignment report explores the critical aspects of data mining, including security, privacy, and ethical considerations, supported by research articles discussing the impact of big data on human rights. The second part of the assignment focuses on practical data analysis using Weka software. It involves analyzing student assignment and final exam marks through summary statistics, histograms, and scatterplots to visualize the data distribution and relationships. Furthermore, the assignment demonstrates the application of the "Unsupervised Discretize Filter" and techniques for handling missing values within the dataset. The report concludes by highlighting the importance of data security, privacy, and the ethical implications of data collection and analysis, emphasizing the need for transparency and responsible data handling practices.

Running head: DATA MINING-2
Data Mining-2
Name of Student
Name of University
Course ID
Data Mining-2
Name of Student
Name of University
Course ID
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

1DATA MINING-2
Table of Contents
Introduction:....................................................................................................................................2
Discussion:.......................................................................................................................................2
Task 1:.............................................................................................................................................2
Task 2:.............................................................................................................................................3
Conclusion:....................................................................................................................................15
References:....................................................................................................................................16
Table of Contents
Introduction:....................................................................................................................................2
Discussion:.......................................................................................................................................2
Task 1:.............................................................................................................................................2
Task 2:.............................................................................................................................................3
Conclusion:....................................................................................................................................15
References:....................................................................................................................................16

2DATA MINING-2
Introduction:
The assignment in the first part discusses about the security, privacy and ethics of data
mining. The two research articles are provided in the assignment file that would help to describe
the limited utility and impractical scientific knowledge. The dynamic association between human
rights and big data is discussed in the discussion section.
On the other hand, the second part of the assignment file takes into account the analysis
by “Weka” software. The marks of assignments and final exam are analysed with summary
statistics and necessary visualizations. Besides, the basic activities like handling the missing
values are executed in the second part.
Discussion:
Task 1:
Topic: Security, Privacy and Ethics in Data Mining.
As more as the personal data is being gathered, the most powerful computers are being
collected according to the availability of the legitimate and abusive usage. There are many inter-
associated aspects to the digital revolution that has enhanced capabilities of storing the data and
models applied for producing knowledge. Big data is the area to criminal curriculum and
provocation. Big data is majorly used in public health sector, bio-chemical research sector,
property dealing and insurance sector (Tasioulas, 2016). The emergence of big data is an
example of scientific and technological innovative generator that includes enormous potential
benefits and market risks.
The heightened security of big data can sometimes degrade the level of privacy. The law
enforcement agencies gather data that could be treated as terrorist or potential customers.
Business companies aim to deliver targeted advertising and track the online strategy.
Organisations like Google, Apple and Amazon takes care about more intelligence and decrypted
version of data. The counter measures of internet theft and cyber crime are auditing and
corporate methods, encryption, control over data access, backups and detection of intrusion. Data
control and data incorporation with the help of sophisticated software of the business authorities
have reduced vulnerability of the dataset (Ryoo, 2016).
Transparency in security and privacy is the vital factor to address the security and privacy
issues. Big data handlers expose data who can access and grant it. The business organisations
may achieve public trust by providing security controls to protect big scale data.
Ethics in big data handling enterprises improves health and health resources. Big data
analysis pursues self-interest defining satisfactory curiosity and career advancement. Demand of
consumers drives security and privacy that is critical to a highlighted level of security through
vehicles. Particularly, the confronting towards the enhancement with dogmatic understanding of
human right became unresponsive towards the changing circumstances that lead to rights to
privacy and authenticity.
Introduction:
The assignment in the first part discusses about the security, privacy and ethics of data
mining. The two research articles are provided in the assignment file that would help to describe
the limited utility and impractical scientific knowledge. The dynamic association between human
rights and big data is discussed in the discussion section.
On the other hand, the second part of the assignment file takes into account the analysis
by “Weka” software. The marks of assignments and final exam are analysed with summary
statistics and necessary visualizations. Besides, the basic activities like handling the missing
values are executed in the second part.
Discussion:
Task 1:
Topic: Security, Privacy and Ethics in Data Mining.
As more as the personal data is being gathered, the most powerful computers are being
collected according to the availability of the legitimate and abusive usage. There are many inter-
associated aspects to the digital revolution that has enhanced capabilities of storing the data and
models applied for producing knowledge. Big data is the area to criminal curriculum and
provocation. Big data is majorly used in public health sector, bio-chemical research sector,
property dealing and insurance sector (Tasioulas, 2016). The emergence of big data is an
example of scientific and technological innovative generator that includes enormous potential
benefits and market risks.
The heightened security of big data can sometimes degrade the level of privacy. The law
enforcement agencies gather data that could be treated as terrorist or potential customers.
Business companies aim to deliver targeted advertising and track the online strategy.
Organisations like Google, Apple and Amazon takes care about more intelligence and decrypted
version of data. The counter measures of internet theft and cyber crime are auditing and
corporate methods, encryption, control over data access, backups and detection of intrusion. Data
control and data incorporation with the help of sophisticated software of the business authorities
have reduced vulnerability of the dataset (Ryoo, 2016).
Transparency in security and privacy is the vital factor to address the security and privacy
issues. Big data handlers expose data who can access and grant it. The business organisations
may achieve public trust by providing security controls to protect big scale data.
Ethics in big data handling enterprises improves health and health resources. Big data
analysis pursues self-interest defining satisfactory curiosity and career advancement. Demand of
consumers drives security and privacy that is critical to a highlighted level of security through
vehicles. Particularly, the confronting towards the enhancement with dogmatic understanding of
human right became unresponsive towards the changing circumstances that lead to rights to
privacy and authenticity.
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

3DATA MINING-2
Task 2:
Text editor and “Arff” file creation:
Summary statistics and histograms:
Table 1: Summary statistics table of marks of Assignment-1
The summary statistics of assignment 1 indicates that lowest mark in assignment 1 is 30 and
highest mark is 84. The average mark of assignment 1 is 49.103 and standard deviation is 13.28.
Figure 1: Histogram of marks of Assignment-1
Task 2:
Text editor and “Arff” file creation:
Summary statistics and histograms:
Table 1: Summary statistics table of marks of Assignment-1
The summary statistics of assignment 1 indicates that lowest mark in assignment 1 is 30 and
highest mark is 84. The average mark of assignment 1 is 49.103 and standard deviation is 13.28.
Figure 1: Histogram of marks of Assignment-1
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

4DATA MINING-2
The histogram of marks of assignment 1 displays that the distribution is left skewed and most of
the students have received marks below 57.
Table 2: Summary statistics table of marks of Assignment-2
The summary statistics of assignment 2 indicates that lowest mark is 0 and highest mark is 97.
The average mark of assignment 1 is 88.35 and standard deviation is 17.69.
Figure 2: Histogram of marks of Assignment-2
The histogram of marks of assignment 1 displays that the distribution is left skewed and most of
the students have received marks below 57.
Table 2: Summary statistics table of marks of Assignment-2
The summary statistics of assignment 2 indicates that lowest mark is 0 and highest mark is 97.
The average mark of assignment 1 is 88.35 and standard deviation is 17.69.
Figure 2: Histogram of marks of Assignment-2

5DATA MINING-2
The histogram of marks of assignment 2 displays that the distribution is right skewed and most
of the students have received marks above 80.
Table 3: Summary statistics table of marks of Assignment-3
The summary statistics of assignment 3 indicates that lowest mark is 5 and highest mark is 100.
The average mark of assignment 3 is 46.06 and standard deviation is 22.20.
Figure 3: Histogram of marks of Assignment-3
The histogram of marks of assignment 2 displays that the distribution is right skewed and most
of the students have received marks above 80.
Table 3: Summary statistics table of marks of Assignment-3
The summary statistics of assignment 3 indicates that lowest mark is 5 and highest mark is 100.
The average mark of assignment 3 is 46.06 and standard deviation is 22.20.
Figure 3: Histogram of marks of Assignment-3
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

6DATA MINING-2
The histogram of marks of assignment 3 displays that the distribution is left skewed and most of
the students have received marks below 52.5.
Table 4: Summary statistics table of marks of Assignment-4
The summary statistics of assignment 4 indicates that lowest mark is 8 and highest mark is 50.
The average mark of assignment 4 is 35.324 and standard deviation is 7.94.
Figure 4: Histogram of marks of Assignment-4
The histogram of marks of assignment 3 displays that the distribution is left skewed and most of
the students have received marks below 52.5.
Table 4: Summary statistics table of marks of Assignment-4
The summary statistics of assignment 4 indicates that lowest mark is 8 and highest mark is 50.
The average mark of assignment 4 is 35.324 and standard deviation is 7.94.
Figure 4: Histogram of marks of Assignment-4
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

7DATA MINING-2
The histogram of marks of assignment 4 displays that the distribution is right skewed and most
of the students have received marks above 29.
Table 5: Summary statistics table of marks of Final Exam
The summary statistics of final exam indicates that lowest mark is 42 and highest mark is 78.
The average mark of final exam is 59.65 and standard deviation is 8.282.
Figure 5: Histogram of marks of Final-Exam
The histogram of marks of assignment 4 displays that the distribution is right skewed and most
of the students have received marks above 29.
Table 5: Summary statistics table of marks of Final Exam
The summary statistics of final exam indicates that lowest mark is 42 and highest mark is 78.
The average mark of final exam is 59.65 and standard deviation is 8.282.
Figure 5: Histogram of marks of Final-Exam

8DATA MINING-2
The histogram of marks of assignment 4 displays that the distribution is right skewed and most
of the students have received marks above 29.
Scatterplots:
Figure 6: Scatter plot of grades of Assignment 1 (X) vs. Final exam (Y)
Figure 7: Scatter plot of grades of Assignment 2 (X) vs. Final exam (Y)
The histogram of marks of assignment 4 displays that the distribution is right skewed and most
of the students have received marks above 29.
Scatterplots:
Figure 6: Scatter plot of grades of Assignment 1 (X) vs. Final exam (Y)
Figure 7: Scatter plot of grades of Assignment 2 (X) vs. Final exam (Y)
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

9DATA MINING-2
Figure 8: Scatter plot of grades of Assignment 3 (X) vs. Final exam (Y)
Figure 9: Scatter plot of grades of Assignment 4 (X) vs. Final exam (Y)
Figure 8: Scatter plot of grades of Assignment 3 (X) vs. Final exam (Y)
Figure 9: Scatter plot of grades of Assignment 4 (X) vs. Final exam (Y)
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

10DATA MINING-2
3. Applying “Unsupervised Discretise Filter”, to the marks of Assignment-4, the outcomes are
shown below:
Table 6: The frequency distribution table using “Unsupervised Discretise Filter” in the marks of Assignment-4
Figure 10: The histogram using “Unsupervised Discretise Filter” in the marks of Assignment-4
3. Applying “Unsupervised Discretise Filter”, to the marks of Assignment-4, the outcomes are
shown below:
Table 6: The frequency distribution table using “Unsupervised Discretise Filter” in the marks of Assignment-4
Figure 10: The histogram using “Unsupervised Discretise Filter” in the marks of Assignment-4

11DATA MINING-2
The histogram achieved from “Unsupervised Discretise Filter” of the assignment-4 that most of
the frequencies lie in the intervals of 29-33.2 and 33.2 to 37.4 (frequency = 8 in each level).
Here, the scale is discrete, not continuous.
4.
Using “Filters” filling missing values:
In the following processes, the missing values of the whole data set could be filled with mean
value of the data set.
The histogram achieved from “Unsupervised Discretise Filter” of the assignment-4 that most of
the frequencies lie in the intervals of 29-33.2 and 33.2 to 37.4 (frequency = 8 in each level).
Here, the scale is discrete, not continuous.
4.
Using “Filters” filling missing values:
In the following processes, the missing values of the whole data set could be filled with mean
value of the data set.
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide
1 out of 17
Related Documents

Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
Copyright © 2020–2025 A2Z Services. All Rights Reserved. Developed and managed by ZUCOL.