Improving Data Utility through Differential Privacy

Verified

Added on  2021/04/24

|25
|4078
|68
AI Summary
The provided assignment delves into the complexities of big data security, focusing on differential privacy as a means to protect sensitive information. It identifies techniques for enhancing data utility while maintaining confidentiality. The author discusses tools such as decomposition, transformation, and composition in the context of big data applications.
tabler-icon-diamond-filled.svg

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
Literature Review (Secondary Research) Template
Student Name &
CSU ID
Project Topic Title
NOTE: Please you need to use YOUR OWN WORDS in writing this template.
Your Literature Review Should be in Scope and MUST Address all Your Project's Questions
You should ONLY use CSU library, or other University Library, and Google search is NOT allowed. The papers
you select should be in last 3 years. If you are in 2018, then you need to collect 2018, 2017, and 2016.
We encourage you to search for Journal papers rather than conference papers as it will give you more
details.
Check the Journal ranking (Q1, Q2, …etc) of the journal based on uploaded excel sheet in interact.
1
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Example (How to work on each section in template below):
2
Document Page
3
Document Page
Version 1.0 _ Week 1
1
Reference in APA format
URL of the Reference Authors Names and Emails
and Level of Journal (Q1, Q2, …
Qn)
Keywords in this Reference
Yao, X., Zhou, X., & Ma, J. (2016,
April). Differential Privacy of Big
Data: An Overview. In Big Data
Security on Cloud
(BigDataSecurity), IEEE
International Conference on High
Performance and Smart
Computing (HPSC), and IEEE
International Conference on
Intelligent Data and Security (IDS),
2016 IEEE 2nd International
Conference on (pp. 7-12). IEEE.
http://ieeexplore.ieee.org/
abstract/document/7502257/?
reload=true
Author Names and emails:
Xiaoming Yao, Xiaoyi Zhou
College of Information Science and
Technology
Hainan University
Haikou, China
xiaomingyao@163.com
Jixin Ma
School of Computing and Mathematical
Science
University of Greenwich
London, UK
J.Ma@gre.ac.uk
Journal Level: NA
Differential privacy; statistical databases;
data
mining; utility; big data
The Name of the Current
Solution (Technique/ Method/
Scheme/ Algorithm/ Model/
Tool/ Framework/ ... etc )
The Goal (Purpose) of this
Solution & What is the Problem
that need to be solved
What are the components of it?
4
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Differential Privacy of Big Data: An
Overview
Techniques: This journal focuses
on the development and
implementation of different
differential privacy mechanism
that may be used to protect the
usage of big data tools. After the
application of these privacy
policies the information security
system will give an effective
environment. The statistical data
server will become highly secured
from the external attackers.
Tools: In order to implement the
privacy within the big data
application some of the tools are
used such as:
Matrix mechanism
MWEM algorithm
Applied Area: In order to serve
the security purposes the different
areas where these applications
can be made include statistical
database management system,
The objectives of the authors of
these journals are to identify certain
security mechanism to protect the
big data tool and its application.
Problem: The current security
approaches used for the businesses
are very weak.
Purpose (Goal): This journal depicts
the role of differential privacy
mechanism the results generated
from the review and re-examination
of certain new improvement in the
differential privacy application.
Another goal of this journal is to
define the way through which privacy
can be implemented.
The different operational steps those are widely
used by the author in this journals are as follows:
Personalized differential privacy
Geometry based error bound,
Stateful mechanism with the IDs
5
Document Page
data mining, big data application
etc.
The Process (Mechanism) of this Work; Means How the Problem has Solved & Advantage & Disadvantage of
Each Step in This Process
Process Steps - Pre Operation Advantage Disadvantage (Limitation)
1 Identification of different differential
privacy methods
Through this application the issues of
privacy and security can be resolved
immediately.
NA
2 Consideration of internet based
application
In this stage massive number of data
can be collected for commercial
analysis and academic research as
well.
If proper source is not identified
then the collected data may not be
used to meet the actual
requirement.
Process Steps - During Surgery Advantage Disadvantage (Limitation)
1 Matrix mechanism Multiple correlated queries can be
easily structures
If laplacian mechanism or
mathematical formulae is used
then, the chances of greater error
can be increases.
2 Secure group differential private query
(SDQ),
During the data mining operation
period, it helps to combine
techniques from different differential
privacy and security multiparty
Many other sophisticated methods
are there those can be used to
serve the purpose of cryptography.
6
Document Page
computation. The intermediate data
will become completely protected
during the data mining operation.
3 Process optimization It helps to restrict all the errors The workloads of the queries are
supposed to known as quite
advanced.
4 Linear query mechanism It also helps to reduce the rate of
error
These linear queries can be carried
out many times but each time their
sensitivity also differs a lot.
Major Impact Factors in this Work
Dependent Variable Independent Variable
Big data tool and its application in the organization.
Big data tool is the dependent variable here because
it cannot be used securely without the application of
proper differential privacy mechanism.
Differential privacy mechanisms such as Laplacian
mechanism, MWEM algorithm, Matrix algorithm are the
dependent variable.
Input and Output Feature of This Solution Contribution & The Value of This Work
Input - Pre
Operation
Output - Pre
Operation
I think that this particular journal is
very much valuable for both the
business and information security
perspectives. It shows different
differential security mechanism and its
The authors have applied different
mathematical derivation to make the big
data privacy solution effective for the users.
Not only will this but also with the help of
the mathematical application the statistical
7
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Input - During
Surgery
Output - During
Surgery
During the
application
mechanism of
MWEM algorithm
it has been found
that, the inputs
during the
application are
dataset, value of
the data and
privacy budget.
Proper privacy to
the bi data tool
an its application
application as well. These security
mechanisms can be used by the
information security officers to protect
their data set from the external
assaults.
database become much secure and
technically strong.
1. what in the method could have
been better?
2. what in the author analyses
were missed?
3. was there a technique that could
have been used, or a question that
could have been asked, that the
researchers did not use or ask?
Cryptograph is the only combination or
mechanism which required further
improvement. Other no such kinds of
method are mentioned in this journal
that requires further improvement. It
already provides a better trade off
between the privacy and the datasets
utility.
Some kids of challenges are found with the
cryptography combination and the authors
have failed to identify those areas of analysis.
Another question is there that can be asked to the
author to gain further result from the analysis.
4. Were the conclusion justified
and How?
Analyse This Work By Critical
Thinking
The Tools That Assessed this Work
Yes, the conclusion is completely 1. Different advanced mathematical The practical applications those are used
8
Document Page
justified because these differential
privacy mechanisms can preserve the
privacy of the datasets and all other
correlated processes.
approaches are used for further
success of protecting datasets from
external assaults.
2. The author fails to identify the all
the cryptographic concept those are
necessary for securing information
stored in the data server. Due to lack
of security approaches, information
may get hijacked by the external
attackers.
identified and applied in this journal include
Matrix mechanism and MWEM algorithm.
Diagram/Flowchart
2
Reference in APA format
URL of the Reference Authors Names and Emails
and Level of Journal (Q1, Q2, …
Keywords in this Reference
9
Document Page
Qn)
Hua, J., Tang, A., Fang, Y., Shen, Z., &
Zhong, S. (2016). Privacy-preserving utility
verification of the data published by non-
interactive differentially private
mechanisms. IEEE Transactions on
Information Forensics and Security, 11(10),
2298-2311.
http://ieeexplore.ieee.org/
abstract/document/7416007/
Author Names and emails:
Jingyu Hua, An Tang, Yixin Fang,
Zhenyu Shen, and Sheng Zhong
Email: NA
Journal Level: NA
Collaborative data publishing, utility
verification, differential privacy.
The Name of the Current
Solution (Technique/ Method/
Scheme/ Algorithm/ Model/
Tool/ Framework/ ... etc )
The Goal (Purpose) of this
Solution & What is the Problem
that need to be solved
What are the components of it?
Techniques: Collaboration data
publishing, utility verification and
differential privacy
Tools: Encryption is the only tool
or method mentioned in this
journal.
Applied Area: Application of this
mechanism is possible only on the
published datasets.
Problem: The main problem
identified in this journal is lack of
usage of privacy mechanism for
protecting data from the external
attackers.
Purpose (Goal): To identify the way
on how the central data publishers
are responsible to aggregate
sensitive data.
The goal of the article is to identify
the privacy preserving utility
verification of all those data
published in the non interactive
Pre-Operation:
Data identification
Data collection
Data collaboration
During Surgery:
Differentially publishing private data
10
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
differentially private mechanism.
Another goal is to identify the
mechanism through which the
private data can be encrypted.
Utility verification
Differential privacy
The Process (Mechanism) of this Work; Means How the Problem has Solved & Advantage & Disadvantage of
Each Step in This Process
Process Steps - Pre Operation Advantage Disadvantage (Limitation)
1 Data identification Through this component service
providers can collect huge amount of
data from the required sources.
If proper data are not collected
then it will lead to major loss.
2 Data collection Data collection can be widely used
and outsourced as well whenever
needed.
NA
3 Data collaboration Collected data can be used to serve
different purposes.
NA
Process Steps - During Surgery Advantage Disadvantage (Limitation)
1 Differentially publishing private data This data can be accessed much
easily than the others.
NA
2 Utility verification Data verification helps to identify and
understand the feature of the data
and also the areas where the
information can be used widely.
NA
3 Differential privacy With proper privacy none of the
unauthorized user will be able to
NA
11
Document Page
access the data from the published
dataset.
Major Impact Factors in this Work
Dependent Variable Independent Variable
Sensitive data Central data publisher
Input and Output Feature of This Solution Contribution & The Value of This Work
Input: Generalization operation is the
input of the article.
Output: Successful privacy and security
operation through which the
collaborated information can be
secured from the external attackers.
The solution highlighted the issues of
privacy preservation and published
collaborative data. How the utility can
be measured are also demonstrated
by the author, In addition to this, a
privacy preserving verification
mechanism is proposed and also the
security and efficiency of the proposed
mechanism are also provided in this
solution.
The authors highlighted the data privacy
issues and also mentioned both the DiffPart
and DiffGen to set valued data. This
mechanism can be used in real world and it
is expected that it will give major success.
1. What in the method could have
been better?
2. what in the author analyses
were missed?
3. was there a technique that could
have been used, or a question that
could have been asked, that the
researchers did not use or ask?
The methods that requires betterment
include cryptography algorithm
The authors have analysed all the relevant tools
and techniques and even how the techniques
In a horizontal distribution context how the
mechanism actually works can be another question to
12
Document Page
can be improved further. Thus it can be said
that, no such point was missed in this analysis.
the author.
4. Were the conclusion justified
and How?
Analyse This Work By Critical
Thinking
The Tools That Assessed this Work
The entire conclusion is absolutely
justified. Regarding frequent number of
providers and size of data, some
changes occurred but still the
conclusion provides positive answers to
the learners.
This journal paper mentioned the
problems of released data verification.
For relational data and set value
similar mechanism are mentioned.
Anonymization technique, and
cryptography are the tools that accessed to
this work
Diagram/Flowchart
13
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
3
Reference in APA format
URL of the Reference Authors Names and Emails
and Level of Journal (Q1, Q2, …
Qn)
Keywords in this Reference
Li, H., Cui, J., Lin, X., & Ma, J.
(2016, December). Improving the
utility in differential private
histogram publishing: Theoretical
study and practice. In Big Data
(Big Data), 2016 IEEE International
Conference on (pp. 1100-1109).
IEEE.
http://ieeexplore.ieee.org/
abstract/document/7840713/
Author Names and emails:
Hui Li∗, Jiangtao Cui†, Xiaobin Lin†
and Jianfeng Ma∗
∗School of Cyber Engineering, Xidian
University, Xi’an, China
Email: hli@xidian.edu.cn,
jfma@mail.xidian.edu.cn
†School of Computer Science and
Technology, Xidian University, Xi’an,
China
Email: cuijt@xidian.edu.cn,
xblin8816@gmail.com
Journal Level: NA
Histogram, Big Data, differential privacy
The Name of the Current
Solution (Technique/ Method/
Scheme/ Algorithm/ Model/
The Goal (Purpose) of this
Solution & What is the Problem
What are the components of it?
14
Document Page
Tool/ Framework/ ... etc ) that need to be solved
Techniques: privacy technique
such as injection noise, differential
privacy, state o the art, matrix
mechanism.
Tools: In this journal both the
statistical tools and fundamental
tools are used.
Applied Area: This techniques
are applied during the publishing
differential private histogram
Problem: Due to lack of exposure of
modern techniques, during the
publication of histogram, data are
not found as secured.
Purpose (Goal): to
Identify the different data
publishing issues
To evaluate the role of
differential privacy during data
publishing
To show that under same
budget, the schemes can be
implemented due to lesser
chances of error occurrence.
Pre-Operation:
data mining
Computing version
During Surgery:
Sanitizing algorithm
The Process (Mechanism) of this Work; Means How the Problem has Solved & Advantage & Disadvantage of
Each Step in This Process
Process Steps - Pre Operation Advantage Disadvantage (Limitation)
1 `Identification of issues If at the initial phase all the issues of
sensitive histogram data are
identified, then it will easier for the
researcher to apply the area of
techniques.
NA
2 Study about different existing All the existing techniques are NA
15
Document Page
technique needed to be identified at the initial
phase to ensure that the researchers
are utilizing the p[roper technique in
the correct location.
Process Steps - During Surgery Advantage Disadvantage (Limitation)
1 Theoretical study It helps to design an accurate
algorithm which may significantly
outperform the state of art method.
NA
2 Grouping This is the next approach through
which different appropriate
algorithms can be combined
together.
NA
Major Impact Factors in this Work
Private data published in the histogram can keep secured if proper differential privacy algorithms are applied in that field or
area. Thus, the published data in histogram are dependent on differential privacy (independent variable) and other privacy
techniques.
Dependent Variable Independent Variable
Private Histogram publication Differential privacy
Input and Output Feature of This Solution Contribution & The Value of This Work
This journal shows the details
investigation about the issues of
The authors have make vast investigation
and made the entire journal very much
16
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Input - Pre
Operation
Output - Pre
Operation
Real life datasets With varied range
size the
performance of
all the methods
and their results
are the outcome.
Input - During
Surgery
Output - During
Surgery
Three different
real world
datasets
Comparison
among three of
these datasets
imply that, it
provide least MSE
than the series of
baseline method.
privacy preserving in histogram
publication.
helpful to the learners.
1. what in the method could have
been better?
2. what in the author analyses
were missed?
3. was there a technique that could
have been used, or a question that
could have been asked, that the
researchers did not use or ask?
The entire presentation and techniques
discussed in this journal are
appropriate additionally, how the
overall utility budget can be maximized
through balancing is also uncovered.
Though a two steps algorithm DAWA is
proposed but application of this method
The DAWA algorithm support the dynamic
histogram update is mentioned but how is
missed in this journal.
Question that may be asked to the researcher is how
the state of art method incorporates both the noise
injection technique and grouping technique as well.
17
Document Page
could have been elaborated better.
4. were the conclusion justified
and How?
Analyse This Work By Critical
Thinking
The Tools That Assessed this Work
The conclusion is justified, whatever
techniques and mechanism are
mentioned and illustrated in this journal
are outlined in the conclusion area.
However, the limitation of the study is
not highlighted by the author that
requires further improvement.
Applications of histogram are seen in
many fields such as data mining,
computer version etc. In order to
report data distribution and its
application for data publishing and
moreover to keep the data secured
from the external attackers the
techniques those should be applied in
the real datasets are correctly
demonstrated in this journal.
Grouping, noise injection, matrix
mechanism and state of the art are four
different techniques mentioned in this
journal
Diagram/Flowchart
18
Document Page
4
Reference in APA format
URL of the Reference Authors Names and Emails
and Level of Journal (Q1, Q2, …
Qn)
Keywords in this Reference
Yao, X., Zhou, X., & Ma, J. (2016,
April). Differential Privacy of Big
Data: An Overview. In Big Data
Security on Cloud
Author Names and emails: Xinyu
Yang, Teng Wang, Xuebin Ren, and
Wei Yu Email: NA
Differential privacy, data security and cloud big
data security.
19
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
(BigDataSecurity), IEEE
International Conference on High
Performance and Smart
Computing (HPSC), and IEEE
International Conference on
Intelligent Data and Security (IDS),
2016 IEEE 2nd International
Conference on (pp. 7-12). IEEE.
http://ieeexplore.ieee.org/
abstract/document/7949053/
Journal Level: NA
The Name of the Current
Solution (Technique/ Method/
Scheme/ Algorithm/ Model/
Tool/ Framework/ ... etc )
The Goal (Purpose) of this
Solution & What is the Problem
that need to be solved
What are the components of it?
Techniques: In order to build
synopsis of the original datasets
application of different techniques
such as transformation,
composition and decomposition
are mentioned in this journal.
Another technique Bayesian
network is also mentioned.
Tools: In order to design the
algorithm of differential privacy
the tool that can be used is
composition tool.
Problem: For excessive generation,
extensive level of sharing and
moreover too much sharing of data
privacy threat has become one of the
mostly raised problems.
Purpose (Goal): The goal of this
journal is to identify the privacy
concerns and to drive a survey for
improving the data utility approach.
Pre-Operation: Big data, privacy-preserving
schemes, sequential data,
During Surgery: differential privacy, data utility,
data correlations.
20
Document Page
Applied Area: In order to
improve the data utility in case of
sequential data publishing, for big
data application these techniques
are applied.
The Process (Mechanism) of this Work; Means How the Problem has Solved & Advantage & Disadvantage of
Each Step in This Process
Process Steps - Pre Operation Advantage Disadvantage (Limitation)
1 Distribution Optimization It helps to optimize the probability
density function.
NA
2 Sensitivity calibration It helps to calibrate the sensitivity. In
addition to this, it also helps to make
the mechanism much smoother by
lowering it. It also reduces the
injected noise’s granularity.
NA
Process Steps - During Surgery Advantage Disadvantage (Limitation)
1 Synopsis of origin of datasets Through transformation,
composition and decomposition the
synopsis of the original data can be
developed.
NA
2 Correlations exploitation It can be used to exploit the
correlation of the data records and
also for eliminating data redundancy.
NA
21
Document Page
Major Impact Factors in this Work
In this case, big data (dependent variable) is dependent on the differential privacy (independent variable). Through the usage
of this privacy technique, data can keep secured from the external attackers.
Dependent Variable Independent Variable
Big data Differential privacy
Input and Output Feature of This Solution Contribution & The Value of This Work
Input - Pre
Operation
Output - Pre
Operation
Analysis of big
data application
and the frequent
reasons of nig
data application
threat
The output will
identify different
techniques
through which
the confidential
data can keep as
a private one.
Input - During
Surgery
Output - During
Surgery
Data utility in
differentially
private
sequential data
The outputs of
the pre –
operation
provides the
techniques and
In order to protect different sensitive
data many techniques are available
and most of those techniques are
discussed in this journal. Even in
differential privacy, the techniques of
improving data utility are also given as
a solution.
The authors have done vast discussion and
all the possible solutions in terms of
techniques and tools are elaborated in this
journal. Though some of the reasons behind
the limitations are not mentioned but still
the authors put high contribution to this big
data security.
22
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
publishing. tools through
which the
sequential data
publishing can
keep as a private
one.
1. what in the method could have
been better?
2. what in the author analyses
were missed?
3. was there a technique that could
have been used, or a question that
could have been asked, that the
researchers did not use or ask?
During designing the privacy
preservation mechanism, the temporal
correlation of the data online can be
elaborated in a much better way.
The future issues that may interrupt the big
data security are not clearly mentioned in this
journal. Additionally how the edge data sources
can keep secured from the external attacks are
not mentioned in this journal.
A question that can be asked to the author is about
the way through which future risks of temporal
correlation of sequential data can be minimized
rather resolved.
4. were the conclusion justified
and How?
Analyse This Work By Critical
Thinking
The Tools That Assessed this Work
The conclusion justified, because it has
successfully outlined the list of serious
challenges that might interrupt the
differential privacy protection.
The differential privacy methods are
discriminated in two different groups
from different perspectives. From the
aspect of differential privacy itself and
secondly from the aspect of the
original dataset.
The tools those have accessed this work
include big data, decomposition,
transformation and composition tools.
Diagram/Flowchart
23
Document Page
© Dr Abeer Alsadoon 2018_CSU Sydney Study Centre
24
Document Page
25
chevron_up_icon
1 out of 25
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]