University Report: Security and Privacy Issues in Analytics

Verified

Added on 2021/06/16

AI Summary

This report addresses security and privacy concerns in data analytics, focusing on the implementation of k-anonymity as a privacy protection model. It begins with an executive summary highlighting the growing pressure to share sensitive health information and the associated privacy risks. The report then provides an overview of the technology, including Android SDK, and how it enables k-anonymity. It details the concept of k-anonymity, explaining how it works to anonymize data and protect individual privacy within organizations. The report includes a practical implementation guide, covering the steps and considerations for applying k-anonymity to real-world scenarios. The report discusses the benefits of using k-anonymity, particularly in the context of mobile applications and data sharing. It explores the technology solution, the implementation guide, and the potential challenges and considerations for organizations adopting this approach, ultimately aiming to provide a comprehensive understanding of how k-anonymity can enhance data security and privacy in analytics.

Security & Privacy Issues in Analytics
Security and Privacy Issues in Analytics
Student Name
University Name
Date
1 | P a g e

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Security & Privacy Issues in Analytics
Executive Summary
The pressure to divide or share the health information is growing, and even makes sensitive data
publicly available. Even though, the disclosure of this personal health-related information raises
serious privacy issues. In order to alleviate this concern, users can disclose anonymous data
before. The popular anonymous method is k-anonymity. The real re-identification prospect of the
k-anonymous data set was not evaluated. This report includes the development of an
organizational implementation guide to demonstrate how k-anonymity is used as a model for
privacy protection and how it can be implemented within the organization.
2 | P a g e

Security & Privacy Issues in Analytics
Contents
Introduction......................................................................................................................................4
Technology Overview.....................................................................................................................4
ANDROID TECHNOLOGY.......................................................................................................5
K-anonymity across an organization to protect the privacy of organizational data....................5
Technology Solution........................................................................................................................7
K-Anonymity Implementation Guide..............................................................................................8
Conclusion.....................................................................................................................................11
References......................................................................................................................................13
3 | P a g e

Security & Privacy Issues in Analytics
Option A
Introduction
Societies are experiencing exponential expansion in the amount and type of data collection that
includes specific personal information, with computer technology, disk storage space and
network connectivity becoming more and more affordable. Data owners operate autonomously
and have limited knowledge, but it is still difficult to publish information or data that never
infringe privacy, national interests and confidential. In several cases, database survival also
depends on the ability of the data holder to generate anonymous data, this because not
completely releasing this information might reduce the demand for data, and then again, it fails
to provide proper security or protection in the release version. It may cause damage to the public
or others (Chung, 2016). This study examines how Android SDK technology allows the
implementation of k-anonymity as a model for privacy protection.
Technology Overview
Privacy protection of personal data is accomplished through many technologies, such as k-
anonymity, t-closeness, l-diversity, and so on, but the proposed technology is only implemented
only in a laptop or in a computer system. Nowadays people are more interested in carrying
mobile devices rather than lab desktops because few of the work done by some laptops can
likewise be done through mobile devices such as file sharing, image and video sharing, and so
on. , but sharing may lose data privacy. In order to provide protection for information, a certain
end-goal is to use the policy k-anonymity, which selects k-esteem, where k-1 individual data is
additionally displayed at discharge. This technology is implemented by using the Android SDK.
Whenever users request for the information, despite sending the original information or data, it is
sent anonymously (Run, 2012).
4 | P a g e

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Security & Privacy Issues in Analytics
ANDROID TECHNOLOGY
Android is one of the supported workstations that are widely used in the advanced mobile-
phones. Android contains some Linux considerations, middleware is written in the C, libraries
and APIs, and application programming which is running on the application system that
integrates Java's perfect library with Apache Harmony. The Dalvik virtual machine is used by
Android to run the Dalvikdex without problems. This code is usually encrypted from the Java
byte-code. Also, ARM framework is the Android hardware platform. Android x 86 projects
completely supports x86 and Google TV use this version (Wang, Xu and Sun, 2011). Android
offers engineers a wealth of creative applications. Designers can use gadget gadgets, access
regional data, basic management, for alerts, status bar alerts, etc. Engineers have complete access
to the similar structured APIs that are used by the central application. Application engineering is
designed to solve the problem of component reuse; any application might assign its functionality,
and any other application might use these features (limited by system maintenance security
requirements) (El Emam and Dankar, 2008). With the same tool, customers can replace
segments. The service includes several applications, including the following: a well-defined set
of applications for building applications, including text boxes, tabs, buttons, and even a built-in
browser. The resource manager provides some non-code resources, such as localized graphics.
Data obtained from other programs are displayed by the Notified Information Manager in the
Status bar that displays all Personal alerts (Nielson and Gollmann, 2014).
K-anonymity across an organization to protect the privacy of organizational data
Advancements in new technology offer users many opportunities to provide data. One of them is
with the help of mobile applications. For example, in case a user requests health information
from the hospital, the mobile application must be used to immediately provide the user with the
5 | P a g e

Security & Privacy Issues in Analytics
data. This application likewise provides security because certain parts of the data contain
sensitive information. This security or protection is provided by utilizing a method named as k-
anonymity. This K-Anonymous Property declares that for each record in the loss a number of k-1
individuals must be whose data is shown in the record. This method is based primarily on the
concept of privacy, which measures the ability to estimate raw data of changed data. The
technology known as privacy protection in data mining is a technology which delivers sensitive
features to the original data and how to protect sensitive features through direct and indirect
disclosures (Wang, 2018). To protect the delicate information organizations use a method called
k-anonymity. The K-anonymity model compares the insurance model of information leak with
the recognition that the respondent is conceivable, as the information implies. K-anonymous data
mining has always been considered as a means of protecting and protecting, while the progress
of portable applications is being exploited to release information exploitation results. Several
organizations published micro-data for purposes such as medical, business and demographic
research (Fellows, Tan and Zhu, 2010). The disclosure of this data may compromise privacy. In
order to give anonymity, a few of the explicit characteristic identifiers (for example social
security numbers, addresses and phone numbers) are encrypted. Some features of any method,
such as gender, age, and zip code, can identify personal information when combined. The
information accessible to the attacker is a serious problem. Personal data is collected daily when
they purchase goods or increase their daily actions (for example financial or demographic)
(Huang, 2014). Like Amazon, Flip kart and other information agencies, the site contains
information about ordinary consumers (Chen and Li, 2016). Data, usually provided as publically
available data, can be easily sold as well as can be utilized to combine identities with
unidentified information. This data is usually included in the Excel-form, which can include
6 | P a g e

Security & Privacy Issues in Analytics
name, age, postal code, DOB, and such cases mainly in medical, personal, financial and other
fields. This data might be used for the research purpose. Be threatened with personal privacy.
This kind of anonymity can be handed out by the mobile application. As soon as the user
requests data, it can immediately provide data (Tan, 2012).
Technology Solution
With advances in technology over the past decades, medical institutions have accumulated a
large amount of health-related electronic data or information. This data provides valuable
resources for researchers, decision-makers, and analysts. For instance, epidemiologists can use
urgent situation visits to identify outbreaks that demand further investigation and take
appropriate action on time. Public health information is likewise provided to the public as part of
public health awareness and education. For instance, electronic birth certificates might: provide a
rich source for researchers, risk factors for baby deaths or other adverse reproductive outcomes;
Provide information for advocates, healthcare providers, government and non-profit
organizations. Specific local data along with child health issues as well as help with policy
development same as other health departments across the whole country, such as Indiana Marion
County Department of Public Health (MCPHD) gives access to public data Marts, Data Mart is a
provider of internet programs birth and collection death data. The user can get two pieces of
information about the characteristics, such as the birth risk factor summarized by year, census,
breed, and so on (Agrawal and Kesdogan, 2013).
Privacy protection Personal data usage on hand - Develop Android applications and because of
the huge increase in usage, many people are interested in performing cell phones rather than
laptops because the phone is not limited to audio or video use, but might also be utilized for
several other purposes, such as file exchange, health, financial information and so on. Therefore,
7 | P a g e

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Security & Privacy Issues in Analytics
when user need to exchange personal information, or when companies or researchers request,
information must be provided immediately without the security or privacy of the personal data.
This security or privacy is provided by anonymous data, utilizing an anonymous k-value method
where the value of k determines the level of anonymity. If everyone does not follow the
information contained in as minimum K-1 personal information may appear simultaneously in
the publication, the results will be anonymous. Progress in new technology offers users many
opportunities to provide data. One of them is with the help of mobile applications. During the
information exchange, certain sensitive information may leak. For example, information may
leak if two bank executives or managers exchange the details or description of their branch
offices and smoothly exchange health-related information between employees in different
branches of the hospital (Run, 2012). For this reason, as long as sensitive information or data is
exchanged, privacy is provided. This security or privacy is given by anonymous data. This kind
of anonymity is providing by the k-anonymity method, which is executed or implemented like an
application in the Android. K-Anonymity is the best technology that helps to release large
amounts of data for business use. For example, a personal or classroom use part or all of the
works in digital or hard copy is available free of charge, provided that the copy or copies are not
for profit or commercial benefit production or distribution, and have a copy of this notice and all
about the first page. The copyright of third-party mechanism of this work must be respected.
K-Anonymity Implementation Guide
K-anonymity is a method used to ensure that at least k individuals cannot distinguish an
individual. However, most methods for implementing k anonymization focus on improving the
efficiency of k-anonymous algorithms; from a researcher point of view, the point is not to ensure
the "usefulness" of anonymous data (Run et al., 2012). A new data utility metric is introduced
8 | P a g e

Security & Privacy Issues in Analytics
called research value (RV) that increases current effective statistics by determining the limitation
of data limits designed to improve the effectiveness of anonymous data queries. In order to
anonymize a given set of original data, two algorithms are proposed that use a predetermined
summary used by the information content expert as well as the corresponding learning value to
evaluate the data utility of the feature as it promotes data to make sure anonymity. Additionally,
a good automated algorithm is proposed that utilized for clustering and for the RV to anonymous
data sets. When various attributes in the data set are large, all of the proposed algorithms can be
scaled effectively (SWEENEY, 2012).
Some of the significant fields created in an application:
Login: This module gives users with confirmation. Once logged in, the login can be used to
execute the operations that the user must have in the algorithm.
Browsing: The browser module is utilized to choose a specific Excel form file from a directory
and run a k-anonymous algorithm on the file. The users-selected filename is displayed in a file
path (LI and YANG, 2013).
Reset file: Use this module to access the new files or to stop ongoing algorithmic operations
during operation. By clicking one button, the application will refresh and clear all fields.
Do k-anonymous: If this button is clicked, it checks if the file is not completed. Any other file
than the Excel format file cannot be accepted, moreover, if other files are selected, an error
message will be displayed. If the chosen file excels, it starts to perform operations on a file.
The result of the file Name: The action performed on a selected file will generate an output that
will be saved in another file or folder with the similar name in a Result file or folder name field,
as well as the file will be saved on a similar path as the input file. The name of the resulting file
can be sent to anyone who needs it (Chen and Li, 2016).
9 | P a g e

Security & Privacy Issues in Analytics
Sensitive columns: There can be any number of sensitive columns in the selected file, and the
choice is based on the user's needs. This choice is based entirely on the properties (column
names) displayed in the given enter file. These properties displayed on the rotator unable to
perform actions on sensitive columns (Suryadevara and Rizkalla, 2015).
A number of columns: Using this module, the set of columns might be executed by selecting
the number of columns. The minimum as well as maximum values are automatically selected
base on the values given by the user and sorted as per their values in this field. In this unit, only
numeric values display on a spinner. When user fills in the number of columns, you can only
display the number of key entry columns with numbers that can be used to change.
Repeat columns: Depending on the number of columns chosen by the user and if there are any
unique values in the columns, repeat it so that others do not recognize the original value in a
columns (Rao, Sheshikala and Prakash, 2017). In this unit, the properties of the string values are
copied, and they are displayed in spinners in the numeric range.
Create * column: The user selects the number of unique columns that the user needs to hide. In
this unit, every row is easily compared to other rows (Yuan, Wu and Lu, 2013). If there is
parallel information, the information that differs from one another will be marked with an
asterisk (*). The unit contains properties that have numeric values in a column.
Mail attachment: The completion of the implementation of an algorithm is saved with the result
file name, and the result file can be viewed and attached by one click on the attachment button
furthermore the sender button will use the desired end-user email to the Email ID (LIU and
WANG, 2010).
Guidelines for Applying k-Anonymity
10 | P a g e

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Security & Privacy Issues in Analytics
The manner in which k anonymity applies depends on the re-identification system that people
oppose. To prevent prosecutors from reassessing the situation, k-anonymity must be used. If the
prosecutor's plan does not apply, it is not recommended to use k-anonymity and k-mapping must
be used (or use hypothesis test method D4 approach) (Yuan, Wu and Lu, 2013). If both
conditions are reasonable, k-anonymity must be used because it is the most protective. Therefore
it is important to decide whether the prosecutor's plan applies (Kasurde and Bhati, 2016).
Conclusion
The study concludes that the exchange of data is very important. Data utilities need to be fully
maximized while protecting private information. For each quasi-identifier combination in the k-
anonymity table, minimum k-records share these values. K-anonymity protects data
anonymously from switching attacks but it goes further: Due to the lack of diversity, k-
anonymity can reveal information. K anonymous cannot prevent background-based attacks.
Disclosure of this data can cause serious privacy issues. For example, consider all clinical and
experimental data and trial articles published by individuals participating in clinical trials on the
journal's website. If the person's record of these public data can be re-defined, it will violate
privacy. As a result of privacy issues, such incidents can lead to a decrease in the number of
people involved in the study, and if it occurs in Canada, it will violate the laws of privacy.
Therefore, it is important to accurately understand the types of re-identification attacks that can
be initiated on the dataset and the different ways in which anonymous data can be used correctly
before being exposed. Anonymous technology can cause data distortion. Over-anonymization
can reduce data quality, which makes it inappropriate for some analyzes, and may lead to errors
or prejudiced results. Therefore, it is very important to balance the number of anonymities and
the amount of information lost.
11 | P a g e

Security & Privacy Issues in Analytics
References
Agrawal, D. and Kesdogan, D. (2013). Measuring anonymity: the disclosure attack. IEEE
Security & Privacy Magazine, 1(6), pp.27-34.
Chen, X. and Li, Y. (2016). The Causality Test of Network Technical Anonymity and Perceptive
Anonymity. International Journal of Future Generation Communication and Networking, 9(3),
pp.279-288.
Chen, X. and Li, Y. (2016). The Causality Test of Network Technical Anonymity and Perceptive
Anonymity. International Journal of Future Generation Communication and Networking, 9(3),
pp.279-288.
Chung, W. (2016). Social media analytics: Security and privacy issues. Journal of Information
Privacy and Security, 12(3), pp.105-106.
El Emam, K. and Dankar, F. (2008). Protecting Privacy Using k-Anonymity. Journal of the
American Medical Informatics Association, 15(5), pp.627-637.
Huang, X., Liu, J., Han, Z. and Yang, J. (2014). A new anonymity model for privacy-preserving
data publishing. China Communications, 11(9), pp.47-59.
Fellows, M., Tan, X. and Zhu, B. (2010). Frontiers in algorithmics and algorithmic aspects in
information and management.
Kasurde, A. and Bhati, P. (2016). Implementation of Robust Barcode Modulation Mechanism for
Large Data Trans Reception Using Android Device. International Journal Of Engineering And
Computer Science.
LI, W. and YANG, G. (2013). Energy-saving data aggregation algorithm for protecting privacy
and integrity. Journal of Computer Applications, 33(9), pp.2505-2510.
12 | P a g e