Emerging Technologies and Innovation: Privacy Threats in Big Data

Verified

Added on  2023/06/12

|11
|3213
|421
Report
AI Summary
This report investigates privacy threats associated with big data, focusing on the increasing breaches and lack of privacy enforcement in user data. It highlights the opportunities and challenges presented by big data, particularly concerning security and privacy protocols. Through a systematic literature review, the study identifies data breaches and abuse as major issues. The proposed solution involves using Fully Homomorphic Encryption (FHE), other encryption methods, and Intrusion Detection and Prevention Systems (IDS/IPS) to enhance data security. It discusses the limitations of traditional security measures and suggests modern approaches, including Open Rights Management systems, to ensure data privacy and anonymity, especially in social media data mining. The approach assumes initial registration of system services on the platform, assigning unique credentials to each service. The rights management platform manages user-generated content (UGC), enabling secure storage of content with user-defined permissions and restrictions, thereby retaining privacy.
Document Page
EMERGING TECHNOLOGIES AND INNOVATION: PRIVACY THREATS IN A BIG DATA 1
Emerging Technologies and Innovation: Privacy Threats in a Big Data
Name
Date
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
EMERGING TECHNOLOGIES AND INNOVATION: PRIVACY THREATS IN A BIG DATA 2
Table of Contents
Abstract.................................................................................................................................................3
Introduction..........................................................................................................................................4
Related Work........................................................................................................................................4
Methodology.........................................................................................................................................6
Challenges and Issues...........................................................................................................................6
Proposed Approach...............................................................................................................................7
Performance Evaluation.......................................................................................................................9
Conclusion.......................................................................................................................................9
Document Page
EMERGING TECHNOLOGIES AND INNOVATION: PRIVACY THREATS IN A BIG DATA 3
Abstract
The concept of privacy, in relation to user information is investigated and discussed in this paper
based on the recognition of increased breaches and lack of privacy enforcement to user data and
information when using big data for various purposes, including business and technological
innovation applications, A review of past works and a systematic literature review identifies data
breaches and abuse as being major big data privacy issues. This paper proposes the use of FHE as
well as other encryption methods and IDS and IPS to guarantee privacy and security of user data.
An experimental evaluation of the method shows promising results
Document Page
EMERGING TECHNOLOGIES AND INNOVATION: PRIVACY THREATS IN A BIG DATA 4
Introduction
The advent of big data has created numerous opportunities for business and organizations; in
the process, numerous amounts of data has been generated that exceed the capacity for present
commonly used software tools fro proper capture, management, and timely analysis and use. Every
two years, the quantity of data to be analyzed is expected to double. Most of this data is in
unstructured form and coming from various inputs including sensors, social media, surveillance,
scientific applications, image and video archives, medical records, internet searches and indexes,
system logs, and business transactions (Kerr & Earle, 2016). The number of devices connected
with the Internet of Things is continuing to increase to unprecedented levels generating large data
amounts that require processing to make sense of and use productively. It has also become popular
and cost effective to use on demand cloud based computing and processing power to analyze and
get insights into this data. As big data expands, the traditional security and privacy protocols
tailored to private computing systems such as demilitarized zones and use of firewalls are no longer
effective (Kache, 2015). In big data, security protocols are expected to work over heterogeneous
hardware, network domains, and operating system components. The collection and use of people’s
data in big data applications has been met with stiff resistance from consumers with growing
concerns expressed over methods that organizations use to collect and use private data and
information (Martin, 2015; ‘LeVPN,’2017). The potential impact and effects of privacy and security
beaches can be highlighted by the recent Facebook page in which there was a massive breach of
privacy and security with regard to user data (‘The Economic Times,’ 2018). This paper discusses
the issue of privacy in big data, first by reviewing related work, discussing the challenges and issues
faced, the methodology of research and a proposed approach, before evaluating performance and
drawing conclusions
Related Work
According to (Lu, Zhu, Liu, Liu, & Shao, 2014), because big data can generate new useful
knowledge for economic and technical benefits, it has received great attention inn recent times
because of its high volume, high velocity, and variety challenges (3V’s). Apart from the 3V
challenges, security and privacy has also emerged as an important issue in big data; If data is not
authentic, the mined information is unconvincing, and if privacy is not properly addressed, there
may be reluctance of resistance for data sharing. As such, an efficient privacy preserving
mechanism, using an algorithm, is proposed by the authors to guarantee security in big data. In a
systematic review of literature and discuss the concept of big data and the issues and challenges
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
EMERGING TECHNOLOGIES AND INNOVATION: PRIVACY THREATS IN A BIG DATA 5
facing big data, moving forward. The authors discuss the issues facing big data including storage,
volume, processing, storage, transportation, and ownership, providing a basis for which to
understand big data (Kaisler, Armour, Espinosa, & Money, 2013). Xu, Jiang,Wang, Yuan, & Ren
(2014) through a review of literature and methods of data mining, specifically the knowledge
Discovery In Databases process (KDD), discuss the techniques used in KDD based on their
appreciation and understanding of big data privacy and security risks. By analyzing the KDD
process, the authors identify issues that eventually result in data breaches or loss of privacy,
including data integration, data selection, and data transformation.
Further, the authors identify the types of users involved in KDD applications, including data
providers, data collectors, data miners and decision makers. Following this review, the authors
propose methods to ensure privacy and data protection while undertaking data mining. The
proposed approaches include using privacy preserving-association rule mining, privacy preserving
classification of data, use of decision trees, using the Naive Bayesian classification, and data
provenance. These methods apply to different players in data mining. Moura & Serrão (2015)
allude to the increased use and sharing or personal data and information to public clouds and social
networks when using a variety of devices, making data privacy and security, especially in the
context of big data an important and hot issue. The authors also allude that traditional methods for
enhancing data security, including the use of demilitarized zones and firewalls are not suitable for
application in computing systems to support security in bid data. By reviewing existing literature,
discussing some of the sources and causes of risks to data security in big data, and using case
studies, the authors propose the use of Software Defined networking (SDN) as a novel approach to
implement security in big data and address data privacy concerns.
Narayanan, Huey, & Felten, (2016) argue that once data is released to the public, it is not
possible to take it back; with time, additional datasets become public with more analytics and
information on the original data, including PII can be revealed making big data information
increasingly vulnerable to being re-identified especially because current ad-hoc methods of De-
identification being presently used are prone to being exploited by adversaries. It is not possible to
know the probability of data being re-identified in future, and so the authors call fr a precautionary
approach to securing privacy in big data. Risks to data privacy go beyond stereotypical re-
identification and that it is impossible to know for certain the privacy risk for data protected using
ad-hoc De-identification. According to Tene & Polonetsky (2013), big data, data mining, and data
analytics play a huge and critical role; data can be mined and analyzed in its raw form without the
need to store and access data from structured databases. However, it comes with the challenge and
Document Page
EMERGING TECHNOLOGIES AND INNOVATION: PRIVACY THREATS IN A BIG DATA 6
problem of data privacy concerns that can result in regulation that would necessitate a backlash and
stifle the befits of big data.
The researchers propose that policy makers must balance the benefits of big data with
privacy concerns, especially the need for privacy and what is defined as personally identifiable
information (PII). Sagiroglu & Sinanc (2013) discuss the concept of big data and its various aspects
and concepts, including sources of data, their transmission, storage, and data mining, and then
discuss in detail the privacy issues and concerns in big data. The authors, in an extensive review of
literature, show that keeping data in a single place increases chances for breaches becomes it
becomes a target for attacks. The authors propose controlled storage management, with encryption,
restricted access to data, and securing the networks through which big data is managed. Terzi,
Terzi, & Sagriroglu (2015) provide a fresh perspective on big data security and privacy where
extra security measures must be put in place to ensure security. The authors suggest, based on their
research and literature review, that extra security must be placed on big data networks through
encryption, controlled access to devices, controlled access to network resources, data should be
made anonymous before being analyzed, communications should proceed in secure channels, and
networks monitored continuously for threats
Methodology
This paper uses a critical systematic review of literature in which clearly formulated
questions are used to undertake explicit and systematic approaches are used for identification,
selection, and critical appraisal of relevant research and for collecting and analyzing data from those
studies I order to generate novel solutions to the issue of privacy in big data.
Challenges and Issues
As more data is collected from connected devices and systems, the existing security
protocols such as fire walls and DMZS are becoming increasingly irrelevant as means for ensuring
big data security. The present issues in big data security and privacy are in four main areas;
infrastructure, data privacy, data management, and integrity and reactive security (Kaisler, Armour,
Espinosa, & Money, 2013). With regard to infrastructure, the main issues include secure distributed
data processing and best security and privacy actions for non relational databases. As relates data
privacy, the main issues include data analysis through data mining methods that preserve data
privacy, using cryptography for data security and privacy, and granular access control. The
challenges in data management and integrity relate to granular audits, secure data storage as well as
transaction logs, and data provenance. Reactive security and privacy issues allude to Validation and
Document Page
EMERGING TECHNOLOGIES AND INNOVATION: PRIVACY THREATS IN A BIG DATA 7
end to end filtering and real time supervision of privacy and security levels. The internet of things
(IoT) is a major area of concern as relates privacy and security in big data. It has become difficult to
to do anything in the present life without someone’s identity being associated with the task, from
surfing the web to making social media comments and engaging in e-commerce. Security breaches
also greatly compromise security through vulnerabilities in web interface insecurity, insufficient
authentication and authorization, lack of encryption, insecure cloud and mobile device interfaces,
inadequate security reconfigurability, insecure firmware and software, and poor physical security. In
addition, companies unknowingly track and collect user data and pass them on to other people such
as marketers for commercial gain, exposing private user data without their consent.
Proposed Approach
A novel approach is used based on the use of several methods, tools, and techniques to
ensure data privacy and security is maintained in big data use. The limitations of traditional
techniques for ensuring data privacy and security can be overcome using modern approaches that
include Fully Homomorphic Encryption (FHE), Secure Function Evaluation (SFE), and Functional
Encryption (FE). FHE is an encryption approach that allows specific computation types such as
RSA to be undertaken on cypher text and generate encryptions that when decrypted matches
operation results performed on plain text. This enables databases queries to be encrypted and keeps
user information private from the location this data is stored. FHE also allows private encrypted
queries to search engines and also helps ensure private user information remains private. Searches
can also be conducted on encrypted data, such as encrypted social media data that helps keep
identities private. The use of open rights management systems, specifically, OpenSDRM: this is a
system architecture that allows different content business models to be implemented. The
architecture is shown below;
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
EMERGING TECHNOLOGIES AND INNOVATION: PRIVACY THREATS IN A BIG DATA 8
This approach, together with FHE, will ensure social media information is mined with
privacy and anonymity retained. The proposed novel approach assumes initial registration of system
services on the platform meaning that each of the different services have to be registered
individually on the platform. Unique credentials are assigned to each service on the platform. The
rights management platform manages user generated content (UGC) that enables secure storage of
content securely in locations that have been configured. When social media users upload UGC, it
remains protected and the permissions, rights, and restrictions about this content is user defined, and
so helps retain privacy. This enables content generators and those willing to use such content, such
as data mining firms, are registered and authenticated in the social network platform as well as on
the rights management platform. Because users willing to access UGC on the platform must be
registered and authenticated and given that UGC is presented in special URI form, user privacy is
achieved.
This is because the special URI is intercepted by the platform for rights management
allowing secure access process. Another approach is to is an intelligent intrusion detection and
prevention system (IDS/IPS) based on a software defined network (SDN). A Kinetic module
controls the IDS/IPS behavior using the Kinetic language, which is a framework for controlling
SDN where network policies can be defined as Finite State machines (FSM). Several dynamic event
types are able to trigger between FSM states transitions. The IDS/ IPS security module ensures non
privileged hosts and infected hosts are dropped; infected but privileged hosts then traffic from that
specific hosts is redirected to a garden wall host automatically where corrective measures are taken
on the infected host, A non infected host has its traffic directed to the intended destination.
Performance Evaluation
Evaluating the two approaches using a simulation in Linux showed promising outcomes in
ensuring user private data is secured. The use of FHE as well as IDS/IPS not only ensures that
private user data is maintained bot in databases as well as in internet search queries, but that the
information remains secure from intrusion and unauthorized access, such as attacks undertaken
using hacking techniques.
Document Page
EMERGING TECHNOLOGIES AND INNOVATION: PRIVACY THREATS IN A BIG DATA 9
Conclusion
The increased use of big data and several interconnected devices, as well as technological
advancements have led to massive data volumes being generated. The generation and use of big
data has several economic and technical innovation benefits, but also raises risks of data privacy
breaches, along with the 3V’s challenges. In this paper, past approaches have been evaluated and
using a systematic review of literature, a combined approach using FHE encryption technologies
and IDS/IPS to ensure personal user data remains private and secure, even when insights are used
for big data analytics. An evaluation of the approach shows the proposed methods are highly
promising in ensuring big data privacy and security
Document Page
EMERGING TECHNOLOGIES AND INNOVATION: PRIVACY THREATS IN A BIG DATA 10
References
Kaisler, S., Armour, F., Espinosa, A., & Money, W. (2013). Big Data: Issues and Challenges Moving
Forward. In 46th Hawaii International Conference on System Sciences (pp. 995-1003).
Hawaii: IEEE Computer Society.
Kache, F. (Ed.). (2015). Dealing with digital information richness in supply chain management: A
review and a Big Data analytics approach. Kassel: Univ.-Press.
Kerr, I., & Earle, J. (2016, August 10). Prediction, Preemption, Presumption | Stanford Law Review.
Retrieved from https://www.stanfordlawreview.org/online/privacy-and-big-data-prediction-
preemption-presumption/
Lei Xu, Chunxiao Jiang, Jian Wang, Jian Yuan, & Yong Ren. (2014). Information Security in Big
Data: Privacy and Data Mining. IEEE Access, 2, 1149-1176.
http://dx.doi.org/10.1109/access.2014.2362522
'Le VPN'. (2017, October 10). Why Do Companies Collect Big Data and Store Personal Data? | Le
VPN. Retrieved from https://www.le-vpn.com/why-companies-collect-big-data/
Lu, R., Zhu, H., Liu, X., Liu, J. K., & Shao, J. (2014). Toward efficient and privacy-preserving
computing in big data era. IEEE Network, 28(4), 46-50. doi:10.1109/mnet.2014.6863131
Martin, K. E. (2015). Ethical Issues in Big Data Industry. MIS Quarterly Executive, 4(2), 67-85.
Retrieved from
https://www.researchgate.net/publication/273772472_Ethical_Issues_in_Big_Data_Industry
Moura, J., & Serrão, C. (2015). Security and Privacy Issues of Big Data. Handbook Of Research On
Trends And Future Directions In Big Data And Web Intelligence, 3(1), 20-52.
http://dx.doi.org/10.4018/978-1-4666-8505-5.ch002
Narayanan, A., Huey, J., & Felten, E. (2016). A Precautionary Approach to Big Data Privacy. Data
Protection On The Move, 24, 357-385. http://dx.doi.org/10.1007/978-94-017-7376-8_13
Tene, O., & Polonetsky, J. (2013). Big Data for All: Pr ivacy and User Control in the Age of
Analytics. Northwestern Journal Of Technology And Intellectual Property, 11(5).
Sagiroglu, S., & Sinanc, D. (May 01, 2013). Big data: A review . In 2013 International Conference
on Collaboration Technologies and Systems (CTS 2013). 42-47. Ankara; Hawaii: IEEE
Computer Society.
Terzi, D., Terzi, R., & Sagriroglu, S. (2015). A Survey on Security and Privacy Issues in Big Data.
In The 10th International Conference for Internet Technology and Secured Transactions (pp.
202-206). London: International Conference for Internet Technology and Secured
Transactions.
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
EMERGING TECHNOLOGIES AND INNOVATION: PRIVACY THREATS IN A BIG DATA 11
'The Economic Times'. (2018, April 11). Mark Zuckerberg apologises to Congress over massive
Facebook breach. Retrieved from https://economictimes.indiatimes.com/tech/internet/mark-
zuckerberg-apologises-to-congress-over-massive-facebook-
breach/articleshow/63704093.cms
chevron_up_icon
1 out of 11
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]