Ethical Issues and Challenges in Big Data Systems - INF30018

Verified

Added on  2023/01/03

|17
|3715
|90
Report
AI Summary
This report provides an in-depth analysis of the ethical issues surrounding big data. It begins with an executive summary outlining the core problems, including security and privacy breaches, discrimination, and challenges in data sharing and access. The introduction sets the stage by highlighting the exponential growth of data and the features that distinguish big data from massive datasets. The report then delves into three key ethical issues: security and privacy, focusing on data breaches, the importance of data security, and the need for skilled professionals to protect vast datasets; discrimination, discussing how big data analytics can lead to discriminatory practices and the need for transparency and anti-discrimination laws; and information sharing and data access, exploring issues related to data governance, the need for open data, and the challenges of data sharing between companies. The report concludes by emphasizing that if the tools and techniques of big data are implemented and used properly, the challenges of the big data system can be mitigated. The report provides detailed discussions on each of these areas, citing relevant literature and offering potential solutions, making it a valuable resource for understanding the ethical implications of big data.
tabler-icon-diamond-filled.svg

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
Running head: BIG DATA
Big Data
Name of the Student
Name of the University
Author’s Note
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
1BIG DATA
Executive Summary
Big data is huge amount data or information that is pouring from the different data sources
and it has the various formats. There was the large amount of data stored in the database.
However, the customary database system, which is not able to handle the information or data.
Big data is the collection of the data set with the various formats. Very crucial asset may be
used for obtaining the enumerable benefits. For combating the privacy and security breach,
the investment should be on the quality of anti-malware software and entry security such as
tokenization or encryption should be implemented. The connection from the system of data
collection to the system of data storage should be secure. It requires ensuring that the
analytics never provide the inaccurate data for using to a trusted data analytics. The data
analytics tools will provide the highest level of accuracy in the big data. there are also many
legal protection, which will secure the data in the big data system. The government has
identified the issues and challenges to the security and privacy. However, many actions are
not taken to address the issues yet. This study concludes that if the tools or the techniques of
big data are implemented and used properly, the tools can mitigate the challenges of the big
data system.
Document Page
2BIG DATA
Table of Contents
Introduction................................................................................................................................3
Ethical Issues related to Big Data..............................................................................................3
Issue 1: Security and Privacy.................................................................................................3
Issue 2: Discrimination..........................................................................................................5
Issue 3: Sharing of Information and Data Access.................................................................6
Conclusion..................................................................................................................................8
References................................................................................................................................10
Document Page
3BIG DATA
Introduction
Over past 20 years, information has growth in the big scale in many different fields.
According to the report from IDC (International Data Corporation) in 2011, the entire copied
and created volume of data was 1.8ZB, which is equivalent to 1021B. This data increased by
nine times in five years (Chen, Mao and Liu 2014). In the future, the figure of the volume of
data will be double in every two years. Big data is the epitome concept. Apart from the huge
data, it has many features, which differentiate between very big data and massive data. Big
data is the dataset, which is not able to perceive, processed, managed and acquired by the
software, hardware and IT tools (Hashem et al. 2015). The importance of integration of big
data has led to the substantial research over few years on the topics such as data fusion,
record linkage and schema mapping for dealing with the challenges faced by the big data
integration (Dong and Srivastava 2013). According to Gandomi and Haider, 2015, the fast
adoption and evolution of big data has leapfrogged to the outlets. To develop efficient and
appropriate methods for leveraging the massive volumes of the various data in the
unstructured audio, video and text format, which constitute near about 93% of big data.
The various issues of big data can be security and privacy, sharing of information and
data access, processing and storage issues, skill requirements, analytical challenges,
discrimination and technical challenges (Katal, Wazid and Goudar 2013).
Ethical Issues related to Big Data
Issue 1: Security and Privacy
In the new age of technology, data security is not limited only to the personal
information of individual but also the research and analysis of the data of the people and it is
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
4BIG DATA
targeted to people’s behaviour too. The data privacy relays in the relationship of the
collection of the data and distribution of that data to the required place (Kshetri 2014). The
data security is the protection of data related to individuals, those data are present in a digital
platform (Zhang 2018). The data security is done to avoid the data breach from unwanted
sources like hackers, etc. A data breach is defined as the release of confidential information
in an environment that does not have the access to that information. The data breach can have
several forms such as stolen, transmitted, or copy of data by unauthorized individuals.
According to Katal, Wazid and Goudar, 2013, the biggest issue related to the use of
Big Data is hampering the privacy of a particular person. The information or data of a person
is released to another person who is not allowed to have the access .Also it should not be
done mistakenly. The information about a person may combine with a set of large data and
result in producing new sets of information about that person. It is possible that the owner of
that information is not interested in reveling those new information and facts. In any
organization, the net structured information collected about a person working there can be
utilized for their own purpose like adding value to their organization. This step is taken by the
organization to create insights for the life of their employees. However, that may remain
unknown to those employees. Therefore, data breach occurred in this case.
The rise in the social stratification is another situation. In this a literate individual
takes advantage of the information that are collected from the analysis of Big Data on the
individuals those are unprivileged. This data breach takes place without the knowledge of that
particular individual. So, it becomes impossible to find out the culprit.
As the size of Big Data is growing day by day the chances of data security breach is
also increasing. As the size of Big Data is maximizing the devices connected to it is also
Document Page
5BIG DATA
getting exposed, many organization struggles in managing their database as well as managing
their data security before abnormalities added through Big Data.
In order to protect data from manipulation several professional of data security are
required with high expertise and knowledge. According to Gamepage, 2016 organisation
must have skilled professional to protect their vast data and they can invest on software like
anti-malware of high quality. Another solution of security can reside in Big Data-Style
analysis. Encryption based on attributes is another way of protecting confidential data and
giving access to the attributes instead of the environment where the data is stored.
Issue 2: Discrimination
According to Kshetri, 2014, big data analytics like predictive and offers of credit
scoring have numbers of opportunities increases the concerns, which are considerable.
Among this, the key risk is discrimination. Although the issues of discrimination have been
examined previously, the extensive study is lacking on the discrimination topic. The author
identifies the big data in order to comprehend the consequences and causes of the
discrimination in big data, to identify the barriers to the fair data mining and to explore the
potential solution for this issue. One of the worrying and research aspects of the techniques of
big data is the issues of the strong discrimination. Discrimination has not any definition in big
data, which is universally accepted. The term discrimination refers to the policies, practices
and acts, which impose the relevant drawbacks on human as the membership of the key
recognized and social threats groups, which are based on ethnic minority, language religion,
race, skin color, political opinion, gender and so on.
According to Favaretto, De and Elger, 2019, many articles have addressed the strong
challenges of the discrimination in big data techniques in the daily life. The discrimination
may be related to the historical vulnerable categories. The predictive analytics and scoring
Document Page
6BIG DATA
system may introduce the fresh forms of the discrimination in the sectors such as healthcare
and insurance. The consequences of discrimination in big data are attributed to the
shortcoming of law and the human bias. Therefore, the proposed solution is included the
transparency enhancing strategies, data protection legislation and implementation and the
comprehensive auditing strategies.
According to Barocas et al., 2017, legislation of the civil rights in the last century has
responded to the real world. People of some nations were denied for accessing to the basic
building blocks for the security and opportunity such as admission to universities, housing,
employment, access to the financial services. It also can be based on the family status,
disability, sexual orientation, gender, religion, sex, origin, race and color. In today’s world,
anti-discrimination laws are helping to enforce the tenet, in which all the human will be
treated equally. These safeguards are very essential for the services against the
discrimination. Big data techniques or tools have the potential for enhancing the ability for
detecting and preventing the discrimination. However, of the techniques are not executed
with much care, the tools can mask, exacerbate or perpetuate the harmful discrimination.
Data analytics may help for solving the potential challenges that may be created. A set of
recommendation or exhaustive look on the avoiding of the discrimination as the big data can
become more central for the business and the work of government.
Issue 3: Sharing of Information and Data Access
According to Varian, 2014 the access of data is generally defined by the ability of an
individual to access and retrieve information present in the database. The individual who is
authorized to access data can manipulate the data like add, delete, and up-date, retrieve the
stored data and with the introduction of Big Data, these data are manipulated in large amount.
The access of data can be of two types one is sequential data access and another is random
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
7BIG DATA
data access. In the sequential data access, process the data is moved until it gets located at a
certain place and each part of data has to be read to operate it. In random access data, the user
is allowed to store data and retrieve it at any desired location of the user, and those data are
accessed for any constant time, and the data is split into several portions and those portions
are located randomly.
The information collected from the individuals must be complete, updated and
accurate. The overall process of governance of data and its management becomes a bit tricky
and complicated. Therefore, it is necessary to make the data open to all and allow the data for
the government agencies with metadata and APIs. These are done to take better decision,
improvements in the productivity and enhance business intelligence. Hence, the sharing of
data becomes a bit awkward between different companies due to the need of achieving an
edge for the business. The collected data is stored in database for future use. The data
collected from different sources are integrated following the privacy requirements (George
2014). It is observed that linkage of private record have introduced several issues starting
from errors in handling to leveraging solution of cryptography.
Once the collection of data is done, they are shared in the organization to utilize those
data to their ultimate potential. Certain data sharing like the data of location collected from
mobile devices are sent to the planers of traffic may be lost in the middle of the sharing
process or some wrong information get shared, it can cause several difficulties for further
planning of the traffic (Kitchin 2014 ) . The data can be accessed illegally in the middle of
information sharing. In certain cases, a data may be shared within a group of individual with
conflicting interests. To avoid theft of data, certain organization adds noise to the data they
are sharing, however, that data may not be supported in certain domains. This reduces the
efficiency of data.
Document Page
8BIG DATA
The balance between the data collectors and data processor must be maintained.
Several incentive issues occur in the process of Big Data sharing. In the process of accessing
the Big Data, tracing the requirements of secrecy is needed for all the individual elements
present in that shared data (Dong et al. 2015). The shared data between the sender and
receiver must be authenticated and they must have authorized to send or receive that
particular data. The roles of the individuals must be monitored to avoid any further issue.
Implementation of secrecy and access control is mandatory to adopt authentication.
Conclusion
Big data always provide the great promises but also create some considerable risks.
This report focuses on the unfair discrimination is the most underestimate issue in the big
data. There are various strategies proposed are such as algorithmic, practical, or
computational methods and the human solution. Transparency is the common proposed
solution for enhancing the algorithmic fairness. To improve the transparency and to resolve
the black box challenge, the best thing is to undertake the discriminatory challenges in the big
data analytics. However, the study identifies the considerable barriers to the suggested
strategies such as legislation shortcoming, human bias, conceptual challenges and technical
difficulties. All of this hampers the execution of the big data practices. Because of the
challenges of the discrimination in big data and the predictive analytic, the shortage of the
empirical studies has focused on the experimental research is essential for accessing the
discriminating practices. This is accidentally and deliberately emerging in various sectors
such as migration, marketing and healthcare. Moreover, his study focuses on the challenges
of big data. If the big data technology is implemented properly, this also could be the
effective tool or technique for preventing the challenges of big data.
Document Page
9BIG DATA
Big Data offers great promise but also poses considerable risks.
The literature review highlights that unfair discrimination is one of
the most pressing, but at the same time an often underestimated
issue in data mining. A wide range of papers proposed solutions on
how to avoid discrimination in the use of data technologies.
Though most of the suggested strategies were practical
computational/algorithmic methods, numerous papers
recommended human solutions. Transparency was a commonly
suggested solution to enhance algorithmic fairness. Improving
algorithmic transparency and resolving the black box issue might
thus be the best course to undertake when dealing with
discriminatory issues in data analytics. However, our study results
identify a considerable number of barriers to the proposed
strategies, such as technical difficulties, conceptual challenges,
human bias and shortcomings of legislation, all of which hamper
the implementation of such fair data mining practices. Due to the
risk of discrimination in data mining and predictive analytics and
the strikingly shortage of empirical studies on the topic that our
review has brought to light, we argue that more empirical
research is needed to assess how discriminatory practices are
deliberately and accidentally emerging from their increased use in
numerous sectors such as healthcare, marketing and migration.
Moreover, since most studies focused on the negative
discriminatory consequences of Big Data, more research is needed
on how data mining technologies, if properly implemented, could
also be an effective tool to prevent unfair discrimination and
promote equality. As more reports from the press are emerging on
the positive use of data technologies to assist vulnerable groups,
future research should focus on the diffusion of similar beneficial
applications. s we live in the era
of big data, here comes the
need of modern, high
performance and capable
equipments
along with scalable
techniques and algorithms
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
10BIG DATA
to deal with the issues and
challenges which must
come across
while playing with the large
data-sets. Big data analytics
is one of the reasons for the
universal success of any
business organization.
Organizations lagging behind
in big data analytics are likely
to be visually and physically
handicapped as they would
suffer with monitory losses
in terms of their future
customers and better
future
investments. The birth of big
data revealed the
shortcomings of existing data
Document Page
11BIG DATA
mining technologies which in
turn
raised new challenges. In
this paper, we have
presented a brief overview
of big data along with its
key
properties, also identified
some challenges of big
data. A very brief
introduction and a
comparison for most
popular big data processing
frameworks; Hadoop
MapReduce and Apache
Spark is presented which
helps
young researchers and data
scientists to analyze the big
Document Page
12BIG DATA
data and uncover hidden,
unknown patterns
s we live in the era of big
data, here comes the need
of modern, high
performance and capable
equipments
along with scalable
techniques and algorithms
to deal with the issues and
challenges which must
come across
while playing with the large
data-sets. Big data analytics
is one of the reasons for the
universal success of any
business organization.
Organizations lagging behind
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
13BIG DATA
in big data analytics are likely
to be visually and physically
handicapped as they would
suffer with monitory losses
in terms of their future
customers and better
future
investments. The birth of big
data revealed the
shortcomings of existing data
mining technologies which in
turn
raised new challenges. In
this paper, we have
presented a brief overview
of big data along with its
key
properties, also identified
some challenges of big
Document Page
14BIG DATA
data. A very brief
introduction and a
comparison for most
popular big data processing
frameworks; Hadoop
MapReduce and Apache
Spark is presented which
helps
young researchers and data
scientists to analyze the big
data and uncover hidden,
unknown patterns
References
Barocas, S., Bradley, E., Honavar, V. and Provost, F., 2017. Big data, data science, and civil
rights. arXiv preprint arXiv:1706.03102.
Chen, C.P. and Zhang, C.Y., 2014. Data-intensive applications, challenges, techniques and
technologies: A survey on Big Data. Information sciences, 275, pp.314-347.
Chen, M., Mao, S. and Liu, Y., 2014. Big data: A survey. Mobile networks and
applications, 19(2), pp.171-209.
Document Page
15BIG DATA
Dong, X., Li, R., He, H., Zhou, W., Xue, Z. and Wu, H., 2015. Secure sensitive data sharing
on a big data platform. Tsinghua science and technology, 20(1), pp.72-80.
Dong, X.L. and Srivastava, D., 2013, April. Big data integration. In 2013 IEEE 29th
international conference on data engineering (ICDE) (pp. 1245-1248). IEEE.
Favaretto, M., De Clercq, E. and Elger, B. (2019). Big Data and discrimination: perils,
promises and solutions. A systematic review. Journal of Big Data, 6(1).
Gamage, P., 2016. New development: Leveraging ‘big data’analytics in the public sector.
Public Money & Management, 36(5), pp.385-390.
Gandomi, A. and Haider, M., 2015. Beyond the hype: Big data concepts, methods, and
analytics. International journal of information management, 35(2), pp.137-144.
George, G., Haas, M.R. and Pentland, A., 2014. Big data and management.
Hashem, I.A.T., Yaqoob, I., Anuar, N.B., Mokhtar, S., Gani, A. and Khan, S.U., 2015. The
rise of “big data” on cloud computing: Review and open research issues. Information
systems, 47, pp.98-115.
Katal, A., Wazid, M. and Goudar, R.H., 2013, August. Big data: issues, challenges, tools and
good practices. In 2013 Sixth international conference on contemporary computing (IC3) (pp.
404-409). IEEE.
Kitchin, R., 2014. The real-time city? Big data and smart urbanism. GeoJournal, 79(1), pp.1-
14.
Kshetri, N., 2014. Big data׳ s impact on privacy, security and consumer welfare.
Telecommunications Policy, 38(11), pp.1134-1145.
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
16BIG DATA
Najafabadi, M.M., Villanustre, F., Khoshgoftaar, T.M., Seliya, N., Wald, R. and
Muharemagic, E., 2015. Deep learning applications and challenges in big data
analytics. Journal of Big Data, 2(1), p.1.
Varian, H.R., 2014. Big data: New tricks for econometrics. Journal of Economic
Perspectives, 28(2), pp.3-28.
Zhang, D., 2018, October. Big data security and privacy protection. In 8th International
Conference on Management and Computer Science (ICMCS 2018). Atlantis Press.
chevron_up_icon
1 out of 17
circle_padding
hide_on_mobile
zoom_out_icon
logo.png

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]