SIT719 Report: Analytics Issues at Dumnonia Corporation
VerifiedAdded on 2022/12/26
|16
|4692
|53
Report
AI Summary
This report analyzes the data analytics challenges faced by Dumnonia Corporation, an Australian insurance company, particularly concerning the security and privacy of its extensive customer data. The report explores the drivers for implementing k-anonymity, a method to anonymize data and protect sensitive information, and assesses the technology solutions through interviews with key stakeholders. The analysis covers the company's existing IT infrastructure, cloud systems, and the concerns of the CEO, CIO, and CSO regarding data breaches, cyber threats, and the impact of encryption on system performance. The report also discusses the implementation guide for k-anonymity, data provenance, and potential issues like data encryption and its impact on system speed. It highlights the need for robust security policies, and the importance of protecting customer data while ensuring the effective use of big data analytics.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.

A report Secondary and Primary Issues in Analytics - Dumnonia Corporation 1
A report Secondary and Primary Issues in Analytics - Dumnonia Corporation
Student
Course
Tutor
Institutional Affiliations
State
Date
A report Secondary and Primary Issues in Analytics - Dumnonia Corporation
Student
Course
Tutor
Institutional Affiliations
State
Date
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.

A report Secondary and Primary Issues in Analytics - Dumnonia Corporation 2
Executive summary
Due to the reason that Dumnonia is one of the competitive insurance corporations in
Australia, and it insures an enormous population. The organization must keep information for all
that population as Big Data hosted in their machine. The organization offer medical insurance
among other various types of insurance. As such, the information for the organization’s
customers including identification numbers, payment details as well as the information
concerning health are kept in the system. The corporate’s CEO is so much concerned about the
security of these sensitive information as well as the organization assets which is a very critical
aspect as the Big Data has medical related data and the information about the organization’s
customers. This initiative will instill full confidence among the organization clients and
employees.
This document has presented a discussion on certain aspects regarding analysis of data as
well as security and privacy related matters. The organization has an out dated security system
like the IDS and Firewall. The corporate also use organizational technologies including malware
protection, encryption and use of password for reducing security risks. The corporate, is however
unsure of the critical aspects of protecting their big data system. They depend on SDN as the
security solution for their big data. The Big Data grows with increase in the public cloud.
Therefore, the traditional security approaches are mainly used by the private sectors. Dumnonia
is a big organization that should not depend on the traditional security solutions, hence it would
put its sensitive data at risk.
Executive summary
Due to the reason that Dumnonia is one of the competitive insurance corporations in
Australia, and it insures an enormous population. The organization must keep information for all
that population as Big Data hosted in their machine. The organization offer medical insurance
among other various types of insurance. As such, the information for the organization’s
customers including identification numbers, payment details as well as the information
concerning health are kept in the system. The corporate’s CEO is so much concerned about the
security of these sensitive information as well as the organization assets which is a very critical
aspect as the Big Data has medical related data and the information about the organization’s
customers. This initiative will instill full confidence among the organization clients and
employees.
This document has presented a discussion on certain aspects regarding analysis of data as
well as security and privacy related matters. The organization has an out dated security system
like the IDS and Firewall. The corporate also use organizational technologies including malware
protection, encryption and use of password for reducing security risks. The corporate, is however
unsure of the critical aspects of protecting their big data system. They depend on SDN as the
security solution for their big data. The Big Data grows with increase in the public cloud.
Therefore, the traditional security approaches are mainly used by the private sectors. Dumnonia
is a big organization that should not depend on the traditional security solutions, hence it would
put its sensitive data at risk.

A report Secondary and Primary Issues in Analytics - Dumnonia Corporation 3
The organizational Drivers for Dumnonia corporate relating to the implementation of k-
anonymity
Dumnonia is a large corporation that provides insurance services in a wide ranging
domain. The population of the organization customers are increasing. Also, being that Dumnonia
is a big corporate that operates in two countries it has a large volume of data, the corporates big
data is currently expanding and there are challenges concerning privacy that are associated to
such expanding big data. This therefore act as a driver towards the adoption of k-anonymity
approach by Dumnonia Corporate. K-anonymity will is a method which anonymize data fields of
an organization big data system such that private data does not get pinpointed to the records of
an individual.
The organization’s data management system is a traditional organization based system,
they use organizational warehouse, links for their clients through mobile application access or
website. Additionally, the organization has a policy that should be fit its customers so as to keep
their customers happy. This cannot be achieved when its customer’s privacy is under risk. The
corporate also need to gain a competitive advantage by exploiting their data (Ye, Cheng, Yuan,
Xu, Gao, and Cheng, 2016, pp. 268-272; Wu, Zhu, Wu, and Ding, 2014, pp.97-107). For
example, if they know that their customers are in a given postcode, i.e. the customer experience
some issues regarding privacy, the corporate should launch new privacy preservation strategies
i.e. their new big data approach. By this approach, the organization customers will be happy and
healthier as there will be no more security concerns. This is another necessity that drives
Dumnonia towards the implementation of k-anonymity approach.
The organizational Drivers for Dumnonia corporate relating to the implementation of k-
anonymity
Dumnonia is a large corporation that provides insurance services in a wide ranging
domain. The population of the organization customers are increasing. Also, being that Dumnonia
is a big corporate that operates in two countries it has a large volume of data, the corporates big
data is currently expanding and there are challenges concerning privacy that are associated to
such expanding big data. This therefore act as a driver towards the adoption of k-anonymity
approach by Dumnonia Corporate. K-anonymity will is a method which anonymize data fields of
an organization big data system such that private data does not get pinpointed to the records of
an individual.
The organization’s data management system is a traditional organization based system,
they use organizational warehouse, links for their clients through mobile application access or
website. Additionally, the organization has a policy that should be fit its customers so as to keep
their customers happy. This cannot be achieved when its customer’s privacy is under risk. The
corporate also need to gain a competitive advantage by exploiting their data (Ye, Cheng, Yuan,
Xu, Gao, and Cheng, 2016, pp. 268-272; Wu, Zhu, Wu, and Ding, 2014, pp.97-107). For
example, if they know that their customers are in a given postcode, i.e. the customer experience
some issues regarding privacy, the corporate should launch new privacy preservation strategies
i.e. their new big data approach. By this approach, the organization customers will be happy and
healthier as there will be no more security concerns. This is another necessity that drives
Dumnonia towards the implementation of k-anonymity approach.

A report Secondary and Primary Issues in Analytics - Dumnonia Corporation 4
The Technology Solution Assessment
This section presents the corporates drivers concerning the implementation of k-
anonymity. The discussion will be done on basis of the three interviews as shown.
Interview 1
Dumnonia considerably invested in the current IT system. The organization has Cloud
system, customer based web portals and mobile applications which are developed for its
customers. In the face of the ever swelling organisation’s data, it is worth noting that the
corporate has made a critical step towards their dream. Cloud system has become the perfect
vehicle for housing the big data workloads and many organizations have been successful with it
(Hashem, Yaqoob, Anuar, Mokhtar, Gani, and Khan, 2015, pp.98-115). Nevertheless, using big
data and cloud system is associated with various challenges.
Cloud system is subject to security and privacy breach and privacy breach has side effects
which may highly cost Dumnonia. As such, implementing k anonymity approach will play a
critical role concerning system security. Besides, Dumnonia also has a dedicated third party
support from their IT partners in India. They organization however need to be aware that privacy
of its customers should be preserved before they publish big data to their third party. There are
two privacy objectives that are achieved while the big data is anonymized. These include unique
identity closure and sensitive identity attribute closure (Daries et al. 2014, pp.94). Therefore
coupling k anonymity with the cloud and big data system is the route of execution for the
organization in achieving its privacy for its customers. Dumnonia is not yet up to date as far as
The Technology Solution Assessment
This section presents the corporates drivers concerning the implementation of k-
anonymity. The discussion will be done on basis of the three interviews as shown.
Interview 1
Dumnonia considerably invested in the current IT system. The organization has Cloud
system, customer based web portals and mobile applications which are developed for its
customers. In the face of the ever swelling organisation’s data, it is worth noting that the
corporate has made a critical step towards their dream. Cloud system has become the perfect
vehicle for housing the big data workloads and many organizations have been successful with it
(Hashem, Yaqoob, Anuar, Mokhtar, Gani, and Khan, 2015, pp.98-115). Nevertheless, using big
data and cloud system is associated with various challenges.
Cloud system is subject to security and privacy breach and privacy breach has side effects
which may highly cost Dumnonia. As such, implementing k anonymity approach will play a
critical role concerning system security. Besides, Dumnonia also has a dedicated third party
support from their IT partners in India. They organization however need to be aware that privacy
of its customers should be preserved before they publish big data to their third party. There are
two privacy objectives that are achieved while the big data is anonymized. These include unique
identity closure and sensitive identity attribute closure (Daries et al. 2014, pp.94). Therefore
coupling k anonymity with the cloud and big data system is the route of execution for the
organization in achieving its privacy for its customers. Dumnonia is not yet up to date as far as
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.

A report Secondary and Primary Issues in Analytics - Dumnonia Corporation 5
the implementation of the k-anonymity approach is concerned. An analysis which involves being
aware of corporates who find it difficult to implement the big data system was done. Dumnonia
Company shares data with other corporates for insurance purpose, this is also essential for the
overall success of the project.
Due to the concern, the organization should protect the data for its customers living in
New Zealand and Australia which is again a great security breach issue. As such, the
organization needs to implement the k-anonymity approach. This is due to the fact that the k-
anonymity approach will guarantee security and privacy for the organization (Chandra, Ray, and
Goswami, 2017, pp. 89-94; Hodges, and Creese, 2013, pp. 613-621). The Dumnonia CEO is as
well concerned about the organization’s operational system. As the corporate adopt the k-
anonymity approach into a new system and would like to draw data from all sectors of the
organization. The Dumnonia CEO is worried that the corporate is still not aware of the
consequences of implementing the new approach. He talks from the initial implementation point
of view and the current system operation. However, the CEO is very aware of other corporates
that have not been successful with the adoption of the Big Data approach. Due to the reason that
Dumnonia Company needs to expand its business and services to many countries, the
organization need to adopt the k-anonymity approach with full rules and regulations in order to
protect the system from frauds (Jutla, and Bodorik, 2015, pp. 1919-1928).
In k-anonymity, there is no randomization which can be exploited by cyber criminals to
tamper with the organization’s data. The analysis here involves encryption with slowing down
the operation in order to calculate and process a larger volume of data. Issues are compared and
matched with the processing after which the large volume of data is calculated thus slowing
down the sources which seem to be autonomous. The k-anonymity constitute data storage,
the implementation of the k-anonymity approach is concerned. An analysis which involves being
aware of corporates who find it difficult to implement the big data system was done. Dumnonia
Company shares data with other corporates for insurance purpose, this is also essential for the
overall success of the project.
Due to the concern, the organization should protect the data for its customers living in
New Zealand and Australia which is again a great security breach issue. As such, the
organization needs to implement the k-anonymity approach. This is due to the fact that the k-
anonymity approach will guarantee security and privacy for the organization (Chandra, Ray, and
Goswami, 2017, pp. 89-94; Hodges, and Creese, 2013, pp. 613-621). The Dumnonia CEO is as
well concerned about the organization’s operational system. As the corporate adopt the k-
anonymity approach into a new system and would like to draw data from all sectors of the
organization. The Dumnonia CEO is worried that the corporate is still not aware of the
consequences of implementing the new approach. He talks from the initial implementation point
of view and the current system operation. However, the CEO is very aware of other corporates
that have not been successful with the adoption of the Big Data approach. Due to the reason that
Dumnonia Company needs to expand its business and services to many countries, the
organization need to adopt the k-anonymity approach with full rules and regulations in order to
protect the system from frauds (Jutla, and Bodorik, 2015, pp. 1919-1928).
In k-anonymity, there is no randomization which can be exploited by cyber criminals to
tamper with the organization’s data. The analysis here involves encryption with slowing down
the operation in order to calculate and process a larger volume of data. Issues are compared and
matched with the processing after which the large volume of data is calculated thus slowing
down the sources which seem to be autonomous. The k-anonymity constitute data storage,

A report Secondary and Primary Issues in Analytics - Dumnonia Corporation 6
networking and effective data collection (Menandas, and Joshi, 2014, pp.68-80). By working
with the autonomous services, dealing with the decentralized and distributed control systems to
find out the evolving relationship among data sets is made easy.
Interview 2
The implementation of k-anonymity is reported here with some updates to ensure that
data breach is made impossible without the provided authorization’s authentication (Jagadish,
2016, pp.77-84; Atat, Liu, Wu, Li, Ye, and Yang, 2018, pp.73603-73636). The issue concerning
the release of privately held data versions is that the people who are the subject of cannot be
recognized and this is the key driver of the Big Data strategy of the organization. The issue of
holding personal medical information of the customers is also discussed in this interview. As per
the interview, Guinevere is always worried about the security and privacy breaches of the data,
he opt that by implementing the k-anonymity approach, there will be full support concerning
information security hence protecting the customer’s personal information and privacy. He
further discussed some algorithms related to the k-anonymity approach that can ensure security.
These algorithms include p-sensitive k-anonymity algorithms, she says that this is a simple
version or an extension of the k-anonymity method, she, however, does not get the advantages
and disadvantages of the k-anonymity extensions clear.
Randomization can be set by an effective understanding of loading the sensitive data
which can get implemented easily by use of other records. The security standard here depends on
the multi-party where computation majorly involves dealing with and handling issues involving
functional computation sets through distributed techniques (Ramani, 2019, pp. 2014-2038;
Naderi, and Alizadeh, 2018, pp.775-784). The data is the major issue that works with the
inference strategy and therefore it is essential to preserve privacy maintaining data mining. This
networking and effective data collection (Menandas, and Joshi, 2014, pp.68-80). By working
with the autonomous services, dealing with the decentralized and distributed control systems to
find out the evolving relationship among data sets is made easy.
Interview 2
The implementation of k-anonymity is reported here with some updates to ensure that
data breach is made impossible without the provided authorization’s authentication (Jagadish,
2016, pp.77-84; Atat, Liu, Wu, Li, Ye, and Yang, 2018, pp.73603-73636). The issue concerning
the release of privately held data versions is that the people who are the subject of cannot be
recognized and this is the key driver of the Big Data strategy of the organization. The issue of
holding personal medical information of the customers is also discussed in this interview. As per
the interview, Guinevere is always worried about the security and privacy breaches of the data,
he opt that by implementing the k-anonymity approach, there will be full support concerning
information security hence protecting the customer’s personal information and privacy. He
further discussed some algorithms related to the k-anonymity approach that can ensure security.
These algorithms include p-sensitive k-anonymity algorithms, she says that this is a simple
version or an extension of the k-anonymity method, she, however, does not get the advantages
and disadvantages of the k-anonymity extensions clear.
Randomization can be set by an effective understanding of loading the sensitive data
which can get implemented easily by use of other records. The security standard here depends on
the multi-party where computation majorly involves dealing with and handling issues involving
functional computation sets through distributed techniques (Ramani, 2019, pp. 2014-2038;
Naderi, and Alizadeh, 2018, pp.775-784). The data is the major issue that works with the
inference strategy and therefore it is essential to preserve privacy maintaining data mining. This

A report Secondary and Primary Issues in Analytics - Dumnonia Corporation 7
allows the calculations depending on the aggregation of statistics through the data sets without
interfering with people’s privacy.
Interview 3
In this interview, the supportive documents as well as the policies which are related to the
privacy approach of the k-anonymity are discussed. As the Dumnonia CSO, Constantine’s
primary focus here is to eliminate all the data breaches and instill robust security policies, this is
due to the reason that Constantine has been initially involved in the initial big data system for the
corporation. As a result, handling and transferring the big data from one place to another is the
major concern for providing data security (Kenekar, and Dani, 2017, pp. 167-190). In this
rationale, the main feature of this part is to recommend some additional features to the k-
anonymity approach in order for it to be enhanced further and offer full authentication for the
security issues.
This discussion is mainly based on various measures for effective modeling of the
privacy and quality metrics. The p-sensitive k-anonymity extension enables understanding and
targeting of the data set attributes. Their main focus is re-identification of users and handling of
potential privacy breach. The procedures are directed to Dumnonia’s Big Data system for
encryption to enhance security and privacy within the organization’s information system (Munir,
Al-Mutairi, and Mohammed, eds., 2015, pp.32; Fu, Wang, Qi, Liao, and Li, 2018, pp.569-585).
However, the organization’s CSE main worry is that the corporate’s operations will be slowed
down when any encryption is implemented within the Big Data system for example computation
and processing of massive data would slow down the organization’s system due to the fact that
the data will need to be encrypted and decrypted constantly. The encryption process should take
place before it enters and when it leaves the big data system.
allows the calculations depending on the aggregation of statistics through the data sets without
interfering with people’s privacy.
Interview 3
In this interview, the supportive documents as well as the policies which are related to the
privacy approach of the k-anonymity are discussed. As the Dumnonia CSO, Constantine’s
primary focus here is to eliminate all the data breaches and instill robust security policies, this is
due to the reason that Constantine has been initially involved in the initial big data system for the
corporation. As a result, handling and transferring the big data from one place to another is the
major concern for providing data security (Kenekar, and Dani, 2017, pp. 167-190). In this
rationale, the main feature of this part is to recommend some additional features to the k-
anonymity approach in order for it to be enhanced further and offer full authentication for the
security issues.
This discussion is mainly based on various measures for effective modeling of the
privacy and quality metrics. The p-sensitive k-anonymity extension enables understanding and
targeting of the data set attributes. Their main focus is re-identification of users and handling of
potential privacy breach. The procedures are directed to Dumnonia’s Big Data system for
encryption to enhance security and privacy within the organization’s information system (Munir,
Al-Mutairi, and Mohammed, eds., 2015, pp.32; Fu, Wang, Qi, Liao, and Li, 2018, pp.569-585).
However, the organization’s CSE main worry is that the corporate’s operations will be slowed
down when any encryption is implemented within the Big Data system for example computation
and processing of massive data would slow down the organization’s system due to the fact that
the data will need to be encrypted and decrypted constantly. The encryption process should take
place before it enters and when it leaves the big data system.
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

A report Secondary and Primary Issues in Analytics - Dumnonia Corporation 8
Moreover, there have been issues regarding k-anonymity. The issue is that the
implementation of the k-anonymity may lead to chances of mitigating the need to encrypt data
due to the reason that much of the data will be anonymous (Damiani et al. 2018, pp. 94). This is
an idea that the Dumnonia’s senior management team does not have while implementing the k-
anonymity approach.
Another issue that is shown in this section involve data provenance. The data provenance
is a policy to the historical database that is residing inside of a machine (Litoiu, and Shtern,
Bitnobi Inc, 2018, pp.46-65). It provides Meta data therefore if it is not accorded the care it
deserves, it may end up altering the data sets which have no use since the unauthorized changes
that take place in the Meta data may lead to wrong data sets thus making it difficult to find the
information that is required. Additionally, some untraceable sources may be a big obstruction to
tracing the roots of the cases of fake data generation and security breaches.
The implementation guide for k-anonymity
This section presents the implantation guide for k-anonymity. The guide is presented with
reference to the technology solution assessment in the Dumnonia corporate discussed in the
previous sections, starting with interview 1, interview 2 and lastly interview 3.
Interview 1
In order to prevent the security attacks that might lead to data breach, the k-anonymity
must be up to date, the microdata has to be done by modification of the microdata k-
anonymization method. Due to the potential increase of the volume of data, an effective
technique for anonymization of the data becomes challenging. However, this section will
propose a better algorithm after a series of trails and systematic comparisons like it was
Moreover, there have been issues regarding k-anonymity. The issue is that the
implementation of the k-anonymity may lead to chances of mitigating the need to encrypt data
due to the reason that much of the data will be anonymous (Damiani et al. 2018, pp. 94). This is
an idea that the Dumnonia’s senior management team does not have while implementing the k-
anonymity approach.
Another issue that is shown in this section involve data provenance. The data provenance
is a policy to the historical database that is residing inside of a machine (Litoiu, and Shtern,
Bitnobi Inc, 2018, pp.46-65). It provides Meta data therefore if it is not accorded the care it
deserves, it may end up altering the data sets which have no use since the unauthorized changes
that take place in the Meta data may lead to wrong data sets thus making it difficult to find the
information that is required. Additionally, some untraceable sources may be a big obstruction to
tracing the roots of the cases of fake data generation and security breaches.
The implementation guide for k-anonymity
This section presents the implantation guide for k-anonymity. The guide is presented with
reference to the technology solution assessment in the Dumnonia corporate discussed in the
previous sections, starting with interview 1, interview 2 and lastly interview 3.
Interview 1
In order to prevent the security attacks that might lead to data breach, the k-anonymity
must be up to date, the microdata has to be done by modification of the microdata k-
anonymization method. Due to the potential increase of the volume of data, an effective
technique for anonymization of the data becomes challenging. However, this section will
propose a better algorithm after a series of trails and systematic comparisons like it was

A report Secondary and Primary Issues in Analytics - Dumnonia Corporation 9
discussed in the preceding sections of the document. This will be done along with its efficiency
and effectiveness.
Literature has help researchers to find out the relationship that exist between the k values,
the choosing of the a quasi-identifier, the degree of anonymization as well as the focus on the
time of execution where k is considered to be a random value since it has been taken as p or
something else in the previous section. Similarly, some algorithms for anonymization has to be
employed (Jordan, 2017, pp. 01). However, the worry is the system’s operation aspect with the
system data across the entire corporate. Adoption of the Big Data system promises the ability to
share data sets with government entities and other corporates. There is concern among the
organization staff regarding how the k-anonymity will help in dealing with security and privacy
to ensure data preservation and work on the operational costing strategies. It is essential to
understand what is meant by holding and allowing the easy sharing of the organization’s data
sets. The k-anonymity is the referenced in this scenario, where the MapReduce method could in
proper working with construction thus handling the scenarios involving the non-published data
(Bilfaqih, and Khatoon, 2016, pp.09). This algorithm should be coupled with an operation that
need to get proposed and worked on in order to fix up the issues regarding scalability.
Interview 2
Due to the reason that an occurrence of data breach would be a very bad impression for
Dumnonia Corporation, one of the important thing that should be the organization’s priority is
the protection of the corporate’s information system by Bug Data security. Moreover, the
organization still uses the traditional encryption technology including the use of passwords for
transferring files from one place to another. The data processing speed will increase when
encryption is done to protect sensitive data (Yang, Wang, Ren, and Yu, 2017, pp. 243-263). It is
discussed in the preceding sections of the document. This will be done along with its efficiency
and effectiveness.
Literature has help researchers to find out the relationship that exist between the k values,
the choosing of the a quasi-identifier, the degree of anonymization as well as the focus on the
time of execution where k is considered to be a random value since it has been taken as p or
something else in the previous section. Similarly, some algorithms for anonymization has to be
employed (Jordan, 2017, pp. 01). However, the worry is the system’s operation aspect with the
system data across the entire corporate. Adoption of the Big Data system promises the ability to
share data sets with government entities and other corporates. There is concern among the
organization staff regarding how the k-anonymity will help in dealing with security and privacy
to ensure data preservation and work on the operational costing strategies. It is essential to
understand what is meant by holding and allowing the easy sharing of the organization’s data
sets. The k-anonymity is the referenced in this scenario, where the MapReduce method could in
proper working with construction thus handling the scenarios involving the non-published data
(Bilfaqih, and Khatoon, 2016, pp.09). This algorithm should be coupled with an operation that
need to get proposed and worked on in order to fix up the issues regarding scalability.
Interview 2
Due to the reason that an occurrence of data breach would be a very bad impression for
Dumnonia Corporation, one of the important thing that should be the organization’s priority is
the protection of the corporate’s information system by Bug Data security. Moreover, the
organization still uses the traditional encryption technology including the use of passwords for
transferring files from one place to another. The data processing speed will increase when
encryption is done to protect sensitive data (Yang, Wang, Ren, and Yu, 2017, pp. 243-263). It is

A report Secondary and Primary Issues in Analytics - Dumnonia Corporation 10
necessary to enact the management policies for cryptographical material access whereby security
of the static information needs working through management with specific types of calculations.
The system security is with some directions that involve handling the methods which are
related to contents which are social network user-generated and Big Data applications such as the
web based big data system (Antonatos, Braghin, Holohan, Gkoufas, and Mac Aonghusa, 2018,
pp. 1531-1542). However, the organization concern is that by anonymizing the target data, its
attributes may be stored in cookies on the user’s web browsers which may be used to identify the
organization users some other time which may lead to privacy breach and the system is
automatically adopted by browsers. Guinevere also expressed her concerns regarding the areas
that may be subjected to data breach and a concern on various platforms. She, however,
acknowledge that she was not fully aware of the problem and the actual cause of the issue.
Interview 3
UDP-based data transfer protocol is an efficient data transfer protocol that is used to
transfer massive data sets in Big Data System through a high speed WAN network (Burmeister,
Lang, Bayrle, Catalkaya, Stelzer, and Schiebel, 2016, pp. 162-176). Nevertheless, the approach
will be accompanied by some implementation problems when using it with big data systems or
cloud and when an encryption takes place. The Dumnonia corporate should focus on the
fabrications for generating reports which is related to the corporates data that may have been
directed by encryption and decryption.
Additionally, it is essential to focus on the changes that are not authorized in the Meta
data in which the untraceable sources of data may be a weakness for identification of the causes
of security with determination of the cases which are related frauds. The process of encryption
necessary to enact the management policies for cryptographical material access whereby security
of the static information needs working through management with specific types of calculations.
The system security is with some directions that involve handling the methods which are
related to contents which are social network user-generated and Big Data applications such as the
web based big data system (Antonatos, Braghin, Holohan, Gkoufas, and Mac Aonghusa, 2018,
pp. 1531-1542). However, the organization concern is that by anonymizing the target data, its
attributes may be stored in cookies on the user’s web browsers which may be used to identify the
organization users some other time which may lead to privacy breach and the system is
automatically adopted by browsers. Guinevere also expressed her concerns regarding the areas
that may be subjected to data breach and a concern on various platforms. She, however,
acknowledge that she was not fully aware of the problem and the actual cause of the issue.
Interview 3
UDP-based data transfer protocol is an efficient data transfer protocol that is used to
transfer massive data sets in Big Data System through a high speed WAN network (Burmeister,
Lang, Bayrle, Catalkaya, Stelzer, and Schiebel, 2016, pp. 162-176). Nevertheless, the approach
will be accompanied by some implementation problems when using it with big data systems or
cloud and when an encryption takes place. The Dumnonia corporate should focus on the
fabrications for generating reports which is related to the corporates data that may have been
directed by encryption and decryption.
Additionally, it is essential to focus on the changes that are not authorized in the Meta
data in which the untraceable sources of data may be a weakness for identification of the causes
of security with determination of the cases which are related frauds. The process of encryption
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.

A report Secondary and Primary Issues in Analytics - Dumnonia Corporation 11
should take place before the data enter or leaves the big data system. As discussed in the
previous sections of this document, there are also some issues that adopting the k-anonymity can
reduce the need to encrypt data due to the fact that much of the data will be anonymous.
Comparison of the publicly available implementations
The proposed algorithms proposed for this project include Mondrian and Datafly
algorithms.
Mondrian algorithm: this is a multidimensional algorithm for portioning domain space into
various regions containing at least k records.
Datafly algorithm: this is an algorithm which provides an anonymity in medical data. It can
generalize attributes which have most discrete values until k anonymity is fulfilled.
This section presents a comparison of the publicly available implementation of the k
anonymity algorithms mentioned above. The publicly available implementations we use are from
ARX Data anonymization Tool and by UTD anonymization toolbox respectively. This was done
in order to find out the factors that affect performance for various algorithms so as to provide a
guide for selecting the most appropriate algorithm for Dumonian Corporation. The adult data sets
were accessed from ARX Data anonymization Tool. The NCP percentage was produced for each
data set for k=10 and analyzed. According to the experiment, Mondrian shows a low sensitivity
compared to datafly. It is also seen that the performance of the algorithms were much better for
the adults with regards to efficiency.
In the UT Dallas anomization tool, adults and INFOMS were used as data sets. The
following table presents the configuration that was used in experiment.
should take place before the data enter or leaves the big data system. As discussed in the
previous sections of this document, there are also some issues that adopting the k-anonymity can
reduce the need to encrypt data due to the fact that much of the data will be anonymous.
Comparison of the publicly available implementations
The proposed algorithms proposed for this project include Mondrian and Datafly
algorithms.
Mondrian algorithm: this is a multidimensional algorithm for portioning domain space into
various regions containing at least k records.
Datafly algorithm: this is an algorithm which provides an anonymity in medical data. It can
generalize attributes which have most discrete values until k anonymity is fulfilled.
This section presents a comparison of the publicly available implementation of the k
anonymity algorithms mentioned above. The publicly available implementations we use are from
ARX Data anonymization Tool and by UTD anonymization toolbox respectively. This was done
in order to find out the factors that affect performance for various algorithms so as to provide a
guide for selecting the most appropriate algorithm for Dumonian Corporation. The adult data sets
were accessed from ARX Data anonymization Tool. The NCP percentage was produced for each
data set for k=10 and analyzed. According to the experiment, Mondrian shows a low sensitivity
compared to datafly. It is also seen that the performance of the algorithms were much better for
the adults with regards to efficiency.
In the UT Dallas anomization tool, adults and INFOMS were used as data sets. The
following table presents the configuration that was used in experiment.

A report Secondary and Primary Issues in Analytics - Dumnonia Corporation 12
Experiment Parameter The size of datasets
|QIDs| |QIDs| = 2
k-value= 3
Adult :60,475
INFORMS: 60,000
k-value |QIDs| = 6
k-value = [10, 30, 60, 120,
240]
Adult :60,475
INFORMS: 60,000
Size |QIDs| = 6
k-value
Adults: 10k, 20k, 40k, 50k,
100k, 200k
Table 1: The dataset configuration used in UT Dallas anomization tool
The following shows the meaning of the varied parameters used:
a. K-value: this parameter shows the level of privacy that an anonymization algorithm must
satisfy.
b. |QIDs|: this shows the number of attributes which are contained in QID (quasi
identifiers) set.
Executing all algorithms using one framework enables a comparison for a fair
performance. In the UTD anonymization toolbox, the intermediate anonimization data set were
hosted in the system database when this implantation was carried out. This application does
implementation by selecting all attributes. Concerning Mondrian and datafly, Mondrian
performed better compared to datafly; this is attributed to the fact that Mondrian produces the
maximum number of EQ. Therefore, with regards to group size based metrics, it can be deduced
that Mondrian performs much better than datafly.
Experiment Parameter The size of datasets
|QIDs| |QIDs| = 2
k-value= 3
Adult :60,475
INFORMS: 60,000
k-value |QIDs| = 6
k-value = [10, 30, 60, 120,
240]
Adult :60,475
INFORMS: 60,000
Size |QIDs| = 6
k-value
Adults: 10k, 20k, 40k, 50k,
100k, 200k
Table 1: The dataset configuration used in UT Dallas anomization tool
The following shows the meaning of the varied parameters used:
a. K-value: this parameter shows the level of privacy that an anonymization algorithm must
satisfy.
b. |QIDs|: this shows the number of attributes which are contained in QID (quasi
identifiers) set.
Executing all algorithms using one framework enables a comparison for a fair
performance. In the UTD anonymization toolbox, the intermediate anonimization data set were
hosted in the system database when this implantation was carried out. This application does
implementation by selecting all attributes. Concerning Mondrian and datafly, Mondrian
performed better compared to datafly; this is attributed to the fact that Mondrian produces the
maximum number of EQ. Therefore, with regards to group size based metrics, it can be deduced
that Mondrian performs much better than datafly.

A report Secondary and Primary Issues in Analytics - Dumnonia Corporation 13
Based on the series of experiment conducted using two the publicly available
implementations, the scenarios where both of the algorithms did well and poorly were identified.
This was done according to metrics of interest. It can be notated from the results that there is no
excellent performing algorithm, however, the performance of algorithms is affected by various
factors including the characteristics of the data sets as well as the desired privacy needs.
Based on the series of experiment conducted using two the publicly available
implementations, the scenarios where both of the algorithms did well and poorly were identified.
This was done according to metrics of interest. It can be notated from the results that there is no
excellent performing algorithm, however, the performance of algorithms is affected by various
factors including the characteristics of the data sets as well as the desired privacy needs.
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

A report Secondary and Primary Issues in Analytics - Dumnonia Corporation 14
Reference list
Antonatos, S., Braghin, S., Holohan, N., Gkoufas, Y. and Mac Aonghusa, P., 2018, April.
PRIMA: An End-to-End Framework for Privacy at Scale. In 2018 IEEE 34th International
Conference on Data Engineering (ICDE) (pp. 1531-1542). IEEE.
Atat, R., Liu, L., Wu, J., Li, G., Ye, C. and Yang, Y., 2018. Big data meet cyber-physical
systems: A panoramic survey. IEEE Access, 6, pp.73603-73636.
Bilfaqih, S.M. and Khatoon, S., 2016. Data Mining Model for Big Data Analysis, pp.09.
Burmeister, S., Lang, J., Bayrle, N., Catalkaya, M., Stelzer, B. and Schiebel, E., 2016. Big Data
im Kontext von Industrie 4.0. Eine Technologievorausschau anhand IT
‐gestützter
bibliometrischer Analyse und Szenariotechnik. Institute of Technology and Process
Management. Ulm: Universität Ulm, pp. 162-176.
Chandra, S., Ray, S. and Goswami, R.T., 2017, January. Big data security in healthcare: survey
on frameworks and algorithms. In 2017 IEEE 7th International Advance Computing Conference
(IACC) (pp. 89-94). IEEE.
Damiani, E., Rana, S., Lu, S., Accorsi, R., Ardagna, C., Arpinar, B., Bellandi, V., Bhowmik, R.,
Braghin, C., Choi, B. and Cimato, S., IJBD. (2018). Editorial Board, pp. 94.
Daries, J.P., Reich, J., Waldo, J., Young, E.M., Whittinghill, J., Ho, A.D., Seaton, D.T. and
Chuang, I., 2014. Privacy, anonymity, and big data in the social sciences.
Reference list
Antonatos, S., Braghin, S., Holohan, N., Gkoufas, Y. and Mac Aonghusa, P., 2018, April.
PRIMA: An End-to-End Framework for Privacy at Scale. In 2018 IEEE 34th International
Conference on Data Engineering (ICDE) (pp. 1531-1542). IEEE.
Atat, R., Liu, L., Wu, J., Li, G., Ye, C. and Yang, Y., 2018. Big data meet cyber-physical
systems: A panoramic survey. IEEE Access, 6, pp.73603-73636.
Bilfaqih, S.M. and Khatoon, S., 2016. Data Mining Model for Big Data Analysis, pp.09.
Burmeister, S., Lang, J., Bayrle, N., Catalkaya, M., Stelzer, B. and Schiebel, E., 2016. Big Data
im Kontext von Industrie 4.0. Eine Technologievorausschau anhand IT
‐gestützter
bibliometrischer Analyse und Szenariotechnik. Institute of Technology and Process
Management. Ulm: Universität Ulm, pp. 162-176.
Chandra, S., Ray, S. and Goswami, R.T., 2017, January. Big data security in healthcare: survey
on frameworks and algorithms. In 2017 IEEE 7th International Advance Computing Conference
(IACC) (pp. 89-94). IEEE.
Damiani, E., Rana, S., Lu, S., Accorsi, R., Ardagna, C., Arpinar, B., Bellandi, V., Bhowmik, R.,
Braghin, C., Choi, B. and Cimato, S., IJBD. (2018). Editorial Board, pp. 94.
Daries, J.P., Reich, J., Waldo, J., Young, E.M., Whittinghill, J., Ho, A.D., Seaton, D.T. and
Chuang, I., 2014. Privacy, anonymity, and big data in the social sciences.

A report Secondary and Primary Issues in Analytics - Dumnonia Corporation 15
Fu, X., Wang, J., Qi, Q., Liao, J. and Li, T., 2018. Incentive Mechanisms for Resource Scaling-
out Game of Stream Big Data Analytics. Journal of Grid Computing, 16(4), pp.569-585.
Hashem, I.A.T., Yaqoob, I., Anuar, N.B., Mokhtar, S., Gani, A. and Khan, S.U., 2015. The rise
of “big data” on cloud computing: Review and open research issues. Information systems, 47,
pp.98-115.
Hodges, D. and Creese, S., 2013, October. Breaking the arc: risk control for big data. In 2013
IEEE International Conference on Big Data (pp. 613-621). IEEE.
Jagadish, H.V., 2016. The Values Challenge for Big Data. Bulletin of the IEEE Computer
Society Technical Committee on Data Engineering, pp.77-84.
Jordan, C., 2017. Big data analytics: balancing individuals’ privacy rights and business
interests (Doctoral dissertation, Canterbury Christ Church University), pp. 01.
Jutla, D.N. and Bodorik, P., 2015, October. PAUSE: A privacy architecture for heterogeneous
big data environments. In 2015 IEEE International Conference on Big Data (Big Data) (pp.
1919-1928). IEEE.
Kenekar, T.V. and Dani, A.R., 2017. Privacy Preserving Data Mining on Unstructured Data.
In Privacy and Security Policies in Big Data (pp. 167-190). IGI Global.
Litoiu, M. and Shtern, M., Bitnobi Inc, 2018. Systems and methods of controlled sharing of big
data. U.S. Patent Application 15/525,636, pp.46-65.
Fu, X., Wang, J., Qi, Q., Liao, J. and Li, T., 2018. Incentive Mechanisms for Resource Scaling-
out Game of Stream Big Data Analytics. Journal of Grid Computing, 16(4), pp.569-585.
Hashem, I.A.T., Yaqoob, I., Anuar, N.B., Mokhtar, S., Gani, A. and Khan, S.U., 2015. The rise
of “big data” on cloud computing: Review and open research issues. Information systems, 47,
pp.98-115.
Hodges, D. and Creese, S., 2013, October. Breaking the arc: risk control for big data. In 2013
IEEE International Conference on Big Data (pp. 613-621). IEEE.
Jagadish, H.V., 2016. The Values Challenge for Big Data. Bulletin of the IEEE Computer
Society Technical Committee on Data Engineering, pp.77-84.
Jordan, C., 2017. Big data analytics: balancing individuals’ privacy rights and business
interests (Doctoral dissertation, Canterbury Christ Church University), pp. 01.
Jutla, D.N. and Bodorik, P., 2015, October. PAUSE: A privacy architecture for heterogeneous
big data environments. In 2015 IEEE International Conference on Big Data (Big Data) (pp.
1919-1928). IEEE.
Kenekar, T.V. and Dani, A.R., 2017. Privacy Preserving Data Mining on Unstructured Data.
In Privacy and Security Policies in Big Data (pp. 167-190). IGI Global.
Litoiu, M. and Shtern, M., Bitnobi Inc, 2018. Systems and methods of controlled sharing of big
data. U.S. Patent Application 15/525,636, pp.46-65.

A report Secondary and Primary Issues in Analytics - Dumnonia Corporation 16
Menandas, J.J. and Joshi, J.J., 2014. Data mining with parallel processing technique for
complexity reduction and characterization of big data. Glob J Adv Res, 1, pp.68-80.
Munir, K., Al-Mutairi, M.S. and Mohammed, L.A. eds., 2015. Handbook of Research on
Security Considerations in Cloud Computing. Information Science Reference, pp.32.
Naderi, N. and Alizadeh, H., 2018. Privacy and Security of Big Data in THE
Cloud. International Journal of Information, Security and Systems Management, 7(1), pp.775-
784.
Ramani, K., 2019. Impact of Big Data on Security: Big Data Security Issues and Defense
Schemes. In Cloud Security: Concepts, Methodologies, Tools, and Applications (pp. 2014-2038).
IGI Global.
UTD Anonymization ToolBox. Accessed on 9th May 2019 from: < http://cs.utdallas.edu/dspl/cgi-
bin/toolbox/index.php >
Wu, X., Zhu, X., Wu, G.Q. and Ding, W., 2014. Data mining with big data. IEEE transactions
on knowledge and data engineering, 26(1), pp.97-107.
Yang, X., Wang, T., Ren, X. and Yu, W., 2017. Survey on improving data utility in differentially
private sequential data publishing. IEEE Transactions on Big Data, pp. 243-263.
Ye, H., Cheng, X., Yuan, M., Xu, L., Gao, J. and Cheng, C., 2016, September. A survey of
security and privacy in big data. In 2016 16th international symposium on communications and
information technologies (iscit) (pp. 268-272). IEEE.
Menandas, J.J. and Joshi, J.J., 2014. Data mining with parallel processing technique for
complexity reduction and characterization of big data. Glob J Adv Res, 1, pp.68-80.
Munir, K., Al-Mutairi, M.S. and Mohammed, L.A. eds., 2015. Handbook of Research on
Security Considerations in Cloud Computing. Information Science Reference, pp.32.
Naderi, N. and Alizadeh, H., 2018. Privacy and Security of Big Data in THE
Cloud. International Journal of Information, Security and Systems Management, 7(1), pp.775-
784.
Ramani, K., 2019. Impact of Big Data on Security: Big Data Security Issues and Defense
Schemes. In Cloud Security: Concepts, Methodologies, Tools, and Applications (pp. 2014-2038).
IGI Global.
UTD Anonymization ToolBox. Accessed on 9th May 2019 from: < http://cs.utdallas.edu/dspl/cgi-
bin/toolbox/index.php >
Wu, X., Zhu, X., Wu, G.Q. and Ding, W., 2014. Data mining with big data. IEEE transactions
on knowledge and data engineering, 26(1), pp.97-107.
Yang, X., Wang, T., Ren, X. and Yu, W., 2017. Survey on improving data utility in differentially
private sequential data publishing. IEEE Transactions on Big Data, pp. 243-263.
Ye, H., Cheng, X., Yuan, M., Xu, L., Gao, J. and Cheng, C., 2016, September. A survey of
security and privacy in big data. In 2016 16th international symposium on communications and
information technologies (iscit) (pp. 268-272). IEEE.
1 out of 16
Related Documents

Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
© 2024 | Zucol Services PVT LTD | All rights reserved.