MBA 610 - Big Data Analysis: Applications, Challenges, and Web Mining
VerifiedAdded on 2023/06/11
|7
|1682
|395
Essay
AI Summary
This essay provides a comprehensive overview of big data analysis, highlighting its applications in business, associated limitations, and the role of web mining. It discusses the significance of big data in contemporary business processes, emphasizing the importance of considering the 4Vs (volume, velocity, veracity, and variety). The essay also examines the challenges and threats related to deploying big data in cloud environments, along with the benefits of analyzing big data for cost savings, time reduction, and new product development. Furthermore, it delves into the Hadoop and Rapid Miner platforms, in-memory computing, and the use of predictive and prescriptive modeling. The essay concludes by addressing web mining techniques and their role in extracting valuable information from the web.

ABS 744194
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Part 1
Big Data is a method of data analysis which supports high-velocity data, storage, as well as
analysis. Presently, it is a significant emerging technology and the 4Vs of big data i.e. volume,
velocity, veracity and variety make management of data as well as analytics perplexing for the
traditional data depositories. It is essential to consider big data and analysis together as big data
can be used to describe the recent explosion of varying data from varying sources while
Analytics is the exploration of data to derive interesting as well as relevant trends and patterns
that can be utilized to notify the decisions, improve processes and even initiate new business
models (El-Seoud, El-Sofany, Abdelfattah, & Mohamed, 2017). There are various challenges
and threats that are required to be considered in the deployment of big data on cloud
environment. The security vulnerabilities arise due to incorporation between big data and the
cloud system environment, which results in platform heterogeneity (Manyika, et al., 2011).
The online services and mobile apps achieve various benefits by analyzing and using big data
such as utilizing the collected data. The company taking data using any source and analyzing it
can achieve cost savings, time reductions, new product development, and understanding of
market conditions as well as in controlling the online reputation. Highly accurate big data
analysis can be improved on the basis of utilization of advanced technology. The company can
use various big data platforms which include Hadoop and Apache Spark that can provide unique
analysis of huge sets of data (McKinsey & Company, 2018). Highly innovative big data
technology can also generate stylish big data models in view of better depiction of the collected
data. Some of the companies might prefer big data providers and select from the various choices
available and suitable for their needs and generate accurate results (Delgado, 2015).
Big Data is a method of data analysis which supports high-velocity data, storage, as well as
analysis. Presently, it is a significant emerging technology and the 4Vs of big data i.e. volume,
velocity, veracity and variety make management of data as well as analytics perplexing for the
traditional data depositories. It is essential to consider big data and analysis together as big data
can be used to describe the recent explosion of varying data from varying sources while
Analytics is the exploration of data to derive interesting as well as relevant trends and patterns
that can be utilized to notify the decisions, improve processes and even initiate new business
models (El-Seoud, El-Sofany, Abdelfattah, & Mohamed, 2017). There are various challenges
and threats that are required to be considered in the deployment of big data on cloud
environment. The security vulnerabilities arise due to incorporation between big data and the
cloud system environment, which results in platform heterogeneity (Manyika, et al., 2011).
The online services and mobile apps achieve various benefits by analyzing and using big data
such as utilizing the collected data. The company taking data using any source and analyzing it
can achieve cost savings, time reductions, new product development, and understanding of
market conditions as well as in controlling the online reputation. Highly accurate big data
analysis can be improved on the basis of utilization of advanced technology. The company can
use various big data platforms which include Hadoop and Apache Spark that can provide unique
analysis of huge sets of data (McKinsey & Company, 2018). Highly innovative big data
technology can also generate stylish big data models in view of better depiction of the collected
data. Some of the companies might prefer big data providers and select from the various choices
available and suitable for their needs and generate accurate results (Delgado, 2015).

However, big data analytics is a remarkable tool that can assist in business decisions, it is
associated with certain limitations as well. Data analysts utilize big data to tease out correlation,
particularly, when one variable is associated to another; however, not all these correlations are
considerable. Big data can be utilized to distinguish correlations and perceptions utilizing
limitless range of queries and questions. It is on the user to figure out the significance of the
questions. Achieving the right answer for the wrong question might become a costly affair to the
user, the clients as well as the business. In addition, big data analytics is susceptible to data
breaches and might be difficult to consistently transfer data to specialists for repeat analysis
(Ciklum, 2017).
Despite everything, it is beneficial for all the organizations to analyze big data, as it reveals
information about the behaviors of customers which assist in performing business in an effective
manner. It assists in developing new business applications and ensuring customer satisfaction.
That is why, before deciding to work with big data, a company should address various people,
organizations, and technology issues. Well-planned analytical processes and people with talent
and skills are required to control the technologies as well as perform effective big data analytics
initiative. The organizations and people should decide what they prefer out of big data before
working with it and determining the security access for big data. The general way to quantify
return on investment from the system is to regulate the speed of transactions and then induce its
significance from the standpoint of captured revenues (Shacklett, 2015). Thus, right steps are
required to be taken in order to optimize its value for the benefit of the organization.
associated with certain limitations as well. Data analysts utilize big data to tease out correlation,
particularly, when one variable is associated to another; however, not all these correlations are
considerable. Big data can be utilized to distinguish correlations and perceptions utilizing
limitless range of queries and questions. It is on the user to figure out the significance of the
questions. Achieving the right answer for the wrong question might become a costly affair to the
user, the clients as well as the business. In addition, big data analytics is susceptible to data
breaches and might be difficult to consistently transfer data to specialists for repeat analysis
(Ciklum, 2017).
Despite everything, it is beneficial for all the organizations to analyze big data, as it reveals
information about the behaviors of customers which assist in performing business in an effective
manner. It assists in developing new business applications and ensuring customer satisfaction.
That is why, before deciding to work with big data, a company should address various people,
organizations, and technology issues. Well-planned analytical processes and people with talent
and skills are required to control the technologies as well as perform effective big data analytics
initiative. The organizations and people should decide what they prefer out of big data before
working with it and determining the security access for big data. The general way to quantify
return on investment from the system is to regulate the speed of transactions and then induce its
significance from the standpoint of captured revenues (Shacklett, 2015). Thus, right steps are
required to be taken in order to optimize its value for the benefit of the organization.
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

Part 2
Hadoop big data analytics is the actual platform for the big data and plays a significant part in
big data analytics. The Rapid Miner platform is an outstanding explanation for dealing with
unstructured data such as text files, web traffic logs as well as images. The analytical engines in
Rapid Miner are in-memory data storage highly optimized for data access generally executed for
analytical processes. In-memory analytics is always the quickest method to develop analytical
models. The size of the data set is restricted by memory or hardware as with the availability of
more memory, larger datasets can be analyzed. Hadoop offers a storage engine which is
distributed along with the likelihood of utilizing the Hadoop cluster for a distributed analytical
engine for big data analytics. It is not applicable for quick and interactive analysis and runtime is
based on the authority of the Hadoop cluster, however, it has practically infinite scalability. As
traditionally, in databases memory was an appreciated source, In-memory computing always
utilizes memory initially and prevents touching the disk. In-memory computing is doing great
from the standpoint of speed and evolution in memory technology and through it, persistent
memory will be possible (Anadiotis, 2017).
The complicated predictive as well as prescriptive modeling can assist the organizations
expecting business opportunities take decisions that influence profits in regions such as targeting
market campaigns and avoiding failures of equipment. The historical data sets are extracted for
patterns revealing future situations and behaviors through predictive analytics, while the results
of predictive analytics are incorporated through prescriptive analytics, in order to advocate
actions that will take paramount advantage of the projected consequences. Big data analytics
tools can consume an extensive range of data types such as structured data, with distinct and
reliable fields such as transaction data, kept in interpersonal databases; semi-structured data, for
Hadoop big data analytics is the actual platform for the big data and plays a significant part in
big data analytics. The Rapid Miner platform is an outstanding explanation for dealing with
unstructured data such as text files, web traffic logs as well as images. The analytical engines in
Rapid Miner are in-memory data storage highly optimized for data access generally executed for
analytical processes. In-memory analytics is always the quickest method to develop analytical
models. The size of the data set is restricted by memory or hardware as with the availability of
more memory, larger datasets can be analyzed. Hadoop offers a storage engine which is
distributed along with the likelihood of utilizing the Hadoop cluster for a distributed analytical
engine for big data analytics. It is not applicable for quick and interactive analysis and runtime is
based on the authority of the Hadoop cluster, however, it has practically infinite scalability. As
traditionally, in databases memory was an appreciated source, In-memory computing always
utilizes memory initially and prevents touching the disk. In-memory computing is doing great
from the standpoint of speed and evolution in memory technology and through it, persistent
memory will be possible (Anadiotis, 2017).
The complicated predictive as well as prescriptive modeling can assist the organizations
expecting business opportunities take decisions that influence profits in regions such as targeting
market campaigns and avoiding failures of equipment. The historical data sets are extracted for
patterns revealing future situations and behaviors through predictive analytics, while the results
of predictive analytics are incorporated through prescriptive analytics, in order to advocate
actions that will take paramount advantage of the projected consequences. Big data analytics
tools can consume an extensive range of data types such as structured data, with distinct and
reliable fields such as transaction data, kept in interpersonal databases; semi-structured data, for
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

example mobile application log files or web servers; as well as unstructured data, including
things such as emails, text messages, text files, and social media columns (Loshin, 2015).
Web mining is considered as the process of using techniques of data mining as well as
algorithms to extract the information directly from the Web, by mining it from web content,
hyperlinks, web documents and server logs. The main objective of Web mining is to search for
patterns in the web data by gathering and examining the information so as to achieve an
understanding into the developments, industry as well as the users in general (Patel, Chauhan, &
Patel, 2011). It is considered as a branch of data mining which focuses on the World Wide Web
by means of the primary data source together with its entire components from Web content,
server logs concerning the whole thing. The contents of data mined from the Web might be a
gathering of facts that Web pages are considered to be including and might consist of text or
structured data for instance lists, tables, images, videos as well as audio.
There are various categories of Web mining which are;
Web content mining- It is considered as the process of mining valuable information
from the subjects of web pages and documents, majorly text, images as well as audio or
video files.
Web structure mining- It is the process of analyzing nodes and linking the structure of a
website with the utilization of graph theory. There are two things that can be acquired
through this, which are, the structure of a website from the viewpoint of its connection
with other sites and the document structure of the website as well as the way individual
pages are connected (Techopedia, 2018).
Web usage mining- It is the process of extracting patterns and information from the
server logs to achieve an understanding of the user activity, together with information
things such as emails, text messages, text files, and social media columns (Loshin, 2015).
Web mining is considered as the process of using techniques of data mining as well as
algorithms to extract the information directly from the Web, by mining it from web content,
hyperlinks, web documents and server logs. The main objective of Web mining is to search for
patterns in the web data by gathering and examining the information so as to achieve an
understanding into the developments, industry as well as the users in general (Patel, Chauhan, &
Patel, 2011). It is considered as a branch of data mining which focuses on the World Wide Web
by means of the primary data source together with its entire components from Web content,
server logs concerning the whole thing. The contents of data mined from the Web might be a
gathering of facts that Web pages are considered to be including and might consist of text or
structured data for instance lists, tables, images, videos as well as audio.
There are various categories of Web mining which are;
Web content mining- It is considered as the process of mining valuable information
from the subjects of web pages and documents, majorly text, images as well as audio or
video files.
Web structure mining- It is the process of analyzing nodes and linking the structure of a
website with the utilization of graph theory. There are two things that can be acquired
through this, which are, the structure of a website from the viewpoint of its connection
with other sites and the document structure of the website as well as the way individual
pages are connected (Techopedia, 2018).
Web usage mining- It is the process of extracting patterns and information from the
server logs to achieve an understanding of the user activity, together with information

about where the users are from, number of clicks on every item on the site as well as
various activities being conducted on site.
References
Anadiotis, G. (2017). In-memory computing: Where fast data meets big data. Retrieved from
Zdnet.com: https://www.zdnet.com/article/in-memory-computing-where-fast-data-meets-
big-data/
Ciklum. (2017). Limitations of Big Data analytics. Retrieved from Ciklum.com:
https://www.ciklum.com/blog/limitations-of-big-data-analytics/
Delgado, R. (2015). Improving the accuracy of Big Data analysis. Retrieved from
Dataconomy.com: http://dataconomy.com/2015/10/improving-the-accuracy-of-big-data-
analysis-2/
El-Seoud, S. A., El-Sofany, H. F., Abdelfattah, M., & Mohamed, R. (2017). Big Data and cloud
computing: trends and challenges. iJIM, 11(2), 34-52.
Loshin, D. (2015). How big data analytics tools can help your organization. Retrieved from
Techtarget.com: https://searchbusinessanalytics.techtarget.com/feature/How-big-data-
analytics-tools-can-help-your-organization
Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., & Byers, A. H. (2011).
Big Data: The next frontier for innovation, competition, and productivity. McKinsey
Global Institute.
various activities being conducted on site.
References
Anadiotis, G. (2017). In-memory computing: Where fast data meets big data. Retrieved from
Zdnet.com: https://www.zdnet.com/article/in-memory-computing-where-fast-data-meets-
big-data/
Ciklum. (2017). Limitations of Big Data analytics. Retrieved from Ciklum.com:
https://www.ciklum.com/blog/limitations-of-big-data-analytics/
Delgado, R. (2015). Improving the accuracy of Big Data analysis. Retrieved from
Dataconomy.com: http://dataconomy.com/2015/10/improving-the-accuracy-of-big-data-
analysis-2/
El-Seoud, S. A., El-Sofany, H. F., Abdelfattah, M., & Mohamed, R. (2017). Big Data and cloud
computing: trends and challenges. iJIM, 11(2), 34-52.
Loshin, D. (2015). How big data analytics tools can help your organization. Retrieved from
Techtarget.com: https://searchbusinessanalytics.techtarget.com/feature/How-big-data-
analytics-tools-can-help-your-organization
Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., & Byers, A. H. (2011).
Big Data: The next frontier for innovation, competition, and productivity. McKinsey
Global Institute.
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

McKinsey & Company. (2018). How companies are using big data and analytics. Retrieved
from Mckinsey.com: https://www.mckinsey.com/business-functions/mckinsey-
analytics/our-insights/how-companies-are-using-big-data-and-analytics
Patel, K. B., Chauhan, J. A., & Patel, J. D. (2011). Web mining in E-Commerce: Pattern
discovery, issues and applications. International Journal of P2P Network Trends and
Technology, 11(3), 40-45.
Shacklett, M. (2015). 10 things you shouldn't expect big data to do. Retrieved from
Techrepublic.com: https://www.techrepublic.com/blog/10-things/10-things-you-shouldnt-
expect-big-data-to-do/
Techopedia. (2018). Web mining. Retrieved from Techopedia.com:
https://www.techopedia.com/definition/15634/web-mining
from Mckinsey.com: https://www.mckinsey.com/business-functions/mckinsey-
analytics/our-insights/how-companies-are-using-big-data-and-analytics
Patel, K. B., Chauhan, J. A., & Patel, J. D. (2011). Web mining in E-Commerce: Pattern
discovery, issues and applications. International Journal of P2P Network Trends and
Technology, 11(3), 40-45.
Shacklett, M. (2015). 10 things you shouldn't expect big data to do. Retrieved from
Techrepublic.com: https://www.techrepublic.com/blog/10-things/10-things-you-shouldnt-
expect-big-data-to-do/
Techopedia. (2018). Web mining. Retrieved from Techopedia.com:
https://www.techopedia.com/definition/15634/web-mining
1 out of 7
Related Documents

Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
Copyright © 2020–2025 A2Z Services. All Rights Reserved. Developed and managed by ZUCOL.