ProductsLogo
LogoStudy Documents
LogoAI Grader
LogoAI Answer
LogoAI Code Checker
LogoPlagiarism Checker
LogoAI Paraphraser
LogoAI Quiz
LogoAI Detector
PricingBlogAbout Us
logo

Business Intelligence Using Big Data

Verified

Added on  2023/01/17

|16
|4212
|71
AI Summary
This document provides an overview of business intelligence using big data. It covers topics such as the Vs of big data, big data technologies like Hadoop and Spark, NoSQL databases, predictive analytics, artificial intelligence, and more. The document also includes a case study on how United Airlines uses big data. Suitable for courses in business intelligence, data analytics, and big data.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
BUSINESS INTELLIGENCE USING BIG DAT
By Name
Course
Instructor
Institution
Location
Date

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
Contents
Introduction.................................................................................................................................................3
Explaining the Vs of big data....................................................................................................................3
Big Data Technologies.................................................................................................................................4
Hadoop Ecosystem..................................................................................................................................4
Spark........................................................................................................................................................4
R..............................................................................................................................................................5
Data Lakes...............................................................................................................................................5
NoSQL Databases....................................................................................................................................6
Predictive Analytics.................................................................................................................................6
Artificial Intelligence................................................................................................................................6
Streaming Analytics.................................................................................................................................7
Edge computing.......................................................................................................................................7
Blockchain...............................................................................................................................................8
Big Data Modeling.......................................................................................................................................8
Techniques for Data Modeling................................................................................................................9
Data Modeling in Social Networks.......................................................................................................9
Modeling of Data in Cloud Environment...........................................................................................10
Data Model based on Ontology.........................................................................................................10
Big Data for United Airlines.......................................................................................................................11
Architecture of Big Data used by United Airlines...................................................................................11
How United Airlines may use big data to customize customer needs...................................................12
Providing highly personalized offers..................................................................................................12
Calls that is more insightful................................................................................................................12
Provision of smarter and safer flights................................................................................................13
Provision of real time baggage status................................................................................................13
References.................................................................................................................................................15
Document Page
Introduction
Big data is a term that is constantly changing which is a description of a large volume of
structured, semi-structured as well as unstructured data which has the ability for being mined for
information as well as being used in projects of machine learning as well as other applications of
advanced analytics. Big data is normally featured by the 3Vs: the extreme data volume, the broad
variety of data kinds as well as the velocity where the data has to be processed. Those features
were initially identified by Gartner analyst Doug Laney in a finding published in the year 2001
(Abbasi, Sarker and Chiang, 2016). More recently, numerous other Vs have been included to
descriptions of big data among them veracity, variability as well as value. In as much as big data
is not equivalent to a certain volume of data, the term is normally used in the description of
terabytes, exabytes as well as petabytes of data taken with time.
Explaining the Vs of big data
These voluminous data may come from an avalanche of sources for instance customer
databases, business transaction systems, medical records, mobile applications, real-time data
sensors, data generated from machine, internet clickstream logs alongside the gathered results of
scientific experiments (Ahmed et al., 2017). The data may be in the raw form or even pre-
processes with the use of various data mining tools otherwise data preparation software prior to
analysis.
Big data as well entail an avalanche of types of data, inclusive of structured data in the SQL
databases as well as data warehouses, unstructured data including document as well as text files
as held in Hadoop clusters or even NoSQL systems alongside semi-structured for instance web
server logs or even streaming data from the various sources. Still, big data is inclusive multiple,
simultaneous sources of data that may not be integrated. For instance, a project data analytics
Document Page
may try to measure the success of a product as well as future sales by associating data sales,
online buyer review data as well as return data for the specific.
Big Data Technologies
Hadoop Ecosystem
In as much as Apache Hoop might not be as dominant as it used to be then, it is almost not
possible to discuss big data without talking about this open source framework that is used for
distributed processing of large sets of data (Al-Ali et al., 2017). It was predicted by Forester that
nearly 100% of the larger enterprise would enact Hadoop alongside the related technologies
including Spark for the purposes of big data analytics within a period of the preceding two years.
Hadoop has managed to develop over the years to incorporate a whole ecosystem of interlinked
software and numerous big data solutions are dependent on Hadoop. The market for products as
well as services based on Hadoop has been forecast by Zion Market research to grow continually
at 50 per cent CAGR all the way to 2022
Spark
Apache Spark is a component of the Hadoop ecosystem even though the use has gained a lot of
widespread that is qualifies to have its own category. It is a processing system for use in the
processing of big data inside Hadoop and has been established to be to the tune of 100 times
factor in comparison with the standard Hadoop engine, MapReduce
Going by the AtScale 2016 Big Data Maturity Survey, about 25 per cent of the intervieews
confirmed they had adopted Spark in their production systems while another 33 per cent had
projects of Spark in development. This is a clear indication that there is sizeable and growing

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
interest in the Spark technology and numerous vendors having Hadoop offering as well provided
Spark based products
R
R which is yet another open source project defines a programming language as well as software
environment that are tailored for working with statistics. Being the darling of most of the data
scientists, it is under the management of R Foundation and often availed under the license of
GPL 3. Numerous commonly preferred internal development environments among them Visual
Studio as well as Eclipse support the language (Côrte-Real, Oliveira and Ruivo, 2017).
Numerous organizations which rank the popularity of different programming languages hint that
R has turned out to be one of the languages with the highest popularity around the world. This is
of significance as the programming languages that are close to the top of such charts are often
general-purpose languages which may be used for numerous types of work. For a language that
is nearly used exclusively for big data projects to get to such close to the top is a demonstration
of the importance of big data as well as the significance of the language in its field.
Data Lakes
A number of enterprises are establishing data lakes to enable ease of access of the wide stores of
data. Such are big data repositories which gather data from numerous different sources and have
them stored in their natural state. This tends to be a little bit different from data warehouse which
as well gathers data from the disparate sources even though processes it and then structures it for
the purposes of storage. In such a case, the warehouse and lake metaphors are almost accurate.
Suppose data is like water, a data lake would ne natural and unfiltered just as is the case with a
water body while a warehouse would be more of a collection of various water bottles kept on
shelves. Data lakes are specifically attractive in cases where the enterprise is interested in storing
Document Page
data but is not sure how the data may be used (De Mauro, Greco and Grimaldi, 2015). Numerous
IoT data may fit into this category with the IoT trend taking part in the growth of data lakes.
NoSQL Databases
Conventional relational database management systems keep information in structured and
defined rows and columns. The developers as well as database administrator question change and
manage the data within the relational database management systems with the aid of s special
language called SQL. NoSQL databases are specialised in the storage of unstructured data as
well as provision of fast performance even though they do not offer the same consistency levels
as relational database management systems De Mauro, Greco and Grimaldi, 2015). Among the
most popular NoSQL databases are inclusive of Redis, MongoDB, Couchbase, and Cassandra
among numerous others even as some of the leading vendors of RDMS including IBM and
Oracle at the time as well provided NoSQL databases. NoSQL databases have gained significant
popularity with the growth in the trend of the big data.
Predictive Analytics
Predictive Analytics is a sub-group of big data analytics which tries to forecast the future events
or even behavior depending on the historical data. It borrows from data mining, machine learning
as well as modeling methods in the prediction of what is likely to occur next. It is normally
applied in fraud detection, business, and credit scoring, marketing as well as finance analysis
purposes. In the recent, developments in the artificial intelligence have allowed wide
improvements in the abilities of solutions of predictive analytics. Hence, enterprises have started
investing more in solutions of big data using predictive capabilities.
Artificial Intelligence
The technology of artificial intelligence has remained to be very useful over the last few years
with the concept of the technology having been around for about as the same time as computers.
Document Page
In numerous ways, the trend of big data has affected the advancements in artificial intelligence in
two main sub-sets of the field: deep learning as well as machine learning (Gandomi and Haider,
2015). The generally acceptable definition of machine learning is that it is a technology which
provides computer with the potential of learning without necessarily being explicitly
programmed.
Machine learning technology enables the system to analyse historical data, acknowledge the
pattern, come up with models as well as foretell the future results in big data analytics. Deep
leaning technology is a kind of machine learning which depends on artificial neural networks as
well as makes use of multiple layers of algorithms in the analysis of data. It hold numerous
promise as a field for enabling analytics tools to acknowledge the content in videos as well as
images and thereafter performs appropriate processing (Gunasekaran et al., 2018).
Streaming Analytics
Since organizations have gained more familiarity with the abilities of big data analytics
solutions, they have started going for faster access to the insights. For such enterprises, streaming
analytics having the ability to analyze data when it is being created remains a thing of a holy
grail. They are hunting for solutions which may accept the input from numerous disparate
sources, takes it through processing and finally return immediately the insights or even as close
as can be attained. This is specifically desirable when it comes to the deployment of new IoT
which are aiding in driving the interest with regard to streaming big data analytics.
Edge computing
Besides the upcoming interest in streaming analytics, the IoT is as well generating more interest
in edge computing. Edge computing is to some extent the reverse of cloud computing to some
extent. Rather than transmission of data to a centralized server for the purposes of analysis, edge

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
computing systems work by analysing data quite near to the point of generation which is at the
network edge (Hashem et al., 2016). The main benefits of edge computing system is it lowers the
volume of information which has to undergo transmission over the network hence lowering the
network traffic as well as associated costs. It lowers the demand on data centers or even cloud
computing facilities, removing a possible single failure point as well as freeing up the capacity
for numerous other loads.
Blockchain
Blockchain which serves as yet another favourite among venture capitalists and analysts who are
looking forward is the main distributed database technology on which the Bitcoin digital
currency is based. The unique characteristic of the blockchain database is upon writing data, it is
not possible to change or delete after the fact. Besides, it has been established to be of very high
security rendering it an ideal choice for applications of big data in industries treated as sensitive
including the health care, insurance as well as banking sectors among others. The blockchain
technology is still at the very infant stages and there is still development of use cases (Kimble
and Milolidakis, 2015). Nevertheless, numerous vendors among them Microsoft, IBM, AWS
alongside multiple start-ups has begun experimental or otherwise introductory solutions based on
blockchain technology.
Big Data Modeling
Big Data has been tapping significantly increasing attention as well as recognition owing to its
wide research as well as application prospects in the recent past. With the increasing interest in
big data getting on the rise, effort is increasing in an attempt to analyze as well as store the data.
In the earlier times, a good number of the DBMS packages were used in the management of
structured data only but of late data have been established to be made up of some unstructured
Document Page
part as well and during the analyses the unstructured part have a significant role to play when it
comes to decision making for instance web logs make quite a large aspect of website data and is
mostly unstructured but in a bid to have an understanding of the customer interest, it remains an
important resource (Larson and Chang, 2016). Most of the data originated today is unstructured
and thus the management of data remains to be a bit of a challenge thus prior to application of
analytics in big data, there is needed to carry out modelling. Modeling of the big data is of
importance as it is composed of structured, semi-structured as well as unstructured data and to
the tune of about 85% of the data tends to be structured and semi-structured.
Techniques for Data Modeling
Data Modeling in Social Networks
The data model which uses Big Data table by Google in the storage of social network including
contents as well as comments is as shown in the figure below. The table may be viewed as a
key/value based model made up of n rows with each of the rows having a unique identifier in the
field of row key. Every Row key is made up of numerous columns and every column stores
column key as well as column value. n key value pairs are found in a column and identification
of each of the data unique is done using column key.
Document Page
Modeling of Data in Cloud Environment
This model involves first building a schema for Big Data and for the creation of the schema, the
technique first acknowledges the kind of data that is getting in from numerous sources. In case
the identified data is unstructured, the key information from the same is obtained through
developing metadata (Phillips-Wren et al., 2015). The entities are extracted as to develop the
metadata such as information regarding names publisher among others and performs extraction
of the facts such as information regarding the content types as well as issues among others.
As soon as the needed information is extracted it would be grouped as per the data type as well
as the table is generated which would be mapped using the structured data schema. Upon
mapping both schemas, unified schema are made that would hold data regarding all stored data
in database in Big Data Dictionary.
Storage of the data is done on commodity hardware at cloud storage level with the aid of
Hadoop’s HDFS. Formation of clusters is done based on the data type, in which two cluster types
will be formed; one for the storage of unstructured data and another for storage of structured
data. Further classification will be done on unstructured data cluster based on the category of
unstructured data for instance in case there are three groups of unstructured data such as audio,
video and text then there will be formation of three clusters (Wamba et al., 2015).
Data Model based on Ontology
Ontology defines an explicit specification of a given conceptualization and refers to abstract
modeling of objects in the real world for instance concepts, identities as well as constraints. As a
result of the Big Data extracted from various data sources, the data may not be shared and could
not understand one another. An analysis on how to combine technology of ontology in coming
up with data model that is in conformity with MapReduce framework in solving challenges as a
result of unstructured data is as shown in the flow diagram below.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Big Data for United Airlines
Architecture of Big Data used by United Airlines
United Airlines used two major data airlines to attain the needs of its customers: Teradata and
Hortonworks platform of which Teradata is one of the most mature data platforms having been
in existence for more than 20 years while Hortonworks is an emerging technology that offers an
economical data science besides being data lake friendly and supportive of frameworks relatively
faster rates than Teradata. The target architecture for United Airlines big data in customer needs
satisfaction revolve around security and governance. Common strategy of security is done using
Teradata-GDPR while governance is attained using Apache Atlas technology.
Document Page
Figure 2: Target architecture data late/curated layer for United Airlines
How United Airlines may use big data to customize customer needs
Providing highly personalized offers
Things tend to be a little more interesting whle one airline booking is composed of numerous
moving parts hence airlines must get creative to ensure they remain at the top of the ball and
outdo their competitors. United Airlines shifted from a conventional collect and analyze
approach for up selling add-ons to an approach that has been described as smart collect, detect,
act system which performs an analysis of more than 150 variables within the profile of a
customer (Wang, Kung and Byrd, 2018). Everything ranging from the last destinations all the
way to prior purchases is evaluated in about 200 milliseconds to find out the likely actions of a
passenger as well as develop tailor-made offer. This new system has seen an increase in the year
over year revenue of United Airlines by more than 15%.
Calls that is more insightful
United Airlines has paired up with Aspect which is a software company to come up with a
unique suite of client contact as well as solution for optimization of workforce. Speech analytics
Document Page
tool is one such solution that has remained to be outstanding which would enable the
representatives of the customer service to comprehend the nuances of each noted interaction of
customer (Xu, Frankwick and Ramirez, 2016). It does not only end at phone call. The
representatives at United Airlines may as well conduct an analysis of the online data from the
various channels including social media to collect more information regarding customers in real
time. Various metrics will be used in guiding the service personnel to the most preferred solution
in each scenario. United Airlines has often remained to be admired for the award winning
customer service hence chances are that numerous other airlines will follow suit soon.
Provision of smarter and safer flights
Ever engine of a Boeing 737 generates more than 20 terabytes of data every hour while a typical
cross country flight that lasts six hours would generate about 240 terabytes of sensor data from
one plane. United Airlines has more than 800 airplanes alongside an entire lot of data it is able to
mine. As per the experts, United Airlines has combined efforts with NASA to ensure data mining
of the massive sensor data cache in trying to enhanced air safety. NASA has generated an
automated system that is able to crunch wide sets of data to determine anomalies that may show
possible safety issues through being powered by a machine learning algorithm which teaches
computers the information to look for. United Airlines has hopes that the expertise of NASA will
enable it to someday fly all the more than 700 planes of fleet within any incident (Xu, Frankwick
and Ramirez, 2016).
Provision of real time baggage status
Going by the report of one of the employees of United Airlines, each client has had the
experience of boarding a flight upon checking their bags and being concerned if it is available.
The universal fear is the reason for introduction of the innovative baggage tracking app by the
United Airlines which is used by the clients. The new app makes use of behind the scene

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
baggage check data which United Airlines personnel use in ascertaining the bags do not go
missing and just presents it in a tracker that is easy to use for the clients that could be suffering
any possible fears (Xu, Frankwick and Ramirez, 2016). Radio Frequency Identification will
substitute the barcode hand scanning which has remained to be a standard of the industry as early
as the 1990s. United Airlines has often had a significant interest when it comes to assuring the
safety of the more than 100 million bags that it handles every year. It is one of the very few
airlines that are making a remarkably big and untested discovery to enhance the satisfaction of
the customers.
Document Page
References
Abbasi, A., Sarker, S. and Chiang, R.H., 2016. Big data research in information systems: Toward
an inclusive research agenda. Journal of the Association for Information Systems, 17(2), p.I
Ahmed, E., Yaqoob, I., Hashem, I.A.T., Khan, I., Ahmed, A.I.A., Imran, M. and Vasilakos,
A.V., 2017. The role of big data analytics in Internet of Things. Computer Networks, 129,
pp.459-471
Al-Ali, A.R., Zualkernan, I.A., Rashid, M., Gupta, R. and Alikarar, M., 2017. A smart home
energy management system using IoT and big data analytics approach. IEEE Transactions on
Consumer Electronics, 63(4), pp.426-434
Côrte-Real, N., Oliveira, T. and Ruivo, P., 2017. Assessing business value of Big Data Analytics
in European firms. Journal of Business Research, 70, pp.379-390
De Mauro, A., Greco, M. and Grimaldi, M., 2015, February. What is big data? A consensual
definition and a review of key research topics. In AIP conference proceedings (Vol. 1644, No. 1,
pp. 97-104). AIP
De Mauro, A., Greco, M. and Grimaldi, M., 2016. A formal definition of Big Data based on its
essential features. Library Review, 65(3), pp.122-135
Gandomi, A. and Haider, M., 2015. Beyond the hype: Big data concepts, methods, and
analytics. International journal of information management, 35(2), pp.137-144
Document Page
Gunasekaran, A., Yusuf, Y.Y., Adeleye, E.O. and Papadopoulos, T., 2018. Agile manufacturing
practices: the role of big data and business analytics with multiple case studies. International
Journal of Production Research, 56(1-2), pp.385-397
Hashem, I.A.T., Chang, V., Anuar, N.B., Adewole, K., Yaqoob, I., Gani, A., Ahmed, E. and
Chiroma, H., 2016. The role of big data in smart city. International Journal of Information
Management, 36(5), pp.748-758
Kimble, C. and Milolidakis, G., 2015. Big data and business intelligence: Debunking the
myths. Global Business and Organizational Excellence, 35(1), pp.23-34
Larson, D. and Chang, V., 2016. A review and future direction of agile, business intelligence,
analytics and data science. International Journal of Information Management, 36(5), pp.700-710
Phillips-Wren, G.E., Iyer, L.S., Kulkarni, U.R. and Ariyachandra, T., 2015. Business Analytics
in the Context of Big Data: A Roadmap for Research. CAIS, 37, p.23
Wamba, S.F., Akter, S., Edwards, A., Chopin, G. and Gnanzou, D., 2015. How ‘big data’can
make big impact: Findings from a systematic review and a longitudinal case study. International
Journal of Production Economics, 165, pp.234-246
Wang, Y., Kung, L. and Byrd, T.A., 2018. Big data analytics: Understanding its capabilities and
potential benefits for healthcare organizations. Technological Forecasting and Social
Change, 126, pp.3-13
Xu, Z., Frankwick, G.L. and Ramirez, E., 2016. Effects of big data analytics and traditional
marketing analytics on new product success: A knowledge fusion perspective. Journal of
Business Research, 69(5), pp.1562-1566
1 out of 16
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]