logo

Big Data Characteristics, Tools and Applications - ITECH 2201 Cloud Computing

   

Added on  2023-06-11

8 Pages1913 Words494 Views
ITECH 2201 Cloud Computing
School of Science, Information Technology & Engineering
Workbook for Week 6 (Big Data)
Please note: All the efforts were taken to ensure the given web links are accessible. However,
if they are broken – please use any appropriate video/article and refer them in your answer
Part A (4 Marks)
Exercise 1: Data Science (1 mark)
Read the article at http://datascience.berkeley.edu/about/what-is-data-science/ and
answer the following:
What is Data Science?
Data science is referred to the study which assess the location of a particular information and
how it can be converted into a proper resource for the benefit of businesses and IT strategies.
It helps to analyze patterns that are created from structured and unstructured data.
According to IBM estimation, what is the percent of the data in the world today that has been
created in the past two years?
In the last two years, according to the estimation of IBM over 90% of the data that is present
in the world has been created.
____________________________________________________________________________
What is the value of petabyte storage?
1000000 gigabytes or 10^15 bytes or 1000 terabytes is the value of one petabyte storage.
_______________________________________________________________________
CRICOS Provider No. 00103D Insert file name here Page 1 of 8

For each course, both foundation and advanced, you find at
http://datascience.berkeley.edu/academics/curriculum/ briefly state (in 2 to 3 lines) what
they offer? Based on the given course description as well as from the video. The purpose
of this question is to understand the different streams available in Data Science.
The students who have fared well in OOP will get 12 units of coursework in the foundation
courses. The students who are not good in object oriented programming are assigned with 15
coursework.
In the foundation courses include various subjects like applied machine learning, statistics of
data science, data analysis, sorted data engineering.
The advanced courses have several subjects like data visualizations, machine learning at
scale, scaling up big data, casually and experiments, human values and deep learning.
Exercise 2: Characteristics of Big Data (2 marks)
Read the following research paper from IEEE Xplore Digital Library
Ali-ud-din Khan, M.; Uddin, M.F.; Gupta, N., "Seven V's of Big Data understanding Big
Data to extract value," American Society for Engineering Education (ASEE Zone 1), 2014
Zone 1 Conference of the , pp.1,5, 3-5 April 2014
and answer the following questions:
Summarise the motivation of the author (in one paragraph)
The author has tried to explain that big data is the only solution that can solve a number of
problem that are faced by industries nowadays. It is resent everywhere. It has its benefits in
several large as well as small scale businesses, entertainment, law enforcement and film
making. Even huge organizations such as Google and Facebook use this technology. It is
used by Google in several cases such as Hadoop, Mao Reduce and Big table. He has further
explained the significance of the technology in other industries such as finance, politics,
sustainability, biological research and education.
CRICOS Provider No. 00103D Insert file name here Page 2 of 8

_______________________________________________________________________
____________________________________________________________________
What are the 7 v’s mentioned in the paper? Briefly describe each V in one paragraph.
The 7Vs of the paper are volume, velocity, value, validity, veracity, volatility and variety. The
volume shows that creation of big data from numerous sources such as research studies,
images, video, text and audio. It can be also taken from government documents, telemetry,
web pages and social media which explains the volatility aspect. The velocity aspect stats
that the system that is handling the big data should have infrastructures that are capable of
processing the big data at high speed. Value and veracity comes from the different use f
clouds and browsers for the big data.
__________________________________________________________________________
Explore the author’s future work by using the reference [4] in the research paper.
Summarise your understanding how Big Data can improve the healthcare sector in 300
words.
Big data can be used in healthcare sector for a number of reasons. It can be used for
personalized medicines and treatments, to prevent the employees from doing unscrupulous
behaviors and for improved treatment service.
_______________________________________________________________________
Exercise 3: Big Data Platform (1 mark)
In order to build a big data platform - one has to acquire, organize and analyse the big
data. Go through the following links and answer the questions that follow the links: Check
the videos and change the wordings
http://www.infochimps.com/infochimps-cloud/how-it-works/
CRICOS Provider No. 00103D Insert file name here Page 3 of 8

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Big Data and MapReduce
|23
|7772
|440

Big Data Workbook for Week 6 - ITECH 2201 Cloud Computing
|10
|3514
|434

ITECH 2201 Cloud Computing School of Science
|25
|6295
|137

Cloud Computing Storage Methods and Design - ITECH 2201
|7
|1808
|298

ITECH 2201 Cloud Computing Storage Methods and Design
|6
|1551
|149

What is Data Science? Part B Exercise 1: What is Data Science?
|24
|7358
|489