logo

Big Data Workbook for Week 6 - ITECH 2201 Cloud Computing

   

Added on  2023-06-11

10 Pages3514 Words434 Views
ITECH 2201 Cloud Computing
School of Science, Information Technology & Engineering
Workbook for Week 6 (Big Data)
Please note: All the efforts were taken to ensure the given web links are accessible. However,
if they are broken – please use any appropriate video/article and refer them in your answer
Part A (4 Marks)
Exercise 1: Data Science(1 mark)
Read the article at http://datascience.berkeley.edu/about/what-is-data-science/ and
answer the following:
What is Data Science?
_____Now day’s data science is fastest and significant growing demand for the data operator’s
professional different non-public and public organizations etc. It also highlighted about the limited
supply of data at scale which also reflects the fastest rising of salaries for all data analyst,
statisticians, data engineers etc. As this is one the new field emerges in today’s world the main
challenges is the technique to use all data in more effective manner.
According to IBM estimation, what is the percent of the data in the world today that has been
created in the past two years?
______As per the report of 2017, more than 90% of data from today’s world will be created in last
two years. However, according to report, different new sensor techniques, devices arises the data
growth rate which will be more accelerated. The main challenges faced by marketers is the
increasing demand of customers to know all their needs, expectation , preferences as per each
interaction as well as transaction.
__________________________________________________________________
What is the value of petabytestorage?
CRICOS Provider No. 00103D Insert file name here Page 1 of 10

In the context of enterprise storage, the system mainly started to leave all terabyte behind,
moving to petabyte; towards Exabyte storage. The value of petabyte (PB) storage is byte
1015 data, 1000 terabytes as well as 1,000,000 Gigabyte (GB). In addition, some of
vendors who sell different associated storage system are IBM Scale Out Network
Attached Storage (SONAS), Hitachi NAS Platform (HNAS), Panasas ActiveStor etc.
_______________________________________________________________________
For each course, both foundation and advanced, you find at
http://datascience.berkeley.edu/academics/curriculum/briefly state (in 2 to 3 lines) what
they offer?Based on the given course description as well as from the video. The purpose
of this question is to understand the different streams available in Data Science.
__As per this article, foundation course mainly offers knowledge as well as proficiency in
different object oriented programming foundation course, different units of advanced as
well as foundation coursework.
_______________________________________________________________________
_However, in the context of advanced courses, they mainly offers better causality as well
as experience knowledge, knowledge related to both Human and Values data. It also
describes about different statistical techniques in the context of time series, panel data as
well as discrete responses.
Exercise 2: Characteristics of Big Data(2 marks)
Read the following research paper from IEEE Xplore Digital Library
Ali-ud-din Khan, M.; Uddin, M.F.; Gupta, N., "Seven V's of Big Data understanding Big
Data to extract value," American Society for Engineering Education (ASEE Zone 1), 2014
Zone 1 Conference of the , pp.1,5, 3-5 April 2014
and answer the following questions:
Summarise the motivation of the author (in one paragraph)
_In this particular article, () discussed about the context of motivation to prepare this
particular paper by doing the proper outlining based on all related arguments from the
context of BigData. However, in this it also discusses about driving better result from the
raw materials of big data in both the Internet and Technology world. Apart from that, large
amount of data to be processed as well as diagnosed based on all related queries along
with different tradition techniques like using SQL etc.
CRICOS Provider No. 00103D Insert file name here Page 2 of 10

What are the 7 v’s mentioned in the paper? Briefly describe each V in one paragraph.
_______________________________________________________________________
Below discussed are the 7 v’s mentioned and discussed below from the article mentioned
above listed below:
1. Volume: In the context of big data volume, it mainly refers to the day sizes including
audio, video, different calamities of natural disaster, weather forecasting etc. The main
importance of big data is discussed about the differences along with traditional d which
can be accessed by doing proper SQL query.
2. Velocity: The speed as well as data velocity will be mainly discussed based on two
different perspectives. In the first case, it is the velocity of different incoming data,
whereas, in other one, it is the data moving speed.
3. Variety: Another variation between traditional data and big data discussed about
different shapes of big data which is mainly acquires from user’s interface in direct
manner.
4. Veracity: Veracity of big data discusses about the data reliability. However, comparing
with traditional data it gets normalized where big data directly acquired from different
users. It also becomes less reliable in nature. Hence, this is one of the important stages to
process big data to process data types.
5. Validity: In validity of data it discusses about the accuracy as well as correctness of data
based on their intended usage. It means that related data can be truth in nature where as
it will not be valid as well as suitable based on the situation.
6. Volatility: Volatility of big data is based on data retention policy in case of both big data
and traditional data. However, big data is the easy one to implement whereas data variety,
volume, velocity enlarge different issues in big data world.
7. Value: After the above mentioned 6v’s, value one is the desired outcome as per big
data analysis compared with the features and approaches of previous characteristics. As
per different researchers, data value needed to exceed both its ownership as well as costs
from management.
CRICOS Provider No. 00103D Insert file name here Page 3 of 10

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Big Data and MapReduce
|23
|7772
|440

Big Data Characteristics, Tools and Applications - ITECH 2201 Cloud Computing
|8
|1913
|494

ITECH 2201 Cloud Computing School of Science
|25
|6295
|137

What is Data Science? Part B Exercise 1: What is Data Science?
|24
|7358
|489

ITECH 2201 Cloud Computing Assignment (Big Data)
|6
|2377
|124

Green Computing, Big Data, and Storage Design: A Workbook for Week 6-8
|12
|2466
|122