HIVE in Big data - Importance, Features and Use Cases

8 Pages1105 Words326 Views

Added on 2023-06-01

About This Document

This article discusses the importance of HIVE in Big data, its features and use cases. It explains how HIVE enables reading, writing and managing large datasets in distributed storage. The article also covers the benefits of using HIVE for batch processing workloads and ETL, and how it is a good choice for environments experiencing tremendous growth in data volume.

HIVE in Big data - Importance, Features and Use Cases

Added on 2023-06-01

Related Documents

Big Data

Table of Contents
HIVE in Big data.................................................................................................................................2
Summary..............................................................................................................................................5
References............................................................................................................................................7
1

HIVE in Big data - Importance, Features and Use Cases_2

HIVE in Big data
The huge information industry has aced the craft of social occasion and logging
terabytes of information, yet the test is to base estimates and settle on choices got from this
genuine information, which is the reason Apache Hive is so vital. It anticipates a structure
onto the information and inquiries this information following a SQL-like inquiry structure to
perform Map and lessen assignments on extensive datasets. Hive information stockroom
programming empowers perusing, composing, and overseeing expansive datasets in
dispersed capacity. Utilizing the Hive inquiry dialect (HiveQL), which is fundamentally the
same as SQL, inquiries are changed over into a progression of employments that execute on a
Hadoop bunch through Map Reduce or Apache Spark. Clients can run clump handling
remaining tasks at hand with Hive while likewise investigating similar information for
intelligent SQL or machine-learning outstanding burdens utilizing apparatuses like Apache
Impala or Apache Spark all inside a solitary stage. As a major aspect of CDH, Hive
additionally profits by (Akerkar, 2014):
 Unified asset administration given by YARN
 Simplified arrangement and organization given by Cloudera Manager
 Shared security and administration to meet consistence necessities given by Apache
Sentry and Cloudera Navigator
Since Hive is a petabyte-scale information stockroom framework based on the
Hadoop stage, it is a decent decision for situations encountering marvellous development in
information volume. The basic Map Reduce interface with HDFS is difficult to program
specifically, yet Hive gives a SQL interface, making it conceivable to utilize existing
programming abilities to perform information readiness. Hive on Map Reduce or Spark is
most appropriate for clump information arrangement or ETL:
 You should run booked bunch employments with extensive ETL sorts with joins to
get ready information for Hadoop. Most information served to BI clients in Impala is
set up by ETL designers utilizing Hive (Mohanty, Jagadeesh and Srivatsa, 2013).
 You run information exchange or transformation employments that take numerous
hours. With Hive, if an issue happens partially through such an occupation, it recoups
and proceeds.
 You get or give information in different configurations, where the Hive SerDes and
assortment of UDFs make it helpful to ingest and convert the information. Regularly,
2

HIVE in Big data - Importance, Features and Use Cases_3

End of preview

Want to access all the pages? Upload your documents or become a member.