logo

Moving Faculty Information System from RDBMS to Hadoop using Pig Latin Scripts

   

Added on  2023-06-04

7 Pages1339 Words492 Views
Big Data
Moving Faculty Information System from RDBMS to Hadoop using Pig Latin Scripts_1
Contents
1. Introduction...................................................................................................................................3
2. Dataset...........................................................................................................................................3
3. Upload the dataset.........................................................................................................................3
4. Pig Latin Scripts............................................................................................................................4
Task: 1..............................................................................................................................................5
Task: 2..............................................................................................................................................5
Task: 3..............................................................................................................................................5
5. Discussion.....................................................................................................................................5
6. Conclusion.....................................................................................................................................6
References.............................................................................................................................................7
Moving Faculty Information System from RDBMS to Hadoop using Pig Latin Scripts_2
1. Introduction
A large, state-owned, multi-campus University wanted to move its Faculty
Information System (FIS) from a relational database management system (RDBMS)
implemented using MS-Access to Hadoop. The production server comprises of data that is
older than fifty years, which is stored and compiled from the 20 campuses of the university.
The plan aims to move this increasing size of the data (i.e., the relational tables and
associated data files), which has started to effect the response system and is increasing related
data issues.
The Pig Latin is referred as the data flow language, where the result of every single
processing step forms a new dataset, or a relation (Gates and Dai, 2016). To create a dataset
and separate all categories. We are using cloudera hue environment to create the project. The
initial step includes, uploading the provided dataset and importing to the Cloudera HUE
environment. We are using the pig script for separate the file for all categorires.The given
dataset are employee dataset and it upload through the cloudera hue environment. Hue means
hadoop user experience and its support the apache hadoop and ecosystem. It is a web based
query, which visualizes the data, where its output completely depends on the dataset and
related query.
The objective of this report is to carry out further work using the datasets, especially
for moving them from the local file system into the storage on the Hadoop system, which
later helps to extract certain basic analytics.
2. Dataset
The provided dataset depends on the list of the faculty and comprises of the following
information- Name of the staff, their location, title, grade, university, course, date of joining,
type, LWD, division, highest qualification, major, all their qualifications, reports, document
and criteria. In the dataset can be separated by the categories. The initial task is required to
separate the dataset based on the degree of the staff. Then, the next task requires separating
the dataset based on the experience of the staff and the last task is separated depending on the
place of the staff’s last degree.
Moving Faculty Information System from RDBMS to Hadoop using Pig Latin Scripts_3

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
CIS 4930 Introduction to Hadoop and Big Data
|7
|1008
|231

Upload and categorize faculty list using Pig in Hadoop
|2
|599
|300

Designated Storage Space Assignment
|2
|593
|428