logo

Big Data Analysis with Hadoop and Pig

   

Added on  2020-04-21

7 Pages1008 Words231 Views
 | 
 | 
 | 
UserHadoop and big data(java)
Big Data Analysis with Hadoop and Pig_1

Table of Contents1.Introduction.......................................................................................................................................22.Objective............................................................................................................................................23.Requirements.....................................................................................................................................24.Processing techniques........................................................................................................................35.Data processing procedures..............................................................................................................46.Conclusion..........................................................................................................................................57.References..........................................................................................................................................5
Big Data Analysis with Hadoop and Pig_2

1.IntroductionVirtualBox is a kind of software virtualization package connects on operating system.VirtualBox will be used for implementation purpose. The Cloudera provides the virtual machinethat makes me to work in the handout conditions. Then Quickstart cloud era will be importedinto virtual box. Hue can be installed in the quick start cloud era. Then pig scripts for sales willbe added to it and process it for analyzing. The requirements of the project are Virtual box,Quick start cloud era, Hue and Pig script.2.ObjectiveTo show per month sales before and after campaignCount Advertised Product Sales by Month3.Requirements The requirements of the project are Unix and windows user needs the followingHadoopJavaAntJUnit The format of the data in the original data input file1..pig file is a texture file.2.Apache pig gives the delivery of describing the user defining function inprogramming languages by using scripts. It can be run in script in a file with .pigextensionApache pig has 2 modes(grunt shell)1.Local mode: used to run local host and local file. It is used for testing. HDFS arenot required.
Big Data Analysis with Hadoop and Pig_3

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents