This assignment focuses on data analysis using Apache Pig and Hadoop Distributed File System (HDFS). The task involves loading the 'CIS_FacultyList.csv' dataset into HDFS, followed by data manipulation using Pig. The solution demonstrates how to create new datasets based on different criteria such as degree level (Bachelors, Masters, Doctorate), years of teaching experience (less than 5 years, more than 5 years), and the location of the last degree (North America or elsewhere). The assignment utilizes Pig's SPLIT function to categorize the data and DUMP to display the schema and data. Finally, the solution includes instructions for copying the processed data from HDFS back to the local file system. The solution is designed to provide practical experience in data processing and analysis within a big data environment. Desklib provides access to past papers and solutions to help students with their studies.