Trusted by 2+ million users, 1000+ happy students everyday
roject [100 points]Due on 11.20 at 11:55pmInstructor's Comments:The project is designed to develop stduents' capabilities apply the programming skills to dooinformation extractionosentiment analysisNow, let us break the problem into 10 parts.Task 1 [5 points]: Download and readmoviereview.txtinto your computer.Task 2 [5 points]: Extract all words in moviereview.txt.Task 3 [5 points]: Lowercase all words.Task 4 [15 points]: Calculate word frequence (i.e., how many times each word apprears in moviewreview.txt)Task 5 [15 points]: Based on the word frequence you calculate in Task4, sort the words in a descending order and display the top 5 words and their frequence in the console by following the format below.The output below is not the answer.Top 5 words in moviereview.txt, organized in a descending order:and appears 100 timesfilm appears 40 timesthat appears 34 timesis appear 33 timesare appears 30 timesTask 6 [5 points]: Download and readpositive.txtinto your computer.Task 7 [5 points]: Extract all words in positive.txt.Task 8 [5 points]: Lowercase all words.Task 9 [15 points]: Calculate word frequence (i.e., how many times each word in positive.txt appears in moviereview.txt)Task 10 [15 points]: Based on the word frequence you calculate in Task 9, sort the words in a descending order and display the words with frequence greater than 5 in the console by following the format below.The output below is not the answer.Words in positive.txt that appear greater than 5 times in moviereview.txt, organized in a descending order:satisfy appears 10 timesmagical appears 9 timesgreat appears 7 times
Found this document preview useful?
You are reading a preview Upload your documents to download or Become a Desklib member to get accesss