Aims. In this phase you will:. carry out basic data man

Added on - 09 Oct 2019

  • 4

    pages

  • 1740

    words

  • 104

    views

  • 0

    downloads

Showing pages 1 to 2 of 4 pages
AimsIn this phase you will:carry out basic data manipulation using spreadsheet's formulaimplement a simple recommender system in spreadsheetIntroductionIn 2014, Hollywood has releasedmore than 600 movies: that is, one movie everyday, plus 4 more during the weekend for you to watch. In the last decade, the wayconsumers access such a huge collection of movies has changed. With theproliferation of the high speed Internet, online movie rental (either via DVD orstreaming) companies are killing the physical video rental shops.Launched in 1999,Netflix--the largest US online movie rental then--announceditsfirst billionth DVD deliveryin February 2007. It claimed to spend about $300million a year on postage alone. Around 2008, Australia'sBigPondMoviesandQuickFlixhad thousands of DVDs on offer only a mouse click away.Subsequently, in 2009,Netflixdelivered its 2 billionth delivery in BluRay. Recently, inMarch 2015,Netflixofficially launched its streaming service in Australia, competingwith existing providers likeStanandPresto.Parallel to this trend, consumers--professional critics or moviegoers---have beensharing and exchanging information about movies online. TheIMDb, the largestonline movie database, stores information about movies, actors, directors, and anyother information you can think of. There are many other similar sites that provide acomprehensive collection of reviews and critics:metacritic,rottentomatoes,Yahoo!Movies, andmovie review query engine. All of these sites allow people tocollaboratively discuss and rate their favourite movies online.One important function of these sites is to help people to select movies they like bylooking at lists of movie reviews from around the world. This is a case of informationfiltering.Online recommendation systemsare becoming important informationfiltering tools as we are overwhelmed by digital content.Pandora's musicrecommendation system andAmazon'sbook recommendationare such examples.These systems are very useful, not only for the audience to find their way throughmillions of options, but also for business to up-sell their products (Do you want toupsize your Big Mac meal?). It is so important thatNetflix offers USD$1,000,000toanyone who can improve their movie recommendation engine.Recommender systemA 'recommender system' presents a list of items (books, movies, music) that arelikely to be of interest to a user, based on what it knows about that user and theitems. It makes use of intrinsic properties of the large collection of items (the content-based approach), the user's social environment (the collaborative filtering approach),or a combination of both. There are many ways to predict what a person would like,but there is no one correct way - as billions of dollars spent on marketing will attestto.
In a 'movie recommender' system, for example, a content-based approach mayemploy information such as actors, directors and/or movie genre. The combination ofwhat the audience thinks about the movies and the audience profiles can be utilizedin the collaborative filtering approach (two users with the same profile are more likelyto enjoy the same movies). As people's borrowing/consuming habits get recorded,the amount of data that can be used in the system only increases.In this assignment, you will build a simple version of such a system, which usesinformation about movies to find similar movies and produce recommendations.TasksData SetsThe given data set contains information about 291 popular feature films producedfrom 1969 to 2008. The data set captures data such as the movie name, censorshiprating, genre, director, actors, score from various critics, and worldwide gross.(Attached)Part 1 - Basic TaskUsing spreadsheet formulas, complete the following tasks and answer all therelevant questions.1.Compare the performance among movie genres based on the worldwidegross of the movies with the same genre. Ignore genres that have less than 5movies. Visualise the comparison using appropriate chart type. Which three genresare the worst performers? Compare the performance of movie ratings (PG, G, etc)based on the same measure. Again, ignore ratings that have less than 5 movies.Visualise the comparison again. Do PG-rated movies generally earn better than R-rated movie?2.Which three of the given reviewers in the movie data (Washington Post,Chicago Sun-Times, The New York Times, LA Weekly, Los Angeles Times, RollingStone, Wall Street Journal, Entertainment Weekly, Empire, Variety, Salon.com, TheOnion (A.V. Club), TV Guide, Slate) are the most consistent with the 'metascore'?You can do this by calculating the average gaps between the metascore value andthe score from a particular reviewer. Visualise the average gaps of all reviews to seehow close they are to metascore. Consider 0 as an empty score. State yourassumption when dealing with missing data.3.Present a table of actors versus genres to show the number of movies in eachgenre that a particular actor is featured in. Show only actors which have appeared inat least 6 movies. Colour the cells that contain these counts so that higher countscan be distinguished from lower counts. Include as the last column the total numberof movies the actor is featured in. Correspondingly, include as the last row the totalnumber of movies within each genre. Present the actor names in descending order(based on the movie count)....genre......
desklib-logo
You’re reading a preview
card-image

To View Complete Document

Become a Desklib Library Member.
Subscribe to our plans

Unlock This Document