PRT564 DATA ANALYTICS AND VISUALISATION Project __________________________________________________________________________________ Due on Fri Week 12 INTRODUCTION You are provided a list of potential topics to choose from to research and write the final report on, but each topic will only be provided one reference material to start with. However, for your final report at least 15 references are required so you can do more of your own research along the line of the topic. You may also choose your own topic of interest subject to the lecturer’s approval even if it is not in the suggested list. A relevant case study must be included in your report that matches the report theme. Case study should contain text descriptions, plots, code snippets about: i.framing a problem with hypotheses relevant to the reporting topic; ii.choosing, collecting, processing, feature engineering a dataset (of size at least several megabytes) for the problem; and iii.visualising and analysing the processed dataset through model/algorithm building and evaluation. Recall that the data analytics lifecycle includes six phases of Discovery, Data Preparation, Model Planning, Model Building, Communicate Results, and Operationalize. Whereas in our case study we only focus on the first five phases. PROJECT SPECIFICATION For case study, you can use any public dataset available online. Many datasets are available from https://archive.ics.uci.edu/ml/index.php. Alternatively, you can google for both datasets and source codes matching the theme of your report, but make sure you provide references to them (e.g. from a paper or a Kaggle problem). Your data + code package must also be submitted through Learnline and accompanied with a brief user manual/running and testing instruction subsection under the Case Study section of your report for the lecturer to run and test your uploaded package. For completing this assignment, you are to work in a group of 2 or 3. Groups must be formed by end of week 2. You should then proceed to choose a topic of interest.
DELIVERABLE TEMPLATE & SUBMISSION The final report (approximately 3500 words) should contain: •Abstract - Brief summary of the contents of the report •Introduction - An explanation of the purpose of the study; a statement of the state-of-art research question(s) and a brief introduction of the relevant case study to be presented next. •Literature review - A critical assessment of the work done so far on this topic. The assessment should systematically outline, compare, discuss the methods and results of the previous studies and then point out the shortcomings involved and suggest further improvements. •Case Study - State how the case study is related to the report topic/theme/domain. i.Specific problem scenario and framing with initial hypotheses (e.g. churn prediction to prevent the loss of customers). Make sure the problem is related to the report topic. ii.Data selection, collection and description; initial data processing (into clean training and testing datasets or dataframes), data analysis and feature engineering iii.Model and parameter selection via testing various model sketches and configurations to gain insights of trade-offs, scenarios and sensitivity analyses. iv.Model evaluation and significant results presentation (to high-level sponsors or expert-level analysts) through visualisation and discussion. Discuss model limitation, lessons learnt, suggestions, analytics value and quantifiable value added to the client. v.A brief user manual/running and testing instruction for the submitted code and data. •Conclusion - state the conclusions, findings and implications of the literature review and the presented case study; also point to directions for further work in the area. •References - at least 20 relevant articles need to be covered. For referenced conference papers, the quality/ranking can be checked through http://portal.core.edu.au/confranks/. For journal articles, check their impact factors and IEEE/ACM Transactions are recommended. Other online articles and resources should also be of high quality and authority. Use Google, Google Scholar and online Digital Libraries for your research. Note that for case study, you are not required to strictly adhere to the textbook formats and examples (e.g. the final presentation components). Instead they can be your reference points. Contents, logical workflow, insightful analysis and presentation quality of the case study will be more valued. Feel free to exercise and document your findings and thoughts as a data scientist in this assignment! Each group is required to submit a report as outlined above together with a zipped data+code package. Note that at the start or the end of each paragraph/section of the report, please note down the name(s) of the respective contributor(s) for the purpose of peer assessment.
End of preview
Want to access all the pages? Upload your documents or become a member.
Related Documents
PRT564 Data Analytics and Visualisation Projectlg...
|5
|2023
|220
R Programming Analysis 2022lg...
|7
|1725
|21
Case Study on Impact of Digital Technology on Business - British Corner Shoplg...
|2
|589
|183
B8IS100 Data Management & Analytics : Assignmentlg...
|5
|1075
|668
Statistical Methods in Epidemiology (401176)lg...
|4
|1347
|268
Organisational Ethics and Corporate Social Responsibility | Essaylg...