Introduction to Data Mining: Project Overview.

Added on - 19 Sep 2019

  • 2

    Pages

  • 599

    Words

  • 118

    Views

  • 0

    Downloads

Trusted by +2 million users,
assist thousands of students everyday
Showing pages 1 to 1 of 2 pages
Introduction to Data Mining: Project Overview1) Read a delimited file (pipe or comma delimited) into a data-frame.Consider using Hospital Compare data as a data source:https://data.medicare.gov/data/hospital-compare(click on “download csv flat files”)BONUS CREDIT: For bonus credit, create a table or tables in Postgres, populate the table(s) withinsert statements, and read the data into a data-frame using R. The DDL and insert statements shouldbe submitted with the assignment. The more elaborate the database, the more bonus credit you arelikely to receive (e.g. creating two tables and joining them together is worth more than a single table).2) Apply some cursory validations (checking for nulls and blanks) and rename your columns ifnecessary3) Split your data into a testing and training dataset (80% training and 20% testing)Hint: Use “the subset” function in R.3) Using a library, implement an algorithm that we’ve discussed in class using 80% of the data. Modeloptions include:Regression (Linear, Logistic)Naive Bayes (Bernoulli, Multinomial, MLE)Clustering (Hierarchical, k-Means)k-Nearest Neighbors (as a classifier or predictor)TF-IDFOther (approval needed)4) Apply the model to 20% of the data and provide some measure of model performance. Note that forclustering, a testing/training split is not necessary.Z-testConfusion MatrixROC CurveInter-cluster SS (sum of squares)Precision/Recall, Specificity & Sensitivity5) Visualize the model in some way with a simple plot.ScatterplotsCorrelation MatrixHistograms6) A one-paragraph write-up on what business problem is being solved with your project and why themodel was selected.BONUS CREDIT: Use R-Shiny to present the data in a browser. The more elaborate the UI (from afunctionality and style perspective), the more bonus credit you are likely to receive.Submission Instructions:
desklib-logo
You’re reading a preview
Preview Documents

To View Complete Document

Become a Desklib Library Member.
Subscribe to our plans

Download This Document