SIT717: Clustering Analysis of BBC Health Twitter Data Report

Verified

Added on 2022/12/15

AI Summary

This report presents a clustering analysis of BBC Health Twitter data, focusing on the application of data mining techniques to classify and segment health-related tweets. The analysis utilizes the bbchealth.txt dataset, sourced from the UCI machine learning repository, containing tweet IDs, timestamps, and health-related content. Due to limitations, the analysis primarily uses tweet IDs and timestamps. The core methodology involves the K-means clustering algorithm, implemented in Weka, to segment the data into clusters based on time-based criteria. The report details the data preprocessing steps, the rationale behind choosing K-means, and a comparison with the fuzzy c-means algorithm. The results section presents the clustered model based on minimum sum of squared error and model building time. The report also includes a discussion on the dataset summary, data mining techniques, results, evaluation and conclusions.

1 out of 19

Loading PDF…

SIT717: Clustering Analysis of BBC Health Twitter Data Report

Related Documents

SIT717 Enterprise Business Intelligence: Weka Data Analysis Report

Comprehensive Analysis of BBC World News and its Impact

Analysis of Public Health Informatics: Twitter Feeds and Telemedicine

A Report on Social Networking's Effect on Emirati Youth Behavior

Cloud Computing's Impact on Business Services: An Assessment

+13062052269

info@desklib.com