This project focuses on data classification using machine learning techniques to predict software defects. The student utilized a dataset containing static code metrics and employed the Naive Bayes classification model to classify the data. The project involved data exploration, visualization using box plots and scatter plots, and dimensionality reduction using Principal Component Analysis (PCA). The student implemented and evaluated the Naive Bayes classifier, analyzing its performance through accuracy, classification reports, and confusion matrices. Different features were tested to determine the optimal feature set for the model. The conclusion indicates that the Naive Bayes model was not a good fit for this particular dataset. The student suggested exploring other algorithms for future reference. The project follows the guidelines provided by the University of Hertfordshire for the Foundations of Data Science module (7COM1073).