logo

Data Mining and Text Mining for Crime Incidents in Chicago

   

Added on  2023-04-21

14 Pages2163 Words213 Views
Theoretical Computer ScienceData Science and Big DataArtificial IntelligenceStatistics and Probability
 | 
 | 
 | 
University
Semester
DATA MINING AND WEB ANALYTICS
Student ID
Student Name
Submission Date
Data Mining and Text Mining for Crime Incidents in Chicago_1

Table of Contents
1. Problem Identification...................................................................................................................1
2. Analysis.........................................................................................................................................1
2.1 Data Mining...........................................................................................................................1
2.1.1 K-Means........................................................................................................................1
2.1.2 C & R Tree....................................................................................................................2
2.1.3 Neural Network..............................................................................................................2
2.2 Text Mining...........................................................................................................................3
3. Recommendation...........................................................................................................................4
4. Conclusion.....................................................................................................................................4
References.............................................................................................................................................5
Appendix...............................................................................................................................................6
Data Mining and Text Mining for Crime Incidents in Chicago_2

1. Problem Identification
The identified problem is the Crime incidents are potentially affecting the public safety
issues and to identify the zones and timelines of delinquency in Chicago for a between risk
management. The crimes are about physical attack or fight with a weapon, sexual battery,
robbery or shoplifting and vandalism. It was also assumed that requiring a benchmark of law
enforcement contact could decrease the subjective judgment connected with the incidents.
The crimes taking place at the community and domestic area include those that took place in
the community and domestic area buildings, on grounds, on buses, and at community, school
and domestic area-sponsored events or activities. So, we have developed a model to resolve
the identified problem by using data mining and text mining algorithms. So, this assignment
to solve this specific problem which is determining the zones and times to improve the
management of the crimes risk.
2. Analysis
2.1 Data Mining
In data mining techniques, this technique uses the structured data. In our analysis uses
the crime incidents data. This data contains the following filed such as ID, Date, Primary
Type, Block, Case Number, Year, Description, IUCR, Location Description, Domestic, Beat,
Arrest, District, FBI Code, Community Area, Ward, X Coordinate, Y Coordinate, Latitude,
Longitude, Location and Updated On.
Data mining technique has various classifications and clustering techniques but, in our
analysis we have chosen the clustering method as K – Means and classification as C & R tree.
And, Neural Network method also is used for better prediction (Blunch and Blunch, 2013).
2.1.1 K-Means
The analysis of K-Means is used to classify the crime incidents based on the zones
and timelines of delinquency in Chicago so resolve the identified problem. It is used to
recognize the data patterns in the crime data without any need of exacting the match to any of
the stored patterns. KNN analysis is utilized for computing the values of a continuous target.
In such a circumstance, the nearest neighbours’ average or median target value is utilized for
obtaining the predicted value for a new case.
In our analysis, we are using the following field’s arrest, domestic, district and year.
Based on mentioned fields are used to predict the crime arrest in domestic and district area in
1
Data Mining and Text Mining for Crime Incidents in Chicago_3

year wise. And, also we have predicted the values based on the crime incident based on the
zones and timelines of delinquency in Chicago. It is used to provide the better risk
management for Chicago crime incidents. The K – Means technique output screenshots are
illustrated in the Appendix.
2.1.2 C & R Tree
The analysis of C&R tree is used to utilize the crime incidents based on the zones and
timelines of delinquency in Chicago and it is used to resolve the identified problem. C&R
stands for Classification and Regression. It is a Tree node which contains a tree-based
classification and similar prediction method (Garson., 2012). It utilizes recursive partitioning,
which ensures splitting of the training records into segments that contain exactly same values
in the output fields. C&R Tree node begins with the examination of input fields, which helps
in finding the best split that is measured by the reduction in the impurity index as it is the
result of the split.
In our analysis, we are using the following field’s arrest, domestic, district and year.
Based on mentioned fields are used to predict the crime arrest in domestic and district area in
year wise. And, also we have predicted the values based on the based on the zones and
timelines of delinquency in Chicago occurred in a year, domestic, and community area and
primary type. The Classification and Regression (C&R) Tree technique output screenshots
are illustrated in the Appendix.
2.1.3 Neural Network
A Neural network method is utilized to perform and resolve the crime incidents based
on the zones and timelines of delinquency in Chicago by use the various operations like-
Classification, feature mining, clustering, pattern recognition and prediction. It is used to
model complex relationships between inputs and outputs or to find patterns in data. It is used
for storing, recognizing and in retrieving the patterns or database entries (Stahlbock, Abou-
Nasr and Weiss, 2018). Moreover, it is utilized for solving the problem of combinatorial
optimization, for filtering the noise from the measurement data and for controlling the ill-
defined problems. Additionally, the neural network method is used for estimating the
sampled functions, when the form of the functions is not known. Basically, pattern
recognition and function estimation are a couple of abilities which are used to make artificial
neural networks (ANN) a common utility in data mining (Perner, 2015).
In our analysis, we are using the following field’s arrest, domestic, community area
and primary type. Based on mentioned fields are used to predict the crime arrest in domestic
2
Data Mining and Text Mining for Crime Incidents in Chicago_4

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Data Mining and Machine Learning in Cybersecurity - Book by Sumeet Dua and Xian Du
|6
|1077
|55