logo

Modeling & Computing Techniques: Machine Learning and Artificial Intelligence

   

Added on  2022-08-24

21 Pages5450 Words14 Views
Running head: MODELING & COMPUTING TECHNIQUES
Modeling & Computing Techniques
Students Name:
Student ID:
University Name:
Paper code:
Modeling & Computing Techniques: Machine Learning and Artificial Intelligence_1
MODELING & COMPUTING TECHNIQUES2
Executive Summary
Machine learning and Artificial Intelligence is considered to be one of the leading and powerful
technologies used in the recent world. The most important part is that human’s haven’t seen the
full potential of such technologies. This is because the system has the ability to learn
automatically from historical data and from past experience. Machine learning technology are
generally used to transform information into knowledge. Machine learning models are used to
gather useful information and the hidden patterns inside the data and make decisions based on
the data with minimum human involvement.
The dataset used in the analysis contains information of each and every building like
home, apartment etc. which are sold in the New York City property market over the period of 12
months. The dataset contains information of five different places. A total number of 84548
numbers of information are present into the data file.
The model used for classifying the sales is artificial neural network model using deep
learning. Basically Keras Regressor algorithm is used to train the training dataset and will be
tested over the tested dataset as how well the classifier classified the target variable values. It can
be said that deep learning or the neural network models provides better prediction rate as
compare to other model.
In the analysis a deep thorough analysis, data exploration, visualization and at the end
prediction has been performed to get in-depth knowledge of the dataset. Proper machine learning
model with keras layers and tensorflow in the backend has been developed using neural network
techniques. At the end a conclusion will be concluded on how the model predicts the sale price
and different hidden patterns and information will be drawn in the end.
Modeling & Computing Techniques: Machine Learning and Artificial Intelligence_2
MODELING & COMPUTING TECHNIQUES3
Table of Contents
Executive Summary.........................................................................................................................2
Introduction......................................................................................................................................4
Discussion........................................................................................................................................4
Introduction and observation of the dataset.................................................................................4
The proposed model for price prediction.....................................................................................8
Conclusion.....................................................................................................................................10
References......................................................................................................................................11
Appendix........................................................................................................................................13
Modeling & Computing Techniques: Machine Learning and Artificial Intelligence_3
MODELING & COMPUTING TECHNIQUES4
Introduction
Machine learning and Artificial Intelligence is considered to be one of the leading and
powerful technologies used in the recent world (Alpaydin, 2020). The most important part is that
human’s haven’t seen the full potential of such technologies. This is because the system has the
ability to learn automatically from historical data and from past experience. Machine learning
technology are generally used to transform information into knowledge (Bishop, 2006). Machine
learning models are used to find the hidden patterns inside the data and make decisions based on
the data with minimum human involvement (Moolayil, Moolayil & John, 2019). Mainly there
are 2 types of machine learning algorithm categories mainly supervised and unsupervised
learning (Brownlee, 2016).
In supervised learning the inputs of the dataset is known and the dataset contains labelled
data with known output, whereas in unsupervised learning the input is known but the dataset
contains, un-labelled data with unknown outputs (Campesato, 2020). In this analysis the target
variable is the sale price attribute and the goal is to predict the sales price using artificial neural
network using deep learning methods (Chernick, 1998).
Deep learning is another field or it can be said that it is a subpart of machine learning
which consist of network based layers and capable of learning from the unsupervised data which
are generally unstructured and unlabeled (Daniel, 2013). There are different kinds of layers used
to build a neural network model. For this analysis only dense layer has been used to build the
artificial neural network model (Dietterich, 1997).
The accuracy and the performance of the models also depends on the data. If the data
contains more missing values or null values then the model will not be able to classify properly
as the data is not a good fit for the model (Géron, 2019). The more cleanly the data the more
acutely the model will classify the target variables. It has been seen that using deep learning
more accurate result has been observed instead of using older learning algorithms (Mitchell,
1997).
Discussion
Introduction and observation of the dataset
Exploring the attributes of the dataset:
1. BOROUGH: The Borough attribute consist of 5 different classes which are basically five
location where properties have been sold which are basically, 1 for 'Manhattan', 2 for
'Bronx', 3 for 'Brooklyn', 4 for 'Queens' and 5 for 'Staten Island' and these should be
considered to be categorical values.
2. NEIGHBORHOOD: This attribute tells the neighborhood name for the particular
properties. The name is given by the department of finance assessors also the name is
similar to the name of the Finance designates. Also it can be seen that there may be few
differences in the neighborhood attributes and few sub- neighborhood might not be
include also with respect to the value of the attribute the attribute will be categorical.
Modeling & Computing Techniques: Machine Learning and Artificial Intelligence_4
MODELING & COMPUTING TECHNIQUES5
3. BUILDING CLASS CATEGORY: This attribute is used to identify similar properties of
the Rolling sales files without having look into individual building classes (Norris, 2020).
The data and files are been store by the Neighborhood, Block, Borough, Building Class
Category and lot. Also the values of the attribute are said to be categorical.
4. TAX CLASS AT PRESENT, there consist of 4 tax classes mainly, 1,2,3 and 4 which are
assigned to each property of the city which are totally based on the use of the property
also this attribute consist of categorical values.
Class 1: This includes most of the attribute which are mainly one, two or three family
houses with such small store or offices, vacant lands are used for residential use and most
important there should not be any three stories.
Class 2: This shoes the properties which are generally primarily residents mainly the
condominiums and the cooperatives.
Class 3: This includes the properties which are generally equipped and owned by
telephones, gas and electric companies.
Class 4: This includes which are not included in the class1, class2 and class3 mainly the
factories, garage, offices, warehouses and many more.
5. BLOCK and LOT: Here the tax block is termed to be as the sub division of borough
attributes. The block and lot distinguishes one unit of real property from another, such as
the different condominiums in a single building (Yao, 1999). The tax lot represent the
unique location of the properties which is generally a subdivision of a tax block. Also
making it categorical doesn’t make any sense as there are 11k unique blocks available in
the dataset. Hence both block and lot will be uses as numerical attributes for the analysis
purpose.
6. BUILDING CLASS AT PRESENT: This attribute is used for describing the constructive
use of properties. The first letter describe the individual class of the properties for
example “A” signifies one-family homes, “O” signifies office buildings. “R” signifies
condominiums (Michie & Spiegelhalter, 1994). For the second position some numbers
are been added with the previous examples which can be written in the form of “A0” is a
Cape Cod style one family home, “O4” is a tower type office building and “R5” is a
commercial condominium unit. The values of the attribute will be categorical as there
will be unique code given for the properties.
7. ADDRESS: The address basically consist of the street address for the property which are
been listed in the sales file. Apartment number are use in the address field for the coop
sale.
8. ZIP CODE: It tell the postal code for each property. This variable should be categorical.
9. RESIDENTIAL UNITS: This attributes tell the total number of residential unit which are
listed for each property. This variable should be numeric.
10. COMMERCIAL UNITS: This attributes tell the total number of commercial unit which
are listed for each property. This variable should be numeric.
Modeling & Computing Techniques: Machine Learning and Artificial Intelligence_5
MODELING & COMPUTING TECHNIQUES6
11. TOTAL UNITS: This attribute tell the total number of units that are listed for each
property. This variable should be numeric
12. LAND SQUARE FEET: It consist of the total land area for particular property measure
in square feet. This attribute should be numeric
13. GROSS SQUARE FEET : It is the measurement of the total measured area including the
exterior surface then the outside wall of the building also the outside space are also taken
to consideration. This attribute will be numeric.
8. YEAR BUILT: The attribute indicates the year when the property was built also the
values of the attributes will be categorical.
9. TAX CLASS AT TIME OF SALE and BUILDING CLASS AT TIME OF SALE. Both
these attributes will be categorical.
10. SALE PRICE: This variable should be numeric.
11. SALE DATE: This variable should be data time. However, we can save the "year" or
"month" part as a new categorical variable.
12. EASEMENT: This attributes indicates some right which needs to be followed, it depicts
some entity which have limited rights to use another’s property.
The dataset contains lots of blank spaces and null values which are not good for any
model to process. Thus data cleaning and pre-processing of data need to be performed in order to
get a cleaner dataset to work on.
Figure 1: Distribution of sales over the year
Modeling & Computing Techniques: Machine Learning and Artificial Intelligence_6

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
MODELING & COMPUTING TECHNIQUES
|20
|5570
|13

Application of Convolutional Neural Network.
|7
|1470
|26

Paired Facial Matching With Age and Gender Prediction | Report
|8
|2417
|9

Supervised and Unsupervised Machine Learning Algorithms in Data Science
|5
|1122
|13

Mathematics and Programming of AI | Report
|9
|3268
|20

Deep Learning: Methods, Algorithms, and Applications
|10
|2301
|493