MODELING & COMPUTING TECHNIQUES

Verified

Added on  2022/08/24

|20
|5570
|13
AI Summary

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
Running head: MODELING & COMPUTING TECHNIQUES
Modeling & Computing Techniques
Students Name:
Student ID:
University Name:
Paper code:

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
2MODELING & COMPUTING TECHNIQUES
Executive Summary
Machine learning and Artificial Intelligence is considered to be one of the leading and powerful
technologies used in the recent world. The most important part is that human’s haven’t seen the
full potential of such technologies. This is because the system has the ability to learn automatically
from historical data and from past experience. Machine learning technology are generally used to
transform information into knowledge. Machine learning models are used to gather useful
information and the hidden patterns inside the data and make decisions based on the data with
minimum human involvement.
The dataset used in the analysis contains information of each and every building like home,
apartment etc. which are sold in the New York City property market over the period of 12 months.
The dataset contains information of five different places. A total number of 84548 numbers of
information are present into the data file.
The model used for classifying the sales is artificial neural network model using deep
learning. Basically Keras Regressor algorithm is used to train the training dataset and will be tested
over the tested dataset as how well the classifier classified the target variable values. It can be said
that deep learning or the neural network models provides better prediction rate as compare to other
model.
In the analysis a deep thorough analysis, data exploration, visualization and at the end
prediction has been performed to get in-depth knowledge of the dataset. Proper machine learning
model with keras layers and tensorflow in the backend has been developed using neural network
techniques. At the end a conclusion will be concluded on how the model predicts the sale price
and different hidden patterns and information will be drawn in the end.
Document Page
3MODELING & COMPUTING TECHNIQUES
Table of Contents
Executive Summary ........................................................................................................................ 2
Introduction ..................................................................................................................................... 4
Discussion ....................................................................................................................................... 4
Introduction and observation of the dataset ................................................................................ 4
The proposed model for price prediction .................................................................................... 8
Conclusion .................................................................................................................................... 10
References ..................................................................................................................................... 11
Appendix ....................................................................................................................................... 13
Document Page
4MODELING & COMPUTING TECHNIQUES
Introduction
Machine learning and Artificial Intelligence is considered to be one of the leading and
powerful technologies used in the recent world (Alpaydin, 2020). The most important part is that
human’s haven’t seen the full potential of such technologies. This is because the system has the
ability to learn automatically from historical data and from past experience. Machine learning
technology are generally used to transform information into knowledge (Bishop, 2006). Machine
learning models are used to find the hidden patterns inside the data and make decisions based on
the data with minimum human involvement (Moolayil, Moolayil & John, 2019). Mainly there are
2 types of machine learning algorithm categories mainly supervised and unsupervised learning
(Brownlee, 2016).
In supervised learning the inputs of the dataset is known and the dataset contains labelled
data with known output, whereas in unsupervised learning the input is known but the dataset
contains, un-labelled data with unknown outputs (Campesato, 2020). In this analysis the target
variable is the sale price attribute and the goal is to predict the sales price using artificial neural
network using deep learning methods (Chernick, 1998).
Deep learning is another field or it can be said that it is a subpart of machine learning which
consist of network based layers and capable of learning from the unsupervised data which are
generally unstructured and unlabeled (Daniel, 2013). There are different kinds of layers used to
build a neural network model. For this analysis only dense layer has been used to build the artificial
neural network model (Dietterich, 1997).
The accuracy and the performance of the models also depends on the data. If the data
contains more missing values or null values then the model will not be able to classify properly as
the data is not a good fit for the model (Géron, 2019). The more cleanly the data the more acutely
the model will classify the target variables. It has been seen that using deep learning more accurate
result has been observed instead of using older learning algorithms (Mitchell, 1997).
Discussion
Introduction and observation of the dataset
Exploring the attributes of the dataset:
1. BOROUGH: The Borough attribute consist of 5 different classes which are basically five
location where properties have been sold which are basically, 1 for 'Manhattan', 2 for
'Bronx', 3 for 'Brooklyn', 4 for 'Queens' and 5 for 'Staten Island' and these should be
considered to be categorical values.
2. NEIGHBORHOOD: This attribute tells the neighborhood name for the particular
properties. The name is given by the department of finance assessors also the name is
similar to the name of the Finance designates. Also it can be seen that there may be few
differences in the neighborhood attributes and few sub- neighborhood might not be include
also with respect to the value of the attribute the attribute will be categorical.
3. BUILDING CLASS CATEGORY: This attribute is used to identify similar properties of
the Rolling sales files without having look into individual building classes (Norris, 2020).

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
5MODELING & COMPUTING TECHNIQUES
The data and files are been store by the Neighborhood, Block, Borough, Building Class
Category and lot. Also the values of the attribute are said to be categorical.
4. TAX CLASS AT PRESENT, there consist of 4 tax classes mainly, 1,2,3 and 4 which are
assigned to each property of the city which are totally based on the use of the property also
this attribute consist of categorical values.
Class 1: This includes most of the attribute which are mainly one, two or three family
houses with such small store or offices, vacant lands are used for residential use and most
important there should not be any three stories.
Class 2: This shoes the properties which are generally primarily residents mainly the
condominiums and the cooperatives.
Class 3: This includes the properties which are generally equipped and owned by
telephones, gas and electric companies.
Class 4: This includes which are not included in the class1, class2 and class3 mainly the
factories, garage, offices, warehouses and many more.
5. BLOCK and LOT: Here the tax block is termed to be as the sub division of borough
attributes. The block and lot distinguishes one unit of real property from another, such as
the different condominiums in a single building (Yao, 1999). The tax lot represent the
unique location of the properties which is generally a subdivision of a tax block. Also
making it categorical doesn’t make any sense as there are 11k unique blocks available in
the dataset. Hence both block and lot will be uses as numerical attributes for the analysis
purpose.
6. BUILDING CLASS AT PRESENT: This attribute is used for describing the constructive
use of properties. The first letter describe the individual class of the properties for example
A” signifies one-family homes, O” signifies office buildings. R” signifies
condominiums (Michie & Spiegelhalter, 1994). For the second position some numbers are
been added with the previous examples which can be written in the form of “A0” is a Cape
Cod style one family home, “O4” is a tower type office building and “R5” is a commercial
condominium unit. The values of the attribute will be categorical as there will be unique
code given for the properties.
7. ADDRESS: The address basically consist of the street address for the property which are
been listed in the sales file. Apartment number are use in the address field for the coop sale.
8. ZIP CODE: It tell the postal code for each property. This variable should be categorical.
9. RESIDENTIAL UNITS: This attributes tell the total number of residential unit which are
listed for each property. This variable should be numeric.
10. COMMERCIAL UNITS: This attributes tell the total number of commercial unit which
are listed for each property. This variable should be numeric.
11. TOTAL UNITS: This attribute tell the total number of units that are listed for each
property. This variable should be numeric
Document Page
6MODELING & COMPUTING TECHNIQUES
12. LAND SQUARE FEET: It consist of the total land area for particular property measure in
square feet. This attribute should be numeric
13. GROSS SQUARE FEET : It is the measurement of the total measured area including the
exterior surface then the outside wall of the building also the outside space are also taken
to consideration. This attribute will be numeric.
8. YEAR BUILT: The attribute indicates the year when the property was built also the values
of the attributes will be categorical.
9. TAX CLASS AT TIME OF SALE and BUILDING CLASS AT TIME OF SALE. Both
these attributes will be categorical.
10. SALE PRICE: This variable should be numeric.
11. SALE DATE: This variable should be data time. However, we can save the "year" or
"month" part as a new categorical variable.
12. EASEMENT: This attributes indicates some right which needs to be followed, it depicts
some entity which have limited rights to use another’s property.
The dataset contains lots of blank spaces and null values which are not good for any model
to process. Thus data cleaning and pre-processing of data need to be performed in order to get a
cleaner dataset to work on.
Figure 1: Distribution of sales over the year
Figure 1 represents the trend of sale price over the specific time period.
Document Page
7MODELING & COMPUTING TECHNIQUES
Figure 2: Average SALE PRICE on each BOROUGH
Figure 3 depicts the average sales price for 5 different Borough which are basically 5
different location. Also it can be said that Manhattan has the highest number of sales whereas
Staten Island has the lowest average sales throughout the year (Zirilli, 1996).
Figure 3: Sales per months
There are different analysis which can be used to find different patterns and useful
information from any dataset. Figure 3 shows the sales count for the 12 months of all the
Borough in total.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
8MODELING & COMPUTING TECHNIQUES
The dataset contains null values, missing values and duplicate values which needs to be
remove to get close insight of the data (Zurada, 1992). Also using correlation matrix other
important less attributes were deleted which doesn’t have such importance in the dataset (Gulli &
Pal, 2017).
After different analysis and visualization it’s now time to do different tweaks to the pre-
processed data and split the data into training and testing set which will be used to feed the
model (Jain, Mao & Mohiuddin, 1996).
The proposed model for price prediction
One of the most popular libraries which is used for deep learning is known as Keras, it is
widely to build neural network due to its simplicity and ease of use (Hassoun, 1995). Keras is a
high-level Python neural networks library that runs on top of the TensorFlow (Ketkar, 2017). In
the backend of the neural network tensorflow has been used during the model builup
(Limsombunchai, 2004).
There are various layers used to build artificial neural network, but for this particular
analysis only dense layer has been used to implement the model using KerasRegressor wrapper
estimator (Liu, Yang & Gamal, 2017). The summary of the model is been shown below with the
total number of parameters-
Model: "sequential_1"
_____________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 18) 342
_________________________________________________________________
dense_2 (Dense) (None, 18) 342
_________________________________________________________________
dense_3 (Dense) (None, 18) 342
_________________________________________________________________
dense_4 (Dense) (None, 1) 19
=================================================================
Total params: 1,045
Trainable params: 1,045
Non-trainable params: 0
The above is the summary of the model which will be used for training and testing
purposes of the dataset. The number of parameter was seen to be 1045. The lower the number of
parameter the better the model produces result.
The dense layer is said to be a fully connected layer, which means that in a layer the
neurons are connected to those in the next layer (Marsland, 2015). Also it should be taken into
consideration for regression task accuracy is not the best way to judge the performance of the
model. Using error function it can also be possible to judge the model as lower the error rate
higher will be the accuracy or the performance of the model.
Document Page
9MODELING & COMPUTING TECHNIQUES
Figure 4: Actual price vs. predicted price
Figure 4 shows the actual vs. predicted prices which are stored in different variables in
the form of list (Mehrotra, Mohan & Ranka, 1997). Thus, the plotting has been performed using
different index position for each graph. And also it can say that actual prices are much higher
with respect to the predicted price as the graphs shows huge spikes for actual sales prices.
Figure 5: Scatter plot of predicted price
against the actual price
Document Page
10MODELING & COMPUTING TECHNIQUES
Conclusion
From the above analysis and result it can be concluded that the dataset given is not a clean
dataset due to which many data cleaning and pre-processing of data has been performed. Also
different finding have been shown using various visualization function. Although the data was not
a good one to feed into any machine learning model in spite a KerasRegressor with dense layer
has been built to check the predicted sales price.
Also accuracy measurement for regression algorithm is not a good choice to go in spite
looking at the error rate it can also be told how well the model has performed. In the discussion
portion of the report various conclusion has been made with respect to different graphs. Various
analysis has been performed to get in-depth knowledge of the dataset.
Some of the major improvement is to use cross validation after the model has been built
and after performing the estimator. Also error rates need to be calculated in order to see how well
the process reduced the error rate from previous. Also different layers need to be built to test how
well the newly designed model works with the dataset.

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
11MODELING & COMPUTING TECHNIQUES
References
Alpaydin, E. (2020). Introduction to machine learning. MIT press.
Bishop, C. M. (2006). Pattern recognition and machine learning. springer.
Brownlee, J. (2016). Deep learning with Python: develop deep learning models on Theano and
TensorFlow using Keras. Machine Learning Mastery.
Campesato, O. (2020). Artificial Intelligence, Machine Learning, and Deep Learning. Stylus
Publishing, LLC.
Chernick, H. (1998). Fiscal capacity in New York: The city versus the region. National Tax
Journal, 531-540.
Daniel, G. (2013). Principles of artificial neural networks (Vol. 7). World Scientific.
Dietterich, T. G. (1997). Machine-learning research. AI magazine, 18(4), 97-97.
Géron, A. (2019). Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow:
Concepts, Tools, and Techniques to Build Intelligent Systems. O'Reilly Media.
Gulli, A., & Pal, S. (2017). Deep learning with Keras. Packt Publishing Ltd.
Hassoun, M. H. (1995). Fundamentals of artificial neural networks. MIT press.
Jain, A. K., Mao, J., & Mohiuddin, K. M. (1996). Artificial neural networks: A tutorial. Computer,
29(3), 31-44.
Ketkar, N. (2017). Introduction to keras. In Deep learning with Python (pp. 97-111). Apress,
Berkeley, CA.
Limsombunchai, V. (2004, June). House price prediction: hedonic price model vs. artificial neural
network. In New Zealand agricultural and resource economics society conference (pp. 25-
26).
Liu, X., Yang, D., & El Gamal, A. (2017, October). Deep neural network architectures for
modulation classification. In 2017 51st Asilomar Conference on Signals, Systems, and
Computers (pp. 915-919). IEEE.
Marsland, S. (2015). Machine learning: an algorithmic perspective. CRC press.
Document Page
12MODELING & COMPUTING TECHNIQUES
Mehrotra, K., Mohan, C. K., & Ranka, S. (1997). Elements of artificial neural networks. MIT
press.
Michie, D., Spiegelhalter, D. J., & Taylor, C. C. (1994). Machine learning. Neural and Statistical
Classification, 13(1994), 1-298.
Mitchell, T. M. (1997). Machine learning.
Moolayil, J., Moolayil, & John, S. (2019). Learn Keras for Deep Neural Networks. Apress.
Norris, D. J. (2020). Predictions using ANNs and CNNs. In Machine Learning with the Raspberry
Pi (pp. 387-451). Apress, Berkeley, CA.
Yao, X. (1999). Evolving artificial neural networks. Proceedings of the IEEE, 87(9), 1423-1447.
Yegnanarayana, B. (2009). Artificial neural networks. PHI Learning Pvt. Ltd..
Zirilli, J. S. (1996). Financial prediction using neural networks. International Thomson Computer
Press.
Zurada, J. M. (1992). Introduction to artificial neural systems (Vol. 8). St. Paul: West.
Document Page
13MODELING & COMPUTING TECHNIQUES
Appendix
# importing all the necessary libraries
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.preprocessing import StandardScaler
from matplotlib import pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
from sklearn.metrics import accuracy_score
# Loading the dataset
df=pd.read_csv('dataset.csv')
# Showing the top 10 data of the dataset
df.head(10)
df.info() # Data information and type
df.describe() # Statistical information of the data
df1=df.copy()
# First Let's remove irrelavant columns:
df.drop(["Unnamed: 0"], axis=1, inplace=True)
df.head()
# constructing the date time variable
df['SALE DATE']= pd.to_datetime(df['SALE DATE'], errors='coerce')
df['sale_year'] = pd.DatetimeIndex(df['SALE DATE']).year.astype("category")

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
14MODELING & COMPUTING TECHNIQUES
df['sale_month'] = pd.DatetimeIndex(df['SALE DATE']).month.astype("category")
pd.crosstab(df['sale_month'],df['sale_year'])
# constructing the numerical variables:
numeric = ["RESIDENTIAL UNITS","COMMERCIAL UNITS","TOTAL UNITS", "LAND
SQUARE FEET" , "GROSS SQUARE FEET","SALE PRICE" ]
for col in numeric:
df[col] = pd.to_numeric(df[col], errors='coerce') # coercing errors to NAs
# constructing the categorical variables:
categorical = ["BOROUGH","NEIGHBORHOOD",'BUILDING CLASS CATEGORY', 'TAX
CLASS AT PRESENT', 'BUILDING CLASS AT PRESENT','ZIP CODE', 'TAX CLASS AT
TIME OF SALE']
for col in categorical:
df[col] = df[col].astype("category")
# getting sum of null values for each attributes
df.isna().sum()
df.replace(' ',np.nan, inplace=True) # Replacing the blank spaces
# Showing the correlation using heatmap
plt.figure(figsize=(10,7))
sns.heatmap(df.isnull(),yticklabels=False,cbar=False,cmap='OrRd_r')
# Dropping useless attributes
df.drop(["EASE-MENT","APARTMENT NUMBER"], axis=1, inplace=True)
df=df.dropna()
# finally check if there is any duplicated value:
sum(df.duplicated())
# Dropping the duplicate column
df.drop_duplicates(inplace=True)
#Capture necessary columns
variables=df.columns
count=[]
for variable in variables:
Document Page
15MODELING & COMPUTING TECHNIQUES
length=df[variable].count()
count.append(length)
#Plot number of available data per variable
plt.figure(figsize=(30,6))
sns.barplot(x=variables, y=count)
plt.title('Available data in percent', fontsize=15)
plt.show()
# Sale price according to sale date
df.groupby('SALE DATE').agg({'SALE PRICE': ['sum']}).plot(figsize=(28,12))
df2= df[(df['SALE PRICE']>10000) & (df['SALE PRICE']<10000000)].copy()
plt.figure(figsize=(12,6))
sns.distplot(df2['SALE PRICE'], kde=True, bins=50, rug=True,color='#D0DB24')
plt.show()
df2= df2[(df2['SALE PRICE']<4000000)]
plt.figure(figsize=(12,6))
sns.distplot(df2['SALE PRICE'], kde=True, bins=50, rug=True,color='g')
plt.show()
# Plotting according to YEAR BUILT
df3=df2[df2['YEAR BUILT']!=0].copy()
plt.figure(figsize=(12,6))
sns.distplot(df3['YEAR BUILT'], bins=50, rug=True,color="r")
plt.show()
# Plotting according to TOTAL UNITS
df4=df3[df3['TOTAL UNITS']!=0].copy()
plt.figure(figsize=(12,6))
sns.distplot(df4['TOTAL UNITS'], bins=50, rug=True,color='#BE19EE')
plt.show()
# Converting the numeric to proper name of the places
Document Page
16MODELING & COMPUTING TECHNIQUES
#'1':'Manhattan', '2':'Bronx', '3': 'Brooklyn', '4':'Queens','5':'Staten Island'
df4['BOROUGH']= df4['BOROUGH'].map({1:'Manhattan', 2:'Bronx', 3: 'Brooklyn',
4:'Queens',5:'Staten Island'})
df4.head()
plt.figure(figsize=(12,5))
#Plot the data and configure the settings
#CountPlot --> histogram over a categorical, rather than quantitative, variable.
plt.title('Counting number of BOROUGH')
sns.countplot(x='BOROUGH',data=df4)
# Plotting Average SALE PRICE on each BOROUGH
df_bar=df4[['BOROUGH','SALE PRICE']].groupby(by='BOROUGH').mean().sort_values by =
'SALE PRICE', ascending = True).reset_index()
plt.figure(figsize=(10,8))
sns.barplot(x = 'BOROUGH', y = 'SALE PRICE', data = df_bar)
plt.title('Average SALE PRICE on each BOROUGH')
plt.show()
# Plotting box plot for SALE PRICE on each BOROUGH to find if outliers are present or not
plt.figure(figsize=(12,6))
sns.boxplot(x = 'BOROUGH', y = 'SALE PRICE', data = df4 )
plt.title('Box plots for SALE PRICE on each BOROUGH')
plt.show()
# Plotting Count Sales by each month
df5=df4[['sale_month', 'SALE PRICE']].groupby(by = 'sale_month').count().sort_values(by =
'sale_month', ascending = True).reset_index()
df5.columns.values[1]='Sales_count'
plt.figure(figsize=(12,6))
sns.barplot(x = 'sale_month', y = 'Sales_count', data = df5 )
plt.title('Count Sales by each month')
plt.show()
# Plotting Commercial Units vs Sale Price

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
17MODELING & COMPUTING TECHNIQUES
dataset = df4[(df4['COMMERCIAL UNITS']<20) & (df4['TOTAL UNITS']<50) & (df4['SALE
PRICE']<5000000) & (df4['SALE PRICE']>100000) & (df4['GROSS SQUARE FEET']>0)]
plt.figure(figsize=(10,6))
sns.boxplot(x='COMMERCIAL UNITS', y="SALE PRICE", data=dataset)
plt.title('Commercial Units vs Sale Price')
# Plotting Residential Units vs Sale Price
plt.figure(figsize=(10,6))
sns.boxplot(x='RESIDENTIAL UNITS', y='SALE PRICE', data=dataset)
plt.title('Residential Units vs Sale Price')
plt.show()
# Plotting Quantity of properties sold by year built
plt.figure(figsize=(10,6))
plotd=sns.countplot(x=dataset[dataset['YEAR BUILT']>1900]['YEAR BUILT'])
#plotd.set_xlim([1900, 2020])
plt.tick_params(labelbottom=False)
plt.xticks(rotation=30)
plt.title("Quantity of properties sold by year built")
plt.show()
#Generate a column season
def get_season(x):
if x==1:
return 'Summer'
elif x==2:
return 'Fall'
elif x==3:
return 'Winter'
elif x==4:
return 'Spring'
Document Page
18MODELING & COMPUTING TECHNIQUES
else:
return ''
dataset['seasons']=dataset['SALE DATE'].apply(lambda x:x.month)
dataset['seasons']=dataset['seasons'].apply(lambda x:(x%12+3)//3)
dataset['seasons']=dataset['seasons'].apply(get_season)
plt.figure(figsize=(20,25))
df_wo=dataset
sns.relplot(x="BOROUGH", y="SALE PRICE",hue='seasons' ,kind="line",
data=df_wo,legend='full')
df4['SALE DATE'] = df1['SALE DATE'].apply(lambda x: int(x[:4]+x[5:7]+x[8:10]))
df4['SALE DATE'] = df4['SALE DATE'].astype(int)
df4 = df4[df4['SALE PRICE'] != 0]
# Taking the important attributes of the dataset
X = df4[['BOROUGH','NEIGHBORHOOD','BUILDING CLASS CATEGORY','TAX CLASS
AT PRESENT','BLOCK','LOT','BUILDING CLASS AT PRESENT','ADDRESS','ZIP
CODE','RESIDENTIAL UNITS','COMMERCIAL UNITS','TOTAL UNITS','LAND SQUARE
FEET','GROSS SQUARE FEET','YEAR BUILT','TAX CLASS AT TIME OF
SALE','BUILDING CLASS AT TIME OF SALE','SALE DATE']].values
y = df4['SALE PRICE'].values
# Labeling all the string values of the specific attributes
labelencoder_X_0 = LabelEncoder()
X[:, 0] = labelencoder_X_0.fit_transform(X[:, 0])
labelencoder_X_1 = LabelEncoder()
X[:, 1] = labelencoder_X_1.fit_transform(X[:, 1])
labelencoder_X_2 = LabelEncoder()
X[:, 2] = labelencoder_X_2.fit_transform(X[:, 2])
labelencoder_X_3 = LabelEncoder()
X[:, 3] = labelencoder_X_3.fit_transform(X[:, 3])
labelencoder_X_6 = LabelEncoder()
X[:, 6] = labelencoder_X_6.fit_transform(X[:, 6])
Document Page
19MODELING & COMPUTING TECHNIQUES
labelencoder_X_7 = LabelEncoder()
X[:, 7] = labelencoder_X_7.fit_transform(X[:, 7])
labelencoder_X_16 = LabelEncoder()
X[:, 16] = labelencoder_X_16.fit_transform(X[:, 16])
# Splitting the training set and test set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2)
# Feature Scaling
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
def baseline_model():
# create model
model = Sequential()
model.add(Dense(18, input_dim=18, kernel_initializer='normal', activation='relu'))
model.add(Dense(output_dim = 18, init = 'uniform', activation = 'relu'))
model.add(Dense(output_dim = 18, init = 'uniform', activation = 'relu'))
model.add(Dense(1, kernel_initializer='normal'))
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam')
return model
x=baseline_model()
x.summary()
# Fitting to the training set
estimator = KerasRegressor(build_fn=baseline_model, epochs=100, batch_size=10,
verbose=False)
estimator.fit(X_train, y_train)
prediction = estimator.predict(X_test)
# Visualization the results and evaluation
n, length = 5, len(prediction)

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
20MODELING & COMPUTING TECHNIQUES
sns.set_style('darkgrid', {'axis.facecolor':'black'})
f, axes = plt.subplots(n, 1, figsize=(20,50))
times = 0
for i in range(n):
if i == 0:
plt.sca(axes[0])
plt.plot(y_test[:round(length/n)], color = '#19E3EE', label = 'Real Price')
plt.plot(prediction[:round(length/n)], color = '#EE1966', label = 'Predicted Price')
plt.title('NYC Property Price Prediction', fontsize=30)
plt.ylabel('Price', fontsize=20)
plt.legend(loc=1, prop={'size': 10})
else:
if i == n-1:
plt.sca(axes[n-1])
plt.plot(y_test[round(length/n*(n-1)):], color = '#19E3EE', label = 'Real Price')
plt.plot(prediction[round(length/n*(n-1)):], color = '#EE1966', label = 'Predicted Price')
plt.ylabel('Price', fontsize=20)
plt.legend(loc=1, prop={'size': 10})
else:
plt.sca(axes[i])
plt.plot(y_test[round(length/n*i):round(length/n*(i+1))], color = '#19E3EE', label = 'Real
Price')
plt.plot(prediction[round(length/n*i):round(length/n*(i+1))], color = '#EE1966', label =
'Predicted Price')
plt.ylabel('Price', fontsize=20)
plt.legend(loc=1, prop={'size': 10})
plt.show()
df_n = pd.DataFrame(list(zip(y_test.astype(int), prediction.astype(int))),columns =['Actual Price',
'Predicted Price'])
df_n.head(10)
1 out of 20
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]