Trusted by 2+ million users, 1000+ happy students everyday
=============Report on Bike Sharing==================The problem Is count the bike on rental hourly and day. Bike sharing system are new generation of traditional bike rentals where whole process from membership, rental and return back has become automatic. In this project task is count the how many bike on rent and which season and number of parameter is affected.Before the coding some of important step which are as following:-1) Understanding the question and thinking the solution.2) Find out the missing data points.3) Prepare the data.4)Decide the model that you aim to exceed.5)Train the model on the training data.6) Compare the prediction to the test data.7) Interpret the model and report results visually and numerically.This model is regression type of model i.e. output is in continuous form. As per your references me used three types of regression model like 1) Decision Tree2) Gradient Boosted Tree3) Linear regressionWhen we look at dataset(.csv file) some of parameter useless. So remove it means drop out. As per my thinking I removed “Date” column. Then after that check out the shape of all data set. I will check out season column and also describe it. But is this case target column is “cnt”(i.e. count). In the count column mentioned as a how many bike on rent on a particular season and time. So many count are available thats why main approach is calculate the count.Requirements :- Python 2 or 3.Step 1:- Import the all library such as Pandas , Numpy, ploting (matplot.lib), number of packages from sklearn.Step 2:- Read the input file from pandas ex. pd.read_csv(File Path)Step 3:- find out shape. Check the column which include 0. Because some time it is not important of 0 number.Step 4:- Remove the irrelevant features. Such as remove (“Date”)Step 5:- Important step is to see unique value because its most of the time unique value in the dataset so thats are removed it is best. Otherwise some time they are overlapped. So, thats time maximum Chances are happening to overfit model. And affected on accuracy (i.e. Final Output.)Step 6:- After that delete the unique value column.Step 7:- Another the main important thing at the pre-processing step is to check the null value in dataset. Null value replace by mean or mode or median. It is depend upon visualization of data like ‘Normal distribution Curve’(“Bell-Shaped”). This graph very usefull to visualization of data.
Found this document preview useful?
You are reading a preview Upload your documents to download or Become a Desklib member to get accesss