Showing pages 1 to 1 of 3 pages
=============Report on Bike Sharing==================The problem Is count the bike on rental hourly and day.Bike sharing system are new generation of traditional bike rentals where whole process frommembership, rental and return back has become automatic. In this project task is count the howmany bike on rent and which season and number of parameter is affected.Before the coding some of important step which are as following:-1) Understanding the question and thinking the solution.2) Find out the missing data points.3) Prepare the data.4)Decide the model that you aim to exceed.5)Train the model on the training data.6) Compare the prediction to the test data.7) Interpret the model and report results visually and numerically.This model is regression type of model i.e. output is in continuous form. As per your references meused three types of regression model like1) Decision Tree2) Gradient Boosted Tree3) Linear regressionWhen we look at dataset(.csv file) some of parameter useless. So remove it means drop out. As permy thinking I removed “Date” column. Then after that check out the shape of all data set. I willcheck out season column and also describe it. But is this case target column is “cnt”(i.e. count). Inthe count column mentioned as a how many bike on rent on a particular season and time. So manycount are available thats why main approach is calculate the count.Requirements :- Python 2 or 3.Step 1:- Import the all library such as Pandas , Numpy, ploting (matplot.lib), number of packagesfrom sklearn.Step 2:- Read the input file from pandas ex. pd.read_csv(File Path)Step 3:- find out shape.Check the column which include 0. Because some time it is not important of 0 number.Step 4:- Remove the irrelevant features. Such as remove (“Date”)Step 5:- Important step is to see unique value because its most of the time unique value in thedataset so thats are removed it is best. Otherwise some time they are overlapped. So, thats timemaximum Chances are happening to overfit model. And affected on accuracy (i.e. Final Output.)Step 6:- After that delete the unique value column.Step 7:- Another the main important thing at the pre-processing step is to check the null value indataset. Null value replace by mean or mode or median. It is depend upon visualization of data like‘Normal distribution Curve’(“Bell-Shaped”). This graph very usefull to visualization of data.