logo

Using Aggregation Functions for Data Analysis

   

Added on  2023-02-01

21 Pages4592 Words22 Views
USING AGGREGATION FUNCTIONS FOR DATA ANALYSIS
1
USING AGGREGATION FUNCTIONS FOR DATA ANALYSIS
SIT718 Real World Analytics
Assessment Task 3: Problem Solving
Deakin University 1 Future Learn
Name of Student:

USING AGGREGATION FUNCTIONS FOR DATA ANALYSIS
2
Question 1
#1. Understand the data
the.data <- as.matrix(read.table("F:Energy19.txt ")) # (ii)Assigning the data to a matrix
my.data <- the.data[sample(1:671,300),c(1:6)] #(iii) The variable of interest is Energy use of
appliances (Y). To investigate Y, generate a subset of 300 data
data_frame <- data.frame(my.data) #generating a data frame from my.data sample
x1<-data_frame$V1 #extracting X1: Temperature in kitchen area, in Celsius
x2<-data_frame$V2# extracting X2: Humidity in kitchen area, given as a percentage
x3<-data_frame$V3#extracting X3: Temperature outside (from weather station), in Celsius
x4<-data_frame$V4#extracting X4: Humidity outside (from weather station), given as a
percentage
x5<-data_frame$V5#extracting X5: Visibility (from weather station), in km
y<-data_frame$V6#extracting Y: Energy use of appliances, in Wh
#(iv)Using scatter plots and histograms, report on the general relationship betweeneach of the
variables X1, X2, X3, X4, X5 and the variable of interest Y. Include 5scatter plots, 6 histograms,
and 1 or 2 sentences for each of the variables,including the variable of interest Y.
scatter_plot1<-plot(x1,y, main = "A Scatter Plot of Energy Use against Kitchen Temperature")
scatter_plot1
scatter_plot2<-plot(y,x2, main = "A Scatter Plot of Energy Use against Humidity in Kitchen
Area")
scatter_plot2
scatter_plot3<-plot(y,x3,main = "A Scatter plot of Energy Use Against Temperature Outside")
scatter_plot3
scatter_plot4<-plot(y,x4,main = "A Scatter Plot of Energy Use against Humidity Outside")
scatter_plot4
scatter_plot5<-plot(y,x5,main = "A Scatter Plot of Energy Use against Visibility")
scatter_plot5
histogram_1<-hist(x1, main = " Histogram of Kitchen Temperature")
histogram_1
histogram_2<-hist(x2,main = "Histogram of Humidity in Kitchen Area")
histogram_2
histogram_3<-hist(x3,main = "Histogram of Temperature Outside")
histogram_3
histogram_4<-hist(x4,main = "Histogram of Humidity Outside")
histogram_4
histogram_5<-hist(x5, main = "Hisogram of Visibility")
histogram_5
histogram_6<-hist(y,main = "Histogram of Energy Use")
histogram_6
The Output
> #1. Understand the data
> the.data <- as.matrix(read.table("F:Energy19.txt ")) # (ii)Assigning the data to a matrix

USING AGGREGATION FUNCTIONS FOR DATA ANALYSIS
3
> my.data <- the.data[sample(1:671,300),c(1:6)] #(iii) The variable of interest is Energy use of
appliances (Y). To investigate Y, generate a subset of 300 data
> data_frame <- data.frame(my.data) #generating a data frame from my.data sample
> x1<-data_frame$V1 #extracting X1: Temperature in kitchen area, in Celsius
> x2<-data_frame$V2# extracting X2: Humidity in kitchen area, given as a percentage
> x3<-data_frame$V3#extracting X3: Temperature outside (from weather station), in Celsius
> x4<-data_frame$V4#extracting X4: Humidity outside (from weather station), given as a
percentage
> x5<-data_frame$V5#extracting X5: Visibility (from weather station), in km
> y<-data_frame$V6#extracting Y: Energy use of appliances, in Wh
> #(iv)Using scatter plots and histograms, report on the general relationship betweeneach of the
variables X1, X2, X3, X4, X5 and the variable of interest Y. Include 5scatter plots, 6 histograms,
and 1 or 2 sentences for each of the variables,including the variable of interest Y.
> scatter_plot1<-plot(x1,y, main = "A Scatter Plot of Energy Use against Kitchen Temperature")
> scatter_plot1
NULL
> scatter_plot2<-plot(y,x2, main = "A Scatter Plot of Energy Use against Humidity in Kitchen
Area")
> scatter_plot2
NULL
> scatter_plot3<-plot(y,x3,main = "A Scatter plot of Energy Use Against Temperature Outside")
> scatter_plot3
NULL
> scatter_plot4<-plot(y,x4,main = "A Scatter Plot of Energy Use against Humidity Outside")
> scatter_plot4
NULL
> scatter_plot5<-plot(y,x5,main = "A Scatter Plot of Energy Use against Visibility")
> scatter_plot5
NULL
> histogram_1<-hist(x1, main = " Histogram of Kitchen Temperature")
> histogram_1
$`breaks`
[1] 15 16 17 18 19 20 21 22 23 24 25
$counts
[1] 21 43 60 54 48 37 24 10 2 1
$density
[1] 0.070000000 0.143333333 0.200000000 0.180000000 0.160000000 0.123333333
[7] 0.080000000 0.033333333 0.006666667 0.003333333
$mids
[1] 15.5 16.5 17.5 18.5 19.5 20.5 21.5 22.5 23.5 24.5

USING AGGREGATION FUNCTIONS FOR DATA ANALYSIS
4
$xname
[1] "x1"
$equidist
[1] TRUE
attr(,"class")
[1] "histogram"
> histogram_2<-hist(x2,main = "Histogram of Humidity in Kitchen Area")
> histogram_2
$`breaks`
[1] 26 28 30 32 34 36 38 40 42 44 46 48 50 52 54
$counts
[1] 1 8 17 32 31 51 49 40 25 19 16 6 2 3
$density
[1] 0.001666667 0.013333333 0.028333333 0.053333333 0.051666667 0.085000000
[7] 0.081666667 0.066666667 0.041666667 0.031666667 0.026666667 0.010000000
[13] 0.003333333 0.005000000
$mids
[1] 27 29 31 33 35 37 39 41 43 45 47 49 51 53
$xname
[1] "x2"
$equidist
[1] TRUE
attr(,"class")
[1] "histogram"
> histogram_3<-hist(x3,main = "Histogram of Temperature Outside")
> histogram_3
$`breaks`
[1] -1 0 1 2 3 4 5 6 7 8
$counts
[1] 3 22 48 73 91 50 10 2 1
$density
[1] 0.010000000 0.073333333 0.160000000 0.243333333 0.303333333 0.166666667
[7] 0.033333333 0.006666667 0.003333333

USING AGGREGATION FUNCTIONS FOR DATA ANALYSIS
5
$mids
[1] -0.5 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5
$xname
[1] "x3"
$equidist
[1] TRUE
attr(,"class")
[1] "histogram"
> histogram_4<-hist(x4,main = "Histogram of Humidity Outside")
> histogram_4
$`breaks`
[1] 60 65 70 75 80 85 90 95 100
$counts
[1] 3 18 28 47 78 73 42 11
$density
[1] 0.002000000 0.012000000 0.018666667 0.031333333 0.052000000 0.048666667
[7] 0.028000000 0.007333333
$mids
[1] 62.5 67.5 72.5 77.5 82.5 87.5 92.5 97.5
$xname
[1] "x4"
$equidist
[1] TRUE
attr(,"class")
[1] "histogram"
> histogram_5<-hist(x5, main = "Hisogram of Visibility")
> histogram_5
$`breaks`
[1] 15 20 25 30 35 40 45 50 55 60 65
$counts
[1] 6 33 65 54 111 9 6 6 9 1
$density
[1] 0.0040000000 0.0220000000 0.0433333333 0.0360000000 0.0740000000

USING AGGREGATION FUNCTIONS FOR DATA ANALYSIS
6
[6] 0.0060000000 0.0040000000 0.0040000000 0.0060000000 0.0006666667
$mids
[1] 17.5 22.5 27.5 32.5 37.5 42.5 47.5 52.5 57.5 62.5
$xname
[1] "x5"
$equidist
[1] TRUE
attr(,"class")
[1] "histogram"
> histogram_6<-hist(y,main = "Histogram of Energy Use")
> histogram_6
$`breaks`
[1] 20 40 60 80 100 120 140 160 180 200
$counts
[1] 74 133 36 26 11 10 2 3 5
$density
[1] 0.0123333333 0.0221666667 0.0060000000 0.0043333333 0.0018333333 0.0016666667
[7] 0.0003333333 0.0005000000 0.0008333333
$mids
[1] 30 50 70 90 110 130 150 170 190
$xname
[1] "y"
$equidist
[1] TRUE
attr(,"class")
[1] "histogram"

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
SIT718 Real World Analytics
|9
|2017
|57

Real World Analytics Using R
|12
|1483
|53

Home Energy Management System
|9
|1403
|18

Regression Analysis Assignment for Fall 2018
|2
|816
|349

Mammal Life Histories: Factors Affecting Longevity
|16
|2187
|361