Data Analysis Report: Web Server Statistics and Visualization

Verified

Added on  2022/08/13

|7
|1100
|146
Report
AI Summary
This report provides a comprehensive analysis of web server statistics from a computer science department, focusing on data collected over nine weeks across eleven variables. The analysis identifies anomalies in the average successful requests per day, particularly in the third and fifth weeks, indicating irregular data patterns. Five key variables—average successful requests per day, average successful requests for pages per day, total failed requests, total redirected requests, and number of distinct file requests—were selected using a simple random sampling lottery method. The report details measures of central tendency (mean, median, mode) and dispersion (range, standard deviation, quartile deviation) for each selected variable. Graphical representations, including bar charts, pie charts, and histograms, are used to visualize the data, with explanations for the suitability of each chart type. The study employs MS-Excel for both statistical analysis and graphical representation, highlighting the importance of charts and graphs for easy understanding and comparison. The standard deviations for the selected variables are calculated, and references to relevant statistical methods and tools are included.
Document Page
Running head: STATISTICS
Statistics
Name of the Student:
Name of the University:
Author note:
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
1
STATISTICS
This study is based on the web server statistics for the department of Computer
science. The data has been provided for nine weeks and also provides 11 variables. There are
some total and some average has been provided in each column. In the average successful
request per day there are some anomalies has been seen (Slezák., Námer, & Waczulíková,
2014). The graphical result also shows the same. Moreover the data on average successful
request per day is not regular. It has been seen on the third and fifth week.
The selected sections are average successful request per day, average successful
request for pages per day, total fail request, total redirected request and number of distinct file
request. Simple random sampling lottery method has been applied to select the five variable
among the 11 variables.
The measure of central tendency that is the mean, median and mode on average
successful request per day is 48111.13, 46378. The mode on average successful request per
day does not exist, because there is no repetition in the data sheet. Similarly the measure of
dispersion on the average successful request per day that is the range, standard deviation and
the quartile deviation is 26944, 7643.83 and 4024.9.
The measure of central tendency that is the mean, median and mode on average
successful request for pages per day is 12746.84, 13406.6. The mode on average successful
request for pages per day does not exist, because there is no repetition in the data sheet.
Similarly the measure of dispersion on the average successful request for pages per day that
is the range, standard deviation and the quartile deviation is 6068.4, 2308.4 and 1822.903.
The measure of central tendency that is the mean, median and mode on total failed
request is 14992.78, 13450, and 13450. Similarly the measure of dispersion on total failed
request is that is the range, standard deviation and the quartile deviation is 9556, 4211 and
3202.831.
Document Page
2
STATISTICS
The measure of central tendency that is the mean, median and mode on total
redirected request is 15164.89, 14300. The mode on total redirected request does not exist,
because there is no repetition in the data sheet. Similarly the measure of dispersion on total
redirected request that is the range, standard deviation and the quartile deviation is 10105,
3697 and 3052.86.
The measure of central tendency that is the mean, median and mode on number of
distinct file request is 15164.89, 14300. The mode on number of distinct file request does not
exist, because there is no repetition in the data sheet. Similarly the measure of dispersion on
number of distinct file request that is the range, standard deviation and the quartile deviation
is 10309, 3573.5 and 3202.83.
35000 to
40000 40000 to
45000 45000 to
50000 50000 to
55000 55000 to
60000 60000 to
65000 65000 to
70000
0
1
2
3
4
5
6
7
Avg. successful rqst per day
Class
Frequency
Figure 1 Bar chart on average successful request per day
The figure 1 shows the bar chart on average successful request per day. In this figure
in the X-axis represent the class and the Y-axis represent the frequency. Since this is a 7
classes of data sheet and moreover there is also an outliers. Hence the bar chart is the
appropriate to represent this data.
Document Page
3
STATISTICS
13%
88%
Pie chart on Avg. successful rqst for pages per day
5000 to 10000 10000 to 15000
Figure 2 Pie chart on average successful request for pages per day
The figure 1 shows the pie chart on average successful request for per day. In this
figure in the X-axis represent the class and the Y-axis represent the frequency. Since this is a
2 classes of data sheet. Hence the pie chart is the appropriate to represent this data.
10000 to 15000 15000 to 20000 20000 to 25000
0
1
2
3
4
5
6
7
Histogram on Total failed rqst
Class
Frequency
Figure 3 Histogram on total failed request
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
4
STATISTICS
The figure 3 shows the Histogram on total failed request. In this figure in the X-axis
represent the class and the Y-axis represent the frequency. Since this is a 3 classes of data
sheet. Hence the histogram is the appropriate to represent this data. It is a right skewed
histogram.
2000 to 6000 6000 to 10000 10000 to 14000 14000 to 18000
0
1
2
3
4
5
6
Histogram on Total redirected rqst
Class
Frequency
Figure 4 Histogram on total redirected request
The figure 4 shows the Histogram on total redirected request. In this figure in the X-
axis represent the class and the Y-axis represent the frequency. Since this is a 4 classes of
data sheet. Hence the histogram is the appropriate to represent this data. It is a right skewed
histogram.
Document Page
5
STATISTICS
10000 to 15000 15000 to 20000 20000 to 25000
0
1
2
3
4
5
6
Bar chart on No. of distinct file rqst
Class
Frequency
Figure 5 Bar chart on number of distinct file request
The figure 5 shows the bar chart on number of distinct file request. In this figure in
the X-axis represent the class and the Y-axis represent the frequency. Since this is a 3 classes
of data sheet. Hence the bar chart is the appropriate to represent this data.
In the visual format the chart and graph are important because it easy to understand,
easy to derive and easy to compare.
The standard deviation on 5 selected variables are 7463.83, 1822.903, 3202.831,
3052.86 and 3052.245 (Holcomb, 2016).
In this this study in the analysis part MS-Excel data analysis tool pack has been
applied on both the summary statistics and graphical representation (Cruz, 2013). Moreover
the web server statistics has also been applied to collect this data sheet. In the data
visualisation MS-Excel is also a better to show different comparison (Ho & Yu, 2015).
Document Page
6
STATISTICS
References
Cruz, C. D. (2013). Genes: a software package for analysis in experimental statistics and
quantitative genetics. Acta Scientiarum. Agronomy, 35(3), 271-276.
Ho, A. D., & Yu, C. C. (2015). Descriptive statistics for modern test score distributions:
Skewness, kurtosis, discreteness, and ceiling effects. Educational and Psychological
Measurement, 75(3), 365-388.
Holcomb, Z. C. (2016). Fundamentals of descriptive statistics. Routledge.
Slezák, P., Bokes, P., Námer, P., & Waczulíková, I. (2014). Microsoft Excel add-in for the
statistical analysis of contingency tables. Int J Innovation Educ Res, 2(06), 90-100.
chevron_up_icon
1 out of 7
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]