Analyzing Tax Lodgement Methods Using Statistical Techniques in Excel

Verified

Added on 2020/05/28

AI Summary

This assignment involves a comprehensive data analysis project using Excel functions. Students are tasked with calculating the frequency of agent and self tax lodgements using COUNTIF formulas, generating pie charts, and computing proportions. They will perform one-sample and two-sample Z-tests to compare proportions, calculate correlations between age range and lodging methods, and use Chi-square tests to assess associations between tax agents and age groups. Additionally, students will analyze total income amounts through descriptive statistics, regression analysis, and outlier detection, providing a thorough exploration of statistical techniques applied in Excel.

Statistics and Data Analysis
Name of the Student:
Name of the University:
Raw/Rough sheet of the assignment

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Lodgement classification:
The frequencies of agent lodgement is found by the excel formula: “=COUNTIF(C2:C1001,
"=1")”. The frequencies of self lodgement is found as: “=COUNTIF(C2:C1001, "=2")”.
Proportions are counted by each count/total frequency. By clicking “Insert”, then “Pie”, pie chart
is generated. The axes, borders are decorated by right clicking “Format chart area”.
In case of proportional Z-test table, proportion (p) and total samples are manually inputted.
Standard error is Square root of p*(1-p)/total sample. Level of significance and confidence limits
are manually declared. Z-critical is calculated as =NORMSINV(1-(1-95%)/2). The upper and
lower confidence intervals are calculated as- p ±z*standard error.
The correlation between age-range and lodgement-method is calculated as
“CORREL(datasetagerange, datasetlodgementmethod)” function.
Lodgment classification2):
The frequency table, pie chart and one-sample proportional z-test is generated by same technique
as the previous dataset.
Two-sample Z-test:
It is incorporated to find the equality of proportions of two types of lodgement methods whose
lodgement type is by agent. In this table, for two samples we manually input total number of
samples, total frequency whose lodging is by agent, lodging proportions of agent lodgement.
Then we calculate total sample and total lodging by summing up the previous two datasets. Next,
we calculate total proportion by – total agent/total population. Now, for the numerator of Z-

statistic, we calculate the difference of proportions of first two datasets. Then we calculate, total
propotion*(1-total proportion).
Then for denominator, we calculate, A = proportion*[(1/total sample of first dataset) + (1/total
sample of second dataset)]. The denominator is square root of A.
Finally, Z-statistic is calculated as, numerator/denominator.
The p-value for Z-statistic is found 0.15386, from online Z-score table. The p-value is greater
than 0.05. Hence null hypothesis of equality of means is granted with, “IF” function formula in
excel.
Lodgment method and age-group:
In the first table, for age-range 0 to 11, we calculated the frequency for each case by
“COUNTIF(dataset, condition)” formula in excel. The frequencies for both self-preparer and Tax
agent are calculated. The total is calculated by summing up both types of tax lodgement. The bar
graph is produced by selecting these two dataset, click “Insert”, click “column chart”.
For establishing the association between Tax agent and age-group, Chi-square test is necessary.
We create a new table and use the function “MMULT(self data, total data)/total sample” for
percentage calculation of expected frequency. The “MMULT(agent data, total data)/total
sample” is also apprehended. Matrix multiplication is done by this formula.
Again a new table is generated for having final frequency table.
“Power(actual-expected)/expected” function is used for this purpose. This is done for both types
of lodgement. Total lodgements are calculated after summing self and tax again.

In a Chi-square table, critical value = 0.05, degrees of freedom =10 (freq. of age-groups -1), Chi-
square = 54.617 = Sum of total lodgements of final table.
p-value of chi-square statistic, “CHIDIST(chi-sq value, d.f.)”. The critical chi-square value is
calculated by given critical region and 10 degrees of freedom = 18.307. As, 54.617>18.307,
significance established. Therefore, the independence of these two categorical variables is
established.
Lodgment method and Total income Amount:
The average and standard deviation as the descriptive statistics is calculated for total amount of
income with the formula “AVERAGE” and “STANDARD DEVIATION”. The average total
income is plotted in bar chart by clicking “Insert” and then “Bar chart” option.
The descriptive statistics is calculated by installing “Analysis Toolpack” and then clicking “Data
analysis”>”Descriptive statistics”. The different total incomes are found by “Filter”-ing
lodgement method.
We generate location measures by “QUARTILE” option. IQR (Inter quartile range) = 3rd
Quartile – 1st Quartile. Upper and lower 95% C.I.s are calculated as, mean±1.96*standard error.
The outlier levels are calculated as, Median ± 1.5*IQR. Finally the number of outliers is
calculated by “COUNIF(data set, condition of outliers)” function in excel.
Total Income Amount and Deduction Amount:
Linear regression is used. Income amount is independent variable and deduction amount is
dependent variable. We click “Data analysis”>”Regression” options. For agent type income
amount, we put amount of income in X variable range, deduction of amount in Y variables

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

range, select confidence interval and level of significance and make “level” true. The toolpack
generates the regression of these two variables.
A scatter plot is after selecting both quantitative variables, clicking “insert”, then “Scatter” and
then selecting a preferable plot. By right clicking on the graph, we add a trend line and trend
equation on the graph by “Add trendline”, “linear”, “Display equation on chart”, “Display R-
squared value on chart” options.
The same procedure is repeated for “Self preparer” income and deduction amounts. Finally, we
have drawn the conclusions about overall findings.