Regression Analysis: Big Data Use in Healthcare Data Assessment

Verified

Added on  2023/06/09

|4
|717
|426
Report
AI Summary
This report outlines the steps taken to perform a regression analysis on healthcare data using big data. The study begins by selecting three independent variables—deviation of policies in healthcare, big data analytics, and performance of big data application—and one dependent variable, big data. Survey questions based on these variables were distributed to a sample of 50 employees, with responses collected on a 5-point Likert scale. The collected data was then analyzed using regression analysis to determine the validity of the null hypothesis (Big Data is not useful for the assessment of healthcare data) versus the alternative hypothesis (Big Data is useful for the assessment of healthcare data). The process involved using Excel's data analysis tools, inputting X and Y ranges, and interpreting the output, including ANOVA test results, coefficient ranges, and adjusted R-square values. The analysis led to the rejection of the null hypothesis and acceptance of the alternative hypothesis, concluding that big data is indeed useful for the assessment of healthcare data, based on p-values less than 0.05.
Document Page
Steps to perform regression analysis:
The regression analysis is performed to determine relationship between two or more
independent variables and one dependent variable. The steps are as follows:
1. At first based on the selected research questions, we selected three independent
variables for this study as: Deviation of policies in healthcare, big data analytics and
Performance of big data application. The dependent variable is: big data.
2. Based on three of the independent variables, we had prepared survey questions. The
questions were set up on a 5-point Likert scale in which the respondents provide their
opinion.
3. The questions are then distributed to the employees and selected sample size is 50.
4. After collecting the data from the respondents, a data sheet is prepared. We had
replaced strongly agree with 1, agree with 2, neutral with 3, disagree with 4 and
strongly disagree with 5.
5. After than we performed the regression analysis. From this test we were able to find if
null or alternative hypothesis was correct one.
6. We found the median of each independent variables and it is shown into the data
analysis sheet with input the formula for the median.
7. At first we clicked on data analysis command button on data tab. When excel displayed
the data analysis dialog box, we selected the regression analysis from the analysis tool
list and clicked OK. The following screenshot shows the dialog box.
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
8. We then identified X and Y values. We used input Y range text box to identify the
worksheet range holding the dependent variable. Then put input X range box to identify
the worksheet range reference holding the independent variables. We entered
$O$2:$Q$52into input X Range and $R$2:$R$52 into input Y range.
Document Page
9. We checked the check box of label and placed the regression results into range
into existing worksheet, selected output range radio button and then identified
range address in output range text box. We selected new worksheet ply to place
the results in other worksheet in same excel file. We selected from the residuals
checkboxes what residuals results we want returned as part of the regression
analysis. We checked for residuals checkbox. Then click on OK.
10. The output was on the new worksheet with providing the summary of putout,
ANOVA test and coefficient ranges. There is range of the basic regression
statistics such as R-square value, standard error, number of observations. The
regression tool was supplied of analysis of variance (ANOVA) data included of
information about degrees of freedom, sum of squares, mean square value, and
significance of F. the regression tool is supplied information about regression line
calculated from data, included of coefficient, standard error, t-stat, and probability
values for the intercept and information for independent variable.
11. The adjusted R-square was used in determining the statistical significance of the
variables used in the model. ANOVA test was used in determining whether to
reject or accept the null hypotheses. A p-value of less than 0.05 shows that I must
reject the null hypothesis and accept the alternative hypothesis.
12. The p-value and coefficient are very important for regression analysis to work
together to describe which relationship in your model is important or significant.
13. The null hypothesis and alternative hypothesis is:
Null hypothesis (H0): Big Data is not useful for the assessment of the healthcare data
Document Page
Alternative hypothesis (H1): Big Data is useful for the assessment of the healthcare
data
14. Therefore, since all my p-values for the hypothesis as indicated in the ANOVA
tables were less than 0.05, I am rejecting in this case the null hypothesis and
accepting the alternative hypothesis. Therefore, data analysis finding shows that
“Big Data is useful for the assessment of the healthcare data”.
chevron_up_icon
1 out of 4
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]