Your All-in-One AI-Powered Toolkit for Academic Success.

+13062052269

info@desklib.com

Available 24*7 on WhatsApp / Email

Company

Tools

Support

Introduction to Research: Methodology, Design, and Implementation

Verified

Added on 2023/06/03

AI Summary

This research paper details the methods used in data collection and analysis of results as explained in the subsequent sections. It covers the methodology, design, and implementation of the research process.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.

Introduction To Research 1
INTRODUCTION TO RESEARCH
By Name
Course
Instructor
Institution
Location
Date

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.

Introduction To Research 2
Table of Contents
Abstract............................................................................................................................................2
1.0 Introduction................................................................................................................................2
2.0 Methodology..............................................................................................................................3
2.1 Data Sources..........................................................................................................................3
2.2 Collection of Data..................................................................................................................3
2.3 Data storage...........................................................................................................................4
3.0 Design and Implementation.......................................................................................................5
3.1 Data Pre-Processing...............................................................................................................5
3.2 feature Selection and Reduction............................................................................................7
3.3 Experiment Designing...........................................................................................................8
3.3.1 Detailed Design Steps.....................................................................................................8
3.4 Implementation....................................................................................................................10
3.4.1 Software and Tools.......................................................................................................10
4.0 Result and Analysis.................................................................................................................11
4.1 Result...................................................................................................................................12
4.2 Result Summary...................................................................................................................14
5.0 Outline of Experiment and Result Analysis............................................................................15
References......................................................................................................................................16

Introduction To Research 3
List Of Tables
Table 1 Data Storage Matrix............................................................................................................7
Table 2 Feature Selection..............................................................................................................10
Table 3 Research Questions...........................................................................................................13
Table 4 Softwares ans Tools..........................................................................................................14
Table 5 Diseases Distribution........................................................................................................15
List of Figures
Figure 1 Data Pre-Processing..........................................................................................................9
Figure 2 Research Detail Design...................................................................................................12
Figure 3 Hypertension Distribution...............................................................................................16
Figure 4 Diabetes Mellitus distribution.........................................................................................16
Figure 5 HIV/AIDS distribution....................................................................................................17
Figure 6 Bronshitis Distribution....................................................................................................18

Introduction To Research 4
Abstract
Large data sets have been collected by our clinical and other health facilities about the
patients. Unfortunately, few exploratory studies have been done to discover new knowledge by
mining the big data housed by clinical health facilities. This research paper has endeavored to
progressively analyze large clinical data sets collected from the clinics and use it to discover new
relationships that have been phenomenal in this field. Knowledge such an s relationship between
the medical records can be used strategically to enhance decision making. The research paper
details the methods used in data collection and analysis of results as explained in the subsequent
sections
1.0 Introduction
Despite the advancement in technology which has led to revolutionary approaches in the
health sector by the automation of most business processes in the fields of medicine, few
advancements have been made towards analysis of this enormous data. This is attributed to usage
of the rudimentary toolkit which focuses on traditional data analysis based on administrative
databases. Such databases lack the insight in providing relationship among variables. This
research paper uses a new paradigm of data mining and knowledge discovery to come up with
new and intriguing relationships among the various phenomena in the health sector (Koh and
Tan, 2011). This is made possible by usage of data warehouses and data marts to store enormous
datasets for analysis. The methodology and methods used in this exploratory study is explained
in the section below

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.

Introduction To Research 5
2.0 Methodology
To discover patterns, several data mining methods were used. They include the use of
association among the data sets, classification of data sets into several classes, clustering by
groupings like sets, prediction using machine learning and use of decision tree to identify
patterns in the data sets.
2.1 Data Sources
The data being mined was collected from a vast number of databases in the health sector
that seeks the anonymity of their data hence no name shall be given. The DHIS2 platform used
by Australian hospitals and health facility houses a longitudinal data of patients records from the
various diagnosis they received, the treatment record. This data formed the primary source of
data sought for knowledge discovery (Tomar and Agarwal, 2013). The university health facility
was used as a subsidiary source of data as it was readily available to the researcher and required
less bureaucracy to gain access to and mine. The anonymity of the data was particularly
requested by the authorities of the university (Lynge, Sandegaard and Rebolj, 2011). To keep the
promise, the research won’t mention any identifiable information that can link to a particular
student at the university. The data sourced was kept informal .csv files to enhance the tracking of
the data (Soni et al., 2011). The details of the collection are as shown in the section below

Introduction To Research 6
2.2 Collection of Data
The next phase of data mining is the collection of the data and formatting them into the
most desired format that would be ideal for mining purposed. The health data was collected from
the Oracle DB used by the DHIS2 platform adopted by the hospitals in Australia (Duan, Street,
and Xu, 2011). A total of 300000 datasets was collected and the details detailed in the table
below,
Data Source Authorizing
organization
Data
metadata
Data format Service
charge
Target data
source
Data i Bethseba
general
hospital
The
frequency of
patient visits
based on
gender
Comma-
separated
version
Free YES
Data ii Royal Perth
Hospital
The
frequency of
emergency
service
requested
Based on
gender
Comma-
separated
version
Free YES
Data iii Calvary
Public
Hospital
Bruce
The
frequency of
disease
diagnosis
based on
gender
Comma-
separated
version
Free YES
Date iv University
Hospital
The
frequency of
student
falling sick
based on
gender
Comma-
separated
version
Free YES
2.3 Data storage
Once the data source was identified and collected, it is ideal for the data to undergo a
structured approach to storage to enhance the integrity and reliability of the data. Proper storage

Introduction To Research 7
enhances the researcher access to the data for future references (Sun and Reddy, 2013). To
achieve this, the researcher set up the matrix below used in storing and accessing the data,
Table 1 Data Storage Matrix
Source of
Data
Collection
Date
File Storage
Location
Name of file File
Format
Records
Aggregate
Level 5
Hospitals
10/10/2018 //draft/data Research_survey.csv Csv 35000
Level 4
Hospitals
5/10/2018 //draft/data Research_survey.csv Csv 25000
University 15/10/2018 //draft/data Research_survey.csv csv 30000
Setting up the matrix above ensured the researcher had a logical view of the data being
mined. This is important especially for large record sets kept in the data warehouse and data
marts. The researcher used this approach as from a glance, the exact storage and file name can be
identified to enhance retrieval and analysis of the said data (Mans et al., 2008).
Once the storage was done, the data underwent a rigorous processing exercise to discover
insights and knowledge as discussed in the section below.
3.0 Design and Implementation
This phase was quite important as it were key techniques to discover the knowledge was
developed. The various larger tasks were divided into smaller sub-tasks to make the process
more mundane and easily adopted and used by the researcher. This also enhanced collaboration
with peers in gaining insights (Banaee, Ahmed and Loutfi, 2013). The tasks are explained in the
section below,

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Introduction To Research 8
3.1 Data Pre-Processing
Before processing starts, it is important to reduce garbage in garbage out phenomena.
This reduces the processing of garbage collected from the field which is a characteristic of most
of the data collection methods. Any inconsistencies and incomplete raw data fetched from the
data source are cleaned to remove such anomalies (Malley, Ramazzotti, and Wu, 2016). Any
duplicate records were removed at this stage to ensure consistencies while incomplete records
were filtered out and remove to ensure the reliability of the data in serving the purpose of mining
and provision of reliable knowledge discovery tool (Chen et al., 2015).
The steps below were used to conduct the data pre-processing is as shown below,

Introduction To Research 9
Figure 1 Data Pre-Processing
The above steps were considered for all the data sets got from the field. The next phase
involved the selection of features and reduction of data dimension to ensure efficiency in the
mining process.
3.2 feature Selection and Reduction
This phase immediately followed the data pre-processing phase, here a subset of the
original data set was extracted. This is very useful in the predictive analytics which shall be used
in gaining insights into the data. This is made possible for reducing the variables in the data set
to a few which are important in the whole predictive analytics phase of data mining (Liu and

Introduction To Research 10
Motoda, 2012). The feature selection served the following important feature selection is to
ensure the researcher worked with few variables as much as possible to take advantage of
analytical methods such as regression. Feature reduction, on the other hand, involved coming up
with a whole new set of variables after the original ones reduced (Lyman, Scully and Harrison Jr,
2008). The table below details the steps;
Table 2 Feature Selection
Date Source
Name
Why Pre-
process?
Method
Used
Original
Records
New
Records
New FileName
18/10/201
8
Data i Feature
selection
purposes
Use of data
integration
20000 10000 Survey_final.csv
18/10/201
8
Data ii Cleaning
of missing
pieces of
data
Filtering of
data
15000 7000 Survey_final.csv
18/10/201
8
Data iii Feature
selection
Through
data
reduction
20000 15000 Survey_final.csv
18/10/201
8
Date iv Remove
duplicates
Through
data
reduction
10000 5000 Survey_final.csv
Once the feature selected and or reduce, the final experiment was designed as discussed
in the section below.
3.3 Experiment Designing
To enable the researcher, test the hypothesis, the researcher came up with a blueprint that
details the procedure and processes used in order provide a conceptual framework with which the
research shall be carried out (Taylor, 2009). The design includes the detailed steps used in
carrying out the research and the various research questions which formed the basis of the
hypothesis for the research (Tsai et al., 2014). The details steps are explained below,

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.

Introduction To Research 11
3.3.1 Detailed Design Steps
The research design enhances the researcher’s ability to have a work plan for how the
said research shall be carried out. This enables the researcher to stick to the given schedule used
in executing the research. The details design is as shown below,

Introduction To Research 12
Figure 2 Research Detail Design

Introduction To Research 13
The design includes the survey questions below which was used in the questionnaire and
the interviews conducted. The questions included
Table 3 Research Questions
S/No Question
Question 1 How many diagnoses do you receive in a day,
group by gender
Question 2 Which age group get more frequent hospital
visits
Question 3 Which disease is much prevalent among age
groups
Question 4 How do you rate customer satisfaction with
your services
The questions sort to find out the demographical characteristics of the patients at various
health facilities and institution. This is key in making strategic decisions. Once the blueprint of
research was done, the actual implementation was conducted as described in the subsequent
sections.
3.4 Implementation
The actual data mining was done at this phase. It describes the various tools, platforms,
and software’s that the researcher used in conducting the research and data mining. The aim is to
gain more insights into the data so far collected to come up with patterns which were not
identifiable before the mining (Srinivas, Rao and Govardhan, 2010). The software’s used are
detailed below,
3.4.1 Software and Tools
Software and tools make the whole process of data mining fun and enjoyable as most of
the data analysis are done using the complex algorithms already implemented in the software.
The table below shows the detailed software and algorithms used in mining the data sets

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Introduction To Research 14
Table 4 Softwares and Tools
4.0 Result and Analysis
Once the experiment was done, the data got from the field was put to the analysis software to
give analytical by use of qualitative and quantitative analysis. This ensures the mining was
thorough in its conclusion. The following result was found from the mining

Introduction To Research 15
4.1 Result
The result got was after putting the data into the analytics software to get more insights. The
following results were achieved.
4.1.1 Disease Distribution Among Gender
The following table shows the total number of cases of each disease and the aggregated number
for male and females
Table 5 Diseases Distribution
Disease Male Female
Hypertension 500 350
Diabetes Mellitus 2000 780
HIV/AIDS 5000 3500
Bronchitis 10000 1200
TB 1300 800
The distribution is visualized as shown in the figures below,
Hypertension

Introduction To Research 16
Hypertension
Male Female
Figure 3 Hypertension Distribution
Diabetitis Melistus
Hypertension
Male Female
Figure 4 Diabetes Mellitus distribution

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.

Introduction To Research 17
HIV/AIDS
Hypertension
Diabetes meliitus
HIV/AIDS
0 1000 2000 3000 4000 5000 6000
Chart Title
Female Male
Figure 5 HIV/AIDS distribution

Introduction To Research 18
Bronchitis
Hypertension Diabetes meliitus HIV/AIDS Bronchitis
0
2000
4000
6000
8000
10000
12000
Chart Title
Male Female
Figure 6 Bronchitis Distribution
4.2 Result Summary
From the above result, it is clear that men suffer most forms of disease and most of the
top five diseases are mostly suffered by men. This makes the hospitals register a high number of
male deaths in Australia than of similar female gender. Most of the deaths are caused by
Diabetes Mellitus, a disease of the liver caused by infection, poor diet and old age. Although
HIV/AIDS is pandemic, most of the Australia population did not register a high risk of new
infections as the data shows generally a low number of infections between the males and female.
Hypertension is mainly prevalent among the old men as the infections of the cardiac muscles and
general high blood pressure has wreak havoc among the men. This is attributed to mold age
among other things as stress which the men put themselves in as they provide for the livelihood
of their families.

Introduction To Research 19
5.0 Outline of Experiment and Result Analysis
The outline of the research makes the researcher have a better outline and view of the various
sub-phases of data mining. This makes it easier to add more phases as the research progresses.
The outline is as shown in the table below
Section Sub-section
Abstract
Introduction
Method
2.0 Data collection
2.1 Data sources
2.2 Collection Of Data
2.3 Data storage
3.0 Design and Implementation
3.1 Data Pre-Processing
3.2 feature Selection and Reduction
3.3 Experiment Designing
3.4 Implementation
4.0 Result and Analysis
4.1 Result Estimation
4.2 Result Summary
5.0 Outline Of Experiment and Result Analysis

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Introduction To Research 20
References
Banaee, H., Ahmed, M.U. and Loutfi, A., 2013. Data mining for wearable sensors in health
monitoring systems: a review of recent trends and challenges. Sensors, 13(12), pp.17472–17500.
Chen, F., Deng, P., Wan, J., Zhang, D., Vasilakos, A.V. and Rong, X., 2015. Data mining for the
internet of things: literature review and challenges. International Journal of Distributed Sensor
Networks, 11(8), p.431047.
Duan, L., Street, W.N. and Xu, E., 2011. Healthcare information systems: data mining methods
in the creation of a clinical recommender system. Enterprise Information Systems, 5(2), pp.169–
181.
Koh, H.C. and Tan, G., 2011. Data mining applications in healthcare. Journal of healthcare
information management, 19(2), p.65.
Liu, H. and Motoda, H., 2012. Feature selection for knowledge discovery and data mining.
Springer Science & Business Media.
Lyman, J.A., Scully, K. and Harrison Jr, J.H., 2008. The development of health care data
warehouses to support data mining. Clinics in laboratory medicine, 28(1), pp.55–71.
Lynge, E., Sandegaard, J.L. and Rebolj, M., 2011. The Danish national patient register.
Scandinavian journal of public health, 39(7_suppl), pp.30–33.
Malley, B., Ramazzotti, D. and Wu, J.T., 2016. Data Pre-processing. In: Secondary Analysis of
Electronic Health Records. Springer, pp.115–141.
Mans, R.S., Schonenberg, M.H., Song, M., van der Aalst, W.M. and Bakker, P.J., 2008.
Application of process mining in healthcare–a case study in a dutch hospital. In: International
joint conference on biomedical engineering systems and technologies. Springer, pp.425–438.
Soni, J., Ansari, U., Sharma, D. and Soni, S., 2011. Predictive data mining for medical diagnosis:
An overview of heart disease prediction. International Journal of Computer Applications, 17(8),
pp.43–48.
Srinivas, K., Rao, G.R. and Govardhan, A., 2010. Analysis of coronary heart disease and
prediction of heart attack in coal mining regions using data mining techniques. In: Computer
Science and Education (ICCSE), 2010 5th International Conference on. IEEE, pp.1344–1349.
Sun, J. and Reddy, C.K., 2013. Big data analytics for healthcare. In: Proceedings of the 19th
ACM SIGKDD international conference on Knowledge discovery and data mining. ACM,
pp.1525–1525.

Introduction To Research 21
Taylor, K.E., 2009. A summary of the CMIP5 experiment design. http://cmip-pcmdi. llnl.
gov/cmip5/docs/Taylor_CMIP5_design. pdf.
Tomar, D. and Agarwal, S., 2013. A survey on Data Mining approaches for Healthcare.
International Journal of Bio-Science and Bio-Technology, 5(5), pp.241–266.
Tsai, C.-W., Lai, C.-F., Chiang, M.-C. and Yang, L.T., 2014. Data mining for Internet of Things:
A survey. IEEE Communications Surveys and Tutorials, 16(1), pp.77–97.

1 out of 21

+13062052269

info@desklib.com

Introduction to Research: Methodology, Design, and Implementation

Contribute Materials

Secure Best Marks with AI Grader

Secure Best Marks with AI Grader

Paraphrase This Document

Secure Best Marks with AI Grader

Paraphrase This Document

Secure Best Marks with AI Grader

Paraphrase This Document

Related Documents

Re-engineering Data Mining Business Process

Relationship between Flexible Work Arrangements and Employees Job Satisfaction through Work-Family Enrichment

Research Methodology

Financial Fraud Detection Techniques

Predicting Chronic Heart Disease

Data Mining Techniques for Analysis of Nutrition, Physical Activity and Obesity from Behavioral Risk Factor Surveillance System Data