Introduction to Research: Methodology, Design, and Implementation
Verified
Added on Β 2023/06/03
|21
|3078
|499
AI Summary
This research paper details the methods used in data collection and analysis of results as explained in the subsequent sections. It covers the methodology, design, and implementation of the research process.
Contribute Materials
Your contribution can guide someoneβs learning journey. Share your
documents today.
Introduction To Research1 INTRODUCTION TO RESEARCH By Name Course Instructor Institution Location Date
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
Introduction To Research2 Table of Contents Abstract............................................................................................................................................2 1.0 Introduction................................................................................................................................2 2.0 Methodology..............................................................................................................................3 2.1 Data Sources..........................................................................................................................3 2.2 Collection of Data..................................................................................................................3 2.3 Data storage...........................................................................................................................4 3.0 Design and Implementation.......................................................................................................5 3.1 Data Pre-Processing...............................................................................................................5 3.2 feature Selection and Reduction............................................................................................7 3.3 Experiment Designing...........................................................................................................8 3.3.1 Detailed Design Steps.....................................................................................................8 3.4 Implementation....................................................................................................................10 3.4.1 Software and Tools.......................................................................................................10 4.0 Result and Analysis.................................................................................................................11 4.1 Result...................................................................................................................................12 4.2 Result Summary...................................................................................................................14 5.0 Outline of Experiment and Result Analysis............................................................................15 References......................................................................................................................................16
Introduction To Research3 List Of Tables Table 1 Data Storage Matrix............................................................................................................7 Table 2 Feature Selection..............................................................................................................10 Table 3 Research Questions...........................................................................................................13 Table 4 Softwares ans Tools..........................................................................................................14 Table 5 Diseases Distribution........................................................................................................15 List of Figures Figure 1 Data Pre-Processing..........................................................................................................9 Figure 2 Research Detail Design...................................................................................................12 Figure 3 Hypertension Distribution...............................................................................................16 Figure 4 Diabetes Mellitus distribution.........................................................................................16 Figure 5 HIV/AIDS distribution....................................................................................................17 Figure 6 Bronshitis Distribution....................................................................................................18
Introduction To Research4 Abstract Large data sets have been collected by our clinical and other health facilities about the patients. Unfortunately, few exploratory studies have been done to discover new knowledge by mining the big data housed by clinical health facilities. This research paper has endeavored to progressively analyze large clinical data sets collected from the clinics and use it to discover new relationships that have been phenomenal in this field. Knowledge such an s relationship between the medical records can be used strategically to enhance decision making. The research paper details the methods used in data collection and analysis of results as explained in the subsequent sections 1.0 Introduction Despite the advancement in technology which has led to revolutionary approaches in the health sector by the automation of most business processes in the fields of medicine, few advancements have been made towards analysis of this enormous data. This is attributed to usage of the rudimentary toolkit which focuses on traditional data analysis based on administrative databases. Such databases lack the insight in providing relationship among variables. This research paper uses a new paradigm of data mining and knowledge discovery to come up with new and intriguing relationships among the various phenomena in the health sector(Koh and Tan, 2011). This is made possible by usage of data warehouses and data marts to store enormous datasets for analysis. The methodology and methods used in this exploratory study is explained in the section below
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Introduction To Research5 2.0 Methodology To discover patterns, several data mining methods were used. They include the use of association among the data sets, classification of data sets into several classes, clustering by groupings like sets, prediction using machine learning and use of decision tree to identify patterns in the data sets. 2.1 Data Sources The data being mined was collected from a vast number of databases in the health sector that seeks the anonymity of their data hence no name shall be given. The DHIS2 platform used by Australian hospitals and health facility houses a longitudinal data of patients records from the various diagnosis they received, the treatment record. This data formed the primary source of data sought for knowledge discovery(Tomar and Agarwal, 2013). The university health facility was used as a subsidiary source of data as it was readily available to the researcher and required less bureaucracy to gain access to and mine. The anonymity of the data was particularly requested by the authorities of the university(Lynge, Sandegaard and Rebolj, 2011). To keep the promise, the research wonβt mention any identifiable information that can link to a particular student at the university. The data sourced was kept informal .csv files to enhance the tracking of the data(Soni et al., 2011). The details of the collection are as shown in the section below
Introduction To Research6 2.2 Collection of Data The next phase of data mining is the collection of the data and formatting them into the most desired format that would be ideal for mining purposed. The health data was collected from the Oracle DB used by the DHIS2 platform adopted by the hospitals in Australia(Duan, Street, and Xu, 2011). A total of 300000 datasets was collected and the details detailed in the table below, Data SourceAuthorizing organization Data metadata Data formatService charge Target data source Data iBethseba general hospital The frequency of patient visits based on gender Comma- separated version FreeYES Data iiRoyal Perth Hospital The frequency of emergency service requested Based on gender Comma- separated version FreeYES Data iiiCalvary Public Hospital Bruce The frequency of disease diagnosis based on gender Comma- separated version FreeYES Date ivUniversity Hospital The frequency of student falling sick based on gender Comma- separated version FreeYES 2.3 Data storage Once the data source was identified and collected, it is ideal for the data to undergo a structured approach to storage to enhance the integrity and reliability of the data. Proper storage
Introduction To Research7 enhances the researcher access to the data for future references(Sun and Reddy, 2013). To achieve this, the researcher set up the matrix below used in storing and accessing the data, Table1Data Storage Matrix Source of Data Collection Date File Storage Location Name of fileFile Format Records Aggregate Level 5 Hospitals 10/10/2018//draft/dataResearch_survey.csvCsv35000 Level 4 Hospitals 5/10/2018//draft/dataResearch_survey.csvCsv25000 University15/10/2018//draft/dataResearch_survey.csvcsv30000 Setting up the matrix above ensured the researcher had a logical view of the data being mined. This is important especially for large record sets kept in the data warehouse and data marts. The researcher used this approach as from a glance, the exact storage and file name can be identified to enhance retrieval and analysis of the said data(Mans et al., 2008). Once the storage was done, the data underwent a rigorous processing exercise to discover insights and knowledge as discussed in the section below. 3.0 Design and Implementation This phase was quite important as it were key techniques to discover the knowledge was developed. The various larger tasks were divided into smaller sub-tasks to make the process more mundane and easily adopted and used by the researcher. This also enhanced collaboration with peers in gaining insights(Banaee, Ahmed and Loutfi, 2013). The tasks are explained in the section below,
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
Introduction To Research8 3.1 Data Pre-Processing Before processing starts, it is important to reduce garbage in garbage out phenomena. This reduces the processing of garbage collected from the field which is a characteristic of most of the data collection methods. Any inconsistencies and incomplete raw data fetched from the data source are cleaned to remove such anomalies(Malley, Ramazzotti, and Wu, 2016). Any duplicate records were removed at this stage to ensure consistencies while incomplete records were filtered out and remove to ensure the reliability of the data in serving the purpose of mining and provision of reliable knowledge discovery tool(Chen et al., 2015). The steps below were used to conduct the data pre-processing is as shown below,
Introduction To Research9 Figure1Data Pre-Processing The above steps were considered for all the data sets got from the field. The next phase involved the selection of features and reduction of data dimension to ensure efficiency in the mining process. 3.2 feature Selection and Reduction This phase immediately followed the data pre-processing phase, here a subset of the original data set was extracted. This is very useful in the predictive analytics which shall be used in gaining insights into the data. This is made possible for reducing the variables in the data set to a few which are important in the whole predictive analytics phase of data mining(Liu and
Introduction To Research10 Motoda, 2012). The feature selection served the following important feature selection is to ensure the researcher worked with few variables as much as possible to take advantage of analytical methods such as regression. Feature reduction, on the other hand, involved coming up with a whole new set of variables after the original ones reduced(Lyman, Scully and Harrison Jr, 2008). The table below details the steps; Table2Feature Selection DateSource Name Why Pre- process? Method Used Original Records New Records New FileName 18/10/201 8 Data iFeature selection purposes Use of data integration 2000010000Survey_final.csv 18/10/201 8 Data iiCleaning of missing pieces of data Filtering of data 150007000Survey_final.csv 18/10/201 8 Data iiiFeature selection Through data reduction 2000015000Survey_final.csv 18/10/201 8 Date ivRemove duplicates Through data reduction 100005000Survey_final.csv Once the feature selected and or reduce, the final experiment was designed as discussed in the section below. 3.3 Experiment Designing To enable the researcher, test the hypothesis, the researcher came up with a blueprint that details the procedure and processes used in order provide a conceptual framework with which the research shall be carried out(Taylor, 2009). The design includes the detailed steps used in carrying out the research and the various research questions which formed the basis of the hypothesis for the research(Tsai et al., 2014). The details steps are explained below,
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Introduction To Research11 3.3.1 Detailed Design Steps The research design enhances the researcherβs ability to have a work plan for how the said research shall be carried out. This enables the researcher to stick to the given schedule used in executing the research. The details design is as shown below,
Introduction To Research12 Figure2Research Detail Design
Introduction To Research13 The design includes the survey questions below which was used in the questionnaire and the interviews conducted. The questions included Table3Research Questions S/NoQuestion Question 1How many diagnoses do you receive in a day, group by gender Question 2Which age group get more frequent hospital visits Question 3Which disease is much prevalent among age groups Question 4How do you rate customer satisfaction with your services The questions sort to find out the demographical characteristics of the patients at various health facilities and institution. This is key in making strategic decisions. Once the blueprint of research was done, the actual implementation was conducted as described in the subsequent sections. 3.4 Implementation The actual data mining was done at this phase. It describes the various tools, platforms, and softwareβs that the researcher used in conducting the research and data mining. The aim is to gain more insights into the data so far collected to come up with patterns which were not identifiable before the mining(Srinivas, Rao and Govardhan, 2010). The softwareβs used are detailed below, 3.4.1 Software and Tools Software and tools make the whole process of data mining fun and enjoyable as most of the data analysis are done using the complex algorithms already implemented in the software. The table below shows the detailed software and algorithms used in mining the data sets
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
Introduction To Research14 Table4Softwares and Tools 4.0 Result and Analysis Once the experiment was done, the data got from the field was put to the analysis software to give analytical by use of qualitative and quantitative analysis. This ensures the mining was thorough in its conclusion. The following result was found from the mining
Introduction To Research15 4.1 Result The result got was after putting the data into the analytics software to get more insights. The following results were achieved. 4.1.1 Disease Distribution Among Gender The following table shows the total number of cases of each disease and the aggregated number for male and females Table5Diseases Distribution DiseaseMaleFemale Hypertension500350 Diabetes Mellitus2000780 HIV/AIDS50003500 Bronchitis100001200 TB1300800 The distribution is visualized as shown in the figures below, Hypertension
Introduction To Research16 Hypertension MaleFemale Figure3Hypertension Distribution Diabetitis Melistus Hypertension MaleFemale Figure4Diabetes Mellitus distribution
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Introduction To Research17 HIV/AIDS Hypertension Diabetes meliitus HIV/AIDS 0100020003000400050006000 Chart Title FemaleMale Figure5HIV/AIDS distribution
Introduction To Research18 Bronchitis HypertensionDiabetes meliitusHIV/AIDSBronchitis 0 2000 4000 6000 8000 10000 12000 Chart Title MaleFemale Figure6Bronchitis Distribution 4.2 Result Summary From the above result, it is clear that men suffer most forms of disease and most of the top five diseases are mostly suffered by men. This makes the hospitals register a high number of male deaths in Australia than of similar female gender. Most of the deaths are caused by Diabetes Mellitus, a disease of the liver caused by infection, poor diet and old age. Although HIV/AIDS is pandemic, most of the Australia population did not register a high risk of new infections as the data shows generally a low number of infections between the males and female. Hypertension is mainly prevalent among the old men as the infections of the cardiac muscles and general high blood pressure has wreak havoc among the men. This is attributed to mold age among other things as stress which the men put themselves in as they provide for the livelihood of their families.
Introduction To Research19 5.0 Outline of Experiment and Result Analysis The outline of the research makes the researcher have a better outline and view of the various sub-phases of data mining. This makes it easier to add more phases as the research progresses. The outline is as shown in the table below SectionSub-section Abstract Introduction Method 2.0 Data collection 2.1 Data sources 2.2 Collection Of Data 2.3 Data storage 3.0 Design and Implementation 3.1 Data Pre-Processing 3.2 feature Selection and Reduction 3.3 Experiment Designing 3.4 Implementation 4.0 Result and Analysis 4.1 Result Estimation 4.2 Result Summary 5.0 Outline Of Experiment and Result Analysis
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
Introduction To Research20 References Banaee, H., Ahmed, M.U. and Loutfi, A., 2013. Data mining for wearable sensors in health monitoring systems: a review of recent trends and challenges.Sensors, 13(12), pp.17472β17500. Chen, F., Deng, P., Wan, J., Zhang, D., Vasilakos, A.V. and Rong, X., 2015. Data mining for the internet of things: literature review and challenges.International Journal of Distributed Sensor Networks, 11(8), p.431047. Duan, L., Street, W.N. and Xu, E., 2011. Healthcare information systems: data mining methods in the creation of a clinical recommender system.Enterprise Information Systems, 5(2), pp.169β 181. Koh, H.C. and Tan, G., 2011. Data mining applications in healthcare.Journal of healthcare information management, 19(2), p.65. Liu, H. and Motoda, H., 2012.Feature selection for knowledge discovery and data mining. Springer Science & Business Media. Lyman, J.A., Scully, K. and Harrison Jr, J.H., 2008. The development of health care data warehouses to support data mining.Clinics in laboratory medicine, 28(1), pp.55β71. Lynge, E., Sandegaard, J.L. and Rebolj, M., 2011. The Danish national patient register. Scandinavian journal of public health, 39(7_suppl), pp.30β33. Malley, B., Ramazzotti, D. and Wu, J.T., 2016. Data Pre-processing. In:Secondary Analysis of Electronic Health Records. Springer, pp.115β141. Mans, R.S., Schonenberg, M.H., Song, M., van der Aalst, W.M. and Bakker, P.J., 2008. Application of process mining in healthcareβa case study in a dutch hospital. In:International joint conference on biomedical engineering systems and technologies. Springer, pp.425β438. Soni, J., Ansari, U., Sharma, D. and Soni, S., 2011. Predictive data mining for medical diagnosis: An overview of heart disease prediction.International Journal of Computer Applications, 17(8), pp.43β48. Srinivas, K., Rao, G.R. and Govardhan, A., 2010. Analysis of coronary heart disease and prediction of heart attack in coal mining regions using data mining techniques. In:Computer Science and Education (ICCSE), 2010 5th International Conference on. IEEE, pp.1344β1349. Sun, J. and Reddy, C.K., 2013. Big data analytics for healthcare. In:Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp.1525β1525.
Introduction To Research21 Taylor, K.E., 2009. A summary of the CMIP5 experiment design.http://cmip-pcmdi. llnl. gov/cmip5/docs/Taylor_CMIP5_design. pdf. Tomar, D. and Agarwal, S., 2013. A survey on Data Mining approaches for Healthcare. International Journal of Bio-Science and Bio-Technology, 5(5), pp.241β266. Tsai, C.-W., Lai, C.-F., Chiang, M.-C. and Yang, L.T., 2014. Data mining for Internet of Things: A survey.IEEE Communications Surveys and Tutorials, 16(1), pp.77β97.