Introduction to Research: Data Collection, Design and Implementation
Verified
Added on 2023/06/03
|24
|3607
|148
AI Summary
This article provides an introduction to research, covering data collection, design and implementation. It includes information on data sources, data pre-processing, feature selection, experiment design, and implementation. The article also includes tables and figures to illustrate the concepts discussed.
Contribute Materials
Your contribution can guide someone’s learning journey. Share your
documents today.
1 Introduction to Research Student’s Name Professor’s Name Course Institution’s Name Institution’s Location Date
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
2 Table of Contents List of figures.........................................................................................................................3 List of tables...........................................................................................................................4 1.Data collection...............................................................................................................5 1.1 Data sources.................................................................................................................5 1.2 Collection of the required data.....................................................................................6 1.3 Data storage..................................................................................................................7 2.Design and implementation...........................................................................................8 2.1 Data pre-processing......................................................................................................8 2.2 Feature selection or dimension reduction..................................................................10 2.3 Experiment design......................................................................................................12 2.3.1 Detailed design steps...........................................................................................12 2.4 Implementation of the research..................................................................................14 2.4.1 The software and tools used in data analysis......................................................14 2.4.2 The results...........................................................................................................18 3.Results analysis...........................................................................................................19 3.1 The expected results...................................................................................................19 3.2 A summary of the results...........................................................................................20 4.Outline of Experiment and Result Analysis................................................................22 References............................................................................................................................23
3 List of figures Figure 1: Data pre-processing techniques..............................................................................7 Figure 2: A pie chart showing the sex (gender) of the respondents.....................................14 Figure 3: A pie chart showing the age range of the respondents.........................................14 Figure 4: A pie chart showing the education level of the respondents................................15 Figure 5: A pie chart showing the background of the respondents......................................15 Figure 6: A bar graph showing the results...........................................................................17
4 List of tables Table 1: Data collection table................................................................................................5 Table 2: Data storage table.....................................................................................................6 Table 3: Feature selection/dimension reduction table..........................................................10 Table 4: The questionnaire questions table..........................................................................11 Table 5: Demographic information table.............................................................................12 Table 6: Analyzed demographic information table..............................................................13 Table 7: The results table.....................................................................................................17
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
5 1.Data collection In any experiment or research, having the required and the appropriate data is very important to make the research or the experiment successful (Patten and Newhart, 2017). Therefore, before starting the experiment or research, we must identify the appropriate data sources where we’ll collect the data we require for the research or experiment. Data collection is normally the very basic and the very first step when designing and implementing an experiment. After the relevant and the suitable data sources are identified, the required raw data is collected and thenrecorded in the appropriate tables for it to be analyzed later when implementing the actual experiment. There are three major steps involved in the step of data collection, and these steps are identifying the appropriate data sources, collecting the required data, and storing the collected data. 1.1 Data sources Before starting an experiment, it’s important to identify the appropriate sources of the required data. Also, it’s equally important to determine the exact kind of data required in the experiment to avoid collecting too much data some of which may be irrelevant (Mutch, 2013). In this case, the kind of data we’ll collect will be the data of the use of health apps which are used by people of different sex, age, education, and background, and so it’s very important for us to choose the appropriate sources or places where we’ll interact with different people who use different health apps to improve their health conditions. Some of the possible sources which can give us the required data include some public places such as malls and people’s parks, some
6 institutions such as colleges and universities, and some companies such as manufacturing companies which employ many people. 1.2 Collection of the required data After identifying the appropriate sources where the required data can be collected, the data is collected and then recorded in appropriate tables for easy access and interpretation. A table which can be utilized in the recording of the collected raw data is shown below: Table1: Data collection table Data source organization Nature of source organization(colleges , malls, companies) Data description Data file format Charge fee Target data source Data 1PublicThe number of people who have smartphones txtFreeYes Data 2PublicThe number of people aware of the existence of health apps in their smartphones txtFreeYes Data 3PublicThe number of people who download health apps txtFreeYes Data 4PublicThe number oftxtFreeYes
7 people who follow these health apps Data 5PublicThe people who feel the health apps are effective txtFreeYes Data 6PublicThe people satisfied with the health apps txtFreeYes 1.3 Data storage After collecting and recording the required data, it’s very necessary to store the collected raw data appropriately since the data may be required in the future. One method which can be used to store the data is making use of data storage tables which will be saved and stored properly to make sure their data is not accessed by unauthorized people who may end up interfering with it (Chodorow, 2013). A table which can be used for data storage purposes is shown below: Table2: Data storage table Data source name Date of collection Saved file location Saved file name Saved file format Number of records Survey from colleges 20/7/2018//raw data/Survey1.txttxt150 Survey from21/7/2018//raw data/Survey2.txttxt250
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
8 public places Survey from companies 22/7/2018//raw data/Survey3.txttxt200 2.Design and implementation After the step of data collection and storage, the other major step which is taken in an experiment is the design and implementation step. It’s in the design and implementation stage where the collected raw data is normally modified as required to make it possible to use the data in the experiment. There are various activities done in the design and implementation step where the main activities include data pre-processing, dimension reduction/feature selection, the design of the experiment, and finally implementation of the actual experiment. 2.1 Data pre-processing Data pre-processing is done to modify the data and convert it into another forms which can be easily understood and analyzed in an experiment. We can say that data pre-processing helps to prepare the data to be used in an experiment by helping it to attain all the features required to be possessed by the data to be analyzed in an experiment (García, Luengo, and Herrera, 2015, pp.195-243). Data pre-processing is a very important process in an experiment as it helps to improve the readability and the usability of the raw data, and therefore, after collecting the raw data, it’s very important to pre-process it before we can use in an experiment. There are various
9 techniques used in data pre-processing but the major techniques include data transformation, data integration, data reduction, and data cleaning (Vijayarani, Ilamathi, and Nithya, 2015, pp.7-16). Data cleaning is the pre-processing technique where the data is normally pre-processed to fill in some of the missing values or data, reduce the noise of the data, and remove some inconsistency which may be found in some data. Therefore, after cleaning the data, we obtain complete, less noisy, and consistent data which can be easily used in the experiment analysis (Osborne, 2013). Data integration is a pre-processing technique which involves combining of related data which may come from multiple sources to obtain a single more coherent data which can be used together in an experiment with much ease (Dong and Srivastava, 2015, pp.190-198). Data transformation is the pre-processing technique where the data is modified and converted into the appropriate forms or formats required. Some of the major activities involved in data transformation include normalization of the data, smoothing of the data, aggregation of the data, and generalization of the data (Heer, Hellerstein, and Kandel, 2015). Data reduction involves reducing the data to remove the unnecessary data without interfering with the quality and the integrity of the data. There are various techniques used in data reduction where the main techniques include data compression, data discretization, numerosity reduction, and dimension reduction (Yıldırım, Özdoğan, and Watson, 2014, pp.72-93). All these techniques help to remove the unnecessary data from the raw data leaving only the required data which can be used effectively and with much ease in the experiment process.
10 The main techniques of data pre-processing can be shown diagrammatically summarized by the figure below: Figure1: Data pre-processing techniques Source: Electronicsmedia.info 2.2 Feature selection or dimension reduction Feature selection is the process of eliminating some of the features of the raw data to make sure we get the most important features and the most meaningful features which will be easily used in an experiment. Feature selection helps to avoid having so much unnecessary data in the analysis which may end up making limiting the efficiency and the accuracy of the experiment analysis (Tang, Alelyani, and Liu, 2014, pp.37-39). Feature selection may at times involve
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
11 reducing the dimensionality of the data, and that’s why it’s at times referred to as dimension reduction (Verde, Irpino, and Balzanella, 2016, pp.344-355). After carrying out the processes of data pre-processing and dimension reduction/feature selection, the data obtained can be recorded in the table shown below: Table3: Feature selection/dimension reduction table DateData source name Purpose of pre- processing Method of pre- processing Original data records Resulting data records The new data file name 25/7/2018Data 1Removing inconsistency Data cleaning200198Finalsurvey1.txt 25/7/2018Data 2Filling of missing values Data cleaning175171Finalsurvey2.txt 25/7/2018Data 3Removing unnecessary features Feature selection 190177Finalsurvey3.txt 25/7/2018Data 4Removing redundancy Data reduction 250195Finalsurvey4.txt 25/7/2018Data 5Combining related data Data integration 220208Finalsurvey5.txt 25/7/2018Data 6Transforming the data into the required formats Data transformation 200194Finalsurvey6.txt
12 2.3 Experiment design The experiment design stage is the stage where the methodology to be applied is chosen and then applied accordingly to experiment. It’s very important to choose the methodology which will make the experiment or the research run smoothly without many difficulties. 2.3.1 Detailed design steps We chose hybrid methodology to be used in our research. The hybrid methodology allows the researchers to collect and analyze both non-numerical and numerical types of data, and so it was the most suitable methodology in this research since we expected to encounter both forms of data in the research (Creswell and Clark, 2017). The main data of our interest was the use of health apps in improving the health conditions of people, and so we visited the destinations where we could get this information. We prepared some questionnaire forms with some questions which sought to determine the sex (gender), the age, the education level, and the background of the respondents and how these respondents used health apps in their smartphones. These questions were very specific to help us to get the information we wanted, and we avoided asking some personal questions which could put off some of the respondents (Neuman, 2016). We used statistical knowledge to record analyze the data we collected from the respondents. The questionnaire questions which were used in the research are shown in the table below: Table4: The questionnaire questions table Question #Question Description
13 1Do you have a smartphone? 2Are you aware of the health apps available on your smartphone? 3Do you download the health apps in your smartphone? 4Do you follow these health apps? 5Are health apps effective in improving your health? 6Are you satisfied with the performance of the health apps? After preparing the questionnaire questions, we made another table to help us categorize the respondents on the basis of their gender, age, education level, and background. This categorization was very necessary as it would help us to understand how the health apps are used by people with different demographic information. A table showing how the demographic information of the respondents which was categorized is shown below: Table5: Demographic information table SexMale Female Age range20 – 30 years 30 – 40 years
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
14 40 -50 years Over 50 years Education levelHigh school College diploma University degree Masters BackgroundStudent Business person Employee 2.4 Implementation of the research 2.4.1 The software and tools used in data analysis After preparing the research questions and categorizing the demographic information of the respondents, the researchers went ahead to collect the data/information required for the research. The collected data was then analyzed using some statistical analysis methods and tools where excel was the main software which was used in the analysis of the data. A table showing the analyzed demographic information of the people who took part in the research is shown below:
15 Table6: Analyzed demographic information table (Total number of respondents = 200) Demographic informationResponses N (%) SexMale Female 112 (56%) 88 (44%) Age range20 – 30 years 30 – 40 years 40 -50 years Over 50 years 55 (27.5%) 50 (25%) 55 (27.5%) 40 (20%) Education levelHigh school College diploma University degree Masters 65 (32.5%) 65 (32.5%) 40 (20%) 30 (15%) BackgroundStudents Business persons Employees 70 (35%) 60 (30%) 70 (35%)
16 The information displayed by the table above can be visually displayed by pie charts below: Figure2: A pie chart showing the sex (gender) of the respondents 56% 44% MaleFemale Figure3: A pie chart showing the age range of the respondents 28% 25% 28% 20% 20 - 30 years30 - 40 years 40 - 50 yearsOver 50 years
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
17 Figure4: A pie chart showing the education level of the respondents 32.50% 32.50% 20.00% 15.00% High schoolCollege diploma University degreeMasters Figure5: A pie chart showing the background of the respondents 35.00% 30.00% 35.00% StudentsBusiness personsEmployees
18 2.4.2 The results The results which were obtained after giving the questionnaire questions to the participants (200 participants) who were involved in the research are shown in the table below: Table7: The results table (Total number of respondents =200) Information related to the health appsResponses N (%) The number of respondents who possess smartphones183 (91.5%) The number of respondents aware of the health apps in their smartphones 146 (73%) The number of respondents who download health apps111 (55.5%) The number of respondents who follow the health apps after downloading them 91 (45.5%) The number of respondents who feel the health apps are effective in improving their health conditions 79 (39.5% The number of respondents who are satisfied with the performance of the health apps 62 (31%) The results shown above can be visually represented by the bar graph shown below:
19 Figure6: A bar graph showing the results Respondents who possess smartphones Respondents aware of the health apps in their smartphones Respondents who download the health apps Respondents who follow the health apps Respondents who feel the health apps are effective in improving their health Respondents satisfied with the performance of the health apps 0 20 40 60 80 100 120 140 160 180 200183 146 111 91 79 62 3.Results analysis 3.1 The expected results Before doing any research or experiment, it’s important to have an idea of the expected results. Having an idea of what’s expected in research helps the researchers to follow the right track when conducting the research, and they easily know when they get out of the right track (Gray and Malins, 2016). An idea of the expected results can be gotten from the previous similar researches or from the available literature which has covered various concepts which the
Secure Best Marks with AI Grader
Need help grading? Try our AI Grader for instant feedback on your assignments.
20 researchers wish to understand. Before conducting our research, we expected to find that there are very many people using smartphones and most of these people are aware of the health apps in their phones, and some of them use the health apps in their smartphones to improve their health status. 3.2 A summary of the results Our expected results were not very far from the actual results which we found after conducting the research. After conducting the research, we found the following summarized results: Out of the 200 respondents, 183 respondents possessed smartphones. 146 of the 200 respondents who took part in the research were aware of the health apps in their smartphones. 111 respondents who participated in the research download the health apps in their smartphones. 91 of the respondents who download health apps follow them to know how their health can be improved. 79 of the respondents who were using health apps felt the health apps were effective in improving their health.
21 62 of the respondents who were using health apps were satisfied with the performance of the health apps in improving their health conditions. These results help us to understand the performance of the health apps in the market, and this information is very important to the people who design the health apps as they can use it to know whether to design more apps or improve the existing apps to be more effective in improving the health conditions of the people.
22 4.Outline of Experiment and Result Analysis 1.Data collection 1.1 Data sources 1.2 Collection of the required data 1.3 Data storage 2.Design and implementation 2.1 Data pre-processing 2.2 Feature selection or dimension reduction 2.3 Experiment design 2.3.1 Detailed design steps 2.4 Implementation of the research 2.4.1 The software and tools used in data analysis 2.4.2 The results 3.Results analysis 3.1 The expected results 3.2 A summary of the results
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
23 References Chodorow, K., 2013.MongoDB: The Definitive Guide: Powerful and Scalable Data Storage. “O’Reilly Media, Inc." Creswell, J.W. and Clark, V.L.P., 2017.Designing and conducting mixed methods research. Sage publications. Dong, X.L. and Srivastava, D., 2015. Big data integration.Synthesis Lectures on Data Management,7(1), pp.190-198. García, S., Luengo, J. and Herrera, F., 2015.Data preprocessing in data mining(pp. 195-243). Switzerland: Springer International Publishing. Gray, C. and Malins, J., 2016.Visualizing research: A guide to the research process in art and design. Routledge. Heer, J., Hellerstein, J.M. and Kandel, S., 2015. Predictive Interaction for Data Transformation. InCIDR. Mutch, C., 2013.Doing educational research. Nzcer Press.
24 Neuman, W.L., 2016.Understanding research. Pearson. Osborne, J.W., 2013.Best practices in data cleaning: A complete guide to everything you need to do before and after collecting your data. Sage. Patten, M.L. and Newhart, M., 2017.Understanding research methods: An overview of the essentials. Taylor & Francis. Tang, J., Alelyani, S. and Liu, H., 2014. Feature selection for classification: A review.Data classification: Algorithms and applications, pp.37-39. Verde, R., Irpino, A. and Balzanella, A., 2016. Dimension reduction techniques for distributional symbolic data.IEEE transactions on cybernetics,46(2), pp.344-355. Vijayarani, S., Ilamathi, M.J. and Nithya, M., 2015. Preprocessing techniques for text mining-an overview.International Journal of Computer Science & Communication Networks,5(1), pp.7- 16. Yıldırım, A.A., Özdoğan, C. and Watson, D., 2014. Parallel data reduction techniques for big datasets. InBig data management, technologies, and applications(pp. 72-93). IGI Global.