Your All-in-One AI-Powered Toolkit for Academic Success.

+13062052269

info@desklib.com

Available 24*7 on WhatsApp / Email

Company

Tools

Support

Introduction to Research: Data Collection, Design and Implementation

Verified

Added on 2023/06/03

AI Summary

This article provides an introduction to research, covering data collection, design and implementation. It includes information on data sources, data pre-processing, feature selection, experiment design, and implementation. The article also includes tables and figures to illustrate the concepts discussed.

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.

1
Introduction to Research
Student’s Name
Professor’s Name
Course
Institution’s Name
Institution’s Location
Date

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.

2
Table of Contents
List of figures.........................................................................................................................3
List of tables...........................................................................................................................4
1. Data collection...............................................................................................................5
1.1 Data sources.................................................................................................................5
1.2 Collection of the required data.....................................................................................6
1.3 Data storage..................................................................................................................7
2. Design and implementation...........................................................................................8
2.1 Data pre-processing......................................................................................................8
2.2 Feature selection or dimension reduction..................................................................10
2.3 Experiment design......................................................................................................12
2.3.1 Detailed design steps...........................................................................................12
2.4 Implementation of the research..................................................................................14
2.4.1 The software and tools used in data analysis......................................................14
2.4.2 The results...........................................................................................................18
3. Results analysis...........................................................................................................19
3.1 The expected results...................................................................................................19
3.2 A summary of the results...........................................................................................20
4. Outline of Experiment and Result Analysis................................................................22
References............................................................................................................................23

3
List of figures
Figure 1: Data pre-processing techniques..............................................................................7
Figure 2: A pie chart showing the sex (gender) of the respondents.....................................14
Figure 3: A pie chart showing the age range of the respondents.........................................14
Figure 4: A pie chart showing the education level of the respondents................................15
Figure 5: A pie chart showing the background of the respondents......................................15
Figure 6: A bar graph showing the results...........................................................................17

4
List of tables
Table 1: Data collection table................................................................................................5
Table 2: Data storage table.....................................................................................................6
Table 3: Feature selection/dimension reduction table..........................................................10
Table 4: The questionnaire questions table..........................................................................11
Table 5: Demographic information table.............................................................................12
Table 6: Analyzed demographic information table..............................................................13
Table 7: The results table.....................................................................................................17

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.

5
1. Data collection
In any experiment or research, having the required and the appropriate data is very
important to make the research or the experiment successful (Patten and Newhart, 2017).
Therefore, before starting the experiment or research, we must identify the appropriate data
sources where we’ll collect the data we require for the research or experiment. Data collection is
normally the very basic and the very first step when designing and implementing an experiment.
After the relevant and the suitable data sources are identified, the required raw data is collected
and then recorded in the appropriate tables for it to be analyzed later when implementing the
actual experiment. There are three major steps involved in the step of data collection, and these
steps are identifying the appropriate data sources, collecting the required data, and storing the
collected data.
1.1 Data sources
Before starting an experiment, it’s important to identify the appropriate sources of the
required data. Also, it’s equally important to determine the exact kind of data required in the
experiment to avoid collecting too much data some of which may be irrelevant (Mutch, 2013). In
this case, the kind of data we’ll collect will be the data of the use of health apps which are used
by people of different sex, age, education, and background, and so it’s very important for us to
choose the appropriate sources or places where we’ll interact with different people who use
different health apps to improve their health conditions. Some of the possible sources which can
give us the required data include some public places such as malls and people’s parks, some

6
institutions such as colleges and universities, and some companies such as manufacturing
companies which employ many people.
1.2 Collection of the required data
After identifying the appropriate sources where the required data can be collected, the
data is collected and then recorded in appropriate tables for easy access and interpretation. A
table which can be utilized in the recording of the collected raw data is shown below:
Table 1: Data collection table
Data source
organization
Nature of source
organization(colleges
, malls, companies)
Data
description
Data
file
format
Charge
fee
Target
data
source
Data 1 Public The number of
people who have
smartphones
txt Free Yes
Data 2 Public The number of
people aware of
the existence of
health apps in
their
smartphones
txt Free Yes
Data 3 Public The number of
people who
download health
apps
txt Free Yes
Data 4 Public The number of txt Free Yes

7
people who
follow these
health apps
Data 5 Public The people who
feel the health
apps are
effective
txt Free Yes
Data 6 Public The people
satisfied with the
health apps
txt Free Yes
1.3 Data storage
After collecting and recording the required data, it’s very necessary to store the collected raw
data appropriately since the data may be required in the future. One method which can be used to
store the data is making use of data storage tables which will be saved and stored properly to
make sure their data is not accessed by unauthorized people who may end up interfering with it
(Chodorow, 2013). A table which can be used for data storage purposes is shown below:
Table 2: Data storage table
Data source
name
Date of
collection
Saved file
location
Saved file
name
Saved file
format
Number of
records
Survey from
colleges
20/7/2018 //raw data/ Survey1.txt txt 150
Survey from 21/7/2018 //raw data/ Survey2.txt txt 250

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

8
public places
Survey from
companies
22/7/2018 //raw data/ Survey3.txt txt 200
2. Design and implementation
After the step of data collection and storage, the other major step which is taken in an
experiment is the design and implementation step. It’s in the design and implementation stage
where the collected raw data is normally modified as required to make it possible to use the data
in the experiment. There are various activities done in the design and implementation step where
the main activities include data pre-processing, dimension reduction/feature selection, the design
of the experiment, and finally implementation of the actual experiment.
2.1 Data pre-processing
Data pre-processing is done to modify the data and convert it into another forms which can
be easily understood and analyzed in an experiment. We can say that data pre-processing helps to
prepare the data to be used in an experiment by helping it to attain all the features required to be
possessed by the data to be analyzed in an experiment (García, Luengo, and Herrera, 2015,
pp.195-243). Data pre-processing is a very important process in an experiment as it helps to
improve the readability and the usability of the raw data, and therefore, after collecting the raw
data, it’s very important to pre-process it before we can use in an experiment. There are various

9
techniques used in data pre-processing but the major techniques include data transformation, data
integration, data reduction, and data cleaning (Vijayarani, Ilamathi, and Nithya, 2015, pp.7-16).
Data cleaning is the pre-processing technique where the data is normally pre-processed to fill
in some of the missing values or data, reduce the noise of the data, and remove some
inconsistency which may be found in some data. Therefore, after cleaning the data, we obtain
complete, less noisy, and consistent data which can be easily used in the experiment analysis
(Osborne, 2013).
Data integration is a pre-processing technique which involves combining of related data
which may come from multiple sources to obtain a single more coherent data which can be used
together in an experiment with much ease (Dong and Srivastava, 2015, pp.190-198).
Data transformation is the pre-processing technique where the data is modified and converted
into the appropriate forms or formats required. Some of the major activities involved in data
transformation include normalization of the data, smoothing of the data, aggregation of the data,
and generalization of the data (Heer, Hellerstein, and Kandel, 2015).
Data reduction involves reducing the data to remove the unnecessary data without interfering
with the quality and the integrity of the data. There are various techniques used in data reduction
where the main techniques include data compression, data discretization, numerosity reduction,
and dimension reduction (Yıldırım, Özdoğan, and Watson, 2014, pp.72-93). All these techniques
help to remove the unnecessary data from the raw data leaving only the required data which can
be used effectively and with much ease in the experiment process.

10
The main techniques of data pre-processing can be shown diagrammatically summarized by
the figure below:
Figure 1: Data pre-processing techniques
Source: Electronicsmedia.info
2.2 Feature selection or dimension reduction
Feature selection is the process of eliminating some of the features of the raw data to make
sure we get the most important features and the most meaningful features which will be easily
used in an experiment. Feature selection helps to avoid having so much unnecessary data in the
analysis which may end up making limiting the efficiency and the accuracy of the experiment
analysis (Tang, Alelyani, and Liu, 2014, pp.37-39). Feature selection may at times involve

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.

11
reducing the dimensionality of the data, and that’s why it’s at times referred to as dimension
reduction (Verde, Irpino, and Balzanella, 2016, pp.344-355). After carrying out the processes of
data pre-processing and dimension reduction/feature selection, the data obtained can be recorded
in the table shown below:
Table 3: Feature selection/dimension reduction table
Date Data
source
name
Purpose of
pre-
processing
Method of
pre-
processing
Original
data
records
Resulting
data
records
The new data
file name
25/7/2018 Data 1 Removing
inconsistency
Data cleaning 200 198 Finalsurvey1.txt
25/7/2018 Data 2 Filling of
missing
values
Data cleaning 175 171 Finalsurvey2.txt
25/7/2018 Data 3 Removing
unnecessary
features
Feature
selection
190 177 Finalsurvey3.txt
25/7/2018 Data 4 Removing
redundancy
Data
reduction
250 195 Finalsurvey4.txt
25/7/2018 Data 5 Combining
related data
Data
integration
220 208 Finalsurvey5.txt
25/7/2018 Data 6 Transforming
the data into
the required
formats
Data
transformation
200 194 Finalsurvey6.txt

12
2.3 Experiment design
The experiment design stage is the stage where the methodology to be applied is chosen and
then applied accordingly to experiment. It’s very important to choose the methodology which
will make the experiment or the research run smoothly without many difficulties.
2.3.1 Detailed design steps
We chose hybrid methodology to be used in our research. The hybrid methodology
allows the researchers to collect and analyze both non-numerical and numerical types of data,
and so it was the most suitable methodology in this research since we expected to encounter both
forms of data in the research (Creswell and Clark, 2017). The main data of our interest was the
use of health apps in improving the health conditions of people, and so we visited the
destinations where we could get this information. We prepared some questionnaire forms with
some questions which sought to determine the sex (gender), the age, the education level, and the
background of the respondents and how these respondents used health apps in their smartphones.
These questions were very specific to help us to get the information we wanted, and we avoided
asking some personal questions which could put off some of the respondents (Neuman, 2016).
We used statistical knowledge to record analyze the data we collected from the respondents.
The questionnaire questions which were used in the research are shown in the table
below:
Table 4: The questionnaire questions table
Question # Question Description

13
1 Do you have a smartphone?
2 Are you aware of the health apps available on your smartphone?
3 Do you download the health apps in your smartphone?
4 Do you follow these health apps?
5 Are health apps effective in improving your health?
6 Are you satisfied with the performance of the health apps?
After preparing the questionnaire questions, we made another table to help us categorize
the respondents on the basis of their gender, age, education level, and background. This
categorization was very necessary as it would help us to understand how the health apps are used
by people with different demographic information.
A table showing how the demographic information of the respondents which was categorized is
shown below:
Table 5: Demographic information table
Sex Male
Female
Age range 20 – 30 years
30 – 40 years

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

14
40 -50 years
Over 50 years
Education level High school
College diploma
University degree
Masters
Background Student
Business person
Employee
2.4 Implementation of the research
2.4.1 The software and tools used in data analysis
After preparing the research questions and categorizing the demographic information of
the respondents, the researchers went ahead to collect the data/information required for the
research. The collected data was then analyzed using some statistical analysis methods and tools
where excel was the main software which was used in the analysis of the data.
A table showing the analyzed demographic information of the people who took part in
the research is shown below:

15
Table 6: Analyzed demographic information table
(Total number of respondents = 200)
Demographic information Responses N (%)
Sex Male
Female
112 (56%)
88 (44%)
Age range 20 – 30 years
30 – 40 years
40 -50 years
Over 50 years
55 (27.5%)
50 (25%)
55 (27.5%)
40 (20%)
Education level High school
College diploma
University degree
Masters
65 (32.5%)
65 (32.5%)
40 (20%)
30 (15%)
Background Students
Business persons
Employees
70 (35%)
60 (30%)
70 (35%)

16
The information displayed by the table above can be visually displayed by pie charts
below:
Figure 2: A pie chart showing the sex (gender) of the respondents
56%
44%
Male Female
Figure 3: A pie chart showing the age range of the respondents
28%
25%
28%
20%
20 - 30 years 30 - 40 years
40 - 50 years Over 50 years

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.

17
Figure 4: A pie chart showing the education level of the respondents
32.50%
32.50%
20.00%
15.00%
High school College diploma
University degree Masters
Figure 5: A pie chart showing the background of the respondents
35.00%
30.00%
35.00%
Students Business persons Employees

18
2.4.2 The results
The results which were obtained after giving the questionnaire questions to the
participants (200 participants) who were involved in the research are shown in the table below:
Table 7: The results table
(Total number of respondents =200)
Information related to the health apps Responses N (%)
The number of respondents who possess smartphones 183 (91.5%)
The number of respondents aware of the health apps in
their smartphones
146 (73%)
The number of respondents who download health apps 111 (55.5%)
The number of respondents who follow the health apps
after downloading them
91 (45.5%)
The number of respondents who feel the health apps are
effective in improving their health conditions
79 (39.5%
The number of respondents who are satisfied with the
performance of the health apps
62 (31%)
The results shown above can be visually represented by the bar graph shown below:

19
Figure 6: A bar graph showing the results
Respondents
who possess
smartphones
Respondents
aware of the
health apps in
their
smartphones
Respondents
who download
the health apps
Respondents
who follow the
health apps
Respondents
who feel the
health apps are
effective in
improving their
health
Respondents
satisfied with
the performance
of the health
apps
0
20
40
60
80
100
120
140
160
180
200 183
146
111
91
79
62
3. Results analysis
3.1 The expected results
Before doing any research or experiment, it’s important to have an idea of the expected
results. Having an idea of what’s expected in research helps the researchers to follow the right
track when conducting the research, and they easily know when they get out of the right track
(Gray and Malins, 2016). An idea of the expected results can be gotten from the previous similar
researches or from the available literature which has covered various concepts which the

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

20
researchers wish to understand. Before conducting our research, we expected to find that there
are very many people using smartphones and most of these people are aware of the health apps
in their phones, and some of them use the health apps in their smartphones to improve their
health status.
3.2 A summary of the results
Our expected results were not very far from the actual results which we found after
conducting the research. After conducting the research, we found the following summarized
results:
Out of the 200 respondents, 183 respondents possessed smartphones.
146 of the 200 respondents who took part in the research were aware of the health apps in
their smartphones.
111 respondents who participated in the research download the health apps in their
smartphones.
91 of the respondents who download health apps follow them to know how their health can
be improved.
79 of the respondents who were using health apps felt the health apps were effective in
improving their health.

21
62 of the respondents who were using health apps were satisfied with the performance of the
health apps in improving their health conditions.
These results help us to understand the performance of the health apps in the market, and this
information is very important to the people who design the health apps as they can use it to know
whether to design more apps or improve the existing apps to be more effective in improving the
health conditions of the people.

22
4. Outline of Experiment and Result Analysis
1. Data collection
1.1 Data sources
1.2 Collection of the required data
1.3 Data storage
2. Design and implementation
2.1 Data pre-processing
2.2 Feature selection or dimension reduction
2.3 Experiment design
2.3.1 Detailed design steps
2.4 Implementation of the research
2.4.1 The software and tools used in data analysis
2.4.2 The results
3. Results analysis
3.1 The expected results
3.2 A summary of the results

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.

23
References
Chodorow, K., 2013. MongoDB: The Definitive Guide: Powerful and Scalable Data Storage.
“O’Reilly Media, Inc."
Creswell, J.W. and Clark, V.L.P., 2017. Designing and conducting mixed methods research.
Sage publications.
Dong, X.L. and Srivastava, D., 2015. Big data integration. Synthesis Lectures on Data
Management, 7(1), pp.190-198.
García, S., Luengo, J. and Herrera, F., 2015. Data preprocessing in data mining (pp. 195-243).
Switzerland: Springer International Publishing.
Gray, C. and Malins, J., 2016. Visualizing research: A guide to the research process in art and
design. Routledge.
Heer, J., Hellerstein, J.M. and Kandel, S., 2015. Predictive Interaction for Data Transformation.
In CIDR.
Mutch, C., 2013. Doing educational research. Nzcer Press.

24
Neuman, W.L., 2016. Understanding research. Pearson.
Osborne, J.W., 2013. Best practices in data cleaning: A complete guide to everything you need
to do before and after collecting your data. Sage.
Patten, M.L. and Newhart, M., 2017. Understanding research methods: An overview of the
essentials. Taylor & Francis.
Tang, J., Alelyani, S. and Liu, H., 2014. Feature selection for classification: A review. Data
classification: Algorithms and applications, pp.37-39.
Verde, R., Irpino, A. and Balzanella, A., 2016. Dimension reduction techniques for distributional
symbolic data. IEEE transactions on cybernetics, 46(2), pp.344-355.
Vijayarani, S., Ilamathi, M.J. and Nithya, M., 2015. Preprocessing techniques for text mining-an
overview. International Journal of Computer Science & Communication Networks, 5(1), pp.7-
16.
Yıldırım, A.A., Özdoğan, C. and Watson, D., 2014. Parallel data reduction techniques for big
datasets. In Big data management, technologies, and applications (pp. 72-93). IGI Global.