Trang H Nguyen: Literature Review on Masked Face Detection
VerifiedAdded on 2020/10/01
|4
|3512
|124
Literature Review
AI Summary
This literature review by Trang H Nguyen from Earlham College explores the critical topic of masked face detection, a prominent area within computer science vision, particularly relevant after the COVID-19 pandemic. The review begins by highlighting the significance of face mask detection in public health and safety, emphasizing the need for automated systems to identify mask usage in various settings. The study examines three key datasets: Kaggle's Mask Dataset, the Real-world Masked Face Recognition Dataset (RMFRD), and the MAFA dataset, detailing their characteristics and suitability for training and evaluating detection models. The review then delves into the methods used for face detection, including Haar Classifiers and Cascaded Convolutional Networks, and for masked face classification using Convolutional Neural Networks (CNNs). The paper discusses the strengths and limitations of each method, providing a comprehensive overview of the current state of research and suggesting avenues for future work, with an emphasis on improving accuracy and adaptability to various face angles and occlusion scenarios.

Masked Face Detection: A literature review
Trang H Nguyen
Computer Science Department, Earlham College
thnguyen17@earlham.edu
1. Introduction
In 2019, Corona-virus disease severely affected the world. Ac-
cording to the World Health Organization presented in Weekly
Epidemiological and Operational report on August 30th, 2020
[1] corona-virus disease 2019 (COVID-19) has caused over
800,000 deaths and affected more than 25 million people from
at least 37 countries [2]. There have also been some severe
diseases in the past few years named SARS (Severe acute respi-
ratory syndrome) appearing in 2002 [3] or MERS (Middle East
respiratory syndrome) reported in 2012 [4] which also caused
similar severe damage directly attributable to respiratory dis-
eases. There is a report by Liu et al. [5] showed that the number
of COVID-19 cases is higher compared to the SARS. Hence, it
can easily to understand that the concern nowadays is people’s
health as more and more deaths keep occurring because of these
viruses. Governments from all over the world have been putting
public health as the top priority[6].
Luckily, research shows that one of the simplest but effec-
tive way which help prevents the spread of the virus is the sur-
gical face mask [7]. Most public services such as supermarkets
[8], museums [9] and restaurants [10] have strict requirements
for customers that if they want to use the services, they have
to wear masks [11]. However, this does not mean that every-
one agree or volunteer to wear a mask. Some people think that
wearing a mask affects their freedom, others claims that it is un-
necessary because their belief show that the rules do not apply
to them and they are in denial [12]. This means it is hard to
keep track of the number of the people who do not wear mask
especially in public places.
A Google Scholar search for the keywords ”face-mask de-
tection” returns more than 18,000 results. This suggests that
this problem is of widespread interest in the computer science
field. In conclusion, face mask detection has become a crucial
topic for the computer science vision aspect to help society.
In this literature review, I will first cover the significant re-
search on how object detection and image classification related
to this idea and how this research can help detect whether the
person is wearing a mask. Secondly, I will present a set of meth-
ods should be suggested to use that I have been researched. Fi-
nally, I will conclude by suggesting and listing out the expecta-
tions for this research’s possible future work.
2. Dataset
Every researches use different datasets for several purposes to
predict result(s). This section introduces and explains the rea-
sons why I choose these datasets.
There are going to be three sources of the dataset I am going
to use. Dataset from Kaggle website (Mask Dataset) [13] con-
tains more than 800 images; Real-world Masked Face Recogni-
tion Dataset (RMFRD) [14] contains more than 90,000 images
and MAsked FAces dataset (MAFA) [15] contains more than
30,000 images. All of the images in these datasets devided into
three type of images that are significant to identify the purpose
of this research. Those images are: people who wear masks,
people who do not wear masks, and people who do not wear
masks but occluded by other objects.
There will be some cases that the machines might get con-
fused, based on different situations such as the machines might
miss-understand and detect something like a scarf as a mask
(Figure 1). Each dataset has it own distribution, therefore I
choose three datasets to evaluate my model. Although, three
datasets contain all three type of images, each dataset has its
own unique and focus. The Mask Dataset mainly focus on the
first two types: people who wear masks and people who do not
wear a mask. The RMFRD dataset contains all three types: peo-
ple who wear masks, people who do not wear masks, and peo-
ple who do not wear masks but faces were occluded by other
objects. The MAFA dataset also contains all three types of im-
ages. More interesting, this dataset has many images with dif-
ferent face-angles, such as right-side faces and left-side faces.
For this research, I will mainly use the MAFA dataset and the
RMFRD dataset. The Mask Dataset can be used to measure how
well our model generalize to new data as an optional extension
to the main project if time permits.
2.1. Mask Dataset
This dataset [13] contains 853 images belonging to the three
types: with mask, without mask, mask work incorrectly, as well
as their bounding boxes in the PASCAL VOC format.
2.2. Real-world Masked Face Recognition Dataset
The authors of this research [14] proposed masked face datasets
in their research - Masked Face Detection Dataset (MFDD),
Real-world Masked Face Recognition Dataset (RMFRD) and
Simulated Masked Face Recognition Dataset (SMFRD). How-
ever, we will only discuss the RMFRD dataset, since it the
dataset that reflects images of real people in this world. The
authors also stated that this dataset is the real-world biggest
masked dataset that is free accessible.
According to the authors of this dataset, they selected im-
ages of celebrities and well-known people. After that they used
Python-crawled tool to crawled and cropped the front-face of
those images. With some celebrities and well-known people
that they cannot find or cannot access to the pictures they were
masks, the authors took images from Internet then correspond
the images into simulated mask-face images by putted the mask
- images in the face as pretend those images contain person wear
masks. In conclusion, the dataset includes 5,000 pictures of 525
people wearing masks, and 90,000 images of the same 525 sub-
ject and people without masks.
2.3. MAFA Dataset
This dataset [16] was collected from a set of facial images from
the Internet with more than 300K images from Flickr, Google
and Bing. The authors searched the images for the dataset with
the keyword “face; mask; occlusion and cover”. After that, the
Trang H Nguyen
Computer Science Department, Earlham College
thnguyen17@earlham.edu
1. Introduction
In 2019, Corona-virus disease severely affected the world. Ac-
cording to the World Health Organization presented in Weekly
Epidemiological and Operational report on August 30th, 2020
[1] corona-virus disease 2019 (COVID-19) has caused over
800,000 deaths and affected more than 25 million people from
at least 37 countries [2]. There have also been some severe
diseases in the past few years named SARS (Severe acute respi-
ratory syndrome) appearing in 2002 [3] or MERS (Middle East
respiratory syndrome) reported in 2012 [4] which also caused
similar severe damage directly attributable to respiratory dis-
eases. There is a report by Liu et al. [5] showed that the number
of COVID-19 cases is higher compared to the SARS. Hence, it
can easily to understand that the concern nowadays is people’s
health as more and more deaths keep occurring because of these
viruses. Governments from all over the world have been putting
public health as the top priority[6].
Luckily, research shows that one of the simplest but effec-
tive way which help prevents the spread of the virus is the sur-
gical face mask [7]. Most public services such as supermarkets
[8], museums [9] and restaurants [10] have strict requirements
for customers that if they want to use the services, they have
to wear masks [11]. However, this does not mean that every-
one agree or volunteer to wear a mask. Some people think that
wearing a mask affects their freedom, others claims that it is un-
necessary because their belief show that the rules do not apply
to them and they are in denial [12]. This means it is hard to
keep track of the number of the people who do not wear mask
especially in public places.
A Google Scholar search for the keywords ”face-mask de-
tection” returns more than 18,000 results. This suggests that
this problem is of widespread interest in the computer science
field. In conclusion, face mask detection has become a crucial
topic for the computer science vision aspect to help society.
In this literature review, I will first cover the significant re-
search on how object detection and image classification related
to this idea and how this research can help detect whether the
person is wearing a mask. Secondly, I will present a set of meth-
ods should be suggested to use that I have been researched. Fi-
nally, I will conclude by suggesting and listing out the expecta-
tions for this research’s possible future work.
2. Dataset
Every researches use different datasets for several purposes to
predict result(s). This section introduces and explains the rea-
sons why I choose these datasets.
There are going to be three sources of the dataset I am going
to use. Dataset from Kaggle website (Mask Dataset) [13] con-
tains more than 800 images; Real-world Masked Face Recogni-
tion Dataset (RMFRD) [14] contains more than 90,000 images
and MAsked FAces dataset (MAFA) [15] contains more than
30,000 images. All of the images in these datasets devided into
three type of images that are significant to identify the purpose
of this research. Those images are: people who wear masks,
people who do not wear masks, and people who do not wear
masks but occluded by other objects.
There will be some cases that the machines might get con-
fused, based on different situations such as the machines might
miss-understand and detect something like a scarf as a mask
(Figure 1). Each dataset has it own distribution, therefore I
choose three datasets to evaluate my model. Although, three
datasets contain all three type of images, each dataset has its
own unique and focus. The Mask Dataset mainly focus on the
first two types: people who wear masks and people who do not
wear a mask. The RMFRD dataset contains all three types: peo-
ple who wear masks, people who do not wear masks, and peo-
ple who do not wear masks but faces were occluded by other
objects. The MAFA dataset also contains all three types of im-
ages. More interesting, this dataset has many images with dif-
ferent face-angles, such as right-side faces and left-side faces.
For this research, I will mainly use the MAFA dataset and the
RMFRD dataset. The Mask Dataset can be used to measure how
well our model generalize to new data as an optional extension
to the main project if time permits.
2.1. Mask Dataset
This dataset [13] contains 853 images belonging to the three
types: with mask, without mask, mask work incorrectly, as well
as their bounding boxes in the PASCAL VOC format.
2.2. Real-world Masked Face Recognition Dataset
The authors of this research [14] proposed masked face datasets
in their research - Masked Face Detection Dataset (MFDD),
Real-world Masked Face Recognition Dataset (RMFRD) and
Simulated Masked Face Recognition Dataset (SMFRD). How-
ever, we will only discuss the RMFRD dataset, since it the
dataset that reflects images of real people in this world. The
authors also stated that this dataset is the real-world biggest
masked dataset that is free accessible.
According to the authors of this dataset, they selected im-
ages of celebrities and well-known people. After that they used
Python-crawled tool to crawled and cropped the front-face of
those images. With some celebrities and well-known people
that they cannot find or cannot access to the pictures they were
masks, the authors took images from Internet then correspond
the images into simulated mask-face images by putted the mask
- images in the face as pretend those images contain person wear
masks. In conclusion, the dataset includes 5,000 pictures of 525
people wearing masks, and 90,000 images of the same 525 sub-
ject and people without masks.
2.3. MAFA Dataset
This dataset [16] was collected from a set of facial images from
the Internet with more than 300K images from Flickr, Google
and Bing. The authors searched the images for the dataset with
the keyword “face; mask; occlusion and cover”. After that, the
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

Figure 1: Example images of Mask Dataset (1), MAFA Dataset (2), and RMFRD (3,4)
author eliminated images that only contain faces but lack occlu-
sion; and then narrowed down to images with 80 pixels mini-
mum side length. Hence, they ended up having 30,811 images
and 35,806 masked faces. Some images in this dataset have
more than one masked face. There are six attributes that manu-
ally annotated for each face: face location, eye location, masks
location, face orientation, occlusion degree, and mask type.
These data also contain various angles from a person’s
face. Meaning there are images showed front-face angle, im-
ages showed left-face angle and images showed right-face an-
gle. This helps the machine become flexible since it would be
more challenging for left-side and right-side faces to detect.
The masked image in this dataset also considers multiple
types of label such as simple mask (man-made objects with pure
color), complex mask (man-made objects with complex textures
or logos) and human body (face covered by hand, hair, etc.) and
hybrid ask (combinations of at least two of the aforementioned
mask types) but I think we might count this as human body as
an occlusion. To sum up, this dataset is a challenging dataset
for face detection but it contains diverse face poses in the real
world.
3. Methods
Masked face detection problems can be divided into two stages.
The first stage is Face Detection - detection the faces in the
images of datasets. In this stage, we will use Haar Classi-
fier [17] and Cascaded Convolutional Networks [18] method.
Haar Classifier can easily apply for all of the three type of im-
ages mentioned in section 2, however this method works bet-
ter with front-faces angle. That is why I also consider using
CNN - which is works more effectively with both front-faces
angle and other different side of faces angle (mentioned in the
MAFA dataset in section 2 about having left-side faces angles
and right-side faces angles). The second stage is predict if the
detected faces are wearing masks or not. This section analyses
some methods used by major researchers for each of the above-
mentioned stages. While much research has been done on the
first stage, the second stage has recently caught researchers’ at-
tention recently.
3.1. Face Detection
3.1.1. Facial Feature Detection Using Haar Classifiers
Based on the Haar feature, Viola and Jones [19] proposed a
method called Haar Classifiers uses to detected object in the
most rapid way. This method can also uses to adapted in de-
tect human faces. The common Haar Feature divided into three
feature steps: Edge feature, Line feature and Center-surround
feature. These features run all over the images in order to de-
tect 5 important parts of the face including: eyes (2 eyes), nose,
mouth, and overall face shape. This would help in analyzed the
areas of the images to the location with the highest probability
of containing the feature. Although, the limitation of this fea-
ture is it can only detect front-faces angle, the number of false
positives reduced and the detection speed were increased.
3.1.2. Joint Face Detection and Alignment using Multi-task
Cascaded Convolutional Networks
The author of this paper [18] proposed a a deep cascaded multi-
task framework for face detection and alignment in order to help
detect faces in different and difficult environments such as faces
that had various poses (include left-side faces angle and right-
side faces angle) or faces under illumination and occlusion con-
ditions. The method contains three main important stages to
predict face and landmark location: Proposal Network (P-Net),
Refinement Network (R-Net), Output Network (O-Net). To be
more specific, when the machine receive a picture as an input,
firstly, it is going to test the images by resize the picture into
image-pyramid ( meaning divided the same picture into dif-
ferent sizes ) so that the machine can detect the picture more
specifically under different details. Next up, the image will
being process under P-Net step, where the output of this step
would release boxes to square that marked the overall face in
the image. Meaning the machine going to ”scan” and try to find
multiple boxes that it consider is the human-face, then eliminate
some boxes with less confidence (since some boxes smaller than
the others, which the percentage is definitely lower). The rea-
sons why this stage contain more than one box although some-
times picture only contain one face is because in some cases,
picture contains more than one face or have more than one per-
son. After that is the R-Net stage, which is pretty similar to
P-Net stage. However this stage would keep eliminate boxes
that have lower confident of face-detected, and more focus in
detect the shape and the angle of the face in the image so that
it can define the final box which fit with the face in the image.
This would lead to the final stage - O-Net. Base on the box of
the face, this stage finalize the position of the 5 most important
places in the face: 2 eyes, 1 nose and 2 corners of the mouth.
Throughout this paper, we can have a bigger picture of what
face detection is, the difference, and how this method can help
detect a person’s face which can help a lot in using for challeng-
ing dataset like MAFA.
3.2. Masked Face Classification
3.2.1. Masked Face Classification Using Convolutional Neural
Network
Convolutional Nerual Network has been widely used for Im-
age classification tasks, based on structure of LeNet [20] and
Alexnet [21]. They have applications in image and video recog-
nition, recommendation systems, image classification, medical
image analysis, natural language processing and financial time
author eliminated images that only contain faces but lack occlu-
sion; and then narrowed down to images with 80 pixels mini-
mum side length. Hence, they ended up having 30,811 images
and 35,806 masked faces. Some images in this dataset have
more than one masked face. There are six attributes that manu-
ally annotated for each face: face location, eye location, masks
location, face orientation, occlusion degree, and mask type.
These data also contain various angles from a person’s
face. Meaning there are images showed front-face angle, im-
ages showed left-face angle and images showed right-face an-
gle. This helps the machine become flexible since it would be
more challenging for left-side and right-side faces to detect.
The masked image in this dataset also considers multiple
types of label such as simple mask (man-made objects with pure
color), complex mask (man-made objects with complex textures
or logos) and human body (face covered by hand, hair, etc.) and
hybrid ask (combinations of at least two of the aforementioned
mask types) but I think we might count this as human body as
an occlusion. To sum up, this dataset is a challenging dataset
for face detection but it contains diverse face poses in the real
world.
3. Methods
Masked face detection problems can be divided into two stages.
The first stage is Face Detection - detection the faces in the
images of datasets. In this stage, we will use Haar Classi-
fier [17] and Cascaded Convolutional Networks [18] method.
Haar Classifier can easily apply for all of the three type of im-
ages mentioned in section 2, however this method works bet-
ter with front-faces angle. That is why I also consider using
CNN - which is works more effectively with both front-faces
angle and other different side of faces angle (mentioned in the
MAFA dataset in section 2 about having left-side faces angles
and right-side faces angles). The second stage is predict if the
detected faces are wearing masks or not. This section analyses
some methods used by major researchers for each of the above-
mentioned stages. While much research has been done on the
first stage, the second stage has recently caught researchers’ at-
tention recently.
3.1. Face Detection
3.1.1. Facial Feature Detection Using Haar Classifiers
Based on the Haar feature, Viola and Jones [19] proposed a
method called Haar Classifiers uses to detected object in the
most rapid way. This method can also uses to adapted in de-
tect human faces. The common Haar Feature divided into three
feature steps: Edge feature, Line feature and Center-surround
feature. These features run all over the images in order to de-
tect 5 important parts of the face including: eyes (2 eyes), nose,
mouth, and overall face shape. This would help in analyzed the
areas of the images to the location with the highest probability
of containing the feature. Although, the limitation of this fea-
ture is it can only detect front-faces angle, the number of false
positives reduced and the detection speed were increased.
3.1.2. Joint Face Detection and Alignment using Multi-task
Cascaded Convolutional Networks
The author of this paper [18] proposed a a deep cascaded multi-
task framework for face detection and alignment in order to help
detect faces in different and difficult environments such as faces
that had various poses (include left-side faces angle and right-
side faces angle) or faces under illumination and occlusion con-
ditions. The method contains three main important stages to
predict face and landmark location: Proposal Network (P-Net),
Refinement Network (R-Net), Output Network (O-Net). To be
more specific, when the machine receive a picture as an input,
firstly, it is going to test the images by resize the picture into
image-pyramid ( meaning divided the same picture into dif-
ferent sizes ) so that the machine can detect the picture more
specifically under different details. Next up, the image will
being process under P-Net step, where the output of this step
would release boxes to square that marked the overall face in
the image. Meaning the machine going to ”scan” and try to find
multiple boxes that it consider is the human-face, then eliminate
some boxes with less confidence (since some boxes smaller than
the others, which the percentage is definitely lower). The rea-
sons why this stage contain more than one box although some-
times picture only contain one face is because in some cases,
picture contains more than one face or have more than one per-
son. After that is the R-Net stage, which is pretty similar to
P-Net stage. However this stage would keep eliminate boxes
that have lower confident of face-detected, and more focus in
detect the shape and the angle of the face in the image so that
it can define the final box which fit with the face in the image.
This would lead to the final stage - O-Net. Base on the box of
the face, this stage finalize the position of the 5 most important
places in the face: 2 eyes, 1 nose and 2 corners of the mouth.
Throughout this paper, we can have a bigger picture of what
face detection is, the difference, and how this method can help
detect a person’s face which can help a lot in using for challeng-
ing dataset like MAFA.
3.2. Masked Face Classification
3.2.1. Masked Face Classification Using Convolutional Neural
Network
Convolutional Nerual Network has been widely used for Im-
age classification tasks, based on structure of LeNet [20] and
Alexnet [21]. They have applications in image and video recog-
nition, recommendation systems, image classification, medical
image analysis, natural language processing and financial time

Figure 2: Example images of Common Haar features (1) feature apply into image (2) Detected Objects result: Face (white), Eyes (red),
Nose (blue), and Mouth (green) (3)
Figure 3: Example images of CNN framework that includes three-stage multi-task deep convolutional networks.
series. Nevertheless, CNN is known for outstanding classifi-
cation performance on image data. A CNN consists of an in-
put and an output layer, as well as multiple hidden layers [22].
Basic architect would be convolution layers follow by pooling
layer (usually max pooling), and fully connected (FC) layers.
FC layers connect every neuron in one layer to every neuron in
another layer [22]. The final activation function would be Soft-
max corresponding to the number of classes. Multiple modern
models have been proposed for Image classifcation like VG-
GNet [23], GoogleNet [24], ResNets [25].
4. Conclusion
This literature review covered how machine learning used to
detect face masks during COVID-19 situations threatened peo-
ple’s health all over the world, which hopefully can help in im-
proving technology to public healthcare’s contribution. In here
I use three sources of dataset contain three main type of images.
Each dataset have it own unique, however, due to time permit, I
might consider again which dataset(s) I should use to train and
test my model. For the method part, the first method is Face De-
tection in use to help detect faces in public place. I consider to
use Haar Classification and Multi-task Cascaded Convolutional
Networks to identify faces in crowded places or focus on faces
when getting blocked by something. Haar Classification would
use to detect mainly front-face angle and Multi-task Cascaded
Convolution Networks would help in detect faces with left and
right-side faces angle. The second method is Masked Detec-
tion using LLE-CNNs in order to determine the type of mask
on people’s faces so that the machine does not confuse when
the person’s face occlude by other thing such as a phone, hand
or scarf as a mask.
Although the most challenging of all the paper is how to
train a model so that the machine will not misunderstand some-
thing else as a mask, at this moment under time limit, the ex-
pected accuracy for the method’s result is 50%.
5. References
[1] W. H. Organization, “Weekly epidemiological and operational
(covid-19) updates august 2020,” Agust 2020.
[2] K.-S. Yuen, Z.-W. Ye, S.-Y. Fung, C.-P. Chan, and D.-Y. Jin,
“Sars-cov-2 and covid-19: The most important research ques-
tions,” Cell & bioscience, vol. 10, no. 1, pp. 1–5, 2020.
[3] C. for Disease Control Syndrome, “Severe acute respiratory syn-
drome (sars),” May 2020.
[4] ——, “Severe acute respiratory syndrome (sars),” May 2020.
[5] A. W.-S. Y. Liu, A. A. Gayle and J. Rockl ¨ov, “The reproductive
number of COVID-19 is higher compared to SARS coronavirus,”
Journal of Travel Medicine, vol. 27, no. 2, 02 2020, taaa021.
[Online]. Available: https://doi.org/10.1093/jtm/taaa021
[6] M. P. Yaqing Fang, Yiting Nie, “Transmission dynamics of the
covid-19 outbreak and effectiveness of gorvernment interven-
tions: A data-driven analysis,” Meical Virology, vol. 92, 2020.
[7] N. H. Leung, D. K. Chu, E. Y. Shiu, K.-H. Chan, J. J. McDevitt,
B. J. Hau, H.-L. Yen, Y. Li, D. K. Ip, J. M. Peiris et al., “Respira-
tory virus shedding in exhaled breath and efficacy of face masks,”
Nature medicine, vol. 26, no. 5, pp. 676–680, 2020.
[8] S. C. Dacona Smith Chief Operating Office, Lance de la Rosa
Chief Operating Office, “A simple step to help keep you safe:
Walmart and sam’s club require shoppers to wear face covering,”
2020.
[9] T. B. Chris Anderson, “Guest over the age of 2 required to wear
masks when children’s museum of cleveland reopens on monday,”
2020.
[10] D. Nelson, “All the restaurants stores that require masks right
now,” 2020.
[11] Y. Fang, Y. Nie, and M. Penny, “Transmission dynamics of the
covid-19 outbreak and effectiveness of government interventions:
A data-driven analysis,” Journal of Medical Virology, 2020.
[12] C. Gillespie, “Why do some people refuse to wear a face mask in
public?” 2020.
[13] “Mask dataset.” [Online]. Available:
https://makeml.app/datasets/mask
[14] Z. Wang, G. Wang, B. Huang, Z. Xiong, Q. Hong, H. Wu, P. Yi,
K. Jiang, N. Wang, Y. Pei et al., “Masked face recognition dataset
and application,” arXiv preprint arXiv:2003.09093, 2020.
Nose (blue), and Mouth (green) (3)
Figure 3: Example images of CNN framework that includes three-stage multi-task deep convolutional networks.
series. Nevertheless, CNN is known for outstanding classifi-
cation performance on image data. A CNN consists of an in-
put and an output layer, as well as multiple hidden layers [22].
Basic architect would be convolution layers follow by pooling
layer (usually max pooling), and fully connected (FC) layers.
FC layers connect every neuron in one layer to every neuron in
another layer [22]. The final activation function would be Soft-
max corresponding to the number of classes. Multiple modern
models have been proposed for Image classifcation like VG-
GNet [23], GoogleNet [24], ResNets [25].
4. Conclusion
This literature review covered how machine learning used to
detect face masks during COVID-19 situations threatened peo-
ple’s health all over the world, which hopefully can help in im-
proving technology to public healthcare’s contribution. In here
I use three sources of dataset contain three main type of images.
Each dataset have it own unique, however, due to time permit, I
might consider again which dataset(s) I should use to train and
test my model. For the method part, the first method is Face De-
tection in use to help detect faces in public place. I consider to
use Haar Classification and Multi-task Cascaded Convolutional
Networks to identify faces in crowded places or focus on faces
when getting blocked by something. Haar Classification would
use to detect mainly front-face angle and Multi-task Cascaded
Convolution Networks would help in detect faces with left and
right-side faces angle. The second method is Masked Detec-
tion using LLE-CNNs in order to determine the type of mask
on people’s faces so that the machine does not confuse when
the person’s face occlude by other thing such as a phone, hand
or scarf as a mask.
Although the most challenging of all the paper is how to
train a model so that the machine will not misunderstand some-
thing else as a mask, at this moment under time limit, the ex-
pected accuracy for the method’s result is 50%.
5. References
[1] W. H. Organization, “Weekly epidemiological and operational
(covid-19) updates august 2020,” Agust 2020.
[2] K.-S. Yuen, Z.-W. Ye, S.-Y. Fung, C.-P. Chan, and D.-Y. Jin,
“Sars-cov-2 and covid-19: The most important research ques-
tions,” Cell & bioscience, vol. 10, no. 1, pp. 1–5, 2020.
[3] C. for Disease Control Syndrome, “Severe acute respiratory syn-
drome (sars),” May 2020.
[4] ——, “Severe acute respiratory syndrome (sars),” May 2020.
[5] A. W.-S. Y. Liu, A. A. Gayle and J. Rockl ¨ov, “The reproductive
number of COVID-19 is higher compared to SARS coronavirus,”
Journal of Travel Medicine, vol. 27, no. 2, 02 2020, taaa021.
[Online]. Available: https://doi.org/10.1093/jtm/taaa021
[6] M. P. Yaqing Fang, Yiting Nie, “Transmission dynamics of the
covid-19 outbreak and effectiveness of gorvernment interven-
tions: A data-driven analysis,” Meical Virology, vol. 92, 2020.
[7] N. H. Leung, D. K. Chu, E. Y. Shiu, K.-H. Chan, J. J. McDevitt,
B. J. Hau, H.-L. Yen, Y. Li, D. K. Ip, J. M. Peiris et al., “Respira-
tory virus shedding in exhaled breath and efficacy of face masks,”
Nature medicine, vol. 26, no. 5, pp. 676–680, 2020.
[8] S. C. Dacona Smith Chief Operating Office, Lance de la Rosa
Chief Operating Office, “A simple step to help keep you safe:
Walmart and sam’s club require shoppers to wear face covering,”
2020.
[9] T. B. Chris Anderson, “Guest over the age of 2 required to wear
masks when children’s museum of cleveland reopens on monday,”
2020.
[10] D. Nelson, “All the restaurants stores that require masks right
now,” 2020.
[11] Y. Fang, Y. Nie, and M. Penny, “Transmission dynamics of the
covid-19 outbreak and effectiveness of government interventions:
A data-driven analysis,” Journal of Medical Virology, 2020.
[12] C. Gillespie, “Why do some people refuse to wear a face mask in
public?” 2020.
[13] “Mask dataset.” [Online]. Available:
https://makeml.app/datasets/mask
[14] Z. Wang, G. Wang, B. Huang, Z. Xiong, Q. Hong, H. Wu, P. Yi,
K. Jiang, N. Wang, Y. Pei et al., “Masked face recognition dataset
and application,” arXiv preprint arXiv:2003.09093, 2020.
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

[15] AIZOOTech, “Facemaskdetection,”
https://github.com/AIZOOTech/FaceMaskDetection, 2020.
[16] S. Ge, J. Li, Q. Ye, and Z. Luo, “Detecting masked faces in the
wild with lle-cnns,” in 2017 IEEE Conference on Computer Vision
and Pattern Recognition (CVPR), 2017, pp. 426–434.
[17] M. J. Paul Viola, “Rapid object detection using a boosted cascade
of simple features,” 2001.
[18] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, “Joint face detection
and alignment using multitask cascaded convolutional networks,”
IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499–1503,
2016.
[19] P. I. Wilson and J. Fernandez, “Facial feature detection using haar
classifiers,” Journal of Computing Sciences in Colleges, vol. 21,
no. 4, pp. 127–133, 2006.
[20] Y. B. P. H. Yann LeCun, L´eon Botton, “Gradient-based learning
applied to document recognition,” 1998.
[21] G. E. Alex Krizhevsky, Illya Sutkever, “Imagenet classification
with deep convolutional neural networks.”
[22] W. contributors, “Convolutional neural network — Wikipedia, the
free encyclopedia,” 2020, online; accessed 7-September-2020.
[23] M. Hollemans, “Convolutional neural networks on
the iphone with vggnet,” 2006. [Online]. Avail-
able: https://machinethink.net/blog/convolutional-neural-
networks-on-the-iphone-with-vggnet/
[24] Y. J. P. S. S. R. D. A. D. E. V. V. A. R. G. I. U. o. N. C. C. H. U.
o. M. A. A. M. L. I. Christian Szegedy, Wei Liu, “Going deeper
with convolutions,” 2014.
[25] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for
image recognition,” 2015.
https://github.com/AIZOOTech/FaceMaskDetection, 2020.
[16] S. Ge, J. Li, Q. Ye, and Z. Luo, “Detecting masked faces in the
wild with lle-cnns,” in 2017 IEEE Conference on Computer Vision
and Pattern Recognition (CVPR), 2017, pp. 426–434.
[17] M. J. Paul Viola, “Rapid object detection using a boosted cascade
of simple features,” 2001.
[18] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, “Joint face detection
and alignment using multitask cascaded convolutional networks,”
IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499–1503,
2016.
[19] P. I. Wilson and J. Fernandez, “Facial feature detection using haar
classifiers,” Journal of Computing Sciences in Colleges, vol. 21,
no. 4, pp. 127–133, 2006.
[20] Y. B. P. H. Yann LeCun, L´eon Botton, “Gradient-based learning
applied to document recognition,” 1998.
[21] G. E. Alex Krizhevsky, Illya Sutkever, “Imagenet classification
with deep convolutional neural networks.”
[22] W. contributors, “Convolutional neural network — Wikipedia, the
free encyclopedia,” 2020, online; accessed 7-September-2020.
[23] M. Hollemans, “Convolutional neural networks on
the iphone with vggnet,” 2006. [Online]. Avail-
able: https://machinethink.net/blog/convolutional-neural-
networks-on-the-iphone-with-vggnet/
[24] Y. J. P. S. S. R. D. A. D. E. V. V. A. R. G. I. U. o. N. C. C. H. U.
o. M. A. A. M. L. I. Christian Szegedy, Wei Liu, “Going deeper
with convolutions,” 2014.
[25] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for
image recognition,” 2015.
1 out of 4
Related Documents
Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
Copyright © 2020–2026 A2Z Services. All Rights Reserved. Developed and managed by ZUCOL.




