Analyzing ASR Software for Enhanced English Pronunciation Skills

Verified

Added on  2019/12/28

|15
|4111
|219
Report
AI Summary
This report investigates the effectiveness of Automatic Speech Recognition (ASR) technology, particularly Eyespeak software, in enhancing English pronunciation skills for students, especially those learning English as a second language. It explores the basic concepts of ASR, its application in Computer-Assisted Language Learning (CALL), and the dimensions of ASR-based CALL, including pedagogical requirements such as input, output, and feedback. The report also examines the use of audio and visual training sessions and the role of speech technology in language learning. It highlights the benefits of ASR in providing personalized learning experiences, reducing dependency on teachers, and offering access to diverse learning materials. Furthermore, the report discusses the importance of pronunciation in communication and the role of Computer-assisted Pronunciation Training (CAPT) systems in improving language skills. The report concludes by emphasizing the potential of ASR to address pronunciation challenges and improve the overall learning experience for English language learners.
tabler-icon-diamond-filled.svg

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
The usefulness of using Automatic
Speech Recognition (ASR)
Eyespeak software in improving
students English pronunciation
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
TABLE OF CONTENT
ABSTRACT.....................................................................................................................................3
INTRODUCTION...........................................................................................................................3
1. Basic Concept of ASR............................................................................................................3
2. ASR in improvising student's English pronunciation.............................................................5
3. Dimensions of ASR based CALL...........................................................................................5
3.1 Pedagogical requirements.....................................................................................................6
3.1.1 Input...................................................................................................................................6
3.1.2 Output.................................................................................................................................6
3.1.3 Feedback............................................................................................................................6
3.2 Audio and visual sessions of training....................................................................................7
3.3 Speech technology in language learning...............................................................................7
4. Effectiveness of ASR..............................................................................................................7
5. ASR for teaching pronunciation..............................................................................................9
CONCLUSION..............................................................................................................................10
REFERENCES..............................................................................................................................11
Document Page
ABSTRACT
It is basically in context to the present carried study that has entirely focussed upon
defining the reliable means of Automated Speech Recognition (ASR) technology in teaching
English pronunciation to scholars. It is with a fundamental intercession of those learners for
whom English is referred to be a secondary language and are thus concerned about improving its
pronunciation. The below survey has therefore evaluated different conceptual measures of ASR
software where it is hereby considered to be a leading supportive mean to help the scholars with
its distinct set of tools that operates as per their configured styles of learning.
INTRODUCTION
Technical evolution is at a higher pace of development to operate in today's progressive
environment of work. However, it is together believed that there are still certain conceptual
arenas in the current status of technological progression that is required to be acquire some
genuine self-reliant system. This can be done by inculcating the scheme of an alert yet artificial
systems of intelligence that communicates in a real way that predicts like humans. This is
presently stipulated to be a major concern of dealing with an obscure situation where the
scientists are steadily progressing towards experimenting the same (Beelders and Blignaut,
2011). It is thence considered to be a future context of continual developments that is prevailing
at a higher pace of improvement. Automated Speech Recognition (ASR) technology depicts a
similar possession of technical development where it has showcased some authentic flow of
invention that is beneficial for some specialised set of users. ASR is referred to be a leading
technical device that tends to allow the humans to utilise their vocalism to interact with a data
processor program which usually reflects a pivotal referral of computer interface. In this way, it
is one of the most intelligent fluctuation that resembles a general form of conversation among
human beings.
1. Basic Concept of ASR
Speech recognition (SR) has a greater importance in the field of electrical engineering
and computer science where it basically tends to translate the expressed words or content into
textual matter. It is yet another term that is used for ASR and is also known as speech to text
(STT). ASR is a software that is autonomous in nature and is composed of computer driven
recording to convert the explicit speech in decipherable textual matter (Demenko, 2009). In
Document Page
context to which, ASR is basically interpreted to be a technical measure that permit the
electronic device to determine the phrases that is being spoken by an individual via telephonic
device or microphone for converting it into textual content. It is referred to be a machine that
interprets any sort of fluently verbalized speech that showcase 100% quality and accuracy in
apprehension of all type of languages.
However, there still exists certain state of dilemma where it is unable to comprehend in
such audible surrounding in which an individual is not able to make a clear and fluent statement.
Due to which, its regular usage is tending to enhance on daily basis with lot more inculcation of
distinct applications (Cucchiarini, 2009). With an analogous reference to which, ASR with an
eventual intent of getting into a more broader form of investigation is attempting to allow the
configured systems to improvise their recognition power with much larger set of vocabularies. It
is mainly in context to apply this aided tool of ASR in learning second level languages with a
major intercession of international languages with an ease of intercepting its distinct set of
pronunciation.
ASR is hence used as an Eyespeak software for improvising the English pronunciation of
foreign students. It is especially for the students who are trying to assimilate the English as a
secondary language often tends to showcase a prior tendency of improvising their pronunciation
where they can aptly communicate with the help of it (Kim, 2006). There exists a most
progressive interpretation of currently developed ASR technologies that is known as Natural
Language Processing (NLP). It is one of the most advanced variant of ASR that is nearest to
allow the individuals to make a real conversation with the intellectual set of machine where it is
also referred to have some more possibility of enhancement. It is basically due to its reluctant
state of quality where it gives only 96 to 99% of accuracy.
This is for instance to specify some highly advanced systems of Siri interface in iPhone
that tends to aid the individuals in making an open ended chat that imitates a real conversation. It
thereby gives lot more choices to the humans referring to the same instead of showcasing limited
set of words (Demenko, Wagner and Cylwik, 2010). Directed dialogue conversation is yet
another simplex form of ASR where it is with a limited assistance of choices that can be chosen
by the human to converse with the machine interface. In context to which, it tends to offer
narrowly outlined requests to the individuals for acquiring considerable knowledge that is out of
the contented arena. This is for instance to illustrate about some substantial means of machine-
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
driven telephonic banking with certain other interfaces of customer services that employs the
directed dialogue software of ASR.
2. ASR in improvising student's English pronunciation
The present context of ASR is to utilise it in enlightening the English learners to
improvise their pronunciation with a major consent of foreigners in it to whom it is often referred
to be a secondary language. It is hereby entitled with a prospective approach of Computer
Assisted Language Learning (CALL) that is based upon the implicated software of ASR. It is
basically due to an increasing usage of speech technology where it is mostly seen in a particular
field of foreign language pedagogy (Strik and Cucchiarini, 2009). This has led to a precise
evolution of those newer disciplined set of CALL. It hereby reflects some potent benefits where
it duly obviates the factor of constricted time period by reducing the dependency level of the
scholars on their respective teachers. Apart from this, it also allows the users to work as per one's
individual pace with no distressed phenomenon where they can also tend to stock their profile for
observing any progressive acquisition of the subject matter. It allows for continual access to
some other additive learning materials like visualisation and recordings that plays an important
role in learning English as L2 (second language) with a prior consideration of its pronunciation.
It is where the formulated concept of CALL largely supports in generating a creative
outlook in the stipulated arena of language teaching. This is for instance to depict an informative
constructivism in a multimedia based surrounding that is further collaborated by scholars and
instructors with an elementary orientation of projects. It is where the pronunciation in English
was largely ignored in favour of vocabulary and grammar in learning languages. It was with a
fundamental consent of the tutors where they believed that such drill sessions have no substantial
effect on pronunciation by rather making it more detrimental in nature (Strik, 2009). However,
later investigations have clarified that language pronunciation has a greater importance in
communication where it also leads to improvise the cognitive state of individuals. In relation to
which, timely session of training largely contributes in improving the language pronunciation
that has resultantly augmented the learning interest of individuals to amend the same. It is
however with a prior assistance of Computer- assisted Pronunciation Training (CAPT) system
that reflects the technical measures of CALL to aid the scholars in rising their English
Pronunciation.
Document Page
3. Dimensions of ASR based CALL
It is basically to make a close investigation of CALL system where it is subsequently
composed of some impelling traits of ASR where it provides an optimum solution to
pronunciation learning. In regard to which, there exists total three distinct requisitions of CAPT
with their distinguished attributes as discoursed below-
3.1 Pedagogical requirements
It is the foremost factor that largely affects the concept of pronunciation learning where
pedagogy is entirely base upon distinct methods of teaching (Wester, 2013). As a result to which,
it is further categorised in three distinct set of projection to support the overall measure of CAPT.
These components are as delineated below-
3.1.1 Input
It is referred to be an elementary ingredient for a successful imposition of language
learning. With reference to which, the scholars should be hereby able to access ample number of
inputs as a mean of acquiring accessible target models of learning. Though, there existed
numerous type of inputs as a mean of supporting the learners. It is where a majority of studies
have evidenced an active existence of input in benefiting the procedural of pronunciation
learning (Zavaliagkos, 2011). However, the learning should be factual in nature that is applicable
to the needs and demands of the scholar that also facilitates the learner to stick to a long term
association of such plan. An input could be presented in any of the formats like either in written,
oral or audio- visual perception that should correspond with different learning style of
individuals.
3.1.2 Output
It is yet another significant factor of CAPT where the learners are hereby required to
come forward that a mere process of listening. It is where the foreigners who are attempting to
learn a considerable English pronunciation should hereby refer to make an active participation in
practising the intelligible accent (Elimat and AbuSeileek, 2014). An output is basically in context
to encourage speech production with a special concern of creating it in a stress free environment
where the scholars do not hesitate in engaging themselves in the same. It is also considered to be
a communicative task in the process of L2 at the time of training the university students where
Document Page
they are mostly concerned about generating a speech in foreign language. It is basically due to a
reluctant phase of losing the battle where they may refer to lose their lingual personality.
3.1.3 Feedback
It is referred to be a debatable constituent that exists with a very inferior nature of
investigation and carries an unspecific discord on its effectualness. It is specially in concern of
learning English pronunciation by the University students as a mean of acquiring a foreign set of
L2. Though, it has been deliberated as a most requisite measure to ascertain the remedial nature
of their learning (McCrocklin, 2014). It is where the most common method of feedback that is
usually adopted by the teachers is recast. In a much broader sense, it is fundamentally known as
repetition with change where the given feedback should not rely upon the carried perceptions of
the scholars and should indeed encourage the learners for self improvement.
3.2 Audio and visual sessions of training
It is yet another element of training via CAPT where the students are being trained using
different measures of audiovisual devices (Neri, 2007). This is basically with a momentous
incursion of audio visual technicians who are accountable to make considerable set ups for
training the individuals in support of the educators. It is basically in context to the organised set
of conferences and seminars to instruct the individuals.
3.3 Speech technology in language learning
It is yet another level of problem recognition with a prior accountability of the educators
to report any problematic consent of students in delivering fluent speech in English that
showcase affluent sense of pronunciation (Zajechowski, 2014). It is however with a leading
existence of feedback that are required to be given by the instructors in order to improvise the
English pronunciation of the learners.
4. Effectiveness of ASR
In addition to this, it can be stated that there are number of aspects that ensure that ASR
system is effective for business firm so that goals and objectives can be accomplished. It has
been noticed that ASR system allows to have appropriate accessibility for deaf and hard of
hearing. It means with an assistance of this, the skills in regard to hearing can be improved
among members (Demenko, 2009). It is because the pronunciation development is also
dependent upon oral and listening skills. Traditional methods of pronunciation development is
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
complex that also enhances the overall cost of program. But the use of ASR system allows to
have improved focus on cost reduction because it provides automated working.
Moreover, the searchable text capability is also one of key benefit which is being
promoted effectively while having a pronunciation development through ASR system. It has
been noticed that diverse standards of pronunciation are well maintained through ASR system
application. It is because it allows to have authentic interaction with the target language and its
users (Strik and Cucchiarini, 2009). It also promotes the collective group work which helps in
interaction with other members. Pronunciation improvement is greatly dependent upon
interaction with other people because it is associated with oral skills. ASR also allow students to
have open-ended learning participation in diverse activities. It also helps in attaining learning
about the target language and its social contexts of use. Exploration of personal and societal
goals is one of key benefit which can be gained through learners (Zavaliagkos, 2011).
It has been noticed that ASR working is dependent upon computer which provide better
support to pronunciation module. Computers are considered as a tool which is being used by
students in order to have proper utilization of information. It also allows to language practice
through online and software programs. Computers holds a great capability to improve the
language learning which is advantageous to advance pronunciation learning (Wester, 2013). A
typical ASR system receives acoustic input from the speaker through a microphone. In addition
to this, it can be said that appropriate analysis of pattern and model is also significant because it
helps system to provide outcome in form of text. It also saves time and provide convenience
because the overall process effectiveness is dependent over technological values.
Along with this, it has been noticed that ASR is also beneficial for automated telephone
lines. It is because the overall interaction level can be advanced in critical manner so that
sustainability can be improved. Pronunciation program can also be advanced through use of Siri
on the iphone. It is because it also a kind of ASR program which is also dependent upon
pronunciation (Zavaliagkos, 2011). It has been noticed that Siri only accept commends if
pronunciation of word is accurate. It means the students can also have use of Siri to have
improvement in pronunciation skills. When used for pronunciation training, ASR is a tool that
allows students to practice at their own speed, getting feedback from the words recognized.
Although ASR has been criticized in dictation programs for low rates of accurate recognition for
non-native speakers of the language.
Document Page
In addition to this, it can be said that ASR systems have been improving in evaluation
accuracy for non-native speakers. It has been noticed that program also provides a recognition of
pronunciation so that learning program effectiveness can be advanced. Dutch speakers' text
assistance can also be provided effectively in order to have sustainable development. k. Work to
improve ASR’s accuracy in evaluating continues and ASR programs geared toward non-native
speakers have reasonably high levels of accuracy (Strik and Cucchiarini, 2009).
5. ASR for teaching pronunciation
Learning in regard to pronunciation is one of critical aspect for adults. Key issue faced by
students is to have interpretation of foreign contrasts. In this respect the ASR is being used
effectively from teachers so that better awareness can be created in regard to pronunciation. ASR
opens up the new possibilities for the training of conversational skills. It allows adding on new
features to teaching environments. However, it also requires a support of call system and it
provides a number of advantages to the requirements of pronunciation training (Demenko, 2009).
By having improved focus on such aspect the self directed learning can be promoted effectively
within classroom because it allows reducing the use of foreign language.
It has been noticed that education systems which is incorporated with ASR modules
allows to have better interaction and identify the issues or errors that individual faces. With an
assistance of this, immediate feedback is being provided to the individual so that error in
pronunciation can be overcome effectively. In present conditions, there are number of
commercial institutions that have use of ASR technology to teach L2 pronunciation (Cucchiarini,
2009). In order to have effective use of ASR technology the use of call applications is also
significant. ASR based system allow students to have active participation in the oral skills
module. Through feedback the mistakes in learning can also be identified in appropriate manner.
In this respect the development of ASR module is significant because it helps in recognize and
score the non-native speech. It is also essential to make sure that ASR module is being
introduced by improved concern about Dutch language course.
Along with this, it has been noticed that ASR allows to have pronunciation training
effectively by having appropriate learning in group classes with teacher. In addition to this, it can
be said that partial individual sessions can also be organised on the computer so that better
support can be provided to training program. It is necessary to make sure that during individual
session the lecture notes and overall interaction with student must be stored in log file (Strik,
Document Page
2009). It has been noticed that teachers also need to provide supervision to the sessions so that it
can be make sure that standards are being well maintained. It also provides an encounter to the
student problems. It has been witnessed that presence of teachers is beneficial because it helps in
reducing the anxiety of technophobic learner. It indicates that overall development in respect to
working of teaching program is beneficial.
Moreover, the effective designing of exercise is significant because it provides a support
to students and helps in oral skills. It is also beneficial to have better learning about
pronunciation so that goals and objectives can be taken into account. In addition to this, it can be
said that pronunciation errors can be overcome by using ASR system (Kim, 2006). In this
respect, few standards need to be taken into account such as frequent, persistent, perceptually
important and reliably autocratic techniques. It has been noticed that by having improved focus
on such aspects the education system effectiveness can be advanced effectively so that goals and
objectives can be accomplished. Moreover, the pronunciation training performance can be
advanced through improved focus on both segmental and sub-segmental aspects. With an
assistance of this, learner speech quality can be advanced (Cucchiarini, 2009). It allows focusing
on speech sounds, word stress and sentence stress. Pronunciation training program through ASR
system allows making sure that real time development and feedback is being shared among
students so that goals and objectives can be accomplished effectively. Group learning can be
advanced through help of questionnaire evaluation so that objectives can be attained.
CONCLUSION
On summarising the above report, it has been concluded that ASR plays a very crucial
role in enhancing the learning context of those individuals who are trying to improvise their
English pronunciation. It was hereby referred to be a secondary language for them where it is a
subsidiary deliberation of learning for them. It is hereby bifurcated into five pivotal divisions
with a foremost section of ASR with its elementary concept. Another part has defined the
contributory significance of ASR in improvising the Enflish pronunciation of learners. The third
unit has specified varied dimensions of ASR with a subsequent subdivision to measure its
effectiveness. It is with an eventual stage that has stated about several roles of ASR in teaching
pronunciation.
tabler-icon-diamond-filled.svg

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
REFERENCES
Books and Journals
Beelders, T.R. and Blignaut, P.J. (2011). The Usability of Speech and Eye Gaze as a Multimodal
Interface for a Word Processor. INTECH Open Access Publisher.
Cucchiarini, C. (2009). Comparing different approaches for automatic pronunciation error
detection. Speech Communication. 51(10). pp.845-852.
Demenko, G. (2009). The EURONOUNCE corpus of non-native Polish for ASR-based
pronunciation tutoring system. SlaTE. pp. 85-88.
Demenko, G., Wagner, A. and Cylwik, N. (2010). The use of speech technology in foreign
language pronunciation training. Archives of Acoustics, 35(3), pp.309-329.
Kim, I.S. (2006). Automatic speech recognition: Reliability and pedagogical implications for
teaching pronunciation. Educational Technology & Society. 9(1). pp.322-334.
Strik, H. (2009). Oral proficiency training in Dutch L2: The contribution of ASR-based
corrective feedback. Speech Communication. 51(10). pp.853-863.
Strik, H. and Cucchiarini, C. (2009). Modeling pronunciation variation for ASR: A survey of the
literature. Speech Communication. 29(2). pp.225-246.
Wester, M. (2013). Pronunciation modeling for ASR–knowledge-based and data-derived
methods. Computer Speech & Language. 17(1). pp.69-85.
Zavaliagkos, G. (2011). Stochastic pronunciation modelling from hand-labelled phonetic
corpora. Speech Communication. 29(2). pp.209-224.
Online
Elimat, A. K. and AbuSeileek, A. F. (2014). Automatic speech recognition technology as an
effective means for teaching pronunciation. [PDF]. Available
through:<http://journal.jaltcall.org/articles/10_1_Elimat.pdf>. [Accessed on 5th December
2016].
McCrocklin, S. M. (2014). The potential of Automatic Speech Recognition for fostering
pronunciation learners' autonomy. [Online]. Available
through:<http://lib.dr.iastate.edu/cgi/viewcontent.cgi?article=4909&context=etd>.
[Accessed on 5th December 2016].
Neri, A. (2007). The pedagogical effectiveness of ASR-based Computer Assisted Pronunciation
Training. [PDF]. Available
Document Page
through:<http://hstrik.ruhosting.nl/wordpress/wp-content/uploads/2013/02/Neri-PhD-
thesis.pdf>. [Accessed on 5th December 2016].
Zajechowski, M. (2014). Automatic Speech Recognition (ASR) Software – An Introduction.
[Online]. Available through:<http://usabilitygeek.com/automatic-speech-recognition-asr-
software-an-introduction/>. [Accessed on 5th December 2016].
Document Page
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Document Page
chevron_up_icon
1 out of 15
circle_padding
hide_on_mobile
zoom_out_icon
logo.png

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]