logo

Corpus Linguistics | Report

   

Added on  2022-09-09

13 Pages3935 Words32 Views
Running Head: CORPUS LINGUISTICS 1
Corpus Lingustics
Name
Professor
Course
Date

CORPUS LINGUISTICS 2
"The challenges and benefits of semantically annotating a corpus."
Introduction
Corpus annotation has a close relation to corpus mark up. The importance of using a
corpus in linguistics research is to assist in the extraction of linguistic facts present in those
corpora. While working towards achieving and increasing understanding of a text by
machines, sematic information is added to lexical objects through the incorporation of
metadata tags. The whole process is known as a semantic annotation (Mautner, 2016). When
developing natural speech understanding structure, it is important to build a speech resource
that covers the linguistic variety available in a given original speech. The original speech
resources known as corps are often presented using meta-figures comprising information
about the document and the tokens forming up the corpus. The inclusion of meta-statistics to
a collection is known as labelling or annotation. Annotations that bring together message to a
collection can be used in a text sentence, to a text as a whole, its words, and its conditions.
The process can be carried out automatically over manually.
Comments can enable the creation of different types of inferences relating to the study
of the original speech. The applications range from automatic speech conveyers and
information extractors. The cause for annotation is to assist in the establishment of additional
text. The extra text assists in the development of the context of the states where a lexical
object is placed hence assist in ambiguity elimination (Th. Gries, 2015). Coups annotation
may be used in distinct levels of the linguistic structures to; give the grammar class of the
Annotated constituents, their correlational phenomena, their morphology, and the aspect of
phonetics. It may also cover other categories relating to the model of the commented text and
its text in general. There are several ways of the lexical item that can be commented, and they
are broadly divided into two broad categories syntactic and semantic elements.

CORPUS LINGUISTICS 3
Syntactic annotations focus on additional statistics corresponding to the form of the lexical
object such as its dictionary and its section of information tagging. On the other hand,
semantic commenting is the act of adding to the terms relevant references to bring out their
actual meaning. The man contribution of semantics comments is to do away with ambiguities
relating to the meaning of texts by computer devices. The semantic comments in focus can
further is divided into Stubbs and Pustejovsky. These two types of semantic annotation divide
the comments into the notes of semantic models and explanation of semantic roles. In
interpretationaltyping, annotation of a speech object is abled with a mode illustrator from a
set aside ontology or vocabulary illustrating what it represents (Alves & Vale, 2017).
On the other hand, comment role representing a speech format is noted as playing a
particular grammatical role consistent to a duty assigner, e.g., verb. When producing a new
semantic Annotated corpus, we use semantic interpretation on the grounds of high-level
concepts, which are used for the enrichment of web content. Semantic website applies the
rule which states that all content present on the web should be indicated in such a manner that
computers are easy to identify the material (Gries & Berez, 2017). Semantic explanations
based on ontologies offers an essential role in the method of grammatical improvement of
web materials to offer to assist the semantic web.
Semantic annotations are important and assist in a vast type of NLP applications.
However, they are extremely resource and time-consuming tasks. In the activity of
explanation carried out by the use of human activities, reasons related to time, diversity of the
language, and cost still delay and prevent the work from being carried out. Automation of
explanations routine using mathematical calculation materials could hence offer an
answer .therefore, NPL applies ways that learned from past Annotated linguistics applying
machine study methods to optimize and decrease task complexity (Hilpert & Gries, 2016).

CORPUS LINGUISTICS 4
The motivation for creating and using this type of corpus
The motivation for semantically annotating a corpus is due to the added value it offers
to a corpus. Semantically annotating assist in enriching the corpus as an origin of linguistics
information to be applied in the development and future research. Other than this there are
different for motivations for semantically annotating a corpus, and there are as follows;
First, it becomes very easy to source relevant information from an Annotated corpora.
According to (Semino, 2017) he observed that, for instance, with the absence of speech,
tagging becomes hard to source left as an adjective in a raw linguistics. The reason behind
this is its variety of meanings, and the application cannot be understood from its orthographic
format or subject alone.
An excellent illustration of this is orthographic from the left, which has an explanation
opposite to right. It can be an adverb, an adjective, or a noun. Besides, it can also be giving
the meaning of prior or past participle structure of leave. When applying the necessary part of
speech explanations, the above distinct uses of the left can be always differentiated away.
Corpus annotations also facilitate machine and human analysts to retrieve and exploit
analysis of which they are not themselves capable.
An example of this is an instance when an n individual doesn't understand the Chinese
language, given an appropriate Annotated corpus, one can find out the significant deal about
Chinese using that kind of a corpus. The speed of data extraction from a semantically
annotated corpus is another advantage of this kind of Annotated corpora. Considering that an
individual can carry out the required linguistic analysis, it becomes tough for the same person
to explore a raw corpus reliably and swiftly compared to an annotated corpus (Strang, 2016).

End of preview

Want to access all the pages? Upload your documents or become a member.