logo

Bioinformatics Technologies

This assignment focuses on the fundamentals of Bioinformatics and information gathering in the field of research.

41 Pages3729 Words339 Views
   

Added on  2023-06-13

About This Document

This article explains the importance of DNA, RNA and protein in protein synthesis. It also lists three primary sequence databases along with their respective URLs for obtaining DNA and protein sequence. Additionally, it lists three protein structure databases with their respective URLs and three genome databases. The article also provides information about Myosin VI, a protein involved in cancer in humans, and its function, major pathways involved, and the number of exons present. Finally, it explains how to obtain the protein sequence of Myosin VI using the NCBI Gene Database.

Bioinformatics Technologies

This assignment focuses on the fundamentals of Bioinformatics and information gathering in the field of research.

   Added on 2023-06-13

ShareRelated Documents
BIOINFORMATICS TECHNOLOGIES
Bioinformatics Technologies_1
Q1. Basic knowledge of DNA, RNA and protein is important for better understanding of
the subject. The following questions will help you get an overall idea of the basic biological
molecules and databases. (8 Marks)
a) What is central dogma? Explain the process briefly and the importance of DNA,
RNA and protein in protein synthesis.
The central dogma of life is the flow of genetic information, from DNA to RNA, and RNA to
Protein which is a functional product. [“Yourgenome” (n.d).]
Three processes are involved in conversion of DNA → RNA → protein. [David, L.N & Michael,
M.C (2004)]
The first step is “Replication”, in this process new DNA is made from existing DNA or it
can be said that parental DNA is copied to make a daughter DNA molecule.
The second step is “Transcription”, in this process new RNA is made from DNA , means
the genetic information encoded in DNA is copied into RNA i.e. messenger RNA.
The third step is “Translation”, in this process new protein is made from RNA, means the
genetic message encoded in messenger RNA is translated on the ribosomes into a
polypeptide with a particular sequence of amino acids.
FIG-1: Diagrammatical representation of the Central Dogma of Life process. [The
Central Dogma (n.d).]
So these three components play a crucial role in the central dogma of life. [ Lodish, H., Berk, A.
& Zipursky, S.L., et al (2000).]
DNA (Deoxyribose Nucleic Acid) to store genetic information and carry the information
from one generation to other.
Bioinformatics Technologies_2
RNA (Ribose Nucleic Acid) is a messenger that carries this information to the ribosomes,
three kind of RNAs are involved Messenger RNA (mRNA) carries the genetic
information copied from DNA, Transfer RNA (tRNA) converts the words which are
coded into mRNA, and Ribosomal RNA (rRNA) associates with a set of proteins to
ribosomes is formed.
Proteins are the building blocks of body and an important task for the cell function is
accomplished by proteins.
b) List 3 primary sequence databases along with their respective URL’s for obtaining
DNA and Protein sequence.
A database is a computerize archive used to store and organize information so that it can be
accessed easily, also the management and updation of data can be done. Biological Databases are
libraries of life sciences information, where data is collected from scientific experiments,
published literature, high throughput experiment technology, and computational analyses.
The Biological databases can be classified into three types [Jin, X. (2006)]:
1. Primary databases: It contains original biological data submitted by scientists and
scientific institutes.
2. Secondary databases: It contains modified information of original biological data which
is computationally processed or manually curated.
3. Specialized database: It contains information about particular and specific research
interest or particular data of organism.
Sequence database consist of DNA or protein sequences and information about those sequences.
A primary sequence database contains molecular biology data in its original format. The primary
sequence databases contain a mixture of data that is information of many different organisms
which includes whole genome sequences, gene sequences derived from genomic DNA or mRNA
(cDNA), sequences of chromosomes, complete or partial sequences and annotated/un-annotated
entries with established/predicted functions. Therefore, large number of entries is screened to
identify the sequence of interest from primary database.
Bioinformatics Technologies_3
The three primary sequence databases are [“Biological Databases” (n.d)]:
1) GenBank, it is a sequence database in NCBI. The URL of the database is
https://www.ncbi.nlm.nih.gov/genbank/.
2) EMBL (European Molecular Biology Laboratory), it is a sequence database in Europe.
The URL of the database is https://www.embl.org/.
3) UniProtKB, it is a knowledge base database which consists of information about protein
sequences and functions. The URL of the database is https://www.uniprot.org/.
c) List 3 protein structure databases with their respective URL’s.
Proteins can be defined as the long chains of amino acids. They are linear, unbranched polymers
which are made of amino acids chains. The function of the protein invariably depends upon
interactions with other molecules which can lead to protein confirmation changes.
The structural organization of protein can be classified into four categories [Particle Science
Drug Development Services (2009) ]:
1. Primary structure: The amino acid sequence specified by genetic information.
2. Secondary structure: They are the folds of polypeptide chain that forms certain localized
arrangements which are adjacent amino acids. Secondary structures are of two types’ α-
helix and β-sheets.
3. Tertiary structure: The overall three-dimensional shape of entire protein molecule.
4. Quaternary structure: The protein consists of multiple same or different polypeptide
chains or protein subunits.
The protein structure databases have wide variety of information about the three-dimensional
structure and functions of proteins.
The three protein structure databases are [“Israel Science and Technology Directory” (n.d)]:
1) PDB (Protein Data Bank) is the worldwide central repository of protein structural
information. The URL of the database is https://www.rcsb.org/.
Bioinformatics Technologies_4
2) SCOPe (Structural Classification of Proteins - extended), Contains information about
classification of protein structures and within that classification, their sequences. The
URL of the database is http://scop.berkeley.edu/.
3) CATH/ Gene3D, database contains information of Class, Architecture, Topology, and
Homology of protein structure. The URL of the database is http://www.cathdb.info/.
d) What is Genome? List three genome databases with their respective URL’s.
Genome is a combination of the words “gene” and “chromosome,” where gene is the biological
unit of heredity and chromosome is the carrier of genetic information in the form of genes, so
genome is defined as the complete set of hereditary instructions/information in each cell of living
organism that is needed for its development and growth.
The set of instructions/information is made up of DNA. There is a unique genome is all living
organisms. For example the size of the human genome is very large and it consists of 3.2 billion
bases of DNA but a genome size differs in other organisms [“Yourgenome” (n.d)].
Genome databases are an organized collection of information that have resulted from the
production or mapping of genome (sequence) or genome product (transcript, protein)
information. Genome databases contain a variety of biological information.
There are different types of genome databases like Human Genome Databases, Model Organism
Databases (MOD), Other Organism Databases, Organelle Databases, and Virus Databases [Jin,
X. (2006)].
The three genome databases are [Winston, H. (2005)]:
1) MaizeGDB (Maize Genome Database), this database mainly focus on crop plant maize
and model organism Zea mays. The URL of the database is https://www.maizegdb.org/#.
2) OMIM (Online Mendelian Inheritance in Man), it’s a whole genome database and a
catalogue of human genes and genetic disorders. The URL of the database is
https://www.omim.org/.
Bioinformatics Technologies_5
3) Ensembl, it is a genome browser for vertebrate genomes that supports research in
comparative genomics, evolution, sequence variation and transcriptional regulation. The
URL of the database is https://asia.ensembl.org/index.html.
Q2. Bioinformatics is an intelligent method for obtaining biological knowledge using
computational techniques. In this question you will execute a workflow to produce a
biological outcome. (12 Marks)
We will investigate Myosin VI a protein involved in cancer in humans.
a) Using the NCBI Gene Database, investigate Myosin VI for Homo sapiens. This
display has a lot of information, list the information you infer about the particular
gene.
Brief Function of the gene (In your own words).
FIG-2: Search for a particular gene in NCBI Gene Database
Bioinformatics Technologies_6
FIG-3: Result obtained from the search about the gene
FIG-4: Details of the gene MYO6
Myosin is actin base protein. The gene involved in Myosin VI and Homo sapiens is MYO6 it is
the protein coding gene. The function of gene is that it plays role in various intracellular
processes like cell migration and membrane trafficking. A reverse-direction motor protein is
encoded by the gene which helps it moves towards the minus end of actin filaments; it also plays
an important role in intracellular vesicle and organelle transport.
Bioinformatics Technologies_7
What are the major pathways involved and how many exons are present?
The major pathways information is obtained from Pathways from BioSystems
o Gap junction degradation, organism-specific biosystem (from REACTOME)
o Gap junction trafficking, organism-specific biosystem (from REACTOME)
o Gap junction trafficking and regulation, organism-specific biosystem (from
REACTOME)
o Glutamate Binding, Activation of AMPA Receptors and Synaptic Plasticity, organism-
specific biosystem (from REACTOME)
o Membrane Trafficking, organism-specific biosystem (from REACTOME)
o Neuronal System, organism-specific biosystem (from REACTOME)
o Neurotransmitter Receptor Binding And Downstream Transmission In The Postsynaptic
Cell, organism-specific biosystem (from REACTOME)
o Stabilization and expansion of the E-cadherin adherens junction, organism-specific
biosystem (from Pathway Interaction Database)
o Trafficking of AMPA receptors, organism-specific biosystem (from REACTOME)
o Transmission across Chemical Synapses, organism-specific biosystem (from
REACTOME)
o Vesicle-mediated transport, organism-specific biosystem (from REACTOME)
Exon count obtained in the MYO6 genes is 37.
Bioinformatics Technologies_8

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
The New Republic Assignment 2022
|3
|670
|19

Protein Synthesis and Nucleic Acids in Human Genetics
|9
|2234
|96

Cell Biology: Nucleic Acids, Protein Synthesis, Embryonic Stem Cells, Cell Division, and Cancer Cells vs Normal Cells
|11
|2493
|377

Assignment On Protein Synthesis
|6
|1904
|406

Assignment - Genetic Code
|9
|2767
|489

Concept of Inherited Diseases - Assignment
|10
|2800
|308