Bioinformatics: Storing and Searching Genetic Information

   

Added on  2023-04-20

15 Pages2928 Words245 Views
Running head: BIOINFORMATICS
BIOINFORMATICS
Name of the Student:
Name of the University:
Author Note:
Bioinformatics: Storing and Searching Genetic Information_1
1BIOINFORMATICS
Genetic information gathered from genome sequence all over the world in regard to
proteins and DNA must be stored in an intelligent and organised manner, so that it can be
availed easily. Thus, this information is stored in electronic databases. Separate databases are
available from nucleotide and protein sequences1. Example of a database for protein
sequencing and their mechanism is Swiss-Prot and example of nucleotide sequencing a
database is EMBL. With the storing of information, the need for searching these huge
databases also arises. BLAST (Basic Local Alignment Search Tool) and FASTA are two
searching algorithms for sequence searching in those databases. These genetic information
can be represented by the BLAST and FASTA electronic form and extracted sequence from
these databases presented in these format as an output result.
These Nucleotide and protein sequences can be used to study variation between two
organisms. Although the DNA sequence is unique to each species, a certain percentage of
similarities remains between two related organisms. Genetic variation or mutation is the
driving force for the evolutionary process in nature. Scientist and taxonomist use these
similarities to place newly found species in the animal or plant kingdom. Differences or
mutation in DNA sequence due to different molecular mechanisms. These differences
generally occur due to a small change in local DNA sequence, rearrangement of DNA
segments. A different direction in evolution pathway for these species are responsible for
these changes but maximum similarities remain between the closely related organisms.
However, genetic variations in nature commonly occur randomly instead of targeted
response. Hence, it can be argued that evolution happens because of these sequence change
rather than vice versa2. DNA is responsible for the creation of proteins in a biological
organism. As DNA is similar between two closely related organisms, thus two closely related
1 Pevsner, Jonathan. Bioinformatics and functional genomics. John Wiley & Sons, 2015.
2 Arber, Werner. "Molecular mechanisms driving Darwinian evolution." Mathematical and Computer
Modelling 47, no. 7-8 (2008): 666-674.
Bioinformatics: Storing and Searching Genetic Information_2
2BIOINFORMATICS
organisms have similar protein structure. DNA-DNA hybridization is a process by which the
similarities between the two related organisms can be determined3.
To derive the homology and evolutionary relationships between two organisms
alignment of two or more similar length biological sequence is needed. MSA or Multiple
Sequence Alignment is a process which can perform the above-mentioned mechanism. There
are many MSA tool available right now and MUSCLE, T-Coffee, MView, Clustal Omega,
and Kalign are few examples of these tools. MUSCLE is an accurate tool for MSA and it is
particularly good for protein sequences with medium to large long protein sequences. T-
Coffee is suitable small alignments and is consistent MSA tool. Clustal Omega is a relatively
new MSA tool and it uses the HMM profile to profile techniques for the generation of
alignments. It is also suitable for medium to large alignments.
The name of the gene is in focus for this article is P19801-2 and related information
regarding this protein is derived from UniProt. FASTA format is used for the presentation of
the nucleotide sequence and BLAST is used for the alignment information and phylogenetic
tree. Clustal Omega is used for Multiple Sequence Alignment.
The FASTA format of this gene is as follow:
>sp|P19801-2|AOC1_HUMAN Isoform 2 of Amiloride-sensitive amine oxidase
[copper-containing] OS=Homo sapiens OX=9606 GN=AOC1
MPALGWAVAAILMLQTAMAEPSPGTLPRKAGVFSDLSNQELKAVHSFLWSKKELRLQPSS
TTTMAKNTVFLIEMLLPKKYHVLRFLDKGERHPVREARAVIFFGDQEHPNVTEFAVGPLP
GPCYMRALSPRPGYQSSWASRPISTAEYALLYHTLQEATKPLHQFFLNTTGFSFQDCHDR
CLAFTDVAPRGVASGQRRSWLIIQRYVEGYFLHPTGLELLVDHGSTDAGHWAVEQVWYNG
KFYGSPEELARKYADGEVDVVVLEDPLPGGKGHDSTEEPPLFSSHKPRGDFPSPIHVSGP
RLVQPHGPRFRLEGNAVLYGGWSFAFRLRSSSGLQVLNVHFGGERIAYEVSVQEAVALYG
GHTPAGMQTKYLDVGWGLGSVTHELAPGIDCPETATFLDTFHYYDADDPVHYPRALCLFE
MPTGVPLRRHFNSNFKGGFNFYAGLKGQVLVLRTTSTVYNYDYIWDFIFYPNGVMEAKMH
ATGYVHATFYTPEGLRHGTRLHTHLIGNIHTHLVHYRVDLDVAGTKNSFQTLQMKLENIT
NPWSPRHRVVQPTLEQTQYSWERQAAFRFKRKLPKYLLFTSPQENPWGHKRTYRLQIHSM
ADQVLPPGWQEEQAITWARTEGGQPRALSQAASPVPGRYPLAVTKYRESELCSSSIYHQN
DPWHPPVVFEQFLHNNENIENEDLVAWVTVGFLHIPHSEDIPNTATPGNSVGFLLRPFNF
FPEDPSLASRDTVIVWPRDNGPNYVQRWIPEDRDCSMPPPFSYNGTYRPV
3 Felgueiras, Juliana, Joana Vieira Silva, and Margarida Fardilha. "Adding biological meaning to human protein-
protein interactions identified by yeast two-hybrid screenings: A guide through bioinformatics tools." Journal of
proteomics 171 (2018): 127-140.
Bioinformatics: Storing and Searching Genetic Information_3
3BIOINFORMATICS
The proteins which are associated with this protein are namely Amine oxidase (Fragment)
(C9J0G8_HUMAN), Amine oxidase (Fragment) (C9J2J4_HUMAN), Amine oxidase
(A0A2J8RK70_PONAB), Amine oxidase (A0A0D9R3E0_CHLSB), Amine oxidase
(F6TU25_MACMU) and others. The potential isoforms of this protein are the Amine oxidase
(C9J2J4_HUMAN) and Amine oxidase (C9J0G8_HUMAN)4.
The alignment of these potential isoforms with the given protein is as follows5:
CLUSTAL O(1.2.4) multiple sequence alignment
SP|P19801|AOC1_HUMAN
MPALGWAVAAILMLQTAMAEPSPGTLPRKAGVFSDLSNQELKAVHSFLWSKKELRLQPSS 60
TR|C9J0G8|C9J0G8_HUMAN
MPALGWAVAAILMLQTAMAEPSPGTLPRKAGVFSDLSNQELKAVHSFLWSKKELRLQPSS 60
TR|C9J2J4|C9J2J4_HUMAN
MPALGWAVAAILMLQTAMAEPSPGTLPRKAGVFSDLSNQELKAVHSFLWSKKELRLQPSS 60
************************************************************
SP|P19801|AOC1_HUMAN
TTTMAKNTVFLIEMLLPKKYHVLRFLDKGERHPVREARAVIFFGDQEHPNVTEFAVGPLP 120
TR|C9J0G8|C9J0G8_HUMAN
TTTMAKNTVFLIEMLLPKKYHVLRFLDKGERHPVREARAVIFFGDQEHPNVTEFAVGPLP 120
TR|C9J2J4|C9J2J4_HUMAN
TTTMAKNTVFLIEMLLPKKYHVLRFLDKGERHPVREARAVIFFGDQEHPNVTEFAVGPLP 120
************************************************************
SP|P19801|AOC1_HUMAN
GPCYMRALSPRPGYQSSWASRPISTAEYALLYHTLQEATKPLHQFFLNTTGFSFQDCHDR 180
TR|C9J0G8|C9J0G8_HUMAN
GPCYMRALSPRPGYQSSWASRPISTAEYALLYHTLQEATKPLHQFFLNTTGFSFQDCHDR 180
TR|C9J2J4|C9J2J4_HUMAN
GPCYMRALSPRPGYQSSWASRPISTAEYALLYHTLQEATKPLHQFFLNTTGFSFQDCHDR 180
************************************************************
SP|P19801|AOC1_HUMAN
CLAFTDVAPRGVASGQRRSWLIIQRYVEGYFLHPTGLELLVDHGSTDAGHWAVEQVWYNG 240
TR|C9J0G8|C9J0G8_HUMAN
CLAFTDVAPRGVASGQRRSWLIIQRYVEGYFLHPTGLELLVDHGSTDAGHWAVEQVWYNG 240
TR|C9J2J4|C9J2J4_HUMAN
CLAFTDVAPRGVASGQRRSWLIIQRYVEGYFLHPTGLELLVDHGSTDAGHWAVEQVWYNG 240
************************************************************
4 AOC1 (Human) | Gene Target - Pubchem (2018) Pubchem.ncbi.nlm.nih.gov
<https://pubchem.ncbi.nlm.nih.gov/target/gene/AOC1/human#section=PDB-Structures>.
5 CALIPHO Bioinformatics (2018) Nextprot.org <https://www.nextprot.org/entry/NX_P19801/sequence>.
Bioinformatics: Storing and Searching Genetic Information_4

End of preview

Want to access all the pages? Upload your documents or become a member.

Related Documents
Bioinformatics And Genetics
|8
|1509
|19

Bioinformatics Assignment
|8
|898
|267

Gycine max between different species
|5
|469
|36

Bioinformatics Assignment Sample
|56
|10564
|54

Bioinformatics: Variation in Genes and Protein Structure
|26
|3466
|366

Genomics and Bioinformatics Lab
|4
|1271
|407