Bioinformatics Assignment Sample

Verified

Added on  2021/06/18

|56
|10564
|54
AI Summary

Contribute Materials

Your contribution can guide someone’s learning journey. Share your documents today.
Document Page
BIOINFORMATICS ASSIGNMENT 1
I
declare that all material in this assessment is my own work except where there is clear
acknowledgement and reference to the work of others. I have read the Academic Honesty and Assessment
Obligations for Coursework Students Policy and Academic Dishonesty Procedures
(http://www.adelaide.edu.au/policies/230/).
I give permission for my assessment work to be reproduced and submitted to other academic staff
for the purposes of assessment and to be copied, submitted and retained in a form suitable for
electronic checking of plagiarism.
Signed………………………………………………. Date ……………………………………………
Table of Contents
A. Introductory Bioinformatics........................................................................................................................2
1. Terms commonly used in bioinformatics....................................................................................................2
2. DNA Sequence Translation.........................................................................................................................4
3. Sequence homology.....................................................................................................................................6
4. Learn To Retrieve Gene/mRNA/Protein Information.................................................................................9
B. Genetic Analysis of a Human Cancer Disease Using Databases.................................................................35
School of Biological Sciences
Assessment Cover Sheet
Student Name
Student ID
Assessment Title
Course/Program
Lecturer/Tutor
Date Submitted
OFFICE USE ONLY
Date Received

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
BIOINFORMATICS ASSIGNMENT 2
5. Analysis of a Human Genetic Disease......................................................................................................35
6. Acquiring Sequence Information.............................................................................................................37
7. Uniprot/Swiss-Prot Database..................................................................................................................42
8. Basic Analysis of Evolutionarily Conserved Sequences........................................................................43
9. Multiple Sequence Alignments across Different Species......................................................................45
10. THREE-DIMENSIONAL VIEWING OF AN IDENTIFIED PROTEIN STRUCTURE:................50
References.........................................................................................................................................................54
Document Page
BIOINFORMATICS ASSIGNMENT 3
A. Introductory Bioinformatics
1. Terms commonly used in bioinformatics
BLAST
Basic Local Alignment Search Tool (BLAST) finds regions of local similarity among
sequences. It compares nucleotide or protein sequences to sequence database and calculate the
statistical significance of matched results.
GenBank
It is a genetic sequence database. GenBank stores annotated collection of DNA sequences
which is publically available. It is a part of the International nucleotide sequence database
collaboration. It is designed to assess and encourage access within the scientific community to the
updated and comprehensive DNA sequence information.
Ensembl
Ensembl is a genome browser for the vertebrate genomes that used to support research in
comparative genomes, evaluation, sequence variation and transcriptional regulation. Ensembl annotate
gene, it computes multiple alignment, predicts regulatory function and stores data of various diseases.
Ensembl includes tools like BLAST, BLAT, BioMart and a variant effect predictor for all supported
species.
GEO
Gene Expression Omnibus (GEO) is an international public repository which archives and
freely distributes microarray, next generation sequencing and other forms of high throughout functional
Document Page
BIOINFORMATICS ASSIGNMENT 4
genomic data which is submitted by the research community tools. These tools are provided to help
users query and download experiments and curated gene expression profile.
Pfam
Pfam is the database includes large collections of protein families, represented by multiple
sequence alignments and HMMs (Hidden Markov Model). It is available on World Wide Web and
provides higher level groupings of the related entries, called clans which are the collection of Pfam
entries.
KEGG
Kyoto Encyclopedia of Genes and Genomes (KEGG) is the collection of databases deals with
genomes, biological pathways, disease drugs and the chemical substances. KEGG is a database
resource to understanding the high level functions. It utilities the biological system, such as cell, the
organism and ecosystem from molecular- level information, specifically large scale molecular database
generated by genome sequencing and other high throughout experimental technologies.
OMIM
Online Mendelian Inheritance in Man (OMIM) is the database initiated by Dr. Victor A.
Mckusick in 1960. It is a comprehensive, authoritative compendium of human’s genes and genome
which is freely available and updated routinely.
PDB
Protein Data Bank or PDB is the open access digital data resources in all biology and medicine.
It is in demand for experimental data central to the scientific discovery. PDB provides access to the 3D
structure data for large molecules like protein, DNA and RNA to understand its role in human and
animal health.

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
BIOINFORMATICS ASSIGNMENT 5
GO
Gene Ontology (GO) is the major bioinformatics initiative to create a computational
representation of peoples evolving knowledge to understand how genes encode function at the
molecular, cellular and tissue system levels. GO resources plays an important role to support
biomedical research, which includes interpretation of the large scale molecular experiments and
computational analysis of biological knowledge.
R
R is the comprehensive statistical environment and a programming language for professional
data analysis and graphical display. Bio-conductor project which associated with R provides various
additional R packages for statistical data analysis in different fields of life science, such as tools for
microarray, next generation sequence and genome analysis. R software is freely available and works in
all common operating systems.
2. DNA Sequence Translation
A.
What is the correct open reading frame?
MADKELKFLVVDDFSTMRRIVRNLLKELGFNNVEEAEDGVDALNKLQAGGYGFVISDWNM
PNMDGLELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKPFTAATLEEKL
NKIFEKLGM
Why are there 6 reading frames instead of 3?
The coding strand refers to a DNA strand with the same base order like the RNA transcript for a
particular gene. Because one gene is always entirely present on a single DNA strand there are indeed 3
reading frames possibly present in this strand, and only 1 RF actually containing the correct codon
sequence for this gene, however when the entire genome is viewed, it is possible for one gene to be
Document Page
BIOINFORMATICS ASSIGNMENT 6
available on single strand with another gene present on other which means that the coding strand for
single gene is the non-coding strand for the other. That is why there are 6 reading frames are found in
result for the genome as a whole as both strands contain genes.
Save and paste the protein sequence you obtain from the “Compact M- no space” format here?
Answer:-
5'3' Frame 1
MADKELKFLVVDDFSTMRRIVRNLLKELGFNNVEEAEDGVDALNKLQAGGYGFVISDWNM
PNMDGLELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKPFTAATLEEKL
NKIFEKLGM-
5'3' Frame 2
WRIKNLNFWLWMTFPPCDA-CVTC-KSWDSIMLRKRKMASTLSISCRQAVMDLLSPTGTC
PIWMAWNC-KQFVRMARCRHCQC-W-LQKRRKRTSLLRRKRGPVAMW-SHLPPRRWRKNS
TKSLRNWAC
5'3' Frame 3
GG-RT-IFGCG-LFHHATHSA-PAERAGIQ-C-GSGRWRRRSQ-VAGRRLWICYLRLEHA
QYGWPGIAENNSCGWRDVGIASVNGDCRSEEREHHCCGASGGQWLCGEAIYRRDAGGKTQ
QNL-ETGHV
3'5' Frame 1
SHAQFLKDFVEFFLQRRGGKWLHHIATGPRLRRSNDVLFLRFCSHH-HWQCRHRAIRTNC
FQQFQAIHIGHVPVGDNKSITACLQLIESVDAIFRFLNIIESQLFQQVTHYASHGGKVIH
NQKFKFFIRH
3'5' Frame 2
HMPSFSKILLSFSSSVAAVNGFTT-PLAPACAAAMMFSFFASAVTINTGNADIAPSARIV
FSNSRPSILGMFQSEITNP-PPACNLLRASTPSSASSTLLNPSSFSRLRTMRRMVEKSST
TKNLSSLSA
3'5' Frame 3
TCPVSQRFC-VFPPASRR-MASPHSHWPPLAPQQ-CSLSSLLQSPLTLAMPTSRHPHELF
SAIPGHPYWACSSRR-QIHNRLPATY-ERRRHLPLPQHY-IPALSAGYALCVAWWKSHPQ
PKI-VLYPP
B
What is the protein encoded by the ORF you identified and what species is it from?
Answer- Chemotaxis protein
Species- Escherichia Coli
Document Page
BIOINFORMATICS ASSIGNMENT 7
To what family of protein does it belongs?
Answer- Family- CheY
What is the function of this protein?
Answer- It Involved in transmit the sensory signals from chemoreceptors of the flagellar motors. This
protein is also participated in changing the direction of flagellar rotation.
3. Sequence homology
Define the terms
Homologue: - A chromosome which is similar in physical aspects and genetic information to another
chromosome, where both make pairs during meiosis.
Orthologue: - Sequence that have common ancestor and have divided due to speciation event
Paralogue: - Sequences are the descendants of an ancestral gene that underwent a duplication event.
List the details of the each of these homologous sequences here, including its maximum identity
Answer- Homologous sequence
1 adkelkflvv ddestmrriv rnllkelgfn nveeaedgvd alnklqaggy gfvisdwmmp
61 nmdglellkt iradgamsal pvlmvtalak keniiaaaqa gasgyvvkpf taatleekln
121 kifeklgm
Accession no. 3F7N_A
SOURCE ESCHERIA COLI K-12
AMINO ACID- 128
MAX SCORE – 246
MAX IDENT 98%
1 adkelkflvv ddfstmrriv rnllkelgfn nveeaedgvd alnklqaggf gfiicdwnmp
61 nmdglellkt iradsamsal pvlmvtaeak keniiaaaqa gasgyvvkpf taatleekln
121 kifeklgm
Accession no.- 2CHY_A
Source- Salmonella Enterica Subsp. Eterica Serovar Typhimurium
Amino Acid- 128
Max score- 251

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
BIOINFORMATICS ASSIGNMENT 8
MAX Ident - 97 %

Define the terms score and P value and give example to explain?
E value
The Expectation value or E-value is the parameter which describes the number of hits which can
be expected to see by chance whn finding a database of particular size. When the score of the matches
increase e value decreases exponenetially the lower the E value, the more significant the score and
the alignment.
Example:-
Query: CPn0189 Score E-
(Bits) Value
Aligned with CT131 hypothetical protein 1240 0.0
Query: 1 MKRRSWLKILGICLGSSIVLGFLIFLPQLLSTESRKYLVFSLIHKESGLSCSAEELKISW 60
MKR W KI G L + L L LP+ S+ES KYL S+++KE+GL E+L +SW
Sbjct: 1 MKRSPWYKIFGYYLLVGVPLALLALLPKFFSSESGKYLFLSVLNKETGLQFEIEQLHLSW 60
Query: 61 FGRQTARKIKLTG-EAKDEVFSAEKFELDGSLLRLLIYKKPKGITLSGWSLKINEPASID 119
FG QTA+KI++ G ++ E+F+AEK + GSL RLL+Y+ PK +TL+GWSL+I+E S++
Sbjct: 61 FGSQTAKKIRIRGIDSDSEIFAAEKIIVKGSLPRLLLYRFPKALTLTGWSLQIDESLSMN 120
Figure 1 shows the E value in sequence search by using BLAST
Score
A score is a number which is used to assess the biological relevance of a finding
Figure 2 shows the score (258 bits) in a sequence search
Define the identity and conservative substitution in a protein alignment?
Document Page
BIOINFORMATICS ASSIGNMENT 9
Identity
Identity is the extents to which two of the sequences have the similar residues at the same
positions in an alignment, often represents as percentage.
Conserved substitution
It is a change at a specific location of an amino acid or less commonly DNA sequence which
preserves the physioco-chemical properties of the real residue or achieves the positive score in a
governing scoring matrix.
What is the default substitution matrix on the BLAST page?
Answer- BLOSUM62 (Block Amino Acid Substitution Matrices 62)
What other matrices are available?
PAM30
PAM70
PAM250
BLOSUM 45
BLOSUM80
BLOSUM62
BLOSUM50
BLOSUM90
List the names for these substitution matrices?
Point accepted mutation 30
Point accepted mutation 70
Point accepted mutation 250
Block Amino Acid Substitution Matrices 80
Block Amino Acid Substitution Matrices 62
Block Amino Acid Substitution Matrices 50
Block Amino Acid Substitution Matrices 90
What is the difference between the main two?
PAM BLOSUM
Document Page
BIOINFORMATICS ASSIGNMENT 10
1. PAM matrices developed to score
alignment between nearly related
protein sequences
BLOSUM matrices can be used to
score alignment between evolutionary
divergent protein sequences
2. Based on global alignment Based on local alignment
3. Alignment in PAM have high
similarity than BLOSUM
alignments
Alignment have low similarity than
PAM alignment
4. Less divergent More divergent
5. Higher numbers in this matrix
naming denotes higher evolutionary
distance
Greater number in the BLOSUM
matrix naming indicates higher
sequence similarity and lesser
evolutionary distance
6. Example: PAM 30, PAM 70 Example: BLOSUM62, BLOSUM 80
Does the use of a different substitution matrix affect the results of the search?
Answer:
By using different substitution matrix (PAM 70) it was found that the e value and scores differs in both
results.
4. Learn To Retrieve Gene/mRNA/Protein Information
A. Use the NCBI database to retrieve information for the 6 genes given below, mRNA and protein
sequence (Learn to use Refseq/Gene/Protein/Nucleotide database)?
Answer
BRCA 2
Gene Type- protein coding

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
BIOINFORMATICS ASSIGNMENT 11
Organism –Homo sapiens
Preferred names- breast cancer type 2 susceptibility protein
Function-
It involved in maintenance of genome stability
It considered as the tumour repressor gene
It contains many copies of a 70 aa motif which is called the BRC motif, and these motifs provides
binding to the RAD51 recombinase that functions in DNA repair.
Number of exons present – 27
Location of the gene on the chromosome – 13q13.1 (on chromosome no. 13, NC_000013.11)
Separate protein and mRNA sequence
Accession no. AAG46030
Protein sequence 1
YDTEIDRSRRSAIKKIMERDDTAAKTLVLCVSDIISLSANISETSSSKTSSADTQKVA
Accession no. AAK29432
Protein sequence 2
TAAPKCKEMQNSLNNDKNLVSIETVVPPKLLSDNLCRQTENLKTSKSIFLKVKVHENVEKETAKSPATCY
TNQSPYSVIENSALAFYTSCSRKTSVSQTSLLEAKKWLREGIFDGQPERINTADYVGNYLYENNSNSTIA
ENDKNHLSEKQDTYLSNSSMSNSYSYHSDEVYNDSGYLSKNKLDSGIEPVLKNVEDQKNTSFSKVISNVK
DANAYPQTVNEDICVEELVTSSSPCKNKNAAIKLSISNSNNFEVGPPAFRIASGKIVCVSHETIKKVKDI
FTDSFSKVIKENNENKSKICQTKIMAGCYEALDDSEDILHNSLDNDECSMHSHKVFADIQSEEILQHNQN
MSGLEKVSKISPCDVSLETSDICKCSIGKLHKSVSSANTCGIFSTASGKSVQVSDASLQNARQVFSEIED
STKQVFSKVLFKSNEHSDQLTREENTAIRTPEHLISQKGFSYNVVNSSAFSGFSTASGKQVSILESSLHK
VKGVLEEFDLIRTEHSLHYSPTSRQNVSKILPRVDKRNPEHCVNSEMEKTCSKEFKLSNNLNVEGGSSEN
NHSIKVSPYLSQFQQDKQQLVLGTKVSLVENIHVLGKEQASPKNVKMEIGKTETFSDVPVKTNIEVCSTY
SKDSENYFETEAVEIAKAFMEDDELTDSKLPSHATHSLFTCPENEEMVLSNSRIGKRRGEPLILVGKCSF
LPFVLPITIFKVFIQ
Accession no. AAN61409
Protein sequence 3
MPIGSKERPTFFEIFKTRCNKA
Document Page
BIOINFORMATICS ASSIGNMENT 12
mRNA Sequence 1
1 gtggcgcgag cttctgaaac taggcggcag aggcggagcc gctgtggcac tgctgcgcct
61 ctgctgcgcc tcgggtgtct tttgcggcgg tgggtcgccg ccgggagaag cgtgagggga
121 cagatttgtg accggcgcgg tttttgtcag cttactccgg ccaaaaaaga actgcacctc
181 tggagcggac ttatttacca agcattggag gaatatcgta ggtaaaaatg cctattggat
241 ccaaagagag gccaacattt tttgaaattt ttaagacacg ctgcaacaaa gcagatttag
301 gaccaataag tcttaattgg tttgaagaac tttcttcaga agctccaccc tataattctg
361 aacctgcaga agaatctgaa cataaaaaca acaattacga accaaaccta tttaaaactc
421 cacaaaggaa accatcttat aatcagctgg cttcaactcc aataatattc aaagagcaag
481 ggctgactct gccgctgtac caatctcctg taaaagaatt agataaattc aaattagact
541 taggaaggaa tgttcccaat agtagacata aaagtcttcg cacagtgaaa actaaaatgg
601 atcaagcaga tgatgtttcc tgtccacttc taaattcttg tcttagtgaa agtcctgttg
661 ttctacaatg tacacatgta acaccacaaa gagataagtc agtggtatgt gggagtttgt
721 ttcatacacc aaagtttgtg aagggtcgtc agacaccaaa acatatttct gaaagtctag
781 gagctgaggt ggatcctgat atgtcttggt caagttcttt agctacacca cccaccctta
841 gttctactgt gctcatagtc agaaatgaag aagcatctga aactgtattt cctcatgata
901 ctactgctaa tgtgaaaagc tatttttcca atcatgatga aagtctgaag aaaaatgata
961 gatttatcgc ttctgtgaca gacagtgaaa acacaaatca aagagaagct gcaagtcatg
1021 gatttggaaa aacatcaggg aattcattta aagtaaatag ctgcaaagac cacattggaa
1081 agtcaatgcc aaatgtccta gaagatgaag tatatgaaac agttgtagat acctctgaag
1141 aagatagttt ttcattatgt ttttctaaat gtagaacaaa aaatctacaa aaagtaagaa
1201 ctagcaagac taggaaaaaa attttccatg aagcaaacgc tgatgaatgt gaaaaatcta
1261 aaaaccaagt gaaagaaaaa tactcatttg tatctgaagt ggaaccaaat gatactgatc
1321 cattagattc aaatgtagca aatcagaagc cctttgagag tggaagtgac aaaatctcca
1381 aggaagttgt accgtctttg gcctgtgaat ggtctcaact aaccctttca ggtctaaatg
1441 gagcccagat ggagaaaata cccctattgc atatttcttc atgtgaccaa aatatttcag
1501 aaaaagacct attagacaca gagaacaaaa gaaagaaaga ttttcttact tcagagaatt
1561 ctttgccacg tatttctagc ctaccaaaat cagagaagcc attaaatgag gaaacagtgg
1621 taaataagag agatgaagag cagcatcttg aatctcatac agactgcatt cttgcagtaa
1681 agcaggcaat atctggaact tctccagtgg cttcttcatt tcagggtatc aaaaagtcta
1741 tattcagaat aagagaatca cctaaagaga ctttcaatgc aagtttttca ggtcatatga
1801 ctgatccaaa ctttaaaaaa gaaactgaag cctctgaaag tggactggaa atacatactg
1861 tttgctcaca gaaggaggac tccttatgtc caaatttaat tgataatgga agctggccag
1921 ccaccaccac acagaattct gtagctttga agaatgcagg tttaatatcc actttgaaaa
1981 agaaaacaaa taagtttatt tatgctatac atgatgaaac atcttataaa ggaaaaaaaa
2041 taccgaaaga ccaaaaatca gaactaatta actgttcagc ccagtttgaa gcaaatgctt
2101 ttgaagcacc acttacattt gcaaatgctg attcaggttt attgcattct tctgtgaaaa
2161 gaagctgttc acagaatgat tctgaagaac caactttgtc cttaactagc tcttttggga
2221 caattctgag gaaatgttct agaaatgaaa catgttctaa taatacagta atctctcagg
2281 atcttgatta taaagaagca aaatgtaata aggaaaaact acagttattt attaccccag
2341 aagctgattc tctgtcatgc ctgcaggaag gacagtgtga aaatgatcca aaaagcaaaa
2401 aagtttcaga tataaaagaa gaggtcttgg ctgcagcatg tcacccagta caacattcaa
2461 aagtggaata cagtgatact gactttcaat cccagaaaag tcttttatat gatcatgaaa
2521 atgccagcac tcttatttta actcctactt ccaaggatgt tctgtcaaac ctagtcatga
2581 tttctagagg caaagaatca tacaaaatgt cagacaagct caaaggtaac aattatgaat
2641 ctgatgttga attaaccaaa aatattccca tggaaaagaa tcaagatgta tgtgctttaa
2701 atgaaaatta taaaaacgtt gagctgttgc cacctgaaaa atacatgaga gtagcatcac
2761 cttcaagaaa ggtacaattc aaccaaaaca caaatctaag agtaatccaa aaaaatcaag
2821 aagaaactac ttcaatttca aaaataactg tcaatccaga ctctgaagaa cttttctcag
2881 acaatgagaa taattttgtc ttccaagtag ctaatgaaag gaataatctt gctttaggaa
2941 atactaagga acttcatgaa acagacttga cttgtgtaaa cgaacccatt ttcaagaact
3001 ctaccatggt tttatatgga gacacaggtg ataaacaagc aacccaagtg tcaattaaaa
3061 aagatttggt ttatgttctt gcagaggaga acaaaaatag tgtaaagcag catataaaaa
3121 tgactctagg tcaagattta aaatcggaca tctccttgaa tatagataaa ataccagaaa
3181 aaaataatga ttacatgaac aaatgggcag gactcttagg tccaatttca aatcacagtt
3241 ttggaggtag cttcagaaca gcttcaaata aggaaatcaa gctctctgaa cataacatta
3301 agaagagcaa aatgttcttc aaagatattg aagaacaata tcctactagt ttagcttgtg
3361 ttgaaattgt aaataccttg gcattagata atcaaaagaa actgagcaag cctcagtcaa
Document Page
BIOINFORMATICS ASSIGNMENT 13
3421 ttaatactgt atctgcacat ttacagagta gtgtagttgt ttctgattgt aaaaatagtc
3481 atataacccc tcagatgtta ttttccaagc aggattttaa ttcaaaccat aatttaacac
3541 ctagccaaaa ggcagaaatt acagaacttt ctactatatt agaagaatca ggaagtcagt
3601 ttgaatttac tcagtttaga aaaccaagct acatattgca gaagagtaca tttgaagtgc
3661 ctgaaaacca gatgactatc ttaaagacca cttctgagga atgcagagat gctgatcttc
3721 atgtcataat gaatgcccca tcgattggtc aggtagacag cagcaagcaa tttgaaggta
3781 cagttgaaat taaacggaag tttgctggcc tgttgaaaaa tgactgtaac aaaagtgctt
3841 ctggttattt aacagatgaa aatgaagtgg ggtttagggg cttttattct gctcatggca
3901 caaaactgaa tgtttctact gaagctctgc aaaaagctgt gaaactgttt agtgatattg
3961 agaatattag tgaggaaact tctgcagagg tacatccaat aagtttatct tcaagtaaat
4021 gtcatgattc tgttgtttca atgtttaaga tagaaaatca taatgataaa actgtaagtg
4081 aaaaaaataa taaatgccaa ctgatattac aaaataatat tgaaatgact actggcactt
4141 ttgttgaaga aattactgaa aattacaaga gaaatactga aaatgaagat aacaaatata
4201 ctgctgccag tagaaattct cataacttag aatttgatgg cagtgattca agtaaaaatg
4261 atactgtttg tattcataaa gatgaaacgg acttgctatt tactgatcag cacaacatat
4321 gtcttaaatt atctggccag tttatgaagg agggaaacac tcagattaaa gaagatttgt
4381 cagatttaac ttttttggaa gttgcgaaag ctcaagaagc atgtcatggt aatacttcaa
4441 ataaagaaca gttaactgct actaaaacgg agcaaaatat aaaagatttt gagacttctg
4501 atacattttt tcagactgca agtgggaaaa atattagtgt cgccaaagag tcatttaata
4561 aaattgtaaa tttctttgat cagaaaccag aagaattgca taacttttcc ttaaattctg
4621 aattacattc tgacataaga aagaacaaaa tggacattct aagttatgag gaaacagaca
4681 tagttaaaca caaaatactg aaagaaagtg tcccagttgg tactggaaat caactagtga
4741 ccttccaggg acaacccgaa cgtgatgaaa agatcaaaga acctactcta ttgggttttc
4801 atacagctag cgggaaaaaa gttaaaattg caaaggaatc tttggacaaa gtgaaaaacc
4861 tttttgatga aaaagagcaa ggtactagtg aaatcaccag ttttagccat caatgggcaa
4921 agaccctaaa gtacagagag gcctgtaaag accttgaatt agcatgtgag accattgaga
4981 tcacagctgc cccaaagtgt aaagaaatgc agaattctct caataatgat aaaaaccttg
5041 tttctattga gactgtggtg ccacctaagc tcttaagtga taatttatgt agacaaactg
5101 aaaatctcaa aacatcaaaa agtatctttt tgaaagttaa agtacatgaa aatgtagaaa
5161 aagaaacagc aaaaagtcct gcaacttgtt acacaaatca gtccccttat tcagtcattg
5221 aaaattcagc cttagctttt tacacaagtt gtagtagaaa aacttctgtg agtcagactt
5281 cattacttga agcaaaaaaa tggcttagag aaggaatatt tgatggtcaa ccagaaagaa
5341 taaatactgc agattatgta ggaaattatt tgtatgaaaa taattcaaac agtactatag
5401 ctgaaaatga caaaaatcat ctctccgaaa aacaagatac ttatttaagt aacagtagca
5461 tgtctaacag ctattcctac cattctgatg aggtatataa tgattcagga tatctctcaa
5521 aaaataaact tgattctggt attgagccag tattgaagaa tgttgaagat caaaaaaaca
5581 ctagtttttc caaagtaata tccaatgtaa aagatgcaaa tgcataccca caaactgtaa
5641 atgaagatat ttgcgttgag gaacttgtga ctagctcttc accctgcaaa aataaaaatg
5701 cagccattaa attgtccata tctaatagta ataattttga ggtagggcca cctgcattta
5761 ggatagccag tggtaaaatc gtttgtgttt cacatgaaac aattaaaaaa gtgaaagaca
5821 tatttacaga cagtttcagt aaagtaatta aggaaaacaa cgagaataaa tcaaaaattt
5881 gccaaacgaa aattatggca ggttgttacg aggcattgga tgattcagag gatattcttc
5941 ataactctct agataatgat gaatgtagca cgcattcaca taaggttttt gctgacattc
6001 agagtgaaga aattttacaa cataaccaaa atatgtctgg attggagaaa gtttctaaaa
6061 tatcaccttg tgatgttagt ttggaaactt cagatatatg taaatgtagt atagggaagc
6121 ttcataagtc agtctcatct gcaaatactt gtgggatttt tagcacagca agtggaaaat
6181 ctgtccaggt atcagatgct tcattacaaa acgcaagaca agtgttttct gaaatagaag
6241 atagtaccaa gcaagtcttt tccaaagtat tgtttaaaag taacgaacat tcagaccagc
6301 tcacaagaga agaaaatact gctatacgta ctccagaaca tttaatatcc caaaaaggct
6361 tttcatataa tgtggtaaat tcatctgctt tctctggatt tagtacagca agtggaaagc
6421 aagtttccat tttagaaagt tccttacaca aagttaaggg agtgttagag gaatttgatt
6481 taatcagaac tgagcatagt cttcactatt cacctacgtc tagacaaaat gtatcaaaaa
6541 tacttcctcg tgttgataag agaaacccag agcactgtgt aaactcagaa atggaaaaaa
6601 cctgcagtaa agaatttaaa ttatcaaata acttaaatgt tgaaggtggt tcttcagaaa
6661 ataatcactc tattaaagtt tctccatatc tctctcaatt tcaacaagac aaacaacagt
6721 tggtattagg aaccaaagtg tcacttgttg agaacattca tgttttggga aaagaacagg
6781 cttcacctaa aaacgtaaaa atggaaattg gtaaaactga aactttttct gatgttcctg
6841 tgaaaacaaa tatagaagtt tgttctactt actccaaaga ttcagaaaac tactttgaaa
6901 cagaagcagt agaaattgct aaagctttta tggaagatga tgaactgaca gattctaaac
6961 tgccaagtca tgccacacat tctcttttta catgtcccga aaatgaggaa atggttttgt
7021 caaattcaag aattggaaaa agaagaggag agccccttat cttagtggga gaaccctcaa

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
BIOINFORMATICS ASSIGNMENT 14
7081 tcaaaagaaa cttattaaat gaatttgaca ggataataga aaatcaagaa aaatccttaa
7141 aggcttcaaa aagcactcca gatggcacaa taaaagatcg aagattgttt atgcatcatg
7201 tttctttaga gccgattacc tgtgtaccct ttcgcacaac taaggaacgt caagagatac
7261 agaatccaaa ttttaccgca cctggtcaag aatttctgtc taaatctcat ttgtatgaac
7321 atctgacttt ggaaaaatct tcaagcaatt tagcagtttc aggacatcca ttttatcaag
7381 tttctgctac aagaaatgaa aaaatgagac acttgattac tacaggcaga ccaaccaaag
7441 tctttgttcc accttttaaa actaaatcac attttcacag agttgaacag tgtgttagga
7501 atattaactt ggaggaaaac agacaaaagc aaaacattga tggacatggc tctgatgata
7561 gtaaaaataa gattaatgac aatgagattc atcagtttaa caaaaacaac tccaatcaag
7621 cagcagctgt aactttcaca aagtgtgaag aagaaccttt agatttaatt acaagtcttc
7681 agaatgccag agatatacag gatatgcgaa ttaagaagaa acaaaggcaa cgcgtctttc
7741 cacagccagg cagtctgtat cttgcaaaaa catccactct gcctcgaatc tctctgaaag
7801 cagcagtagg aggccaagtt ccctctgcgt gttctcataa acagctgtat acgtatggcg
7861 tttctaaaca ttgcataaaa attaacagca aaaatgcaga gtcttttcag tttcacactg
7921 aagattattt tggtaaggaa agtttatgga ctggaaaagg aatacagttg gctgatggtg
7981 gatggctcat accctccaat gatggaaagg ctggaaaaga agaattttat agggctctgt
8041 gtgacactcc aggtgtggat ccaaagctta tttctagaat ttgggtttat aatcactata
8101 gatggatcat atggaaactg gcagctatgg aatgtgcctt tcctaaggaa tttgctaata
8161 gatgcctaag cccagaaagg gtgcttcttc aactaaaata cagatatgat acggaaattg
8221 atagaagcag aagatcggct ataaaaaaga taatggaaag ggatgacaca gctgcaaaaa
8281 cacttgttct ctgtgtttct gacataattt cattgagcgc aaatatatct gaaacttcta
8341 gcaataaaac tagtagtgca gatacccaaa aagtggccat tattgaactt acagatgggt
8401 ggtatgctgt taaggcccag ttagatcctc ccctcttagc tgtcttaaag aatggcagac
8461 tgacagttgg tcagaagatt attcttcatg gagcagaact ggtgggctct cctgatgcct
8521 gtacacctct tgaagcccca gaatctctta tgttaaagat ttctgctaac agtactcggc
8581 ctgctcgctg gtataccaaa cttggattct ttcctgaccc tagacctttt cctctgccct
8641 tatcatcgct tttcagtgat ggaggaaatg ttggttgtgt tgatgtaatt attcaaagag
8701 cataccctat acagtggatg gagaagacat catctggatt atacatattt cgcaatgaaa
8761 gagaggaaga aaaggaagca gcaaaatatg tggaggccca acaaaagaga ctagaagcct
8821 tattcactaa aattcaggag gaatttgaag aacatgaaga aaacacaaca aaaccatatt
8881 taccatcacg tgcactaaca agacagcaag ttcgtgcttt gcaagatggt gcagagcttt
8941 atgaagcagt gaagaatgca gcagacccag cttaccttga gggttatttc agtgaagagc
9001 agttaagagc cttgaataat cacaggcaaa tgttgaatga taagaaacaa gctcagatcc
9061 agttggaaat taggaaggcc atggaatctg ctgaacaaaa ggaacaaggt ttatcaaggg
9121 atgtcacaac cgtgtggaag ttgcgtattg taagctattc aaaaaaagaa aaagattcag
9181 ttatactgag tatttggcgt ccatcatcag atttatattc tctgttaaca gaaggaaaga
9241 gatacagaat ttatcatctt gcaacttcaa aatctaaaag taaatctgaa agagctaaca
9301 tacagttagc agcgacaaaa aaaactcagt atcaacaact accggtttca gatgaaattt
9361 tatttcagat ttaccagcca cgggagcccc ttcacttcag caaattttta gatccagact
9421 ttcagccatc ttgttctgag gtggacctaa taggatttgt cgtttctgtt gtgaaaaaaa
9481 caggacttgc ccctttcgtc tatttgtcag acgaatgtta caatttactg gcaataaagt
9541 tttggataga ccttaatgag gacattatta agcctcatat gttaattgct gcaagcaacc
9601 tccagtggcg accagaatcc aaatcaggcc ttcttacttt atttgctgga gatttttctg
9661 tgttttctgc tagtccaaaa gagggccact ttcaagagac attcaacaaa atgaaaaata
9721 ctgttgagaa tattgacata ctttgcaatg aagcagaaaa caagcttatg catatactgc
9781 atgcaaatga tcccaagtgg tccaccccaa ctaaagactg tacttcaggg ccgtacactg
9841 ctcaaatcat tcctggtaca ggaaacaagc ttctgatgtc ttctcctaat tgtgagatat
9901 attatcaaag tcctttatca ctttgtatgg ccaaaaggaa gtctgtttcc acacctgtct
9961 cagcccagat gacttcaaag tcttgtaaag gggagaaaga gattgatgac caaaagaact
10021 gcaaaaagag aagagccttg gatttcttga gtagactgcc tttacctcca cctgttagtc
10081 ccatttgtac atttgtttct ccggctgcac agaaggcatt tcagccacca aggagttgtg
10141 gcaccaaata cgaaacaccc ataaagaaaa aagaactgaa ttctcctcag atgactccat
10201 ttaaaaaatt caatgaaatt tctcttttgg aaagtaattc aatagctgac gaagaacttg
10261 cattgataaa tacccaagct cttttgtctg gttcaacagg agaaaaacaa tttatatctg
10321 tcagtgaatc cactaggact gctcccacca gttcagaaga ttatctcaga ctgaaacgac
10381 gttgtactac atctctgatc aaagaacagg agagttccca ggccagtacg gaagaatgtg
10441 agaaaaataa gcaggacaca attacaacta aaaaatatat ctaagcattt gcaaaggcga
10501 caataaatta ttgacgctta acctttccag tttataagac tggaatataa tttcaaacca
10561 cacattagta cttatgttgc acaatgagaa aagaaattag tttcaaattt acctcagcgt
10621 ttgtgtatcg ggcaaaaatc gttttgcccg attccgtatt ggtatacttt tgcttcagtt
10681 gcatatctta aaactaaatg taatttatta actaatcaag aaaaacatct ttggctgagc
Document Page
BIOINFORMATICS ASSIGNMENT 15
10741 tcggtggctc atgcctgtaa tcccaacact ttgagaagct gaggtgggag gagtgcttga
10801 ggccaggagt tcaagaccag cctgggcaac atagggagac ccccatcttt acaaagaaaa
10861 aaaaaagggg aaaagaaaat cttttaaatc tttggatttg atcactacaa gtattatttt
10921 acaagtgaaa taaacatacc attttctttt agattgtgtc attaaatgga atgaggtctc
10981 ttagtacagt tattttgatg cagataattc cttttagttt agctactatt ttaggggatt
11041 ttttttagag gtaactcact atgaaatagt tctccttaat gcaaatatgt tggttctgct
11101 atagttccat cctgttcaaa agtcaggatg aatatgaaga gtggtgtttc cttttgagca
11161 attcttcatc cttaagtcag catgattata agaaaaatag aaccctcagt gtaactctaa
11221 ttccttttta ctattccagt gtgatctctg aaattaaatt acttcaacta aaaattcaaa
11281 tactttaaat cagaagattt catagttaat ttattttttt tttcaacaaa atggtcatcc
11341 aaactcaaac ttgagaaaat atcttgcttt caaattggca ctgatt
//
mRNA Sequence 2
1 ccgtaattag ttaacgtgaa tacttttaaa tgacgttaag tttttctggt tattggctcc
61 caccatatcc gctttcgggt gcagtttggt tcgaagcaca attataatct ctccccatct
121 tcatgataag ttctgttctt gccccaaaag atgggcctaa gacttcttta ccaagaacat
181 gaccgttgaa tacaaaagga gaccaacttt ttgggaaatt tttaaagcga gatgcagcac
241 agcagattta ggaccaataa gcctcaattg gtttgaagaa ctttcttcag aagccccacc
301 atataatact gaacctccag aagaatctga atataaaccc caaggccatg aaccacagct
361 atttaaaaca ccacagagga atccctctta ccatcagttt gcttcaactc caataatgtt
421 caaagaacaa agtcaaactc taccactaga ccagtcacct ttcaaagaat tagggaatgt
481 tgttgcaaat agtaaacgta aacatcacag caaaaagaag gccagaaagg accctgtggt
541 agatgttgcc agtctgccgc tgaaagcttg tcccagtgaa agcccttgta ctccgcgatg
601 cacacaggtg gcaccgcagc gaagaaagcc agtggtatct ggaagtttat tctatacacc
661 aaaacttgag gagacaccaa aacatatttc tgaaagtctg ggagttgaag tggatccgga
721 tatgtcttgg acaagttcat tagctacacc accaaccctt agttccactg tgctcatagc
781 ccgagatgaa gaagcacaca gaaatgcatt tccagccgac tctcctgcta gtttgaaaag
841 ctatttttct aaccacaatg aaagtctgaa aaagaatgat agatttattc cctctgtgtc
901 tgacagtgaa aacaaaagcc agcaagaagc ttttagtcag ggactggaga aaatgttagg
961 ggattcatct agcaaaataa atcgcttcag agactgcctt agaaaaccaa taccaaatgt
1021 tctagaagat ggggagacag ctgtagatac ttcaggagaa gatagtttct cgttatgttt
1081 tcctaaacgt agaaccagaa atctacagaa aacgagaatg ggcaagatga agaagaaaat
1141 cttcagtgaa acaaggactg atggattaag tgaagaagcc agaggacaag ctgatgataa
1201 aaactcattt gcacttgaaa ttgaaccaag agatagtgag cccttagatc catctgtaac
1261 aaaccagaag cccctttaca gccagagtgg ggacatctcc agtgaagctg gccagtgttc
1321 agacagtata tggtctcagc cagatccctc tggtctaaat ggaacccaga caagaaaaat
1381 acccctactc catatttctt ctcataagca aagtatttta gaagacttca tagatatgaa
1441 gaaggaaggt actggctcta ttacttttcc tcatatttct agtcttccag aaccagaaaa
1501 gatgttcagt gaggaaactc tggtagacaa ggaacatgaa ggacagcatc ttgagtcact
1561 tgaagactcc atctcaggga aacaaatggt gtctggaacg agtcaaacag cctgtctctc
1621 tccgagtatc aggaagtcta tagtcaagat gagagagcca cttgaagaga ctttggatac
1681 tgttttctca gacagtatga ccagctcagc ctttacagaa gaacttgacg cctctgcagg
1741 gggactggaa atacatactg catgctcaca gagagaggat tctttatgtc ctagttcagt
1801 ggacactgga agctggccaa ccactctcac tgacacttct gcaactgtga aaaatgcagg
1861 tttaataacc actctcaaaa ataaaagaag aaagtttatt tactctgtaa gtgatgatgc
1921 atctcatcaa ggaaaaaaac tacagacaca gagacagtca gagcttacta acctttcagc
1981 cccgtttgaa gcaagtgctt ttgaagtacc attccccttt acaaatgtag attcaggtat
2041 accagattct tcaatcaaaa gaagcaattt accaaatgat cctgaagagc catctttgtc
2101 cttgaccaac tcttttgtga ctgctgccag taaagaaatt agttatattc atgcattgat
2161 atctcaggat ctaaatgaca aagaagcaat actcagtgaa gaaaagccac agccatatac
2221 agccctagaa gctgactttc tgtcgtgctt gccagaaaga tcatgtgaaa atgatcaaaa
2281 aagtccaaaa gtttccgacc gaaaagaaaa agtcttagtc tcagcatgtc gtccttcagg
2341 aaggcttgca gcagcagtgc agctcagcag cattagcttt gactctcagg aaaaccctct
2401 tggtagccac aacgtaacaa gtactcttaa attaactccc agcccgaaga cacctctgtc
2461 aaagccagtt gtggtttcta gagggaaaat gtgtaaaatg ccagagaaac tgcaatgtaa
2521 gagttgtaaa gataatattg aattaagcaa aaacatccct ctgggggtta atgaaatgtg
2581 tgtcttaagt gaaaattctg aaacacctga gcttctgcca cctctagaat atataacaga
Document Page
BIOINFORMATICS ASSIGNMENT 16
2641 agtgtcctca tcagtgaagt cacagttcaa tcaaaataca aaaatagcag tcgtacaaaa
2701 ggaccaaaaa gactcaactt ttatttcaga agtaacagtc catatgaatt ctgaagaact
2761 tttcccagaa aaggagaata attttgcttt tcaagtaacc aatgaaagca ataaacccaa
2821 tataggaagt actgtggaat tccaggaaga agacctcagc cacacaaaag ggcatagtct
2881 caagaactct cccatgacag tagatagaga cctagatgat gagcaagcag gccaagtgtt
2941 gattacagag gactcagatt cattagcagt agtccatgat tgtacaaaga agagcagaaa
3001 tactatagag cagcatcaga aaggaactgc agacaaagac ttcaagtcaa attcctcctt
3061 gtatttgaaa tcagatggga acaatgatta tttagacaaa tggtcagagt tcttggatcc
3121 actcatgaac cataaatttg gaggtagctt cagaacagct tccaataaag aaataaaact
3181 ttcagaggat aatgtcaaga aaagcaaaat gttcttcaaa gatatcgaag aacagtatcc
3241 tactagttta gattgtattg acactgttag taccctacaa ttagcaaaca agaagagact
3301 aagtgaacct catacatttg atttgaagtc aggtactact gtatctacac agtgtcatag
3361 tcaatcatct gtttctcatg aagatactca cacagcacct cagatgttat cttcaaagca
3421 agattttcat tcaagtcata acttaacgcc cagccaaaaa gcagaaatta cagaactgtc
3481 tactatcttg gaagaatcag gaagtcagtt tgaattcaca cagttcaaaa atccaagcca
3541 catagcacag aataatacat ctgcagtgct tggaaaccag atggctgttg taaggaccgc
3601 ttctgaggag tggaaagatg ttgatcttca tctcccactg aatccctcct ctgtaggtca
3661 gatagatcac aacaagaaat tcgaatgttt ggttggagtt aagcaaagct cttctcacct
3721 gttagaagac acttgtaacc aaaatacatc ttgtttttta ccgataaaag aaatggagtt
3781 tggaggattt tgttctgctc ttggcacaaa acttagtgtg tctaatgagg ctctgagaaa
3841 ggctatgaaa ctgttcagtg acattgaaaa tattagtgag gagccttcta caaaagtagg
3901 accaagagga ttctcttcat gtgcacatca tgattctgtt gcttccgtgt ttaagataaa
3961 gaaacaaaac actgataaaa gttttgatga gaaatctagt aagtgccagg taacagtaca
4021 aaataataag gaaatgacta cctgtattct tgttgatgaa aatcctgaaa attatgtaaa
4081 gaatataaaa caagataaca actatactgg ctctcaaaga aatgcttata aattagaaaa
4141 ctctgatgtt agtaaatcaa gtacaagtgg cacagtttat attaataaag gtgacagtga
4201 tttacctttt gctgctgaaa aaggcaataa gtatcctgag tcatgtaccc aatatgtgag
4261 ggaagaaaat gcacaaatta aggaaagtgt atcagattta acatgtttgg aagtcatgaa
4321 agctgaggaa acatgtcata tgaaatcttc agataaagaa caattacctt cagataagat
4381 ggaacaaaat atgaaagagt ttaatatatc ttttcagact gcaagcggga aaaatatcag
4441 agtctccaaa gagtcactaa ataaaagtgt gaatatttta gatcaggaaa cagaagactt
4501 gactgtcact tcagattctt tgaattctaa aattctttgt ggcataaata aggacaaaat
4561 gcatatttca tgtcacaaga aatcaatcaa tattaaaaag gtatttgaag aacatttccc
4621 aattggaact gtcagtcaat taccagctct tcagcagtat cctgaatatg aaatagaaag
4681 tatcaaagaa cctactctgt tgagttttca tacagctagt gggaaaaaag tcaaaattat
4741 gcaggaatct ttggacaaag tgaaaaatct ttttgatgag acacagtatg ttaggaaaac
4801 caccaatttt ggtcatcaag aatcaaaacc cctgaaggac agagaggact ataaagaaag
4861 acttacatta gcatatgaga aaattgaagt aactgcctca aaatgtgaag aaatgcagaa
4921 ctttgtctct aagcagactg aaatgctacc ccagcaaaat gatcatatgt ataggcaaac
4981 tgaaaatctc acatcaaatg gtagctctcc caaagtacat ggaaacatag aaaataaaat
5041 agaaaagaat cctagaattt gctgtatttg tcagtcctca tactttgtca ctgaagattc
5101 tgctttggca tgttatacgg gggacagtag aaaaacttgt gtcggagagt cttctctgtc
5161 caaaggcaaa aaatggctta gagaacaaag tgataagctt ggaacaagaa atactattga
5221 aatccaatgt gtaaaggaac acactgaaga ttttgcagga aatgccttat atgaacatag
5281 tttagtcatt atcagaactg aaattgatac aagtcatgtc tctgaaaacc aagcttcaac
5341 cctctttagt gaccctaatg tgtgtcacag ctatctatcc cattctagtt tttgtcatca
5401 tgatgatatg cataatgatt caggatattt cttaaaagat aaaattgatt ctgatgtcca
5461 gccagacatg aagaatactg aaggcaatgc cattttccct aaaatatctg ctacaaaaga
5521 aataaaacta cacccacaaa ctgtaaatga agagtgtgtt caaaaactgg agactaatgc
5581 ttcaccatat gcaaataaaa atatagccat tgactcagct atgctggatt taaggaattg
5641 taaggtaggc tcacctgtat tcattacaac tcattcacaa gaaactgtaa gaatgaaaga
5701 gatattcaca gataactgta gtaaaatagt cgaacaaaac agggagagta aaccagacac
5761 ttgccagaca agctgtcata aagcattgga taattcagag gattttatat gtcctagctc
5821 ttcaggtgat gtctgcataa actcacctat ggctattttt tatcctcaaa gtgaacaaat
5881 tttacagcat aaccaaagtg tgtctggact gaagaaagct gcaacaccac ctgttagttt
5941 ggaaacttgg gatacatgta aatctataag aggatctcct caggaagtcc atccttcacg
6001 cacttatgga atttttagca cagcaagtgg aaaagctgta caagtatctg atgcttcatt
6061 ggaaaaggca cggcaagtgt tttctgagat agatggtgat gctaaacagt tagcttccat
6121 ggtgtcactg gaaggtaatg aaaaatcaca tcactctgtg aaaagagaaa gctctgtggt
6181 gcataacacc catggtgtat tgtcactccg aaaaaccctc ccaggcaatg tcagttcatc
6241 tgtattctct ggatttagca ccgcaggtgg aaaactggtc acagtttcag aaagtgcttt

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
BIOINFORMATICS ASSIGNMENT 17
6301 acataaagtt aaaggaatgt tagaggaatt tgatttgatc agaactgaac atactctcca
6361 gcattcacct acacctgaag acgtatcaaa aatacctcct caaccttgtc ttgaaagcag
6421 aaccccagaa tactctgtaa gctctaaatt gcagaaaacc tacaatgata aatccaggtc
6481 accaagtaat tataaagaaa gtggttcttc aggcaatact caatctcttg aagtttctcc
6541 ccaactctct cagatggaga gaaagcaaga aacacagtcg gtattaggaa caaaagtgtc
6601 ccagaggaag actaatatct tggaaaaaaa acaaaactta ccccaaaaca taaaaataga
6661 aagtaataaa atggaaacat tttctgatgt ttccatgaaa acaaatgtag gagagtacta
6721 ctccaaagag ccagagaact attttgaaac agaagcagtg gaaattgcca aagcttttat
6781 ggaagatgat gagctgacgg attctgaaca gactcatgcc aaatgctcac tgtttgcatg
6841 cccccaaaac gaggctttat taaattcgag aactagaaaa agaggaggaa tggctggtgt
6901 tgcagttgga caacccccaa ttaaaaggag cttattaaat gaatttgaca ggataataga
6961 aagtaaagga aaatccttaa cgccttcaaa aagcactcca gatggcacaa taaaagacag
7021 acgattgttc acacaccaca tgtctttaga gccggttacc tgtggaccct tctgctcaag
7081 taaagaaagg caagaaaccc agagcccaca tgttacctca cctgctcaag gacttcagtc
7141 taaggggcat ccttcaagac actcagctgt gggaaagtct tcaagcaatc ctacagtttc
7201 tgccctaaga tctgaaagga ccagacactc agtctcagac aaatccacca aagtctttgt
7261 cccacctttc aaggtgaaat cacggtttca cagagatgaa cattttgata gcaagaatgt
7321 taatttggag ggaaaaaacc aaaagagcgc agatggagtc agtgaagatg ggaacgacag
7381 tgactttcct cagtttaaca aagatttaat gtcaagcctt cagaatgcca gagacctaca
7441 ggatatacga attaaaaaca aagaaaggca tcatctctgt ccgcagccag gcagtctgta
7501 tcttacaaaa tcatccaccc tgccccggat ttctctgcag gccgcagtag gagacagcgt
7561 tccttctgcc tgttctccta agcagctcta tatgtatggc gtttctaaag catgcataag
7621 cgttaacagc aaaaacgctg agtattttca gtttgccatt gaggatcact ttggtaaaga
7681 agctctgtgc gctggcaaag gctttcggtt ggcagatggt ggatggctaa tcccctccga
7741 tgacgggaag gctgggaaag aagaatttta cagggctctg tgtgacactc caggtgtgga
7801 tccaaagctg atttctagcg tctgggtctc taaccactac aggtggattg tatggaaact
7861 ggcagctatg gaatttgctt ttcctaagga attcgcgaat agatgtctaa acccagagcg
7921 agtgctgctt caactaaaat acagatacga tgtggaaatt gacaatagca gcagatcagc
7981 tctcaagaag atactggaaa gggatgatac agcagcaaaa acgcttgtcc tctgtgtttc
8041 tgatatcatt tcactaagca caaatgtgtc tgaaacttca ggcagtaaag ctagcagtga
8101 ggacagtaac aaagtagaca cgatcgaact cacagatggg tggtatgctg tcaaggccca
8161 gctagaccct ccactcctgg ctctcgtaaa gagtgggaga ctcactgtgg gtcagaagat
8221 cattactcag ggagcagagc tggtgggctc tcccgatgcc tgtgcacctc tggaagcccc
8281 agactccctt aggctaaaga tttctgcaaa cagcacgcgg cctgctcgct ggcacagcaa
8341 gctggggttc tttcatgacc ccaggccctt ccctctgccc ttgtcctcac tgttcagtga
8401 tggaggaaat gttggttgtg tggatgtcat cgttcagaga gtgtaccctc tacagtgggt
8461 ggagaagacg gtgtctggat cgtacatatt tcgtaatgag agagaggagg agaaggaagc
8521 gctgagattt gcagaggccc agcagaagaa actagaggcc ttgttcacca aagtccacac
8581 agagcttaaa gagcatgaag aagacatagc ccagcggcgt gtgctgtccc gggcactcac
8641 acggcagcag gtccacgctc tgcaggacgg tgcagagctc tatgcagcag tgcaggatgc
8701 atcagaccca gagcacctgg agacttgttt cagcgaagag cagctgagag ccttgaacaa
8761 ctacagacaa atgctgagcg ataagaaaca agcgcggatc cagtcagagt tccggaaggc
8821 cctggaggcc gctgagaaag aagagggttt atcaagggac gtctcaactg tgtggaagct
8881 tcgtgttaca agctacaaga aaagagaaaa atcagctctg ttgagtatct ggcgtccatc
8941 ttcagacttg ccctccctgt taacagaagg acagagatac agaatctatc atctttcggt
9001 gtcaaaatcg aagaataaat ttgagtggcc cagcatccag ttaacagcca caaagagaac
9061 tcagtatcaa cagctgccgg tttccagtga gaccctgctc cagctttacc agcccaggga
9121 gctccttccc ttcagcaaac tgtcagatcc agccttccag ccaccttgtt ctgaagtgga
9181 tgtagtagga gttgtagttt ctgttgtaaa accaataggt cttgctcctt tggtctactt
9241 gtcagacgag tgccttcatt tattagtggt gaaatttgga atagacctta atgaagacat
9301 aaagccacgt gtgcttattg ctgcgagcaa cctccagtgg cggccggagt ccacgtcacg
9361 agtgccaact ttatttgctg ggaacttttc cgtattctct gccagtccaa aggaggccca
9421 ctttcaggag agggtcacga acatgaagca tgctattgag aacatcgaca cattttacaa
9481 ggaggcagaa aagaagctta tacaggtgct gaagggagac agtccaaagt ggtccacccc
9541 gaacaaagac cccacccgag aaccctaccc agcctccact tgctctgctt cagaccttgc
9601 ttcaggaggt cagttaccga ggagttcacc tactgatcag caaagttatc gaagtccttt
9661 atcatgctgc acaccgacgg ggaaatctac acccctggcc cactcagcct ggatggcagc
9721 caagtcttgt agtggggaga atgagattga ggaccccaag acctgtcgaa aaaaaagggc
9781 cttggatctc ctaagtcggc tgcccttacc gccaccgctc agtcctgtct gcacctttgt
9841 ctctccggct gcacagaagg cctttcagcc accacggagc tgtggcacca aatacccaac
9901 acctctaaaa aaagagggac ccagttcccc ttggagcagg gcgccatttc agaaggccag
Document Page
BIOINFORMATICS ASSIGNMENT 18
9961 tggcgtttct ctcctggact gtgattcagt agctgatgaa gaacttgcct tgctcagtac
10021 ccaagccctt gtgcctcact cagtgggagg aagtgaacaa gtgtttccca gtgattccac
10081 aaggacagaa ggaccctcag ccagcacaga ggccagacca gcaaatagat ccaagaggga
10141 gtctctgagg gactgtagag atgacagcga tgggaaattg gctgctgaga cagttccaga
10201 ctactcataa gtgtccgtgt gtccagacag gaatacactt attgacgcat ttgtttactc
10261 tacagatgct ctacctggta actctttagg attagacttg atcaccacaa gaattatttt
10321 gcatacaaaa taaacatagc tttgctttta
mRNA Sequence 3
1 cgcgtcggcg gagcgggttt ttggcgcgag gatctgaaag aaggtcggcg gaggcggagg
61 cggagctgct ggggcttggc gctctggaag tcgtcccagc cgcgggtcgc cgaggaaagg
121 agcctgcggg tcagctttct ggccgaagtg ccggcgcgaa tttgttagcc gtctccggcc
181 aaaaagagcg gcacctcgga aggcgagtta tttaccaagc actggagtaa tattgtagat
241 aaaaatgcct gttggatgca aagagaggcc aacatttttt gaaattttta agacgcggtg
301 caatcaagca ggaaaggata ttaccgatag taaacataaa agttgttgca caatgaagtc
361 taaaatggat cgagcaaatg atgttaccag cccacctcta aattcttatc ttagtgaaag
421 tcctgttcta cgatgtacac atgtaacacc acagagagaa aagtcggtgg tatgtggaag
481 cttatttcat acaccgaagc ttatgaaggg tcagacacca aaacgcattt ctgaaagtct
541 aggagctgag gtggattctg atatgtcttg gtcaagttct ttagccacac caccaaccct
601 tagttctact gtgctaatag tcagagatga agaagtatct gcagctgtat ttcctaatga
661 cactactgct atttttaaaa gctgtttttc taaccatgat gaaagtctga agaaaaatga
721 tcggtttatc ccttgtggtc caggcaaaga aaacaaaaat caaagggaag ctaaaagtca
781 aagtttgggg aattcatttg gtaaagtaaa tagcaccaaa gaccattttg taaagtctac
841 accaaatgtc ctagaggatg aagtacatga aaaagttcta gatgtttctg aagaagaaga
901 tagtttttca ttatgtgttc ctaaatataa aacaagaaat ctacaaaaaa taaaaactag
961 caaaactagg aaaaatattt ttaatgagac aaaaaccagt gaatgtgaag aagctaaaaa
1021 gcaaatgaaa gaaaataaac attcattggt atctgaaatg gaaccaaatg acagtcatcc
1081 attagattgg aatgtaacac atgagaagcc ctttgggaat ggaactgaca aaatctccaa
1141 ggaaattgta ctgtcttcag cctctggatg ttctgaccta accctctcaa gtctaaatgg
1201 agctcagatg gagaaaacac ctctattgca tacttcttat gaccaaaata attcagaaaa
1261 agacctcata atcacagata aagaatgcac caacttcatt actttggaaa attcttggcc
1321 acagatttca aatgtaccaa agtattcaga gaagacgtta aatgaggaaa tagtagtaaa
1381 taagataaac gaagggcagt gtcttgaatc tcatgaagat tccgttgttt cggtaaagca
1441 agcaatatat gaaactactt taatagcttc tccacttcag ggtatcagaa agtctatatt
1501 caggataaga gaatcacctg aagggatgtc caatgcaatg ttctcaaata atatgactaa
1561 tccaaacttt aaagaacctg aagcctctga aagtggattg gaaaaacata ctatttgctc
1621 tcagaaagag gattctttat gtacaagttc aattgatgat ggaagctggc cagcaactat
1681 caaacatact tctgtagctt tgaagaattt aggtttaata tctagtttga aaaagaaaac
1741 aaaaaagttt atttacgtta taaatgatga aacatctaat caaggcctga aaacacagaa
1801 agaccaagag tcaagactaa ttaacctttc gacccaattt gaagcaaatg cttttgaagg
1861 acccctgaca tttacaaatg ctgattcagg tttattgcat tcttcttcca tcaaaaaaaa
1921 ctgtttacag aatgactcag aaaaaccagc tttgtcttta accagctctt ttgggacaat
1981 tctgagaaaa gtttccagta atggagccag ttctcctaat aataaaataa tatctcagga
2041 tcctgattat aaagaagcaa aaattaataa gaaaaaattg gagtcattta taaccacaga
2101 aactgattgt ctgtcatccc tgcaggaaaa acattgggaa gatgatgcaa aaaaaccaag
2161 agtttccgat ataaaagaaa aagtcttgcc tacagcaagt caccctcctg tgccacattc
2221 agaagtggaa ggtaatgata ttcactttca gtctccagaa agcttttcat ttgactgtga
2281 taataccagt ctgttaactc ctagctctag ggattctcca tcaagcctag ttgtgatgtc
2341 tagaggaaaa gaatcatata aaatatcaga gaaactaaaa tgtaagaatc atgaaactgg
2401 ttttgaatta accaaaaata ttcccatgga aaagaatcaa gacatacatg ttttaaatgc
2461 agattctaaa aatgctaaac tgttgtcaac tgaaaaacat ataacagtag catcatcttc
2521 agtaaaggtt cagttcaacc aaaatgcaaa tctcaccaca atccaaaaag accaaaaaga
2581 aactacttta atttcaaaaa taactgttaa tccaaactct gaagaacttt tcccagatga
2641 tgaaaataat tttgtcttaa agataactaa tgaaagtaat actcctgttt taggaaatac
2701 taaggaacta catgattcaa acctctgttg tgtaagagat tctgttccta agaactctac
2761 catggtagta tgtacagacc tggatgacaa acaaacagcc aaagtgtcga ttatgaaaga
2821 ttgttattca tcaagcatag atgatcttac agaaaggaac agaagtacca taaagcaaca
2881 actaaaaatg actctagatc aagattcaaa atcagacatt acctcagata tagttaggaa
2941 atcaaatgga aacagtgatt atatggataa ttgggcaaga ctgtctgatc caatttcaaa
3001 tcacagtttt gaaaatggct tcaaaacagc ttctaataaa gagataaaac tctctgaaaa
Document Page
BIOINFORMATICS ASSIGNMENT 19
3061 caacattagg aaaagtaaaa tgcttttcaa agatattgag gaacattatc ctactaactt
3121 agcatgtctt gaaattgtaa atacttcatc attagaaagt caaaagaaac caagcaaatc
3181 tcatgcactt gatccacagt caattaatat catatctggg tttgtgcaga atagcacata
3241 tgtttctgat agtgaaagtg gtcacacagc tcctccaact ttatctttaa agcaagattt
3301 tgattcaaat cgtaatttaa ctcctagtca aaaggcagaa attacagaac tttctactat
3361 tttggaagaa tcaggaagcc agtttgaatt tacacagttt agaaaaccaa gccacataat
3421 acagaaaaat ccatttgaaa tgcctgaaaa ccagctgact atcttgaata gcacttctaa
3481 ggaatggaaa gatgatgatc ttcatctcac aactaatgct ccatctatca gtcaggtaga
3541 tagcaagaaa tctgaaggta taattggagg taagcagaag tttgcttgct tgtcaagaac
3601 cagctgtaac agaagtgctt ctggctattc aacagataaa aatgaagtgg agtttagagg
3661 cttttattct gctcgtggca caaaactgaa tgttggtagt gaagcattgc aaaaagctaa
3721 gaaactgttc agtgaccttg agaatatcaa tgaggaaact tctgtagaag tagatagaag
3781 tttctcctca agcaaataca atgattctgt ctcaatgatt cagatagaag attgtaatga
3841 taaaaattta aatgagaaaa ataataaatg ccggctaata ctacaaaata atattgaaat
3901 gactactgac atttttgttg aagaatatac tgaaagttac aggagaaata cagaaaatga
3961 aggtaaccaa tgtactgacg ctggtagaaa tacttgtaac tcagaatctg atggcagtga
4021 ttcaagtaaa aatgatacag tttatattca tgaagaagaa aatggcttgc cctgtattga
4081 tcagcacaac atagatctga aattatttag ccagtttatg aaggagggga acactcaaat
4141 taaagaaggt ttgtcagatt taacctgttt ggaagttatg aaagctgaag aaacatctca
4201 tgttactatg tcaaataaac agcagttaac agctaatacg gggcaaaaca taaaagattt
4261 tgacactttt tatttatcct ttcagactgc aagcagaaaa aatataaggg tctccaaaga
4321 gtcattaaat aaagctagaa gtctccttaa tcaaaaatgg acagaagaag aattaaataa
4381 cttttcagat tccttgaatt ctgaattact tcctggcata gatatcaaga aaacagacat
4441 ctcaaatcat gaggtaatag aaaatactga aagaaaagac aaaataacga aagaaagtga
4501 cctaattggt actgaaaata tattactgat cctgcagcaa agaccagaaa gtaaaataaa
4561 aaagatcaaa gaatctgctg tgttgggttt tcatacagct agtgggaaaa aaatagaaat
4621 tacaaaggaa tctttggaca aagtaaaaaa tctttttgaa gaaaaagagc aagataatag
4681 tgaaatcact aattttagcc atcgaggggc aaagatgtcc aaggacagag aagaatgtaa
4741 agatgggcgt gaattagctt gtgggacaac tgaaataaca actaccccag agtatgaaga
4801 aactcacagt tctctagaga agaaaaaact tgtttctaat gagattgcag ccttaagacc
4861 caggctctta agtgataatt tatacaaaca aactgaaaat cttaaaatat cagatcatgc
4921 ctctcagaaa gttgatgtac atgaaaatac agaaaaagaa acagcaaaaa agcctacaat
4981 gtatacaaat caatccactt attcagccat tgaaaactca cctttaacat tttacacagg
5041 acacggaaga aaaatttctg tgagtgaggc ttcactattt gaagcaaaaa aatggcttag
5101 agaaggagaa tgggatgatc aatcagaaag aataaatgct gccaaggtta actgcttaaa
5161 agaatatcct gatgattacg tagaaaatcc ttcatgtgga aatagttcaa atagtgccat
5221 aactgaaaat gacaaaaatc atctctctga aaaacaaggc tcaacttatt taagtaatag
5281 taccatgtct aacagctatt cataccatcc tggcttttgt cattctagtg aagtgtataa
5341 taaatcagaa tatctttcaa gaagtaaaat tgataattct ggtattgaac cagtaataaa
5401 gaatattaga gagagaaaaa acattggttt ttctgaaata atgtcccctg gaagagaagc
5461 agacacagac ccacaaagtg taaatgaaga tatttgtgtt gagaaacttg cgactaactc
5521 ttcatgcaaa aataaaaata cagccattaa agtggccata tctgactcaa ataattttaa
5581 tacaattcaa aagttgaatt ctgattcaaa taattctgta cctgcataca gtacagtaaa
5641 tagtaaaaga gtctttgttg cacaccagac aaaagtgaca gaggggttta cagacaactg
5701 cagcatggta actaaacaaa acaccaagag taaatcagac acttgccatg cagaaattgt
5761 ggcagattat cctaaggcac tggatgattc agaggctatt tttcctaact ctctgggtgc
5821 tatagaatgt tcaccttcac ataaggtttt tgctgacatt caaagtgaac aaacttcaca
5881 acttaaccaa agtatgtctg gattggagaa agtttctgaa acaccacctt gtcagattaa
5941 ttcaaaaact tctgatagat gtgaacttcc tagggggaag cttcccaagt cagtctctta
6001 cacaaatgca tgtgggattt ttagcacagc aagtggaaaa tctgtacaag tatcagatgc
6061 tgcaatacaa aaggcaagag aggtgttttc taagctagaa gatagtgcca agcaactctt
6121 tcctgaagta tcacttaaag ataatgaaga acattcagaa aagttcacaa atgaagaaaa
6181 tactgtgata tatacctccc aaaatttact atcatctgct ttctctggat ttaggacagc
6241 aagtgggaaa caagttccag tttctgaaag tgccttatgc aaagttaagg gaatgttaga
6301 agaattcaat ctgatcagaa ctgaaagttg tcttcagcat tcatctactt ctagacaaga
6361 tgtatcaaaa atgcctcctc cctcttgtat tggtaagaga accccagaac actccagaaa
6421 ctccaaattg gataaagcct gcaataaaga atttagatta tcaagtaact gtaacaatca
6481 gagtggttct tcagaaaatc atcactctat taaagtttct ccatgtccct ctcaattgaa
6541 gcgagacaaa ccacagttgc tagtcggaag caaaggatca cttgttgaga acattcatcc
6601 tttgggaaaa gaacaagctt tacctaaaaa tataaaaaca gagattggga aagctgaaac
6661 ttttcctaat cttcctgtga aaacaaatat agaattttgt tctacttact ccaaggatcc

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
BIOINFORMATICS ASSIGNMENT 20
6721 agaaaactat tttgaaacag aaaccgtaga gattgccaaa gcttttatgg aagatggtga
6781 gctgacagat tccgaactgc taagtcatgc caaacacttt gtttttacat gccaaaacac
6841 taaggaaatg gttttgttaa attcaagaat tggaaaaaga agaggagatg cacttgtctc
6901 agttggagaa cccccaatta aaagaaactt gttaaatgaa ttcgacagga taataaaaaa
6961 tcaagaaaca tctttaaaag cttcaaaaag cactccagac ggcatcctaa aagacagaag
7021 cttgtttatg catcatattt ctttagagcc aatttcctgt ggaccctttc gcacaactga
7081 ggaacggcaa gaaatacaga atccaaattt cactgcacct ggtcaagaat ttttgcctaa
7141 atctcatttt tatgaacacc tggcttcaga aaaatcttca agtaatttat cagtttcacg
7201 gcaaccattt tgtatggttc ctgccacagg aaatgaaaaa aggagacact tgattgctcc
7261 aggcaaacca gtgaaagtct ttgtcccacc ttttaaaact aaatcacatt ttcacagaga
7321 tgagcagtgc attagcaaga atactaaatt ggaaaaaaac aaacaaaact ccaaagacat
7381 agatgaactt ggctctggtg atagtgaaaa aaatattaat gacagtggaa tccatcagct
7441 taagaaaaat aactccaatc aagcagcaac tataatattc acaaagaatg aaaaagaacc
7501 tttagattta attacaaatc ttcagaacgc cagagatata caggatatgc ggattaaaaa
7561 gaaacaaagg cagcatattt ttccacagcc aggtagtctg tatcttgcaa aaacctccac
7621 tttgcctaga atctctctga gagaagcagt agaaggccga gtcccctctg catgttctca
7681 taaacagctc tatatgtatg gtgtttccaa acattgtgta aaaataaaca gcaaaaatgc
7741 agagtctttt cagtttcatg ctcaggatta ttttggtaag gaaggcctat ggtctggaga
7801 aggaatacaa ttggctgatg gtggatggct cataccctcc aatgatggaa agattggaaa
7861 agaagaattt tatagggctc tgtgtgacac cccaggtgtg gatccaaatt gtatttctag
7921 agtttgggta tataatcact atagatggat tatatggaaa ttggcagcca tggaatttgc
7981 ctttcctaag gaatttgcta ataggtgtct aagtccagaa agagtgcttc ttcaactaaa
8041 atacagatat gatgtggaaa ttgataaaag cagaagatca gctataaaga agataatgga
8101 aagggatgac acagctgcaa aaacacttgt tctctgtatt tctgaaatca tttcgtcaag
8161 tgcagatata tctgaaactt ctagtagtaa aactagtagt gtgggtacca aaaaagtggg
8221 cattattgag ctcacagatg ggtggtatgc tattaaggcc cagttagacc ctcccctctt
8281 agctctcgta aagaacggga gattgactgt gggtcagaag atcactattc atggagcaga
8341 actggtaggc tctcctgatg cctgcacacc acttgaagcc ccagaatctc ttatgttaaa
8401 gatttctgct aacagtactc gtcctgcttg ctggtatacc aaacttggat tctctcctga
8461 tcctagacct ttccctctcc ccttgtcatc acttttcagt gatggaggaa atgttggttg
8521 tgttgatgta gttgttcaaa gagcataccc aatacagtgg atggagagga ccccatctgg
8581 attatgcata tttcgcaatg aaagagagga agaaaaggaa gcaacaaaat atgcagaaat
8641 ccaacaaaag aaactagaag ttttattcaa taaaattcaa gcagaatttg aaaagaatga
8701 tgaaaatata acaaagcagt gtataccatc atgtgcatta acaagacagc agatctgtgc
8761 tctgcaagat ggtgcagagc tttatgaagc agtgacaaat gcaccagacc caagtgacct
8821 ggagggttat tttagtgaag agcagttaag agccttgaat aatcacagac agatgttgaa
8881 tgataagaag caagcacaga tccagttaga attcaagaag gctatggaat ctgctgagca
8941 aggagaacaa attctaccaa gggatgttac aactgtgtgg aagttacgta tcataagcta
9001 caggaaaaaa gaaaaagatt cagttacatt gagtatctgg cgtccatcac cagatttata
9061 ttccctgtta atagaaggaa agagatacag aatctatcat cttgcagcat cacaatctaa
9121 aagtaaatct ggaaaagcca acacacagct aacagcaaca aagaaaactc agtaccagca
9181 actaccagca tcagatgaaa tcctatccca agtttatcag ccaagggaac ccctttactt
9241 caacaaactg ttggatccgg acttccaacc accttgttct gaggtggacc taataggatt
9301 tgtagtttct gttgtgaaaa aaataggtct tgctcctgtg gtctatttgt cagatgaatg
9361 ccataattta ttggcaataa agttctggac tgattttaat gaagacatta ttaaacctta
9421 cacattaatt gctgcaagca acctccagtg gcgaccagaa gccaaatcag gaattcctac
9481 tttatttgct ggagattttt ccaggttttc tgccagtcca aaggaggagc attttcaaga
9541 gacattccac aaaatgaaaa atactgttga gaatattggt atgttttaca atgatgcaga
9601 aaacaaactt gtgcatatac ttaatgcaaa tgatcccaag ttgtccaccc cgactaaaga
9661 ctatgcttca gagccacaca cagctcaaat agtccttggc ataggaaata aatttctgat
9721 gtcttctccc aataatgaga tgaattatca gagtccttta tcactttgta agccaaaaga
9781 gaagtctgtc cccatacctg gatcaaccca aatgacttca aagtcttatt gtaaagagga
9841 gaaagagatg gatgacccaa aaacctgcaa aaagagaaga gccttggact ttttgagtag
9901 agtgccttta cctccatctg tcagtcccat ttgtacattt gtttctccag ctgcacagaa
9961 ggcatttcag ccaccacgga gttgcggcac caaatatgaa acactgatga agaaagagtt
10021 gaattctcca cagatgactc cacgtaaatt taatgacctt tcccttttgg aaagtgattc
10081 aatagcagac gaagaactcg caatgataaa cacccaagcc cttttgttgg gttcaccagg
10141 agaacatcaa cttgtgtctg tcagtgactc taccaggact gctcccacga gctcaaaaga
10201 ttatcttgga ctgaaaaggc attctactgc acccggggtc agaggacccg agagccccca
10261 ggcctgcacc aggaagcggg agccccgtgt acagaacaca agtgatctga aaaggacatc
10321 tctgagactg cagaggcaac aaacacaaaa atgacaatga attggtgact gactcaacct
Document Page
BIOINFORMATICS ASSIGNMENT 21
10381 ttccaatgtg tggaaaacac agcctcaacc tgtatgtcaa gatgtgcata atgagacaag
10441 aaagaccaca tcccaaatct cctgtgtgct tgtctatctt aggaaacctg gcctatctct
10501 gtactggtcg gtgtacttta tttcagttat gtgtctgaaa attgtgtatt tattagctaa
10561 tcaagaaaaa aaatctcctt taaactctta tgattggata tgatcaagta tattttacaa
10621 agtaaacaca ctttttcttt aaattgtgtc cctaattaaa tgaaagtagg tttcaaagta
10681 ctgttatttt gactcctgta gttcttttta ggtgacttgg ttttgttttg tttttcggag
10741 gtaacctact atgaaccagt tttccttaat aaacgtgttg gttctcttat agttgtatcc
10801 tgatcaaaag tcaggaggag taaggaacaa acagcagtgc tctctctgga ccagttcttt
10861 aaccttacgt cagcataagt gcaagaaaaa cagaatcctc aatgtgattc ctttttatga
10921 ttctagtgtg attgctgaat tatttcaatt aaaaattcaa atgcttttaa a
//
Gene name PEX1
Accession no. Z28197 Y13137
Protein Sequence 1
TGGTTGGTGGTGCGATTTAATGCGTATGCTTGCCTGTGGCGTTAAAATGGTCTATACGTTTATGCTAATA
TTTCTTGATGTAATGCAGGGCCTATGAATACATGGAGGTTTGTTGCCGGTGGTATACGGTTGAAAAAAAA
AAGAAATAGATATATATATATTTCTTCTTTGTATAGACTTCTGCGAGAAAAAGAAAAGATAAGGAAAAAT
GAAAAAAAAAGCGTATAGGTGCTAAGAAGAATTAAGTATCAAAAGGTAGCAGGACATTTATGTTATATAT
GCAACTTCGGATTTCGTTCAATACATCTTGACCGGTCTCCACTGTTACTGGCATATTTCTCGGGTTTTTC
CTTTTCACCACAAGGGAAAAAAAATGCGTTATGTAATTTCACCAATATTTTTCCCGTCCTGCGTGCTCGC
ACAATACAATAGGGTCATGTCACGTAATAAGCAACGTGCAGCCGCATTTTTTGCCCTTTAAAGGGAAACG
CGCTTTGTTCTTTTCTTCTTCCTTTTCACATAAGGGAGAGTCGGCTACCAATGTCGATGGAATTCTCACC
ATTGGGCATCTCACCGTTCCTATCCTTCTGGAACCTATCGTAAATCCCCCTTAATTTGACCAACTCACTT
GTTGAAATACTAGGTTTTGTCTCTTGACAGGCTTCCAATAAATCGTTTATTGTGACCACAGCTGTCAGCT
CAGAAGCAGCTGAAGTGCTTGTCTTTGTCTCGTGCACCACATCTTGCTGCAATAGAGTCTTTAGCCGTAG
TCTATTTTCTTCTCTACGCCCGTGCTCATTGATACTGAAGTATTCTATATTATCATTGCCGGGTACCACT
TCCGATTGGTCTGCAGCAGACAGCCACCTATGTACAGATTTCAAATAAGCATTGTAGCAGAGCCCCTGTA
AATCAGCACCCGAAAAACCAGCTGTCTTTTCAGCGATCAACTTCAAGTCTGCATTTTTCTCCAATGCAAA
TTTTTTTTGTCCCGTATCCTTGTCTTTTGAATTGACGATAGCTTGCAAGATATCTAACCTCTCTGATTCA
GTCGGTATATTACAGATCACACTTTTGTCTAATCTTCCCGGTCTTAACAATGCGCTATCAATCAAATCAG
GTCTACTTGTAGCTGCTAGTATATATACACCATCAAGGCCCTCGGCACCATCCATTTGGGTCAATAATTG
ATTGACTACACGGTCTGTGACACCAGTGGAGTCATGCCCTCTCTTTGGCGCAATAGAATCGAACTCGTCA
AAAAATAGAATACAGGGTTTGACGGACTGTGCCCTTTCAAACAATTCTCTTATGTTTTGTTCGCTGGCAC
CTATAAACTTGTTTAAAATCTCTGGTCCCTTAACGGAGATAAAGTTTAACCCACATTGTTGTGCCACGGC
GCTCGCCAGAAGCGTTTTACCACAACCAGGATAACCGTAAAGCAAGATTCCTGATCTCAATCTTAGGGGA
CAATTAACGAAAATAGGCTCATATTTTGTGGGCCACTCTAAGGTTTCTAGAAGGACATCTTTGGCATTTG
CTAAAGCACCAATATCCCCCCATTTGATGTTCGTTTCTTTTGTTAGCTTTACTCCACGCAACGCAGATGG
TGTAAACGCACTAAGCGACTTCGAAAAAAGTTCCCTTGTTACAACATTATCACAATCTCTTTCCAACTGC
AAGTCGTAGAAAATCTTTTCTGTGAATATTTCCAAATCTAGTGGCGAAAATCCTTCTGTCTCCAACGACA
AATCACTGAATTGCAAATCTCGATTTAGTTTCATGATCTGGTTTTTCGAGAAAAAATACTCCAGTAACTT
CGCTCTTGCATGTTTGTCAGGTGCTCTCAAAGACCACGTCTCTGAAACAAAATGCTTATCAAACAATAGC
GGATTAATTTGAGTTTTCTGCTTGCCTGAAAATAAAACTCTGATTCGTTTATTATCCTTGTTAAATATCT
TTGTCACCTGATTGATGAAAAAATTCAAAAGTTTGCTTGCATTATCCCATTGGCCATTATTAGAAGGATC
TCCGTCATTGGCTTGAGGTTTTCCAAATAGAGCCTCAACGTTATCCAACACAATCAAAGAAGGACCATAC
CAATAACAAAAAGAACACCATTCCATAATTAATTTTTGGGTTTTATCTAAATTTGATGTCTCGTGCAATG
TTTCACAATCCGCATATTTAACGAAGATGTGATGATCTTTTTCCACTTCATTTATAAGCTCTTTTAATAA
CCTTGTTTTACCGATCCCTTGCTTACCATCTAATATAATAGCTGGTGTGGCAATAATAGGCGATGTCAAA
TAATTGACCATTTCTTTCTTGATACTATTTACCGTAATGAAATCATCTTCATCTTTACTAGTGCGCGAAA
CTTCGCCAGTTTCTTTGACGTGGTAATGTTTAGGCAAATGCCTTTCGATAATATCTTTAACTTCTTCTTT
ACCCATTTGGGTTACTTTCCATTGAACGGATTCATTGGAAATTTCATTCAAATTACACAACTGCTGCTCG
GATTCACCTTTTTTTATTTCGATTATAATTTGCTCTGTTGGAAGTATTAAGTTATTGGTCAATAAGCTCC
CGCCCAACAACTTAGAATACTGATCCCCACTCTTGGTAGGAACGTCCTTGCCAAAATATTTGATATTGAC
TGTTGCATTTCTTCCGGATATTATATTTGCTTGATTCATTTGTAGAAACTCTAGTTTGATTTTTGCTCCG
TTCATTGGGTGCGTAAAGAATGCATCCCAAAGATGAGAGCTTAATGCAATGTGATTTTCGGGTATCTGAC
TATCACATTTTATAAAAACCCCAATCTTCTTCGAAGGGATACCAACTGATTTATTATCTGAATCGCTTTT
CTTGGACTGCCTTAAGCTACATTTTACGATAGAAGCGTAGCCCTTCTGAGAAGGCAATTGTGCCCCGTCA
Document Page
BIOINFORMATICS ASSIGNMENT 22
CTGATATATACGACAAAAAGGTTATCTTTAGGGAAGTCCATTTTGCACACAGTACTCCTCAATATTACCT
TCTTTAAAAGTTGGATAGCCCCATTTTTTAGTATTGTCTTATTACTATGACCATACTCAGCCTTCACCAA
ACGTGTTTTATTGACTTTTGGTGCCACAACAACCAAGGAACCATCTGTGATCCTCGCAGACTTCATTGAT
GGCTCCACCCTGTCAATTTTGAACTTTGTAACAATTCCTTCCAAATAACAAATCAAAGTCTCACCTGGGG
TAACTATACGGGTTTGGTGCAAAATCTCACCATTTTGGAACCTCATCGCATTGGCATCGATAATTTCCCA
ATCATCACTCGTTTCTGGCGTGACATACACCTCCGTAGCCAGGTGTGTGTGGTCATATCGCTGGATGTAT
AAGTCCACCAATGGAGACTTTTGATTTAAATCATATACTGTGGCCAAAACAGGATTGATAAGAACAACAT
TCTCACTCGATCCAGAATCATGGCCATCCCATCCAAGATGAACAATAGGTATATCTGAATTATGAGAATG
GACTGCTATACCGAATTCCTGGATAGCGTAGTTGGTTGACTCTAACACGTTTATAATTGAATGCGGTAAC
CTTAAAAAGTTTCCTACTATGGCATTGGAGAATTGGATTCTCAGATTCTCAAACTTCAACCTCTTGGTCG
TCGTCATTTGAAAGAGCAGTTCCTCAGGTGTTTCTTGTTACTGCCGTCCCTTCCAGAAGAAATTCGACGA
TGACAATAGTCGGAGAGATATTCCGGACCTTTATAATTGCGGCTACGCACAAGAATAAATCGCGATAGTA
ATCGGATATAAGACAATAAAGTCCAGCTAACTTGGTTGGAGAATTCGACAGATTACCATGAATATAGAAA
TCGATGTTTTTTTTTGTCCACGCCGATTGATCAGCCATTTTTTTTTATCAGGGTCGCAACTGCCCTGTGC
AAATAAGAAAGCCTGAAATGGTTGCACAAAGACCACAAACAAGGGCATACGCCAGTACAGTATTATGGAC
TCATGCCATCGAGGTTACATAGAATTAGTTTAAACAAAAACCAACTATCCTCTAATATCCATGGTACGAG
TATATATAATTGTATGAAGTGTAAGAATATGATATAAATAATGAGAGCACTCATAGAGAGTAACGGCGCT
GAATTGTAAGAACTGTCACGTAGTAACACAATCACTGAAAGCATATAAAGTGAAATGAGAAACGGTAGTA
ATATCATATGCATTACCAATTCCTTTTAGTGAATTTCCATGTGCCCGAGGGGGGACTTCCAGTATATTCT
GTTACATTTAAGTAAATACTTCATCAACAAATAGGGCCCAGCGGTTTCTTACACTAATTATACATTTACA
ACGTAAATGCTCCAGCTCAAGCATTATTTATATATCTTGTTTTCAAAAGAAAAATATAGCAATTAGACTC
CATGGCCAAGTTGGTTAAGGCGTGCGACTGTTAATCGCAAGATCGTGAGTTCAACCCTCACTGGGGTCGC
TTATTTTTAACTTTTTTACACTGAAGAAACAAATCAATTCTAATACAATGCCAATGGCTTCAAATCAACC
TTTGGTTATAATAGTTAGGAGAATCGATGCTTACATAGTATATTATTTAAACAATTAATTTGATTTAAAA
GATTTTTGTTTCCCTTGCTGCTGTCATTGGCTTTCTAAATTACATAAAGCAGATGAAAAGCCACACTATG
CTGAAAATAGTGTGTGGATGCAT
Accession no. NM_001179763
Protein Sequence 2
ATGACGACGACCAAGAGGTTGAAGTTTGAGAATCTGAGAATCCAATTCTCCAATGCCATAGTAGGAAACT
TTTTAAGGTTACCGCATTCAATTATAAACGTGTTAGAGTCAACCAACTACGCTATCCAGGAATTCGGTAT
AGCAGTCCATTCTCATAATTCAGATATACCTATTGTTCATCTTGGATGGGATGGCCATGATTCTGGATCG
AGTGAGAATGTTGTTCTTATCAATCCTGTTTTGGCCACAGTATATGATTTAAATCAAAAGTCTCCATTGG
TGGACTTATACATCCAGCGATATGACCACACACACCTGGCTACGGAGGTGTATGTCACGCCAGAAACGAG
TGATGATTGGGAAATTATCGATGCCAATGCGATGAGGTTCCAAAATGGTGAGATTTTGCACCAAACCCGT
ATAGTTACCCCAGGTGAGACTTTGATTTGTTATTTGGAAGGAATTGTTACAAAGTTCAAAATTGACAGGG
TGGAGCCATCAATGAAGTCTGCGAGGATCACAGATGGTTCCTTGGTTGTTGTGGCACCAAAAGTCAATAA
AACACGTTTGGTGAAGGCTGAGTATGGTCATAGTAATAAGACAATACTAAAAAATGGGGCTATCCAACTT
TTAAAGAAGGTAATATTGAGGAGTACTGTGTGCAAAATGGACTTCCCTAAAGATAACCTTTTTGTCGTAT
ATATCAGTGACGGGGCACAATTGCCTTCTCAGAAGGGCTACGCTTCTATCGTAAAATGTAGCTTAAGGCA
GTCCAAGAAAAGCGATTCAGATAATAAATCAGTTGGTATCCCTTCGAAGAAGATTGGGGTTTTTATAAAA
TGTGATAGTCAGATACCCGAAAATCACATTGCATTAAGCTCTCATCTTTGGGATGCATTCTTTACGCACC
CAATGAACGGAGCAAAAATCAAACTAGAGTTTCTACAAATGAATCAAGCAAATATAATATCCGGAAGAAA
TGCAACAGTCAATATCAAATATTTTGGCAAGGACGTTCCTACCAAGAGTGGGGATCAGTATTCTAAGTTG
TTGGGCGGGAGCTTATTGACCAATAACTTAATACTTCCAACAGAGCAAATTATAATCGAAATAAAAAAAG
GTGAATCCGAGCAGCAGTTGTGTAATTTGAATGAAATTTCCAATGAATCCGTTCAATGGAAAGTAACCCA
AATGGGTAAAGAAGAAGTTAAAGATATTATCGAAAGGCATTTGCCTAAACATTACCACGTCAAAGAAACT
GGCGAAGTTTCGCGCACTAGTAAAGATGAAGATGATTTCATTACGGTAAATAGTATCAAGAAAGAAATGG
TCAATTATTTGACATCGCCTATTATTGCCACACCAGCTATTATATTAGATGGTAAGCAAGGGATCGGTAA
AACAAGGTTATTAAAAGAGCTTATAAATGAAGTGGAAAAAGATCATCACATCTTCGTTAAATATGCGGAT
TGTGAAACATTGCACGAGACATCAAATTTAGATAAAACCCAAAAATTAATTATGGAATGGTGTTCTTTTT
GTTATTGGTATGGTCCTTCTTTGATTGTGTTGGATAACGTTGAGGCTCTATTTGGAAAACCTCAAGCCAA
TGACGGAGATCCTTCTAATAATGGCCAATGGGATAATGCAAGCAAACTTTTGAATTTTTTCATCAATCAG
GTGACAAAGATATTTAACAAGGATAATAAACGAATCAGAGTTTTATTTTCAGGCAAGCAGAAAACTCAAA
TTAATCCGCTATTGTTTGATAAGCATTTTGTTTCAGAGACGTGGTCTTTGAGAGCACCTGACAAACATGC
AAGAGCGAAGTTACTGGAGTATTTTTTCTCGAAAAACCAGATCATGAAACTAAATCGAGATTTGCAATTC
AGTGATTTGTCGTTGGAGACAGAAGGATTTTCGCCACTAGATTTGGAAATATTCACAGAAAAGATTTTCT
ACGACTTGCAGTTGGAAAGAGATTGTGATAATGTTGTAACAAGGGAACTTTTTTCGAAGTCGCTTAGTGC
GTTTACACCATCTGCGTTGCGTGGAGTAAAGCTAACAAAAGAAACGAACATCAAATGGGGGGATATTGGT

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
BIOINFORMATICS ASSIGNMENT 23
GCTTTAGCAAATGCCAAAGATGTCCTTCTAGAAACCTTAGAGTGGCCCACAAAATATGAGCCTATTTTCG
TTAATTGTCCCCTAAGATTGAGATCAGGAATCTTGCTTTACGGTTATCCTGGTTGTGGTAAAACGCTTCT
GGCGAGCGCCGTGGCACAACAATGTGGGTTAAACTTTATCTCCGTTAAGGGACCAGAGATTTTAAACAAG
TTTATAGGTGCCAGCGAACAAAACATAAGAGAATTGTTTGAAAGGGCACAGTCCGTCAAACCCTGTATTC
TATTTTTTGACGAGTTCGATTCTATTGCGCCAAAGAGAGGGCATGACTCCACTGGTGTCACAGACCGTGT
AGTCAATCAATTATTGACCCAAATGGATGGTGCCGAGGGCCTTGATGGTGTATATATACTAGCAGCTACA
AGTAGACCTGATTTGATTGATAGCGCATTGTTAAGACCGGGAAGATTAGACAAAAGTGTGATCTGTAATA
TACCGACTGAATCAGAGAGGTTAGATATCTTGCAAGCTATCGTCAATTCAAAAGACAAGGATACGGGACA
AAAAAAATTTGCATTGGAGAAAAATGCAGACTTGAAGTTGATCGCTGAAAAGACAGCTGGTTTTTCGGGT
GCTGATTTACAGGGGCTCTGCTACAATGCTTATTTGAAATCTGTACATAGGTGGCTGTCTGCTGCAGACC
AATCGGAAGTGGTACCCGGCAATGATAATATAGAATACTTCAGTATCAATGAGCACGGGCGTAGAGAAGA
AAATAGACTACGGCTAAAGACTCTATTGCAGCAAGATGTGGTGCACGAGACAAAGACAAGCACTTCAGCT
GCTTCTGAGCTGACAGCTGTGGTCACAATAAACGATTTATTGGAAGCCTGTCAAGAGACAAAACCTAGTA
TTTCAACAAGTGAGTTGGTCAAATTAAGGGGGATTTACGATAGGTTCCAGAAGGATAGGAACGGTGAGAT
GCCCAATGGTGAGAATTCCATCGACATTGGTAGCCGACTCTCCCTTATGTGA
Protein sequence 3
ATCTTTTTGTAAATTTGAATTTTGAAATTTTTTTCACAAAAAAATTTGGCACTCCAACCCATTTATTCCT
TTAATAAATTGACAAAAAAAAAATATATATCTTCGACCATAGTTGTTGAGTGTGCAAGGAGGCTACTACA
TAGTTTTCGTCCTCAAGATAAGTTTACTTTAGCAACAACATACTACTGTTAATAAGTTTTACTCGGTGAA
TACGAGTAACGTTGAGTTGATTATTTCAAACTATATAATTATACTTCTAAATAGTATATGCGGTGTATTG
TTTCGTATAAAAGTCTAAGATCATGTCTGGTTAATGTTCCGGAATTGCTGTTGGAATCTATCTCTGAACC
TGTCCAAAACTACGCCGTTCAGGCTGTAGTTTGTAAAAACGATATAAAGAAAACATTTTATTTTGGGATT
TCTGGAATTCCGTCACAATTTTCATTTGAGATCGATTCTACCTACGCCCATACTCTGAAATTAGCAGAAA
ATCAAGAGATAAATTTATCAATAATCGATTGTACTCATGAAATTGAGCAGCTAGAAATTGAACCTGTCAC
CTCTAATGATTGGGAAATCGCTGAACGCAATGCAGCTTGGCTGGAAGAAAATCTTTTAGTACAATATCGT
GTAGCTACCACCGAGAGATTTATTATTTATCTTCCTTCCGGTACATTTATCCAATTCCAGCCTCTTAAAT
TGATTCCCAGTTCTTTATGCGGGCGCCTGTTACGAACTACTGAAGTGTTGATTACTCCCAAACCCAATAC
ATCTGCCATTGAAGTTAAAGAAAATAGAAAAGTTAATCTTAGATGTGTTGTAGAAAACCGCTTGTTGCCT
GACAGCGTTACTGCTGATTCACCAGCTCTTTGCGTATTTCTACCTCTAAATTTTCCGGATAGGCCAGATG
TAGTATATATGGATGGTGGAAATTTAAAATCCACAATTGTATGTCAATGTGTATCGTGCCCTTTTCAAAT
ACCTGGTCATTTTTTTATCTCAAAGAGCTTAGCTTTGTCATACAGCATAAAAACAGGTTTCAAATTCCAA
ATATGGAAGGCTCATAACCCTCCCTCTTCCTCAAAATTTATTCTAGAGCAAAAGGGTCTTCCACCAGAAA
GCAACTTGAGTTCTGAGCTGGTGGCTGCCAAGCTAAAAAACAGCTATTTAATGGACGGCATGACTTTAAA
ACTTGTTGACATAGCAGTTTCTTATTCTGTTAGCGGACTCAGTGGTGTTGTTAAAAATCCAATTCAAGAT
ATAAAAATTACAAGTGATACAAACGTCCCAGTAAATGCAGGTATAAGAAATAATAGTCCTCGTTTGTCTA
TGCAGCCGTTTCCGCATGAATTTGCGCAAGTACGAAATGCTGTATTCTTGCACCAAAATATCTATATCAA
TGGACCCAAAGGTTGTGGAAAATCAAATTTAGTTCACTCATTGTTTGATTATTATTCTCTTAACTCAATC
TATTTCCAGATGATTGTATCTTGTTCAGAAATCGATCGATCATCTTTTGCCAAATTCCAAAGCTTTTGGA
ACAATGTATTTATACAAGCAGAGAGGTACGAACCCTCAATAATTTATCTAGATGATGTTCATTGTCTTAT
ATCTTCCTCAAATGAAAATGGGGAGCTTGGATTTGTGGAAGAGCGAGAAATAGCATTTTTGCAACATCAA
ATAATTAATTTAAAAAGAAAAAGAAAAATAATTTTCATTGGATTTGGGGAGGAATTTTTAACTTTCTCAG
AAAATTTAGTTTTGCCTTTGCTTTTTCAAATAAAAATAGCTCTACCATCGCTTGCTGTGACTAGGAGGAA
AGAAATTTTAACAACTATCTTTCAAGAAAATTTCTCTGATATAACTATGGATAGTATTGAATTTATTTCA
GTTAAAACGGAAGGCTACCTGATGACAGATTTAGTTTTATTCGTGAAACGATTACTGTCTGAAGCTTTTG
TTGAAAAGATACAAAATGGGCCAAAACATTTAATGAATAAGGGATTGATTGAGAAGACCTTAAAAGATTT
TGTCCCATTGCAGCTTCGGAAAGCTAAATTTGTCAAGTCCAGTATTCGGTGGATTGATATTGCGGGCATG
CAAGAAGCGAAAGAAGCTGTGAGAGATATAATAGAGTCTCCTGTTAAATACTCTCTCATATATAAACAAT
GTAGGCTTCGCCTTCCCACGGGAATTCTACTTTTTGGTTATCCCGGGTGCGGCAAAACGTATTTGGCTTC
CGCAATTTCAAGTACTTTTCCAGTTCAGTTTATAAGCATAAAAGGGCCCGAACTTTTGGATAAGTATATT
GGGAAATCGGAGCAGGGTGTTCGAGATTTGTTCAGTAGAGCTCAAATGGCCAAGCCGTGTGTTTTGTTTT
TTGACGAATTTGATTCAGTAGCACCCAGAAGAGGACAAGATAGTACTGGAGTTACGGACAGAGTTGTTAA
TCAAATACTTACACAAATGGATGGAGCTGAGAGCTTAGATGGTGTTTACATTGTGGCAGCTACTACTAGG
CCTGATATGATTGATCCTGCATTATTACGTCCTGGTCGTCTCGATAAGCTGATATTCTGTGATTTGCCTA
ACGAAGAAGAAAGACTTGAAGTTTTACAGAAATTGGCAAACCGATTCCATATTGAAAATGCGGCCATGTT
AAAGAAATTGTCAACTTTAACAGATGGTTATACTTATGCTGACTTATCGTCACTTCTGTACGATGCGCAT
TTGATTGCTGTTCATAAACTATTGAAACGTGTAAGTATAAACGCAGTGGACCCCAGTCAAACCACAAGTT
CCTTCACTAACTTGACGACTGAGTCAAAAAGGAATGCGTCAATGCTTGCTCTTCCACCTGAAAGTCGCTA
TAATCAAAATATGCAAAGTATGAGTGATTCAAAAAGTGTCGTAATAGAAGATTATATGCTCATGGAAGCT
Document Page
BIOINFORMATICS ASSIGNMENT 24
TTGAAGAAAAATAGTCCAAGCTTAAACTCGGAAGAGTTTGAGCACCTTTCAAACCTATATAGAGATTTTC
GTTCAAAGCTTTTTGAGCCAGAACTTAATGCAAGGAATACGGACGTGGGAAGTAAGACTAGACAAATCTA
AAGCATTAGTATACTTTAAGCAAATAAAAATGAAAAAAT
mRNA Sequence 1
ATCTTTTTGTAAATTTGAATTTTGAAATTTTTTTCACAAAAAAATTTGGCACTCCAACCCATTTATTCCT
TTAATAAATTGACAAAAAAAAAATATATATCTTCGACCATAGTTGTTGAGTGTGCAAGGAGGCTACTACA
TAGTTTTCGTCCTCAAGATAAGTTTACTTTAGCAACAACATACTACTGTTAATAAGTTTTACTCGGTGAA
TACGAGTAACGTTGAGTTGATTATTTCAAACTATATAATTATACTTCTAAATAGTATATGCGGTGTATTG
TTTCGTATAAAAGTCTAAGATCATGTCTGGTTAATGTTCCGGAATTGCTGTTGGAATCTATCTCTGAACC
TGTCCAAAACTACGCCGTTCAGGCTGTAGTTTGTAAAAACGATATAAAGAAAACATTTTATTTTGGGATT
TCTGGAATTCCGTCACAATTTTCATTTGAGATCGATTCTACCTACGCCCATACTCTGAAATTAGCAGAAA
ATCAAGAGGTAAAAATATTTGTCAGCAATGAAATGTTTTTACTAATCTCTTAGATAAATTTATCAATAAT
CGATTGTACTCATGAAATTGAGCAGCTAGAAATTGAACCTGTCACCTCTAATGATTGGGAAATCGCTGAA
CGCAATGCAGCTTGGCTGGAAGAAAATCTTTTAGTACAATATCGTGTAGCTACCACCGAGAGATTTATTA
TTTATCTTCCTTCCGGTACATTTATCCAATTCCAGCCTCTTAAATTGATTCCCAGTTCTTTATGCGGGCG
CCTGTTACGAACTACTGAAGTGTTGATTACTCCCAAACCCAATACATCTGCCATTGAAGTTAAAGAAAAT
AGAAAAGTTAATCTTAGATGTGTTGTAGAAAACCGCTTGTTGCCTGACAGCGTTACTGCTGATTCACCAG
CTCTTTGCGTATTTCTACCTCTAAATTTTCCGGATAGGCCAGATGTAGTATATATGGATGGTGGAAATTT
AAAATCCACAATTGTATGTCAATGTGTATCGTGCCCTTTTCAAATACCTGGTCATTTTTTTATCTCAAAG
AGCTTAGCTTTGTCATACAGCATAAAAACAGGTTTCAAATTCCAAATATGGAAGGCTCATAACCCTCCCT
CTTCCTCAAAATTTATTCTAGAGCAAAAGGGTCTTCCACCAGAAAGCAACTTGAGTTCTGAGCTGGTGGC
TGCCAAGCTAAAAAACAGCTATTTAATGGACGGCATGACTTTAAAACTTGTTGACATAGCAGTTTCTTAT
TCTGTTAGCGGACTCAGTGGTGTTGTTAAAAATCCAATTCAAGATATAAAAATTACAAGTGATACAAACG
TCCCAGTAAATGCAGGTATAAGAAATAATAGTCCTCGTTTGTCTATGCAGCCGTTTCCGCATGAATTTGC
GCAAGTACGAAATGCTGTATTCTTGCACCAAAATATCTATATCAATGGACCCAAAGGTTGTGGAAAATCA
AATTTAGTTCACTCATTGTTTGATTATTATTCTCTTAACTCAATCTATTTCCAGATGATTGTATCTTGTT
CAGAAATCGATCGATCATCTTTTGCCAAATTCCAAAGCTTTTGGAACAATGTATTTATACAAGCAGAGAG
GTACGAACCCTCAATAATTTATCTAGATGATGTTCATTGTCTTATATCTTCCTCAAATGAAAATGGGGAG
CTTGGATTTGTGGAAGAGCGAGAAATAGCATTTTTGCAACATCAAATAATTAATTTAAAAAGAAAAAGAA
AAATAATTTTCATTGGATTTGGGGAGGAATTTTTAACTTTCTCAGAAAATTTAGTTTTGCCTTTGCTTTT
TCAAATAAAAATAGCTCTACCATCGCTTGCTGTGACTAGGAGGAAAGAAATTTTAACAACTATCTTTCAA
GAAAATTTCTCTGATATAACTATGGATAGTATTGAATTTATTTCAGTTAAAACGGAAGGCTACCTGATGA
CAGATTTAGTTTTATTCGTGAAACGATTACTGTCTGAAGCTTTTGTTGAAAAGATACAAAATGGGCCAAA
ACATTTAATGAATAAGGGATTGATTGAGAAGACCTTAAAAGATTTTGTCCCATTGCAGCTTCGGAAAGCT
AAATTTGTCAAGTCCAGTATTCGGTGGATTGATATTGCGGGCATGCAAGAAGCGAAAGAAGCTGTGAGAG
ATATAATAGAGTCTCCTGTTAAATACTCTCTCATATATAAACAATGTAGGCTTCGCCTTCCCACGGGAAT
TCTACTTTTTGGTTATCCCGGGTGCGGCAAAACGTATTTGGCTTCCGCAATTTCAAGTACTTTTCCAGTT
CAGTTTATAAGCATAAAAGGGCCCGAACTTTTGGATAAGTATATTGGGAAATCGGAGCAGGGTGTTCGAG
ATTTGTTCAGTAGAGCTCAAATGGCCAAGCCGTGTGTTTTGTTTTTTGACGAATTTGATTCAGTAGCACC
CAGAAGAGGACAAGATAGTACTGGAGTTACGGACAGAGTTGTTAATCAAATACTTACACAAATGGATGGA
GCTGAGAGCTTAGATGGTGTTTACATTGTGGCAGCTACTACTAGGCCTGATATGATTGATCCTGCATTAT
TACGTCCTGGTCGTCTCGATAAGCTGATATTCTGTGATTTGCCTAACGAAGAAGAAAGACTTGAAGTTTT
ACAGAAATTGGCAAACCGATTCCATATTGAAAATGCGGCCATGTTAAAGAAATTGTCAACTTTAACAGAT
GGTTATACTTATGCTGACTTATCGTCACTTCTGTACGATGCGCATTTGATTGCTGTTCATAAACTATTGA
AACGTGTAAGTATAAACGCAGTGGACCCCAGTCAAACCACAAGTTCCTTCACTAACTTGACGACTGAGTC
AAAAAGGAATGCGTCAATGCTTGCTCTTCCACCTGAAAGTCGCTATAATCAAAATATGCAAAGTATGAGT
GATTCAAAAAGTGTCGTAATAGAAGATTATATGCTCATGGAAGCTTTGAAGAAAAATAGTCCAAGCTTAA
ACTCGGAAGAGTTTGAGCACCTTTCAAACCTATATAGAGATTTTCGTTCAAAGCTTTTTGAGCCAGAACT
TAATGCAAGGAATACGGACGTGGGAAGTAAGACTAGACAAATCTAAAGCATTAGTATACTTTAAGCAAAT
AAAAATGAAAAAAT
mRNA sequence 2
GTAGAACTTTATGTGCTTCCTTACATTGGTATATTTCAGGCACATAAATATTCTTCAACTTACAATTCTA
AGTATTTTGTTTATACTAAAAGGAGCTGAATAACGTTTATACAGTGCTGACATTGAAATCTATTTGCTTT
CTTTGGAATATAAGCGCATGCTGAGTTACTTTCGCAGGCCAAGCCATATCCAACCACCATTTTTGTGCCA
AGCTTTTATGCAAGGTTAATTCCTTGTACTGCTTGTTATGTTATAATATATCAACATCTTAACAGTTTTC
Document Page
BIOINFORMATICS ASSIGNMENT 25
ATATCTTCCTTTATATTCTATTAATTGAATTTCAAACATCGTTTTATTGAGCTCATTTACATCAACCGGT
TCAATGCAAACAGTAATGATGGATGACATTCAAAGCACTGATTCTATTGCTGAAAAAGATAATCACTCTA
ATAATGAATCTAACTTTACTTGGAAAGCGTTTCGTGAACAAGTGGAAAAGCATTTTTCTAAAATTGAAAG
GCTTCACCAAGTCCTTGGAACAGATGGAGACAATTCATCATTATTTGAGTTGTTTACAACGGCAATGAAT
GCCCAGCTTCATGAAATGGAACAGTGCCAGAAAAAACTTGAAGATGACTGTCAGCAAAGAATTGATTCAA
TCAGATTTTTGGTTTCCTCATTAAAGTTAACGGATGATACTTCTAGTCTCAAAATTGAGTCTCCTTTAAT
TCAGTGTTTGAATCGTTTGTCAATGGTAGAAGGACAATATATGGCACAGTATGATCAAAAGTTAAGTACG
ATTAAAGGTATGTAATCGTCTTTAATTTAGACTTGTGTTTTAACTGATGTATAGAAATGTATCACAAATT
GGAGTCATATTGTAACCGCTTAGGAAGTCCGTTCGTTTTACCTGATTTTGAGAATTCATTTTTATCTGAT
GTATCCGATGCTTTTACTGAATCTTTGAGAGGACGCATCAACGAAGCCGAAAAGGAGATTGATGCGAGAT
TAGAGGTTATTAATTCCTTTGAAGAAGAAATTTTGGGTTTGTGGTCTGAACTCGGTGTTGAGCCCGCTGA
TGTTCCACAATACGAACAATTGCTTGAATCCCATACTAATCGACCAAATGATGTTTATGTTACTCAAGAA
CTTATCGACCAACTTTGCAAGCAAAAAGAAGTTTTTTCCGCTGAAAAAGAAAAGAGAAGTGATCATTTAA
AAAGTATACAATCAGAAGTTAGCAACTTGTGGAATAAGCTTCAAGTTTCTCCCAATGAACAAAGTCAATT
TGGCGATTCATCAAACATTAATCAAGAAAATATTTCATTATGGGAAACTGAACTTGAAAAACTTCATCAG
TTAAAAAAGGAGCATTTACCCATTTTTTTAGAAGACTGTCGTCAACAAATTCTTCAGCTTTGGGATTCTC
TGTTTTATTCAGAAGAACAAAGAAAGTCCTTTACACCTATGTATGAAGACATTATTACAGAGCAGGTTCT
TACGGCCCATGAAAACTATATAAAGCAACTAGAGGCCGAAGTTTCTGCTAATAAGTCCTTTTTAAGCTTA
ATTAATCGCTATGCCTCTTTAATAGAAGGAAAGAAAGAGCTTGAAGCTAGTTCTAATGATGCCTCTCGTC
TAACACAACGGGGACGCCGGGACCCAGGTTTACTTCTACGTGAAGAGAAAATCCGTAAGCGACTTTCTAG
AGAACTTCCTAAGGTTCAGTCGCTGCTTATACCAGAGATTACAGCATGGGAAGAAAGAAATGGAAGGACG
TTCCTTTTTTATGATGAACCACTTCTCAAGATTTGCCAAGAGGCCACTCAACCAAAATCATTATATAGAA
GTGCAAGTGCTGCCGCAAACCGCCCGAAAACAGCAACTACAACGGACTCTGTTAATAGAACACCTTCTCA
ACGAGGGCGTGTAGCTGTACCTTCAACACCAAGTGTTAGGTCCGCTTCTCGAGCTATGACGAGTCCAAGG
ACACCGCTTCCTAGAGTAAAAAACACTCAAAATCCAAGTCGTTCCATTAGTGCAGAACCGCCATCAGCAA
CCAGTACCGCCAATAGAAGACACCCCACTGCTAATCGAATTGATATAAACGCTAGATTAAACAGTGCTAG
TCGGTCTCGAAGCGCGAACATGATAAGACAAGGGGCAAATGGTAGTGACAGCAATATGTCTTCTTCACCC
GTTTCTGGAAATTCCAATACCCCTTTTAACAAGTTTCCAAATTCTGTATCTCGCAATACACATTTTGAAT
CCAAGTCACCGCACCCAAATTACTCTCGAACTCCTCATGAAACGTATTCAAAGGCTTCATCTAAGAACGT
CCCATTAAGTCCTCCAAAGCAGCGTGTAGTTAATGAACACGCTTTAAATATTATGTCGGAAAAATTGCAA
AGAACTAATCTGAAAGAACAAACACCCGAGATGGACATTGAAAACAGCTCGCAGAACCTTCCTTTTTCTC
CTATGAAGATATCCCCCATAAGAGCATCACCCGTAAAGACAATTCCATCATCACCGTCCCCCACTACCAA
CATTTTTTCTGCTCCACTCAACAATATTACAAATTGTACACCGATGGAGGATGAATGGGGAGAAGAAGGC
TTTTAAGCTTCTTATTTACCTAATCGATCAAATTTAAATATACATATTTTTGCATATGAATACAGCATAT
AGATAATTCATAAAAGTTTATTAACTGAGGTCATAATTAAAAGACTATTTACACCTAA
mRNA Sequence 3
AATATTCTTAATACAAAACATTCATCTAAAAAACCACGATGAATTTTTCAAACGGTTCAAAATCGTCTAC
TTTTACAATTGCTCCTTCGGGTTCATGTATTGCTTTACCTCCTCAGCGGGGTGTAGCAACGAGTAAGTAT
GCTGTTCATGCTTCTTGCCTCCAAGAATATCTCGACAAGGAAGCTTGGAAGGATGACACATTGATCATCG
ACCTTCGTCCCGTATCCGAGTTTTCCAAATCTAGAATTAAGGGCTCCGTCAATCTTTCTTTACCTGCAAC
CTTAATTAAGCGTCCTGCCTTTTCGGTCGCGCGTATCATCAGCAATCTTCACGACGTCGACGACAAGAGA
GATTTTCAAAACTGGCAAGAGTTTTCATCCATTCTTGTTTGTGTTCCTGCTTGGATTGCTAACTACGTAA
CGAACGCTGAGGTAATTGGTGAAAAATTTAGAAAAGAATCCTATTCTGGTGACTTCGGTATTTTAGATTT
AGACTATTCGAAAGTCTCGGGCAAGTATCCATCTGTGATTGACAACTCTCCTGTAAAAAGTAAATTAGGT
GCTTTGCCTAGCGCTCGCCCCAGATTGTCTTACTCTGCTGCTCAAACTGCTCCTATCTCTTTGTCCAGCG
AAGGCTCCGATTACTTCTCTCGTCCTCCTCCCACTCCTAATGTTGCCGGTTTGTCTTTAAATAACTTTTT
CTGTCCTCTACCTGAGAACAAGGACAACAAATCTTCTCCATTTGGCAGTGCTACAGTCCAGACACCCTGC
TTACACAGTGTGCCCGATGCTTTTACCAATCCAGATGTTGCGACCCTCTACCAGAAGTTCTTGCGCCTGC
AATCATTGGAGCATCAACGTCTAGTTTCTTGTTCGGATCGTAATTCTCAGTGGAGTACCGTTGATTCATT
GAGTAATACATCGTATAAGAAAAACAGATATACCGACATTGTTCCTTACAATTGTACTCGTGTGCATTTG
AAAAGAACTAGCCCTTCAGAGCTCGATTATATCAATGCTTCCTTTATCAAAACTGAAACTTCAAATTACA
TCGCTTGCCAAGGTTCTATTTCTCGTTCTATTAGCGATTTCTGGCACATGGTATGGGACAATGTGGAGAA
TATAGGCACTATTGTTATGCTTGGTTCATTGTTCGAGGCTGGCCGCGAAATGTGTACTGCATATTGGCCA
AGTAATGGTATTGGAGATAAACAAGTGTATGGAGATTACTGTGTCAAACAAATTTCGGAGGAAAATGTCG
ATAATTCTCGATTCATTTTGCGAAAGTTTGAAATTCAGAATGCCAATTTTCCTTCAGTTAAAAAAGTACA
TCACTATCAATATCCTAATTGGTCTGATTGTAATTCTCCTGAAAACGTGAAATCTATGGTTGAGTTCTTA
AAATATGTGAACAACTCTCACGGATCAGGAAATACTATTGTGCACTGCTCTGCCGGTGTTGGTCGGACAG

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
BIOINFORMATICS ASSIGNMENT 26
GAACCTTTATTGTACTAGATACGATCCTACGCTTCCCAGAAAGCAAACTTTCTGGTTTCAATCCTTCTGT
CGCCGATTCTTCAGACGTCGTTTTCCAGTTGGTTGATCATATCCGAAAGCAGCGGATGAAGATGGTTCAA
ACCTTCACACAATTTAAATATGTGTATGACTTGATCGATTCTTTGCAAAAATCTCAAGTTCATTTCCCGG
TTTTAACATGAAATTTTTGACTGGATTTTTCTTGGCAATATATATTCGTGTTTTAATCGATTCCTTTATT
TCTTGTACTTGTAAAGTGTCTTTTTTTTTACATTTGCATTTGAATTTACTGTCAATGTTTGTGAAATTCA
GTTGCTTTAATCACGTTGTCTTTTATTTCAAAAAGTATATTTGAGAACTAGGCTTTTTAATGATATC
Gene lpxD
Accession No. AY616592
Protein sequence 1
ATGCCTTCAATTCGACTGGCTGATTTAGCGCAGCAGTTGGATGCAGAACTACACGGTGATGGCGATATCG
TCATCACCGGCGTTGCGTCCATGCAATCTGCACAAACAGGTCACATTACGTTCATGGTTAACCCAAAATA
CCGTGAGCATTTAGGCTTGTGCCAGGCGTCCGCGGTTGTCATGACCCAGGACGATCTTCCTTTCGCGAAA
AGTGCCGCACTGGTAGTGAAGAATCCCTACCTGACTTACGCGCGCATGGCGCAAATTTTAGATACCACGC
CGCAGCCCGCGCAGAACATTGCACCCAGTGCGGTGATCGACGCGACGGCGAAGCTGGGTAACAACGTATC
GATTGGCGCTAACGCGGTGATTGAGTCCGGCGTTGAACTGGGCGATAACGTGATTATCGGTGCCGGTTGC
TTCGTAGGTAAAAACAGCAAAATCGGTGCAGGTTCGCGTCTCTGGGCGAACGTAACCATTTACCATGAGA
TCCAGATCGGTCAGAATTGCCTGATCCAGTCCGGAACAGTGGTAGGCGCAGACGGCTTTGGTTATGCCAA
CGATCGTGGTAACTGGGTGAAGATCCCACAGATTGGTCGCGTAATTATTGGCGATCGCGTGGAGATCGGT
GCCTGCACAACCATCGATCGCGGCGCGCTGGATGACACTATTATTGGCAATGGCGTGATCATTGATAACC
AGTGCCAGATTGCACATAACGTCGTGATTGGCGACAATACGGCGGTTGCCGGTGGCGTCATTATGGCGGG
CAGCCTGAAAATTGGTCGTTACTGCATGATCGGCGGAGCCAGCGTAATCAACGGGCATATGGAAATATGC
GACAAAGTGACGGTTACGGGCATGGGTATGGTGATGCGTCCCATCACTGAACCAGGCGTCTATTCCTCAG
GCATTCCGCTGCAACCCAACAAAGTCTGGCGCAAAACCGCTGCACTGGTGATGAACATTGATGACATGAG
CAAGCGTCTGAAATCGCTTGAGCGCAAGGTTAATCAACAAGACTAA
Accession No. NXIR01000284
Protein sequence 2
TCCCTACCTGACCTATGCGCGCATGGCCCAAATTCTTGATACCACGCCGCAGCCGGCACAGAACATTGCA
GCCAGTGCTGCGATTGATCCGACGGCTCACGCAGGACGATCTTCCGTTTGCCACTAGCGCTGCACTGGTA
GTGAAAAATCCCTACCTGACCTATGCGCGCATGGCCCAAATTCTTGATACCACGCCGCAGCCGGCACAGA
ACATTGCAGCCAGTGCTGCGATTGATCCGACGGCTCA
Accession No NZ_PYBI01000155
Protein sequence 3
CAGCCAGGATGCGCCTAGCGAGCACATCAAGCTGCTTGAAGCGCGCGGCATTTTTGCGCCACGTGCGATT
GTCGGTGAGAGGGGTGCCGGACGAATACTCGCCCGGCTCATGGATGGAATTACGTACGACAGATTTGCCG
GTGATGACGACCTTGTCGCAGATCTCCAGGTGGCCGACTACGCCGACATGCCCGCCGAGCAGGCAGTAAC
GGCCGATCTTGGCGCTACCGGCAATGCCGGTGCAGCCGGCAATCGCACTGTGTGCGCCGATGCGGCAGTT
GTGGGCGATCTGCACCAGGTTGTCCACGCGCACGTCTTCTTCCAGCACGGTGTCTTCCAGCGCGCCGCGG
TCGATACAGGTATTGGCGCCGATCTCGCAATCGTCGCCGATCACCACGCCGCCCAGTTGCGGCACCTTGA
TCCAGCGGCCGGCGTCCATTGCCAGGCCGAAACCGTCTGCGCCGATCACCGCACCCGGGTGGATGCGCAC
GCGCTTGCCCAGCCG
Accession No NP_414721
mRNA sequence 1
MPSIRLADLAQQLDAELHGDGDIVITGVASMQSAQTGHITFMVNPKYREHLGLCQASAVVMTQDDLPFAK
SAALVVKNPYLTYARMAQILDTTPQPAQNIAPSAVIDATAKLGNNVSIGANAVIESGVELGDNVIIGAGC
FVGKNSKIGAGSRLWANVTIYHEIQIGQNCLIQSGTVVGADGFGYANDRGNWVKIPQIGRVIIGDRVEIG
Document Page
BIOINFORMATICS ASSIGNMENT 27
ACTTIDRGALDDTIIGNGVIIDNQCQIAHNVVIGDNTAVAGGVIMAGSLKIGRYCMIGGASVINGHMEIC
DKVTVTGMGMVMRPITEPGVYSSGIPLQPNKVWRKTAALVMNIDDMSKRLKSLERKVNQQD
Accession No NC_000913
mRNA sequence 2
ATGAGCACTATCGAAGAACGCGTTAAGAAAATTATCGGCGAACAGCTGGGCGTTAAGCAGGAAGAAGTTA
CCAACAATGCTTCTTTCGTTGAAGACCTGGGCGCGGATTCTCTTGACACCGTTGAGCTGGTAATGGCTCT
GGAAGAAGAGTTTGATACTGAGATTCCGGACGAAGAAGCTGAGAAAATCACCACCGTTCAGGCTGCCATT
GATTACATCAACGGCCACCAGGCGTAA
Accession No NP_414723
mRNA sequence 3
MIDKSAFVHPTAIVEEGASIGANAHIGPFCIVGPHVEIGEGTVLKSHVVVNGHTKIGRDNEIYQFASIGE
VNQDLKYAGEPTRVEIGDRNRIRESVTIHRGTVQGGGLTKVGSDNLLMINAHIAHDCTVGNRCILANNAT
LAGHVSVDDFAIIGGMTAVHQFCIIGAHVMVGGCSGVAQDVPPYVIAQGNHATPFGVNIEGLKRRGFSRE
AITAIRNAYKLIYRSGKTLDEVKPEIAELAETYPEVKAFTDFFARSTRGLIR
GenBank format
GenBank format stores an annotation section and the sequence section. The annotation initiates
with LOCUS. Start of a sequence section is indicated by origin and of the section is marked by //. It is
the plaintext format to store DNA data as character sequence. It is the DNA encoding format which is
used by the U.S. national center for biotechnology information.
FASTA Format
Fasta format is a text based format which is used to represent DNA sequence in which base
pairs are indicated using a single letter code (A, C, G, T, and N). Sequence in FAST format starts with
the single line identifier description, followed by lines of a DNA sequence data.
Gene
A gene is known as the basic physical and functional unit of heredity and made up of DNA. It
act as instruction to careate molecules called proteins. It occupies a specific position on a chromosome.
Gene transcritopn, translation and post translation modifications. codes for producing proteins.
Information of a gene includes nomenclature, chromosomal localisation, gene products and their
attributes, related markers, phenotypes and interactions
CDS (coding sequences or coding regins)
Document Page
BIOINFORMATICS ASSIGNMENT 28
It is the part of a gene or mRNA that codes for a protein. Introns and the 5’ or 3’ UTR are not
coding sequences. The coding sequences or coding regions in a mature mRNA contains everything
from the start codon to the stop codon.
mRNA
A messenger RNA is the subtype of RNA molecule. It contains a portion of the DNA code to
the other sections of the cell for processing. mRNA is fromed during transcription where single strand
of a DNA is decoded by RNA polymerase and m RNA is synthesized.
B
Save and paste the sequences here as FASTA files.
Accession number- NM_007299.3
CTTAGCGGTAGCCCCTTGGTTTCCGTGGCAACGGAAAAGCGCGGGAATTACAGATAAATTAAAACTGCGA
CTGCGCGGCGTGAGCTCGCTGAGACTTCCTGGACGGGGGACAGGCTGTGGGGTTTCTCAGATAACTGGGC
CCCTGCGCTCAGGAGGCCTTCACCCTCTGCTCTGGTTCATTGGAACAGAAAGAAATGGATTTATCTGCTC
TTCGCGTTGAAGAAGTACAAAATGTCATTAATGCTATGCAGAAAATCTTAGAGTGTCCCATCTGTCTGGA
GTTGATCAAGGAACCTGTCTCCACAAAGTGTGACCACATATTTTGCAAATTTTGCATGCTGAAACTTCTC
AACCAGAAGAAAGGGCCTTCACAGTGTCCTTTATGTAAGAATGATATAACCAAAAGGAGCCTACAAGAAA
GTACGAGATTTAGTCAACTTGTTGAAGAGCTATTGAAAATCATTTGTGCTTTTCAGCTTGACACAGGTTT
GGAGTATGCAAACAGCTATAATTTTGCAAAAAAGGAAAATAACTCTCCTGAACATCTAAAAGATGAAGTT
TCTATCATCCAAAGTATGGGCTACAGAAACCGTGCCAAAAGACTTCTACAGAGTGAACCCGAAAATCCTT
CCTTGCAGGAAACCAGTCTCAGTGTCCAACTCTCTAACCTTGGAACTGTGAGAACTCTGAGGACAAAGCA
GCGGATACAACCTCAAAAGACGTCTGTCTACATTGAATTGGGATCTGATTCTTCTGAAGATACCGTTAAT
AAGGCAACTTATTGCAGTGTGGGAGATCAAGAATTGTTACAAATCACCCCTCAAGGAACCAGGGATGAAA
TCAGTTTGGATTCTGCAAAAAAGGCTGCTTGTGAATTTTCTGAGACGGATGTAACAAATACTGAACATCA
TCAACCCAGTAATAATGATTTGAACACCACTGAGAAGCGTGCAGCTGAGAGGCATCCAGAAAAGTATCAG
GGTGAAGCAGCATCTGGGTGTGAGAGTGAAACAAGCGTCTCTGAAGACTGCTCAGGGCTATCCTCTCAGA
GTGACATTTTAACCACTCAGCAGAGGGATACCATGCAACATAACCTGATAAAGCTCCAGCAGGAAATGGC
TGAACTAGAAGCTGTGTTAGAACAGCATGGGAGCCAGCCTTCTAACAGCTACCCTTCCATCATAAGTGAC
TCTTCTGCCCTTGAGGACCTGCGAAATCCAGAACAAAGCACATCAGAAAAAGTATTAACTTCACAGAAAA
GTAGTGAATACCCTATAAGCCAGAATCCAGAAGGCCTTTCTGCTGACAAGTTTGAGGTGTCTGCAGATAG
TTCTACCAGTAAAAATAAAGAACCAGGAGTGGAAAGGTCATCCCCTTCTAAATGCCCATCATTAGATGAT
AGGTGGTACATGCACAGTTGCTCTGGGAGTCTTCAGAATAGAAACTACCCATCTCAAGAGGAGCTCATTA
AGGTTGTTGATGTGGAGGAGCAACAGCTGGAAGAGTCTGGGCCACACGATTTGACGGAAACATCTTACTT
GCCAAGGCAAGATCTAGAGGGAACCCCTTACCTGGAATCTGGAATCAGCCTCTTCTCTGATGACCCTGAA
TCTGATCCTTCTGAAGACAGAGCCCCAGAGTCAGCTCGTGTTGGCAACATACCATCTTCAACCTCTGCAT
TGAAAGTTCCCCAATTGAAAGTTGCAGAATCTGCCCAGAGTCCAGCTGCTGCTCATACTACTGATACTGC
TGGGTATAATGCAATGGAAGAAAGTGTGAGCAGGGAGAAGCCAGAATTGACAGCTTCAACAGAAAGGGTC
AACAAAAGAATGTCCATGGTGGTGTCTGGCCTGACCCCAGAAGAATTTATGCTCGTGTACAAGTTTGCCA
GAAAACACCACATCACTTTAACTAATCTAATTACTGAAGAGACTACTCATGTTGTTATGAAAACAGATGC
TGAGTTTGTGTGTGAACGGACACTGAAATATTTTCTAGGAATTGCGGGAGGAAAATGGGTAGTTAGCTAT
TTCTGGGTGACCCAGTCTATTAAAGAAAGAAAAATGCTGAATGAGCATGATTTTGAAGTCAGAGGAGATG
TGGTCAATGGAAGAAACCACCAAGGTCCAAAGCGAGCAAGAGAATCCCAGGACAGAAAGATCTTCAGGGG
GCTAGAAATCTGTTGCTATGGGCCCTTCACCAACATGCCCACAGGGTGTCCACCCAATTGTGGTTGTGCA
GCCAGATGCCTGGACAGAGGACAATGGCTTCCATGCAATTGGGCAGATGTGTGAGGCACCTGTGGTGACC
CGAGAGTGGGTGTTGGACAGTGTAGCACTCTACCAGTGCCAGGAGCTGGACACCTACCTGATACCCCAGA
TCCCCCACAGCCACTACTGACTGCAGCCAGCCACAGGTACAGAGCCACAGGACCCCAAGAATGAGCTTAC
AAAGTGGCCTTTCCAGGCCCTGGGAGCTCCTCTCACTCTTCAGTCCTTCTACTGTCCTGGCTACTAAATA
TTTTATGTACATCAGCCTGAAAAGGACTTCTGGCTATGCAAGGGTCCCTTAAAGATTTTCTGCTTGAAGT
CTCCCTTGGAAATCTGCCATGAGCACAAAATTATGGTAATTTTTCACCTGAGAAGATTTTAAAACCATTT
AAACGCCACCAATTGAGCAAGATGCTGATTCATTATTTATCAGCCCTATTCTTTCTATTCAGGCTGTTGT
TGGCTTAGGGCTGGAAGCACAGAGTGGCTTGGCCTCAAGAGAATAGCTGGTTTCCCTAAGTTTACTTCTC
TAAAACCCTGTGTTCACAAAGGCAGAGAGTCAGACCCTTCAATGGAAGGAGAGTGCTTGGGATCGATTAT
GTGACTTAAAGTCAGAATAGTCCTTGGGCAGTTCTCAAATGTTGGAGTGGAACATTGGGGAGGAAATTCT
GAGGCAGGTATTAGAAATGAAAAGGAAACTTGAAACCTGGGCATGGTGGCTCACGCCTGTAATCCCAGCA
CTTTGGGAGGCCAAGGTGGGCAGATCACTGGAGGTCAGGAGTTCGAAACCAGCCTGGCCAACATGGTGAA

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
BIOINFORMATICS ASSIGNMENT 29
ACCCCATCTCTACTAAAAATACAGAAATTAGCCGGTCATGGTGGTGGACACCTGTAATCCCAGCTACTCA
GGTGGCTAAGGCAGGAGAATCACTTCAGCCCGGGAGGTGGAGGTTGCAGTGAGCCAAGATCATACCACGG
CACTCCAGCCTGGGTGACAGTGAGACTGTGGCTCAAAAAAAAAAAAAAAAAAAGGAAAATGAAACTAGAA
GAGATTTCTAAAAGTCTGAGATATATTTGCTAGATTTCTAAAGAATGTGTTCTAAAACAGCAGAAGATTT
TCAAGAACCGGTTTCCAAAGACAGTCTTCTAATTCCTCATTAGTAATAAGTAAAATGTTTATTGTTGTAG
CTCTGGTATATAATCCATTCCTCTTAAAATATAAGACCTCTGGCATGAATATTTCATATCTATAAAATGA
CAGATCCCACCAGGAAGGAAGCTGTTGCTTTCTTTGAGGTGATTTTTTTCCTTTGCTCCCTGTTGCTGAA
ACCATACAGCTTCATAAATAATTTTGCTTGCTGAAGGAAGAAAAAGTGTTTTTCATAAACCCATTATCCA
GGACTGTTTATAGCTGTTGGAAGGACTAGGTCTTCCCTAGCCCCCCCAGTGTGCAAGGGCAGTGAAGACT
TGATTGTACAAAATACGTTTTGTAAATGTTGTGCTGTTAACACTGCAAATAAACTTGGTAGCAAACACTT
CCAAAAAAAAAAAAAAAAAA
Accession no. NM_005225.2
TTGGCGCGTAAAAGTGGCCGGGACTTTGCAGGCAGCGGCGGCCGGGGGCGGAGCGGGATCGAGCCCTCGC
CGAGGCCTGCCGCCATGGGCCCGCGCCGCCGCCGCCGCCTGTCACCCGGGCCGCGCGGGCCGTGAGCGTC
ATGGCCTTGGCCGGGGCCCCTGCGGGCGGCCCATGCGCGCCGGCGCTGGAGGCCCTGCTCGGGGCCGGCG
CGCTGCGGCTGCTCGACTCCTCGCAGATCGTCATCATCTCCGCCGCGCAGGACGCCAGCGCCCCGCCGGC
TCCCACCGGCCCCGCGGCGCCCGCCGCCGGCCCCTGCGACCCTGACCTGCTGCTCTTCGCCACACCGCAG
GCGCCCCGGCCCACACCCAGTGCGCCGCGGCCCGCGCTCGGCCGCCCGCCGGTGAAGCGGAGGCTGGACC
TGGAAACTGACCATCAGTACCTGGCCGAGAGCAGTGGGCCAGCTCGGGGCAGAGGCCGCCATCCAGGAAA
AGGTGTGAAATCCCCGGGGGAGAAGTCACGCTATGAGACCTCACTGAATCTGACCACCAAGCGCTTCCTG
GAGCTGCTGAGCCACTCGGCTGACGGTGTCGTCGACCTGAACTGGGCTGCCGAGGTGCTGAAGGTGCAGA
AGCGGCGCATCTATGACATCACCAACGTCCTTGAGGGCATCCAGCTCATTGCCAAGAAGTCCAAGAACCA
CATCCAGTGGCTGGGCAGCCACACCACAGTGGGCGTCGGCGGACGGCTTGAGGGGTTGACCCAGGACCTC
CGACAGCTGCAGGAGAGCGAGCAGCAGCTGGACCACCTGATGAATATCTGTACTACGCAGCTGCGCCTGC
TCTCCGAGGACACTGACAGCCAGCGCCTGGCCTACGTGACGTGTCAGGACCTTCGTAGCATTGCAGACCC
TGCAGAGCAGATGGTTATGGTGATCAAAGCCCCTCCTGAGACCCAGCTCCAAGCCGTGGACTCTTCGGAG
AACTTTCAGATCTCCCTTAAGAGCAAACAAGGCCCGATCGATGTTTTCCTGTGCCCTGAGGAGACCGTAG
GTGGGATCAGCCCTGGGAAGACCCCATCCCAGGAGGTCACTTCTGAGGAGGAGAACAGGGCCACTGACTC
TGCCACCATAGTGTCACCACCACCATCATCTCCCCCCTCATCCCTCACCACAGATCCCAGCCAGTCTCTA
CTCAGCCTGGAGCAAGAACCGCTGTTGTCCCGGATGGGCAGCCTGCGGGCTCCCGTGGACGAGGACCGCC
TGTCCCCGCTGGTGGCGGCCGACTCGCTCCTGGAGCATGTGCGGGAGGACTTCTCCGGCCTCCTCCCTGA
GGAGTTCATCAGCCTTTCCCCACCCCACGAGGCCCTCGACTACCACTTCGGCCTCGAGGAGGGCGAGGGC
ATCAGAGACCTCTTCGACTGTGACTTTGGGGACCTCACCCCCCTGGATTTCTGACAGGGCTTGGAGGGAC
CAGGGTTTCCAGAGATGCTCACCTTGTCTCTGCAGCCCTGGAGCCCCCTGTCCCTGGCCGTCCTCCCAGC
CTGTTTGGAAACATTTAATTTATACCCCTCTCCTCTGTCTCCAGAAGCTTCTAGCTCTGGGGTCTGGCTA
CCGCTAGGAGGCTGAGCAAGCCAGGAAGGGAAGGAGTCTGTGTGGTGTGTATGTGCATGCAGCCTACACC
CACACGTGTGTACCGGGGGTGAATGTGTGTGAGCATGTGTGTGTGCATGTACCGGGGAATGAAGGTGAAC
ATACACCTCTGTGTGTGCACTGCAGACACGCCCCAGTGTGTCCACATGTGTGTGCATGAGTCCATGTGTG
CGCGTGGGGGGGCTCTAACTGCACTTTCGGCCCTTTTGCTCTGGGGGTCCCACAAGGCCCAGGGCAGTGC
CTGCTCCCAGAATCTGGTGCTCTGACCAGGCCAGGTGGGGAGGCTTTGGCTGGCTGGGCGTGTAGGACGG
TGAGAGCACTTCTGTCTTAAAGGTTTTTTCTGATTGAAGCTTTAATGGAGCGTTATTTATTTATCGAGGC
CTCTTTGGTGAGCCTGGGGAATCAGCAAAGGGGAGGAGGGGTGTGGGGTTGATACCCCAACTCCCTCTAC
CCTTGAGCAAGGGCAGGGGTCCCTGAGCTGTTCTTCTGCCCCATACTGAAGGAACTGAGGCCTGGGTGAT
TTATTTATTGGGAAAGTGAGGGAGGGAGACAGACTGACTGACAGCCATGGGTGGTCAGATGGTGGGGTGG
GCCCTCTCCAGGGGGCCAGTTCAGGGCCCCAGCTGCCCCCCAGGATGGATATGAGATGGGAGAGGTGAGT
GGGGGACCTTCACTGATGTGGGCAGGAGGGGTGGTGAAGGCCTCCCCCAGCCCAGACCCTGTGGTCCCTC
CTGCAGTGTCTGAAGCGCCTGCCTCCCCACTGCTCTGCCCCACCCTCCAATCTGCACTTTGATTTGCTTC
CTAACAGCTCTGTTCCCTCCTGCTTTGGTTTTAATAAATATTTTGATGACGTTTGGGCCGGGTTTTGGGA
CTCTGTTGGGAACATTTCGGGGCGGGAGAGGCCAAGGTTGCTGGGGAAATGCCCATTCTCCACTTCCCTT
CTCCCTGTCCGTGCCCGATTTGATTTGAGCCTCATAACTCGAAGAAAGGTCAGCTTCCTCGCTGTTTTGG
TCCTAACTCAAAAGCAGATCCAGTAAAGGTTTTTGTTGTAAAAAAAAAAAAAAAAAAAAAAA
Accession Number- NM_003106.3
GGATGGTTGTCTATTAACTTGTTCAAAAAAGTATCAGGAGTTGTCAAGGCAGAGAAGAGAGTGTTTGCAA
AAGGGGGAAAGTAGTTTGCTGCCTCTTTAAGACTAGGACTGAGAGAAAGAAGAGGAGAGAGAAAGAAAGG
GAGAGAAGTTTGAGCCCCAGGCTTAAGCCTTTCCAAAAAATAATAATAACAATCATCGGCGGCGGCAGGA
Document Page
BIOINFORMATICS ASSIGNMENT 30
TCGGCCAGAGGAGGAGGGAAGCGCTTTTTTTGATCCTGATTCCAGTTTGCCTCTCTCTTTTTTTCCCCCA
AATTATTCTTCGCCTGATTTTCCTCGCGGAGCCCTGCGCTCCCGACACCCCCGCCCGCCTCCCCTCCTCC
TCTCCCCCCGCCCGCGGGCCCCCCAAAGTCCCGGCCGGGCCGAGGGTCGGCGGCCGCCGGCGGGCCGGGC
CCGCGCACAGCGCCCGCATGTACAACATGATGGAGACGGAGCTGAAGCCGCCGGGCCCGCAGCAAACTTC
GGGGGGCGGCGGCGGCAACTCCACCGCGGCGGCGGCCGGCGGCAACCAGAAAAACAGCCCGGACCGCGTC
AAGCGGCCCATGAATGCCTTCATGGTGTGGTCCCGCGGGCAGCGGCGCAAGATGGCCCAGGAGAACCCCA
AGATGCACAACTCGGAGATCAGCAAGCGCCTGGGCGCCGAGTGGAAACTTTTGTCGGAGACGGAGAAGCG
GCCGTTCATCGACGAGGCTAAGCGGCTGCGAGCGCTGCACATGAAGGAGCACCCGGATTATAAATACCGG
CCCCGGCGGAAAACCAAGACGCTCATGAAGAAGGATAAGTACACGCTGCCCGGCGGGCTGCTGGCCCCCG
GCGGCAATAGCATGGCGAGCGGGGTCGGGGTGGGCGCCGGCCTGGGCGCGGGCGTGAACCAGCGCATGGA
CAGTTACGCGCACATGAACGGCTGGAGCAACGGCAGCTACAGCATGATGCAGGACCAGCTGGGCTACCCG
CAGCACCCGGGCCTCAATGCGCACGGCGCAGCGCAGATGCAGCCCATGCACCGCTACGACGTGAGCGCCC
TGCAGTACAACTCCATGACCAGCTCGCAGACCTACATGAACGGCTCGCCCACCTACAGCATGTCCTACTC
GCAGCAGGGCACCCCTGGCATGGCTCTTGGCTCCATGGGTTCGGTGGTCAAGTCCGAGGCCAGCTCCAGC
CCCCCTGTGGTTACCTCTTCCTCCCACTCCAGGGCGCCCTGCCAGGCCGGGGACCTCCGGGACATGATCA
GCATGTATCTCCCCGGCGCCGAGGTGCCGGAACCCGCCGCCCCCAGCAGACTTCACATGTCCCAGCACTA
CCAGAGCGGCCCGGTGCCCGGCACGGCCATTAACGGCACACTGCCCCTCTCACACATGTGAGGGCCGGAC
AGCGAACTGGAGGGGGGAGAAATTTTCAAAGAAAAACGAGGGAAATGGGAGGGGTGCAAAAGAGGAGAGT
AAGAAACAGCATGGAGAAAACCCGGTACGCTCAAAAAGAAAAAGGAAAAAAAAAAATCCCATCACCCACA
GCAAATGACAGCTGCAAAAGAGAACACCAATCCCATCCACACTCACGCAAAAACCGCGATGCCGACAAGA
AAACTTTTATGAGAGAGATCCTGGACTTCTTTTTGGGGGACTATTTTTGTACAGAGAAAACCTGGGGAGG
GTGGGGAGGGCGGGGGAATGGACCTTGTATAGATCTGGAGGAAAGAAAGCTACGAAAAACTTTTTAAAAG
TTCTAGTGGTACGGTAGGAGCTTTGCAGGAAGTTTGCAAAAGTCTTTACCAATAATATTTAGAGCTAGTC
TCCAAGCGACGAAAAAAATGTTTTAATATTTGCAAGCAACTTTTGTACAGTATTTATCGAGATAAACATG
GCAATCAAAATGTCCATTGTTTATAAGCTGAGAATTTGCCAATATTTTTCAAGGAGAGGCTTCTTGCTGA
ATTTTGATTCTGCAGCTGAAATTTAGGACAGTTGCAAACGTGAAAAGAAGAAAATTATTCAAATTTGGAC
ATTTTAATTGTTTAAAAATTGTACAAAAGGAAAAAATTAGAATAAGTACTGGCGAACCATCTCTGTGGTC
TTGTTTAAAAAGGGCAAAAGTTTTAGACTGTACTAAATTTTATAACTTACTGTTAAAAGCAAAAATGGCC
ATGCAGGTTGACACCGTTGGTAATTTATAATAGCTTTTGTTCGATCCCAACTTTCCATTTTGTTCAGATA
AAAAAAACCATGAAATTACTGTGTTTGAAATATTTTCTTATGGTTTGTAATATTTCTGTAAATTTATTGT
GATATTTTAAGGTTTTCCCCCCTTTATTTTCCGTAGTTGTATTTTAAAAGATTCGGCTCTGTATTATTTG
AATCAGTCTGCCGAGAATCCATGTATATATTTGAACTAATATCATCCTTATAACAGGTACATTTTCAACT
TAAGTTTTTACTCCATTATGCACAGTTTGAGATAAATAAATTTTTGAAATATGGACACTGAAAAAAAAAA
Accession Number- NM_004550.4
GCCCCAGGAGAGGCAGAGAGTGAGGGAAAGGGCCTGGCCGGCATGCACAGATAGGATCACGGTCCTGGGA
GAATTCCTGCTCTTATAGTCTAACCTACCATGGCTTCTCTTTTCTCAAGGCTCCCTCATGCTGCCCTTTG
GCCCTAGTGGCTGGTTTCCAGGGCTGAGGGGACTGAGTGAGCTGCCTGAGAAAAGAGGGTAGGGAACAGA
AAAGCCAGCCAGGAGCTGTGGGAGGAAACGCCCTCAGTAAAGATGACCGCGGTCACTGTTATCTAAACGC
AAGTGAAGCCGAGTCACAGGACCCGGATGTTGTCAGTTCGACGGTAAACGACCCTGCCAGCTTCCAAGAG
GGCGGCTTCACTGTGCGAATAGGTGAGAAGCCAAGAAGGAGGCGCGCTGGAGTTACTTCCGCCCGGTTCT
CCTTCCCGCAGTCTGCAGCCGGAGTAAGATGGCGGCGCTGAGGGCTTTGTGCGGCTTCCGGGGCGTCGCG
GCCCAGGTGCTGCGGCCTGGGGCTGGAGTCCGATTGCCGATTCAGCCCAGCAGAGGTGTTCGGCAGTGGC
AGCCAGATGTGGAATGGGCACAGCAGTTTGGGGGAGCTGTTATGTACCCAAGCAAAGAAACAGCCCACTG
GAAGCCTCCACCTTGGAATGATGTGGACCCTCCAAAGGACACAATTGTGAAGAACATTACCCTGAACTTT
GGGCCCCAACACCCAGCAGCGCATGGTGTCCTGCGACTAGTGATGGAATTGAGTGGGGAGATGGTGCGGA
AGTGTGATCCTCACATCGGGCTCCTGCACCGAGGCACTGAGAAGCTCATTGAATACAAGACCTATCTTCA
GGCCCTTCCATACTTTGACCGGCTAGACTATGTGTCCATGATGTGTAACGAACAGGCCTATTCTCTAGCT
GTGGAGAAGTTGCTAAACATCCGGCCTCCTCCTCGGGCACAGTGGATCCGAGTGCTGTTTGGAGAAATCA
CACGTTTGTTGAACCACATCATGGCTGTGACCACACATGCCCTGGACCTTGGGGCCATGACCCCTTTCTT
CTGGCTGTTTGAAGAAAGGGAGAAGATGTTTGAGTTCTACGAGCGAGTGTCTGGAGCCCGAATGCATGCT
GCTTATATCCGGCCAGGAGGAGTGCACCAGGACCTACCCCTTGGGCTTATGGATGACATTTATCAGTTTT
CTAAGAACTTCTCTCTTCGGCTTGATGAGTTGGAGGAGTTGCTGACCAACAATAGGATCTGGCGAAATCG
GACAATTGACATTGGGGTTGTAACAGCAGAAGAAGCACTTAACTATGGTTTTAGTGGAGTGATGCTTCGG
GGCTCAGGCATCCAGTGGGACCTGCGGAAGACCCAGCCCTATGATGTTTACGACCAGGTTGAGTTTGATG
TTCCTGTTGGTTCTCGAGGGGACTGCTATGATAGGTACCTGTGCCGGGTGGAGGAGATGCGCCAGTCCCT
GAGAATTATCGCACAGTGTCTAAACAAGATGCCTCCTGGGGAGATCAAGGTTGATGATGCCAAAGTGTCT
CCACCTAAGCGAGCAGAGATGAAGACTTCCATGGAGTCACTGATTCATCACTTTAAGTTGTATACTGAGG
GCTACCAAGTTCCTCCAGGAGCCACATATACTGCCATTGAGGCTCCCAAGGGAGAGTTTGGGGTGTACCT
Document Page
BIOINFORMATICS ASSIGNMENT 31
GGTGTCTGATGGCAGCAGCCGCCCTTATCGATGCAAGATCAAGGCTCCTGGTTTTGCCCATCTGGCTGGT
TTGGACAAGATGTCTAAGGGACACATGTTGGCAGATGTCGTTGCCATCATAGGTACCCAAGATATTGTAT
TTGGAGAAGTAGATCGGTGAGCAGGGGAGCAGCGTTTGATCCCCCCTGCCTATCAGCTTCTTCTGTGGAG
CCTGTTCCTCACTGGAAATTGGCCTCTGTGTGTGTGTGTGTGTGTGTGTGTGTGTGTATGTTCATGTACA
CTTGGCTGTCAGGCTTTCTGTGCATGTACTAAAAAAGGAGAAATTATAATAAATTAGCCGTCTTGCGGCC
CCTAGGCCTAAAAAAAAAAAAAAAAAAAA
Find information on protein TP53 from human and pig, save and paste the sequence
TP53 Homo sapiens
MEEPQSDPSVEPPLSQETFSDLWKLLPENNVLSPLPSQAMDDLMLSPDDIEQWFTEDPGP
DEAPRMPEAAPPVAPAPAAPTPAAPAPAPSWPLSSSVPSQKTYQGSYGFRLGFLHSGTAK
SVTCTYSPALNKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIYKQSQHMTEVVRRCPHHE
RCSDSDGLAPPQHLIRVEGNLRVEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNS
SCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENLRKKGEPHHELP
PGSTKRALPNNTSSSPQPKKKPLDGEYFTLQIRGRERFEMFRELNEALELKDAQAGKEPG
GSRAHSSHLKSKKGQSTSRHKKLMFKTEGPDSD
Pig
MEESQSELGVEPPLSQETFSDLWKLLPENNLLSSELSLAAVNDLLLSPVTNWLDENPDDA
SRVPAPPAATAPAPAAPAPATSWPLSSFVPSQKTYPGSYDFRLGFLHSGTAKSVTCTYSP
ALNKLFCQLAKTCPVQLWVSSPPPPGTRVRAMAIYKKSEYMTEVVRRCPHHERSSDYSDG
LAPPQHLIRVEGNLRAEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNFMCNSSCMGGMN
RRPILTIITLEDASGNLLGRNSFEVRVCACPGRDRRTEEENFLKKGQSCPEPPPGSTKRA
LPTSTSSSPVQKKKPLDGEYFTLQIRGRERFEMFRELNDALELKDAQTARESGENRAHSS
HLKSKKGQSPSRHKKPMFKREGPDSD
D
Name of the gene and its function
Answer: FMR1
Function- Alteration of FMR1 gene expression associated with the pathogenesis of fragile X syndrome by
early transcription and distribution, especially in mental retardation.
It also assists function in the mature testes, which is indicated by greater expression in
spermatagonia. FMR1 expression in spermatagonia is essential for germ cell proliferation.
It Inhibit translation of many mRNA’s at monomer concentrations.
Alternative names
FMR1_HUMAN
FMRP
FRAXA
Protein FMR-1
Chromosomal location (Homo sapiens)
Cytogenic location: Xq27.3

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
BIOINFORMATICS ASSIGNMENT 32
Molecular location: base pairs 147, 911, 951 to 147, 951, 127 on X chromosome.
Two phenotypes associated with fragile x syndrome
This syndrome became more prominant with age.
Long and narrow face, wide ears and a prominent jaw and forehead. Usually flexible fingers
and base of the foot is flat \ (Hangerman et al. 2017).
Related publication on fragile x syndrome
Cholesterol levels in fragile X syndrome by Berry-Kravis, Shah, Mathur, Darnell and Ouyang (2015).
E
Use the OMIM database to find diseases associated with Matrix Metalloproteases MMP9, MMP2 and
MMP14
MMP9 (matrix metallopeptidase 9)
Alternative names
CLG4B
GELB
MANDP2
MMP-9
Chromosomal Location
Cytogenetic Location: 20q13.12
Molecular Location: base pairs 46,008,908 to 46,016,561 on chromosome 20 (Homo sapiens
Disease associated
Metaphyseal anadysplasia 2
Document Page
BIOINFORMATICS ASSIGNMENT 33
Related publication
“MMP-9 supplied by bone marrow-derived cells contributes to skin carcinogenesis” by Coussens,
tinkle, Hanahan and Warb (2000).
F
What are SNPs and its importance?
Answer
Single are common but minute differences that are occurring in human DNA at a frequency of 1
in every 1000 bases. A single nucleotide polymorphism is the single base-pair region inside the genome
at which more than one of the four base pairs is commonly present in natural population.
Functions:
They provide densest predictable map of genetic differences.
It acts as a biological marker to find exect locatation of genes that are associated with disease.
SNPs may also assist to predict an individual’s response to some drugs.
What does CDH1 code for?
CDH1 code for protein called epithelial cadherin or E-cadherin
What disease are the mutations in this gene associated with?
Gastric Cancer
Find a SNP that has pathogenic clinical significance. rs13689
Paste the region affected by the SNP here
TTGGGCTCTTTTAGGGTAAGAAGTT[A/C/G/T]GTGTCTTTGTCTGGCCACATCTTGA
Find a SNP that has a benign clinical significance.
rs8045438
Paste the region affected by the SNP here (a string of around 55 bases)
TAAACAATTTTGTTAAACCATTGTC[A/G]AATTTGTTTTATTTGATTCCACCAC
What is the difference between a benign and pathogenic SNP in this gene?
Answer
Document Page
BIOINFORMATICS ASSIGNMENT 34
SNP significance Chromosomal
position
Contig allele Ancestral allele
rs13689 Benign 68834619 T C
rs33935154 Pathogenic 68822138 G G
Frame shift variant of rs587781276
Affected string
CCTTAGAGGTCAGCGTGTGTGACTG[-/TG]AAGGGGCCGCTGGCGTCTGTAGGAA
Effect of SNP
Pathogenic allele TG is produced due to single nucleotide polymorphism in that specific gene on
chromosome number 16.
Save and paste the mRNA sequence of the CDH1 gene
Answer
1 gtacctcgga tcccctgact tgcgagggac gcattcgggc cgcaagctcc gcgccccagc
61 cccgtgcccc agccctgcgc cccttcctct cccgtcgtca ccgcttccct tcttccaaga
121 aagttcgggt cctgaggagc ggagcggcct ggaagcctcg cgcgctccgg accccccagt
181 gatgggagtg gggcgtgggt ggtgaggggc gagcgcggct ttcctgcccc ctccagcgca
241 gaccgaggcg ggggcgtctg gccgcggagt ccggcggggt gggctcgcgc gggcggtggg
301 ggcgtgaagc ggggtgtagg gggtggggtg tggagaaggg gtgccctggt gcaagtcgag
361 ggggagccag gagtcgtggg gacgatcttc gagggaagga gaggggcatc cgtagaaata
421 aaggcacctg ccatgccaag aaaggtcgta aataggagtg agggtcccgg ggataagaaa
481 gtgaggtcgg aggaggtggg agcgcccctc gctctgagga gtggtgcatt cccggtctaa
541 ggaaagtggg gtcctggaga ataaagacat ctccaataaa atgagaaagg agactgaaag
601 ggaacggtgg gctaggtctt gagggggtga ctcggcggcc ctcccgggag ttcctggggg
661 ctcggcggcc gtaggtttcg gggtggggga gggtgacgtc gctgcccgcc cgtcccgggg
721 ctgcgggctg gggtcctccc ccaatcccga cgccgggagc gagggagggg cggcgctgtt
781 ggtttcggtg agcaggaggg aaccctccga gtcacccggt tccatctacc tttcccccac
841 cccag
//
Compare protein sequence with genbank sequence
Protein sequence FASTA format
MEPGDSEGSLLLTETNSAAPPSLPASGLGEDPSPQPRDGRAATSPSPTPKPTAAEPPGTP
GRAAESPPQDLAHRSLSVSFLILLEMSLFSRTPLSLDRECTTPQSEGRSHLLRPHFLIPG
TLTPIYDLSWHGRCLYFYGCPSPSLEDRPHDSWLPLDLHQGTPSPHPTPYTPLHAPTARA
SPPRRTPRPDAPASVCAGGGRKAALAPHHPRPTPITGGSGAREASRPLRSSGPELSWKKG
SGDDGRGRGAGLGHGAGARSLRPECVPRKSGDPRY

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
BIOINFORMATICS ASSIGNMENT 35
Un-translated regions
Answer
The un-translated region means that they do not code for protein. Both upstream and
downstream sites are important in translation, 3’ UTR plays a important role in translation termination.
UTR’s also takes part in many regulatory aspects of gene expression in eukaryotes.
Now go to rs121964874 [Homo sapiens] and paste the 55 base sequence with SNP here
Answer- GGCCGCTGGCGTCTGTAGGAAGGCA[A/C/G/T]AGCCTGTCGAAGCAGGATTGCAAAT
Exact location of SNP- 68,823,557
rs16260 (A)
SNP located in a promoter region of the E-cadherin CDH1 gene which is associated with
increased risk of hereditary prostate cancer.
CTAGCAACTCCAGGCTAGAGGGTCA[A/C]CGCGTCTATGCGAGGCCGGGTGGGC
Expasy FASTA Format
MCXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
GenBank protein sequence
1 mgpwsrslsa lllllqvssw lcqepepchp gfdaesytft vprrhlergr vlgrvnfedc
61 tgrqrtayfs ldtrfkvgtd gvitvkrplr fhnpqihflv yawdstyrkf stkvtlntvg
121 hhhrppphqa svsgiqaell tfpnsspglr rqkrdwvipp iscpenekgp fpknlvqiks
181 nkdkegkvfy sitgqgadtp pvgvfiiere tgwlkvtepl dreriatytl fshavssngn
241 avedpmeili tvtdqndnkp eftqevfkgs vmegalpgts vmevtatdad ddvntynaai
301 aytilsqdpe lpdknmftin rntgvisvvt tgldresfpt ytlvvqaadl qgeglsttat
361 avitvtdtnd nppifnptty kgqvpenean vvittlkvtd adapntpawe avytilnddg
421 gqfvvttnpv nndgilktak gldfeakqqy ilhvavtnvv pfevslttst atvtvdvldv
481 neapifvppe krvevsedfg vgqeitsyta qepdtfmeqk ityriwrdta nwleinpdtg
541 aistraeldr edfehvknst ytaliiatdn gspvatgtgt lllilsdvnd napipeprti
601 ffcernpkpq viniidadlp pntspftael thgasanwti qyndptqesi ilkpkmalev
661 gdykinlklm dnqnkdqvtt levsvcdceg aagvcrkaqp veaglqipai lgilggilal
721 lililllllf lrrravvkep llppeddtrd nvyyydeegg geedqdfdls qlhrgldarp
781 evtrndvapt lmsvprylpr panpdeignf idenlkaadt dptappydsl lvfdyegsgs
841 eaaslsslns sesdkdqdyd ylnewgnrfk kladmyggge dd
//
Document Page
BIOINFORMATICS ASSIGNMENT 36
Do SNPs need to be located in the coding region (exon) of a gene to have an effect?
Answer
It is not necessory that only SNPs that are found in coding regions have effects on genes, but the
probability is high. Single-nucleotide polymorphisms may be found within coding sequences
of genes, non-coding sites of genes. SNPs within a coding sequence do not essentially change
the amino acid sequence of a protein that is produced, due to degenertion of the genetic code. SNPs that
are not found in protein-coding regions may still alter gene splicing, transcription
factor attachment, mRNA degradation, or the sequence of a non-coding RNA. Gene expression
affected by this SNP is known as an end (expression SNP) and may be upstream or downstream from a
gene.
B. Genetic Analysis of a Human Cancer Disease Using Databases
5. Analysis of a Human Genetic Disease
On what chromosome is the gene HTT ?
Answer- 4p16
What are the cytogenetic location and genomic co-ordinates of the gene?
Answer-
Cytogenetic Location: 4p16.3, which is the short (p) arm of chromosome 4 at position 16.3
Genomic Coordinates - 4:3,074,680-3,243,959
What is the OMIM reference number for this gene?
143100
What does HTT code for?
Document Page
BIOINFORMATICS ASSIGNMENT 37
The HTT gene encodes for huntingtin, a ubiquitously expression of the nuclear protein which binds to
various transcription factors for the regulation of transcription.
What roles does HTT play in the cell and how can they be related to Huntington Disease?
Answer-
It plays essential role in nerve cells (neurons) in brain and necssory for normal development
before birth. Within the cells huntingtin takes part in chemical signalling, transporting materials,
binding to protein on and other structures, and saving the cell from self-destruction. The inherited
mutation which causes Huntington disease is known as the CAG trinucleotide repeat expansion. This
mutation increases size of the CAG segment in a HTT gene. People with Huntington disease have 36 to
120 CAG repeats.
What change in HTT results in Huntington disease?
Answer - Expected repeat on single allele of HTT gene. The increased CAG repeat size in the HTT related to
Huntington disease.
List 5 allelic variants of HTT and define the molecular basis for the dysfunctional gene product in each
1. .0001 HUNTINGTON DISEASE, (CAG)n expansion
Huntington disease is caused by expansion of a polymorphic trinucleotide repeat (CAG)n, encodes
glutamine, found in N- terminal coding sites of HTT gene.
1. .0002 LOPES- MACIEL-RODAN SYNDROME (HTT, PRO703LEU)
For Lopes-meciel-rodan syndrome a responsible compound of heterozygous mutations identified in
HTT gene

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
BIOINFORMATICS ASSIGNMENT 38
2. .0003 LOPES-MACIEL-RODAN SYNDROME (HTT, THR1260MET)
For discussion of the c.3779C-T transition, which was found in compound heterozygous state in
a patient with this Syndrome that was found in compound heterozygous state in patients with this
disease?
3. .0004 LOPES-MACIEL-RODAN SYNDROME (HTT, IVS34DS, G-A, +1)
An identified compound heterozygous mutations in a HTT gene: a G to A transition in intron
34, predicted to create in abnormal gene splicing, and a c.8156T- A transversion.
4. .0004 LOPES-MACIEL-RODAN SYNDROME (HTT, PHE2719LEU)
For discussion of C.8156T-A transversion (c.8156T-A, NM_002111.7) in the HTT gene creates
a phe2719 to leu substitution that was found in compound heterozygous state in 3 sibs with this
syndrome.
Do all mutations lead to same disease? Explain why.
Answer
All the mutations do not leads to Huntington disease but the mutations that arise from
intermediate alleles including CAG repeats between 29 to 35. The mutations that are found by exome
sequencing and confirmed by Sanger sequencing, segregated with the disorder.
6. Acquiring Sequence Information
HTT Gene
Official name: Huntingtin
Identity number: 3064
Number of exons: 67
Accession number for transcript
Document Page
BIOINFORMATICS ASSIGNMENT 39
mRNA: NM_002111
Protein: NP_002102
Length of mRNA: 13498 bp
Other genes
Accession
prefix
Official
name
Gene id No. of
exons
Accession no. for transcript
mRNA and protein
mRNA
transcript
length
( in base
pairs)
mRNA protein
AC_ achaete 30981 1 BT022154.1 P10083.1 917
NC_ Nedd-2 like
caspase
100500
764
8 AB489117.1 BAJ16361.1 1682
NG_ nackig 103929 NA NA AAA30607.1 NA
NT_ no turning
deletion
region
18202 NA NA NA NA
NW_ narrow 377201
5
4 BT126129.1 ADZ36849.1 2387
NS_ Protein
ABA
deficient 4,
chlioroplasti
c
103444
131
6 JN941557.1 AEX97076.1 727
NZ_b Not
found(NF)
NF NF NF NF NF
NM_ Neutrophil
migration
4827 NF NF NF NF
NR_ Nitrate
reductase
100736
473
4 HQ616893.1 P17570
.1
739
XM_c Marginal
coil Xmc
100491
647
1 XM_00294487
4
XP_0029449
20
2623
XR_c NF NF NF NF NF NF
AP_ Apterous 35509 10 NM_165445 NP_724428 4114
NP_ Notopleural 35904 10 NM_00125928
4
NP_0012462
13
3822
YP_c NF NF NF NF NF NF
XP_c Xeroderma
pigmentosu
m
36697 6 NM_166087 NP_72545 4683
ZP c Zona
pellucida
100379
210
0 AV630570.1 AAV335105.
1
2337
Copy and paste the sequence (including the title line) here.
Document Page
BIOINFORMATICS ASSIGNMENT 40
>NM_002111.8 Homo sapiens huntingtin (HTT), mRNA
GCTGCCGGGACGGGTCCAAGATGGACGGCCGCTCAGGTTCTGCTTTTACCTGCGGCCCAGAGCCCCATTC
ATTGCCCCGGTGCTGAGCGGCGCCGCGAGTCGGCCCGAGGCCTCCGGGGACTGCCGTGCCGGGCGGGAGA
CCGCCATGGCGACCCTGGAAAAGCTGATGAAGGCCTTCGAGTCCCTCAAGTCCTTCCAGCAGCAGCAGCA
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAACAGCCGCCACCGCCGCCG
CCGCCGCCGCCGCCTCCTCAGCTTCCTCAGCCGCCGCCGCAGGCACAGCCGCTGCTGCCTCAGCCGCAGC
CGCCCCCGCCGCCGCCCCCGCCGCCACCCGGCCCGGCTGTGGCTGAGGAGCCGCTGCACCGACCAAAGAA
AGAACTTTCAGCTACCAAGAAAGACCGTGTGAATCATTGTCTGACAATATGTGAAAACATAGTGGCACAG
TCTGTCAGAAATTCTCCAGAATTTCAGAAACTTCTGGGCATCGCTATGGAACTTTTTCTGCTGTGCAGTG
ATGACGCAGAGTCAGATGTCAGGATGGTGGCTGACGAATGCCTCAACAAAGTTATCAAAGCTTTGATGGA
TTCTAATCTTCCAAGGTTACAGCTCGAGCTCTATAAGGAAATTAAAAAGAATGGTGCCCCTCGGAGTTTG
CGTGCTGCCCTGTGGAGGTTTGCTGAGCTGGCTCACCTGGTTCGGCCTCAGAAATGCAGGCCTTACCTGG
TGAACCTTCTGCCGTGCCTGACTCGAACAAGCAAGAGACCCGAAGAATCAGTCCAGGAGACCTTGGCTGC
AGCTGTTCCCAAAATTATGGCTTCTTTTGGCAATTTTGCAAATGACAATGAAATTAAGGTTTTGTTAAAG
GCCTTCATAGCGAACCTGAAGTCAAGCTCCCCCACCATTCGGCGGACAGCGGCTGGATCAGCAGTGAGCA
TCTGCCAGCACTCAAGAAGGACACAATATTTCTATAGTTGGCTACTAAATGTGCTCTTAGGCTTACTCGT
TCCTGTCGAGGATGAACACTCCACTCTGCTGATTCTTGGCGTGCTGCTCACCCTGAGGTATTTGGTGCCC
TTGCTGCAGCAGCAGGTCAAGGACACAAGCCTGAAAGGCAGCTTCGGAGTGACAAGGAAAGAAATGGAAG
TCTCTCCTTCTGCAGAGCAGCTTGTCCAGGTTTATGAACTGACGTTACATCATACACAGCACCAAGACCA
CAATGTTGTGACCGGAGCCCTGGAGCTGTTGCAGCAGCTCTTCAGAACGCCTCCACCCGAGCTTCTGCAA
ACCCTGACCGCAGTCGGGGGCATTGGGCAGCTCACCGCTGCTAAGGAGGAGTCTGGTGGCCGAAGCCGTA
GTGGGAGTATTGTGGAACTTATAGCTGGAGGGGGTTCCTCATGCAGCCCTGTCCTTTCAAGAAAACAAAA
AGGCAAAGTGCTCTTAGGAGAAGAAGAAGCCTTGGAGGATGACTCTGAATCGAGATCGGATGTCAGCAGC
TCTGCCTTAACAGCCTCAGTGAAGGATGAGATCAGTGGAGAGCTGGCTGCTTCTTCAGGGGTTTCCACTC
CAGGGTCAGCAGGTCATGACATCATCACAGAACAGCCACGGTCACAGCACACACTGCAGGCGGACTCAGT
GGATCTGGCCAGCTGTGACTTGACAAGCTCTGCCACTGATGGGGATGAGGAGGATATCTTGAGCCACAGC
TCCAGCCAGGTCAGCGCCGTCCCATCTGACCCTGCCATGGACCTGAATGATGGGACCCAGGCCTCGTCGC
CCATCAGCGACAGCTCCCAGACCACCACCGAAGGGCCTGATTCAGCTGTTACCCCTTCAGACAGTTCTGA
AATTGTGTTAGACGGTACCGACAACCAGTATTTGGGCCTGCAGATTGGACAGCCCCAGGATGAAGATGAG
GAAGCCACAGGTATTCTTCCTGATGAAGCCTCGGAGGCCTTCAGGAACTCTTCCATGGCCCTTCAACAGG
CACATTTATTGAAAAACATGAGTCACTGCAGGCAGCCTTCTGACAGCAGTGTTGATAAATTTGTGTTGAG
AGATGAAGCTACTGAACCGGGTGATCAAGAAAACAAGCCTTGCCGCATCAAAGGTGACATTGGACAGTCC
ACTGATGATGACTCTGCACCTCTTGTCCATTGTGTCCGCCTTTTATCTGCTTCGTTTTTGCTAACAGGGG
GAAAAAATGTGCTGGTTCCGGACAGGGATGTGAGGGTCAGCGTGAAGGCCCTGGCCCTCAGCTGTGTGGG
AGCAGCTGTGGCCCTCCACCCGGAATCTTTCTTCAGCAAACTCTATAAAGTTCCTCTTGACACCACGGAA
TACCCTGAGGAACAGTATGTCTCAGACATCTTGAACTACATCGATCATGGAGACCCACAGGTTCGAGGAG
CCACTGCCATTCTCTGTGGGACCCTCATCTGCTCCATCCTCAGCAGGTCCCGCTTCCACGTGGGAGATTG
GATGGGCACCATTAGAACCCTCACAGGAAATACATTTTCTTTGGCGGATTGCATTCCTTTGCTGCGGAAA
ACACTGAAGGATGAGTCTTCTGTTACTTGCAAGTTAGCTTGTACAGCTGTGAGGAACTGTGTCATGAGTC
TCTGCAGCAGCAGCTACAGTGAGTTAGGACTGCAGCTGATCATCGATGTGCTGACTCTGAGGAACAGTTC
CTATTGGCTGGTGAGGACAGAGCTTCTGGAAACCCTTGCAGAGATTGACTTCAGGCTGGTGAGCTTTTTG
GAGGCAAAAGCAGAAAACTTACACAGAGGGGCTCATCATTATACAGGGCTTTTAAAACTGCAAGAACGAG
TGCTCAATAATGTTGTCATCCATTTGCTTGGAGATGAAGACCCCAGGGTGCGACATGTTGCCGCAGCATC
ACTAATTAGGCTTGTCCCAAAGCTGTTTTATAAATGTGACCAAGGACAAGCTGATCCAGTAGTGGCCGTG
GCAAGAGATCAAAGCAGTGTTTACCTGAAACTTCTCATGCATGAGACGCAGCCTCCATCTCATTTCTCCG
TCAGCACAATAACCAGAATATATAGAGGCTATAACCTACTACCAAGCATAACAGACGTCACTATGGAAAA
TAACCTTTCAAGAGTTATTGCAGCAGTTTCTCATGAACTAATCACATCAACCACCAGAGCACTCACATTT
GGATGCTGTGAAGCTTTGTGTCTTCTTTCCACTGCCTTCCCAGTTTGCATTTGGAGTTTAGGTTGGCACT
GTGGAGTGCCTCCACTGAGTGCCTCAGATGAGTCTAGGAAGAGCTGTACCGTTGGGATGGCCACAATGAT
TCTGACCCTGCTCTCGTCAGCTTGGTTCCCATTGGATCTCTCAGCCCATCAAGATGCTTTGATTTTGGCC
GGAAACTTGCTTGCAGCCAGTGCTCCCAAATCTCTGAGAAGTTCATGGGCCTCTGAAGAAGAAGCCAACC
CAGCAGCCACCAAGCAAGAGGAGGTCTGGCCAGCCCTGGGGGACCGGGCCCTGGTGCCCATGGTGGAGCA
GCTCTTCTCTCACCTGCTGAAGGTGATTAACATTTGTGCCCACGTCCTGGATGACGTGGCTCCTGGACCC
GCAATAAAGGCAGCCTTGCCTTCTCTAACAAACCCCCCTTCTCTAAGTCCCATCCGACGAAAGGGGAAGG
AGAAAGAACCAGGAGAACAAGCATCTGTACCGTTGAGTCCCAAGAAAGGCAGTGAGGCCAGTGCAGCTTC
TAGACAATCTGATACCTCAGGTCCTGTTACAACAAGTAAATCCTCATCACTGGGGAGTTTCTATCATCTT
CCTTCATACCTCAAACTGCATGATGTCCTGAAAGCTACACACGCTAACTACAAGGTCACGCTGGATCTTC
AGAACAGCACGGAAAAGTTTGGAGGGTTTCTCCGCTCAGCCTTGGATGTTCTTTCTCAGATACTAGAGCT
GGCCACACTGCAGGACATTGGGAAGTGTGTTGAAGAGATCCTAGGATACCTGAAATCCTGCTTTAGTCGA
GAACCAATGATGGCAACTGTTTGTGTTCAACAATTGTTGAAGACTCTCTTTGGCACAAACTTGGCCTCCC

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
BIOINFORMATICS ASSIGNMENT 41
AGTTTGATGGCTTATCTTCCAACCCCAGCAAGTCACAAGGCCGAGCACAGCGCCTTGGCTCCTCCAGTGT
GAGGCCAGGCTTGTACCACTACTGCTTCATGGCCCCGTACACCCACTTCACCCAGGCCCTCGCTGACGCC
AGCCTGAGGAACATGGTGCAGGCGGAGCAGGAGAACGACACCTCGGGATGGTTTGATGTCCTCCAGAAAG
TGTCTACCCAGTTGAAGACAAACCTCACGAGTGTCACAAAGAACCGTGCAGATAAGAATGCTATTCATAA
TCACATTCGTTTGTTTGAACCTCTTGTTATAAAAGCTTTAAAACAGTACACGACTACAACATGTGTGCAG
TTACAGAAGCAGGTTTTAGATTTGCTGGCGCAGCTGGTTCAGTTACGGGTTAATTACTGTCTTCTGGATT
CAGATCAGGTGTTTATTGGCTTTGTATTGAAACAGTTTGAATACATTGAAGTGGGCCAGTTCAGGGAATC
AGAGGCAATCATTCCAAACATCTTTTTCTTCTTGGTATTACTATCTTATGAACGCTATCATTCAAAACAG
ATCATTGGAATTCCTAAAATCATTCAGCTCTGTGATGGCATCATGGCCAGTGGAAGGAAGGCTGTGACAC
ATGCCATACCGGCTCTGCAGCCCATAGTCCACGACCTCTTTGTATTAAGAGGAACAAATAAAGCTGATGC
AGGAAAAGAGCTTGAAACCCAAAAAGAGGTGGTGGTGTCAATGTTACTGAGACTCATCCAGTACCATCAG
GTGTTGGAGATGTTCATTCTTGTCCTGCAGCAGTGCCACAAGGAGAATGAAGACAAGTGGAAGCGACTGT
CTCGACAGATAGCTGACATCATCCTCCCAATGTTAGCCAAACAGCAGATGCACATTGACTCTCATGAAGC
CCTTGGAGTGTTAAATACATTATTTGAGATTTTGGCCCCTTCCTCCCTCCGTCCGGTAGACATGCTTTTA
CGGAGTATGTTCGTCACTCCAAACACAATGGCGTCCGTGAGCACTGTTCAACTGTGGATATCGGGAATTC
TGGCCATTTTGAGGGTTCTGATTTCCCAGTCAACTGAAGATATTGTTCTTTCTCGTATTCAGGAGCTCTC
CTTCTCTCCGTATTTAATCTCCTGTACAGTAATTAATAGGTTAAGAGATGGGGACAGTACTTCAACGCTA
GAAGAACACAGTGAAGGGAAACAAATAAAGAATTTGCCAGAAGAAACATTTTCAAGGTTTCTATTACAAC
TGGTTGGTATTCTTTTAGAAGACATTGTTACAAAACAGCTGAAGGTGGAAATGAGTGAGCAGCAACATAC
TTTCTATTGCCAGGAACTAGGCACACTGCTAATGTGTCTGATCCACATCTTCAAGTCTGGAATGTTCCGG
AGAATCACAGCAGCTGCCACTAGGCTGTTCCGCAGTGATGGCTGTGGCGGCAGTTTCTACACCCTGGACA
GCTTGAACTTGCGGGCTCGTTCCATGATCACCACCCACCCGGCCCTGGTGCTGCTCTGGTGTCAGATACT
GCTGCTTGTCAACCACACCGACTACCGCTGGTGGGCAGAAGTGCAGCAGACCCCGAAAAGACACAGTCTG
TCCAGCACAAAGTTACTTAGTCCCCAGATGTCTGGAGAAGAGGAGGATTCTGACTTGGCAGCCAAACTTG
GAATGTGCAATAGAGAAATAGTACGAAGAGGGGCTCTCATTCTCTTCTGTGATTATGTCTGTCAGAACCT
CCATGACTCCGAGCACTTAACGTGGCTCATTGTAAATCACATTCAAGATCTGATCAGCCTTTCCCACGAG
CCTCCAGTACAGGACTTCATCAGTGCCGTTCATCGGAACTCTGCTGCCAGCGGCCTGTTCATCCAGGCAA
TTCAGTCTCGTTGTGAAAACCTTTCAACTCCAACCATGCTGAAGAAAACTCTTCAGTGCTTGGAGGGGAT
CCATCTCAGCCAGTCGGGAGCTGTGCTCACGCTGTATGTGGACAGGCTTCTGTGCACCCCTTTCCGTGTG
CTGGCTCGCATGGTCGACATCCTTGCTTGTCGCCGGGTAGAAATGCTTCTGGCTGCAAATTTACAGAGCA
GCATGGCCCAGTTGCCAATGGAAGAACTCAACAGAATCCAGGAATACCTTCAGAGCAGCGGGCTCGCTCA
GAGACACCAAAGGCTCTATTCCCTGCTGGACAGGTTTCGTCTCTCCACCATGCAAGACTCACTTAGTCCC
TCTCCTCCAGTCTCTTCCCACCCGCTGGACGGGGATGGGCACGTGTCACTGGAAACAGTGAGTCCGGACA
AAGACTGGTACGTTCATCTTGTCAAATCCCAGTGTTGGACCAGGTCAGATTCTGCACTGCTGGAAGGTGC
AGAGCTGGTGAATCGGATTCCTGCTGAAGATATGAATGCCTTCATGATGAACTCGGAGTTCAACCTAAGC
CTGCTAGCTCCATGCTTAAGCCTAGGGATGAGTGAAATTTCTGGTGGCCAGAAGAGTGCCCTTTTTGAAG
CAGCCCGTGAGGTGACTCTGGCCCGTGTGAGCGGCACCGTGCAGCAGCTCCCTGCTGTCCATCATGTCTT
CCAGCCCGAGCTGCCTGCAGAGCCGGCGGCCTACTGGAGCAAGTTGAATGATCTGTTTGGGGATGCTGCA
CTGTATCAGTCCCTGCCCACTCTGGCCCGGGCCCTGGCACAGTACCTGGTGGTGGTCTCCAAACTGCCCA
GTCATTTGCACCTTCCTCCTGAGAAAGAGAAGGACATTGTGAAATTCGTGGTGGCAACCCTTGAGGCCCT
GTCCTGGCATTTGATCCATGAGCAGATCCCGCTGAGTCTGGATCTCCAGGCAGGGCTGGACTGCTGCTGC
CTGGCCCTGCAGCTGCCTGGCCTCTGGAGCGTGGTCTCCTCCACAGAGTTTGTGACCCACGCCTGCTCCC
TCATCTACTGTGTGCACTTCATCCTGGAGGCCGTTGCAGTGCAGCCTGGAGAGCAGCTTCTTAGTCCAGA
AAGAAGGACAAATACCCCAAAAGCCATCAGCGAGGAGGAGGAGGAAGTAGATCCAAACACACAGAATCCT
AAGTATATCACTGCAGCCTGTGAGATGGTGGCAGAAATGGTGGAGTCTCTGCAGTCGGTGTTGGCCTTGG
GTCATAAAAGGAATAGCGGCGTGCCGGCGTTTCTCACGCCATTGCTAAGGAACATCATCATCAGCCTGGC
CCGCCTGCCCCTTGTCAACAGCTACACACGTGTGCCCCCACTGGTGTGGAAGCTTGGATGGTCACCCAAA
CCGGGAGGGGATTTTGGCACAGCATTCCCTGAGATCCCCGTGGAGTTCCTCCAGGAAAAGGAAGTCTTTA
AGGAGTTCATCTACCGCATCAACACACTAGGCTGGACCAGTCGTACTCAGTTTGAAGAAACTTGGGCCAC
CCTCCTTGGTGTCCTGGTGACGCAGCCCCTCGTGATGGAGCAGGAGGAGAGCCCACCAGAAGAAGACACA
GAGAGGACCCAGATCAACGTCCTGGCCGTGCAGGCCATCACCTCACTGGTGCTCAGTGCAATGACTGTGC
CTGTGGCCGGCAACCCAGCTGTAAGCTGCTTGGAGCAGCAGCCCCGGAACAAGCCTCTGAAAGCTCTCGA
CACCAGGTTTGGGAGGAAGCTGAGCATTATCAGAGGGATTGTGGAGCAAGAGATTCAAGCAATGGTTTCA
AAGAGAGAGAATATTGCCACCCATCATTTATATCAGGCATGGGATCCTGTCCCTTCTCTGTCTCCGGCTA
CTACAGGTGCCCTCATCAGCCACGAGAAGCTGCTGCTACAGATCAACCCCGAGCGGGAGCTGGGGAGCAT
GAGCTACAAACTCGGCCAGGTGTCCATACACTCCGTGTGGCTGGGGAACAGCATCACACCCCTGAGGGAG
GAGGAATGGGACGAGGAAGAGGAGGAGGAGGCCGACGCCCCTGCACCTTCGTCACCACCCACGTCTCCAG
TCAACTCCAGGAAACACCGGGCTGGAGTTGACATCCACTCCTGTTCGCAGTTTTTGCTTGAGTTGTACAG
CCGCTGGATCCTGCCGTCCAGCTCAGCCAGGAGGACCCCGGCCATCCTGATCAGTGAGGTGGTCAGATCC
CTTCTAGTGGTCTCAGACTTGTTCACCGAGCGCAACCAGTTTGAGCTGATGTATGTGACGCTGACAGAAC
TGCGAAGGGTGCACCCTTCAGAAGACGAGATCCTCGCTCAGTACCTGGTGCCTGCCACCTGCAAGGCAGC
Document Page
BIOINFORMATICS ASSIGNMENT 42
TGCCGTCCTTGGGATGGACAAGGCCGTGGCGGAGCCTGTCAGCCGCCTGCTGGAGAGCACGCTCAGGAGC
AGCCACCTGCCCAGCAGGGTTGGAGCCCTGCACGGCGTCCTCTATGTGCTGGAGTGCGACCTGCTGGACG
ACACTGCCAAGCAGCTCATCCCGGTCATCAGCGACTATCTCCTCTCCAACCTGAAAGGGATCGCCCACTG
CGTGAACATTCACAGCCAGCAGCACGTACTGGTCATGTGTGCCACTGCGTTTTACCTCATTGAGAACTAT
CCTCTGGACGTAGGGCCGGAATTTTCAGCATCAATAATACAGATGTGTGGGGTGATGCTGTCTGGAAGTG
AGGAGTCCACCCCCTCCATCATTTACCACTGTGCCCTCAGAGGCCTGGAGCGCCTCCTGCTCTCTGAGCA
GCTCTCCCGCCTGGATGCAGAATCGCTGGTCAAGCTGAGTGTGGACAGAGTGAACGTGCACAGCCCGCAC
CGGGCCATGGCGGCTCTGGGCCTGATGCTCACCTGCATGTACACAGGAAAGGAGAAAGTCAGTCCGGGTA
GAACTTCAGACCCTAATCCTGCAGCCCCCGACAGCGAGTCAGTGATTGTTGCTATGGAGCGGGTATCTGT
TCTTTTTGATAGGATCAGGAAAGGCTTTCCTTGTGAAGCCAGAGTGGTGGCCAGGATCCTGCCCCAGTTT
CTAGACGACTTCTTCCCACCCCAGGACATCATGAACAAAGTCATCGGAGAGTTTCTGTCCAACCAGCAGC
CATACCCCCAGTTCATGGCCACCGTGGTGTATAAGGTGTTTCAGACTCTGCACAGCACCGGGCAGTCGTC
CATGGTCCGGGACTGGGTCATGCTGTCCCTCTCCAACTTCACGCAGAGGGCCCCGGTCGCCATGGCCACG
TGGAGCCTCTCCTGCTTCTTTGTCAGCGCGTCCACCAGCCCGTGGGTCGCGGCGATCCTCCCACATGTCA
TCAGCAGGATGGGCAAGCTGGAGCAGGTGGACGTGAACCTTTTCTGCCTGGTCGCCACAGACTTCTACAG
ACACCAGATAGAGGAGGAGCTCGACCGCAGGGCCTTCCAGTCTGTGCTTGAGGTGGTTGCAGCCCCAGGA
AGCCCATATCACCGGCTGCTGACTTGTTTACGAAATGTCCACAAGGTCACCACCTGCTGAGCGCCATGGT
GGGAGAGACTGTGAGGCGGCAGCTGGGGCCGGAGCCTTTGGAAGTCTGCGCCCTTGTGCCCTGCCTCCAC
CGAGCCAGCTTGGTCCCTATGGGCTTCCGCACATGCCGCGGGCGGCCAGGCAACGTGCGTGTCTCTGCCA
TGTGGCAGAAGTGCTCTTTGTGGCAGTGGCCAGGCAGGGAGTGTCTGCAGTCCTGGTGGGGCTGAGCCTG
AGGCCTTCCAGAAAGCAGGAGCAGCTGTGCTGCACCCCATGTGGGTGACCAGGTCCTTTCTCCTGATAGT
CACCTGCTGGTTGTTGCCAGGTTGCAGCTGCTCTTGCATCTGGGCCAGAAGTCCTCCCTCCTGCAGGCTG
GCTGTTGGCCCCTCTGCTGTCCTGCAGTAGAAGGTGCCGTGAGCAGGCTTTGGGAACACTGGCCTGGGTC
TCCCTGGTGGGGTGTGCATGCCACGCCCCGTGTCTGGATGCACAGATGCCATGGCCTGTGCTGGGCCAGT
GGCTGGGGGTGCTAGACACCCGGCACCATTCTCCCTTCTCTCTTTTCTTCTCAGGATTTAAAATTTAATT
ATATCAGTAAAGAGATTAATTTTAACGTAACTCTTTCTATGCCCGTGTAAAGTATGTGAATCGCAAGGCC
TGTGCTGCATGCGACAGCGTCCGGGGTGGTGGACAGGGCCCCCGGCCACGCTCCCTCTCCTGTAGCCACT
GGCATAGCCCTCCTGAGCACCCGCTGACATTTCCGTTGTACATGTTCCTGTTTATGCATTCACAAGGTGA
CTGGGATGTAGAGAGGCGTTAGTGGGCAGGTGGCCACAGCAGGACTGAGGACAGGCCCCCATTATCCTAG
GGGTGCGCTCACCTGCAGCCCCTCCTCCTCGGGCACAGACGACTGTCGTTCTCCACCCACCAGTCAGGGA
CAGCAGCCTCCCTGTCACTCAGCTGAGAAGGCCAGCCCTCCCTGGCTGTGAGCAGCCTCCACTGTGTCCA
GAGACATGGGCCTCCCACTCCTGTTCCTTGCTAGCCCTGGGGTGGCGTCTGCCTAGGAGCTGGCTGGCAG
GTGTTGGGACCTGCTGCTCCATGGATGCATGCCCTAAGAGTGTCACTGAGCTGTGTTTTGTCTGAGCCTC
TCTCGGTCAACAGCAAAGCTTGGTGTCTTGGCACTGTTAGTGACAGAGCCCAGCATCCCTTCTGCCCCCG
TTCCAGCTGACATCTTGCACGGTGACCCCTTTTAGTCAGGAGAGTGCAGATCTGTGCTCATCGGAGACTG
CCCCACGGCCCTGTCAGAGCCGCCACTCCTATCCCCAGGCCAGGTCCCTGGACCAGCCTCCTGTTTGCAG
GCCCAGAGGAGCCAAGTCATTAAAATGGAAGTGGATTCTGGATGGCCGGGCTGCTGCTGATGTAGGAGCT
GGATTTGGGAGCTCTGCTTGCCGACTGGCTGTGAGACGAGGCAGGGGCTCTGCTTCCTCAGCCCTAGAGG
CGAGCCAGGCAAGGTTGGCGACTGTCATGTGGCTTGGTTTGGTCATGCCCGTCGATGTTTTGGGTATTGA
ATGTGGTAAGTGGAGGAAATGTTGGAACTCTGTGCAGGTGCTGCCTTGAGACCCCCAAGCTTCCACCTGT
CCCTCTCCTATGTGGCAGCTGGGGAGCAGCTGAGATGTGGACTTGTATGCTGCCCACATACGTGAGGGGG
AGCTGAAAGGGAGCCCCTCCTCTGAGCAGCCTCTGCCAGGCCTGTATGAGGCTTTTCCCACCAGCTCCCA
ACAGAGGCCTCCCCCAGCCAGGACCACCTCGTCCTCGTGGCGGGGCAGCAGGAGCGGTAGAAAGGGGTCC
GATGTTTGAGGAGGCCCTTAAGGGAAGCTACTGAATTATAACACGTAAGAAAATCACCATTCCGTATTGG
TTGGGGGCTCCTGTTTCTCATCCTAGCTTTTTCCTGGAAAGCCCGCTAGAAGGTTTGGGAACGAGGGGAA
AGTTCTCAGAACTGTTGGCTGCTCCCCACCCGCCTCCCGCCTCCCCCGCAGGTTATGTCAGCAGCTCTGA
GACAGCAGTATCACAGGCCAGATGTTGTTCCTGGCTAGATGTTTACATTTGTAAGAAATAACACTGTGAA
TGTAAAACAGAGCCATTCCCTTGGAATGCATATCGCTGGGCTCAACATAGAGTTTGTCTTCCTCTTGTTT
ACGACGTGATCTAAACCAGTCCTTAGCAAGGGGCTCAGAACACCCCGCTCTGGCAGTAGGTGTCCCCCAC
CCCCAAAGACCTGCCTGTGTGCTCCGGAGATGAATATGAGCTCATTAGTAAAAATGACTTCACCCACGCA
TATACATAAAGTATCCATGCATGTGCATATAGACACATCTATAATTTTACACACACACCTCTCAAGACGG
AGATGCATGGCCTCTAAGAGTGCCCGTGTCGGTTCTTCCTGGAAGTTGACTTTCCTTAGACCCGCCAGGT
CAAGTTAGCCGCGTGACGGACATCCAGGCGTGGGACGTGGTCAGGGCAGGGCTCATTCATTGCCCACTAG
GATCCCACTGGCGAAGATGGTCTCCATATCAGCTCTCTGCAGAAGGGAGGAAGACTTTATCATGTTCCTA
AAAATCTGTGGCAAGCACCCATCGTATTATCCAAATTTTGTTGCAAATGTGATTAATTTGGTTGTCAAGT
TTTGGGGGTGGGCTGTGGGGAGATTGCTTTTGTTTTCCTGCTGGTAATATCGGGAAAGATTTTAATGAAA
CCAGGGTAGAATTGTTTGGCAATGCACTGAAGCGTGTTTCTTTCCCAAAATGTGCCTCCCTTCCGCTGCG
GGCCCAGCTGAGTCTATGTAGGTGATGTTTCCAGCTGCCAAGTGCTCTTTGTTACTGTCCACCCTCATTT
CTGCCAGCGCATGTGTCCTTTCAAGGGGAAAATGTGAAGCTGAACCCCCTCCAGACACCCAGAATGTAGC
ATCTGAGAAGGCCCTGTGCCCTAAAGGACACCCCTCGCCCCCATCTTCATGGAGGGGGTCATTTCAGAGC
CCTCGGAGCCAATGAACAGCTCCTCCTCTTGGAGCTGAGATGAGCCCCACGTGGAGCTCGGGACGGATAG
Document Page
BIOINFORMATICS ASSIGNMENT 43
TAGACAGCAATAACTCGGTGTGTGGCCGCCTGGCAGGTGGAACTTCCTCCCGTTGCGGGGTGGAGTGAGG
TTAGTTCTGTGTGTCTGGTGGGTGGAGTCAGGCTTCTCTTGCTACCTGTGAGCATCCTTCCCAGCAGACA
TCCTCATCGGGCTTTGTCCCTCCCCCGCTTCCTCCCTCTGCGGGGAGGACCCGGGACCACAGCTGCTGGC
CAGGGTAGACTTGGAGCTGTCCTCCAGAGGGGTCACGTGTAGGAGTGAGAAGAAGGAAGATCTTGAGAGC
TGCTGAGGGACCTTGGAGAGCTCAGGATGGCTCAGACGAGGACACTCGCTTGCCGGGCCTGGGCCTCCTG
GGAAGGAGGGAGCTGCTCAGAATGCCGCATGACAACTGAAGGCAACCTGGAAGGTTCAGGGGCCGCTCTT
CCCCCATGTGCCTGTCACGCTCTGGTGCAGTCAAAGGAACGCCTTCCCCTCAGTTGTTTCTAAGAGCAGA
GTCTCCCGCTGCAATCTGGGTGGTAACTGCCAGCCTTGGAGGATCGTGGCCAACGTGGACCTGCCTACGG
AGGGTGGGCTCTGACCCAAGTGGGGCCTCCTTGTCCAGGTCTCACTGCTTTGCACCGTGGTCAGAGGGAC
TGTCAGCTGAGCTTGAGCTCCCCTGGAGCCAGCAGGGCTGTGATGGGCGAGTCCCGGAGCCCCACCCAGA
CCTGAATGCTTCTGAGAGCAAAGGGAAGGACTGACGAGAGATGTATATTTAATTTTTTAACTGCTGCAAA
CATTGTACATCCAAATTAAAGGAAAAAAATGGAAACCATCAAAAAAAAAAAAAAAAAA
7. Uniprot/Swiss-Prot Database
What is the putative function of the protein?
Answer:
May play role in microtubule-mediated transport or vesicle function.
Beta tubulin binding
Dynein intermediate chain binding
Heat shock protein binding
Identical protein binding
Ion channel binding
Kinase binding
P53 binding
Protein binding
Transcription factor binding
Biological process
Animal organ development
Apoptotic process
Establishment of mitotic spindle orientation
Gorgi organisation
Regulation of extrinsic apoptotic signalling pathway.
Alternative names
HD
IT15
HD_Human
Huntingtin
Intracellular localisation
Nucleus

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
BIOINFORMATICS ASSIGNMENT 44
Other location cytoplasm
The mutant huntingtin protein found with AKAP8L in nuclear matrix of neurons related to
Huntington disease. Shuttles between cytoplasm and nucleus in ran GTPase- independent manner.
Key domains
Pfma1
sd00044
PRINTS1
InterPro1
Full length
3,142
Post translation modification
Devided by apopain downstream of the polyglutamine stretch. The resulting N- terminal
Fragment is cytotoxic and initiate apoptosis.
Phosphorylation at Ser-1179 and Ser -1199 by CDK5 in response to DNA destruction in nuclei
of neurons saves neurons against polyglutamine expansion as well as DNA damage mediated toxicity.
8. Basic Analysis of Evolutionarily Conserved Sequences
Reasonable homologous matches
250
Best matched non-human sequence
A0A2J8K6F3_PANTR – HTT isoform 1 from Pan Troglodytes (chimpanzee)
Mouse homologue
G3X9H5_MOUSE - huntingtin mus musculus.
Document Page
BIOINFORMATICS ASSIGNMENT 45
Match score – 90.6%
Dog homologue
FQPPPPPPPPPP7Y7_CANLF- Canis lupus family
Match score- 91.2%
Natural variant
MATLEKLMKAFESLKSFQQQQQQQQQQQQQQQQQQQQQPPPPPPPPPPPQLPQPPPQAQP
LLPQPQPPPPPPPPPPGPAVAEEPLHRPKKELSATKKDR
Why natural variant important
To find out the exact location of the mutation occurred in a gene.
To find out the effect of variation on the protein.
E values
The Expectation value or expect value means the number of different alignment with scores similar
to or better than S which is expected to occur in a database search by chance. The lower the e value, the
more significant the score and the alignment.
Identity
It is the extent to which two sequences have the same residues on same positions in an alignment
generally expressed as a percentage.
Are two proteins more highly conserved if they have a higher percentage of two positives of identities?
Answer- yes
Document Page
BIOINFORMATICS ASSIGNMENT 46
What is meant by PREDICTED in the matched sequences? Is a Blast search helpful in these cases and
if so why?
Answer
It means that the records of a sequence is predicted by automated computational evaluation
which is annoted by applying gene prediction method.
Blast search is helpful in these situations because it aligned the different of matched sequences
from number of organisms. It gives great number of result for best matches.
9. Multiple Sequence Alignments across Different Species
sp|P42858|HD_HUMAN MATLEKLMKAFESLKSFQQQQQQQQQQQQQQQQQQQQQPPPPPPPPPPPQLPQPPPQAQP 60
tr|A0A2K6QU33|A0A2K6QU33_RHIRO ----------------FFETES-------------------------------R-SVARA 12
tr|A0A2K6AF47|A0A2K6AF47_MANLE ------------------------------------------------------------ 0
tr|A0A2K6CY39|A0A2K6CY39_MACNE MATLEKLMKAFESLKSFQQQQQQ----------QQQQQPPPPPPPPPPPQLPQP-PQAQP 49
sp|P42858|HD_HUMAN LLPQPQPPPPPPPPPPGPAVAEEPLHRPKKELSATKKDRVNHCLTICENIVAQSVRNSPE 120
tr|A0A2K6QU33|A0A2K6QU33_RHIRO GVQWPN-LSSLQAPPPGFTLGLQAPATSKKELSATKKDRVNHCLTICENIVAQSVRNSPE 71
tr|A0A2K6AF47|A0A2K6AF47_MANLE SVPQRNLGA-LQPPPPGF-KQFLCLSLPSKELSATKKDRVNHCLTICENIVAQSVRNSPE 58
tr|A0A2K6CY39|A0A2K6CY39_MACNE MLPQPQPPPPPPPPPPGPAVAEEPLHRPKKELSATKKDRVNHCLTICENIVAQSVRNSPE 109
: : **** .*******************************
sp|P42858|HD_HUMAN FQKLLGIAMELFLLCSDDAESDVRMVADECLNKVIKALMDSNLPRLQLELYKEIKKNGAP 180
tr|A0A2K6QU33|A0A2K6QU33_RHIRO FQKLLGIAMELFLLCSDDAESDVRMVADECLNKVIKALMDSNLPRLQLELYKEIKKNGAP 131
tr|A0A2K6AF47|A0A2K6AF47_MANLE FQKLLGIAMELFLLCSDDAESDVRMVADECLNKVIKALMDSNLPRLQLELYKEIKKNGAP 118
tr|A0A2K6CY39|A0A2K6CY39_MACNE FQKLLGIAMELFLLCSDDAESDVRMVADECLNKVIKALMDSNLPRLQLELYKEIKKNGAP 169
************************************************************
sp|P42858|HD_HUMAN RSLRAALWRFAELAHLVRPQKCRPYLVNLLPCLTRTSKRPEESVQETLAAAVPKIMASFG 240
tr|A0A2K6QU33|A0A2K6QU33_RHIRO RSLRAALWRFAELAHLVRPQKCRPYLVNLLPCLSRTSKRPEESVQETLAAAVPKIMASFG 191
tr|A0A2K6AF47|A0A2K6AF47_MANLE RSLRAALWRFAELAHLVRPQKCRPYLVNLLPCLSRTSKRPEESVQETLAAAVPKIMASFG 178
tr|A0A2K6CY39|A0A2K6CY39_MACNE RSLRAALWRFAELAHLVRPQKCRPYLVNLLPCLSRTSKRPEESVQETLAAAVPKIMASFG 229
*********************************:**************************
sp|P42858|HD_HUMAN NFANDNEIKVLLKAFIANLKSSSPTIRRTAAGSAVSICQHSRRTQYFYSWLLNVLLGLLV 300
tr|A0A2K6QU33|A0A2K6QU33_RHIRO NFANDNEIKVLLKAFIANLKSSSPTIRRTAAGSAVSICQHSRRTQYFYSWLLNVLLGLLV 251
tr|A0A2K6AF47|A0A2K6AF47_MANLE NFANDNEIKVLLKAFIANLKSSSPTIRRTAAGSAVSICQHSRRTQYFYSWLLNVLLGLLV 238
tr|A0A2K6CY39|A0A2K6CY39_MACNE NFANDNEIKVLLKAFIANLKSSSPTIRRTAAGSAVSICQHSRRTQYFYSWLLNVLLGLLV 289
************************************************************
sp|P42858|HD_HUMAN PVEDEHSTLLILGVLLTLRYLVPLLQQQVKDTSLKGSFGVTRKEMEVSPSAEQLVQVYEL 360
tr|A0A2K6QU33|A0A2K6QU33_RHIRO PVEEEHSTLLILGVLLTLRYLVPLLQQQVKDTSLKGSFGVTRKEMEVSPSAEQLVQVYEL 311
tr|A0A2K6AF47|A0A2K6AF47_MANLE PVEEEHSTLLILGVLLTLRYLVPLLQQQVKDTSLKGSFGVTRKEMEVSPSAEQLVQVYEL 298
tr|A0A2K6CY39|A0A2K6CY39_MACNE PVEEEHSTLLILGVLLTLRYLVPLLQQQVKDTSLKGSFGVTRKEMEVSPSAEQLVQVYEL 349
***:********************************************************
sp|P42858|HD_HUMAN TLHHTQHQDHNVVTGALELLQQLFRTPPPELLQTLTAVGGIGQLTAAKEESGGRSRSGSI 420
tr|A0A2K6QU33|A0A2K6QU33_RHIRO TLHHTQHQDHNVVTGALELLQQLFRTPPPELLQALTTVGGIGQLTAAKEESGGRSRSGSI 371
tr|A0A2K6AF47|A0A2K6AF47_MANLE TLHHTQHQDHNVVTGALELLQQLFRTPPPELLQALTTVGGIGQLTAAKEESGGRSRSGSI 358
tr|A0A2K6CY39|A0A2K6CY39_MACNE TLHHTQHQDHNVVTGALELLQQLFRTPPPELLQALTTVGGIGQLTAAKEESGGRSRSGSI 409
*********************************:**:***********************
sp|P42858|HD_HUMAN VELIAGGGSSCSPVLSRKQKGKVLLGEEEALEDDSESRSDVSSSALTASVKDEISGELAA 480
tr|A0A2K6QU33|A0A2K6QU33_RHIRO VELIG--------MLVHYLLGKVLLGEEEALEDDSESRSDVSSSAFAASVKDDISGELAT 423
tr|A0A2K6AF47|A0A2K6AF47_MANLE VELIG--------MLVHYLLGKVLLGEEEALEDDSESRSDVSSSAFAASVKDDISGELAT 410
tr|A0A2K6CY39|A0A2K6CY39_MACNE VELIG--------MLVHYLLGKVLLGEEEALEDDSESRSDVSSSAFAASVKDDISGELAT 461
****. :* : *************************::*****:******:

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
BIOINFORMATICS ASSIGNMENT 47
sp|P42858|HD_HUMAN SSGVSTPGSAGHDIITEQPRSQHTLQADSVDLASCDLTSSATDGDEEDILSHSSSQVSAV 540
tr|A0A2K6QU33|A0A2K6QU33_RHIRO SSGVSTPGSAGHDIITEQPRSQHTLQADSVDLASCDLTSSATDGDEEDILSHSSSQVSAV 483
tr|A0A2K6AF47|A0A2K6AF47_MANLE SSGVSTPGSTGHDIITEQPRSQHTLQADSVDLASCDLTSSATDGDEEDILSHSSSQVSAV 470
tr|A0A2K6CY39|A0A2K6CY39_MACNE SSGVSTPGSAGHDIITEQPRSQHTLQADSVDLASCDLTSSATDGDEEDILSHSSSQVSAV 521
*********:**************************************************
sp|P42858|HD_HUMAN PSDPAMDLNDGTQASSPISDSSQTTTEGPDSAVTPSDSSEIVLDGTDNQYLGLQIGQPQD 600
tr|A0A2K6QU33|A0A2K6QU33_RHIRO PSDPAMDLNDGTQASSPISDSSQTTTEGPDSAVTPSDSSEIVLDGTDNQYLGLQIGQPQD 543
tr|A0A2K6AF47|A0A2K6AF47_MANLE PSDPAMDLNDGTQASSPISDSSQTTTEGPDSAVTPSDSSEIVLDGTDNQYLGLQIGQPQD 530
tr|A0A2K6CY39|A0A2K6CY39_MACNE PSDPAMDLNDGTQASSPISDSSQTTTEGPDSAVTPSDSSEIVLDGTDNQYLGLQIGQPQD 581
************************************************************
sp|P42858|HD_HUMAN EDEEATGILPDEASEAFRNSSMALQQAHLLKNMSHCRQPSDSSVDKFVLRDEATEPGDQE 660
tr|A0A2K6QU33|A0A2K6QU33_RHIRO EDEEATGVLPDAASEAFRNSSMALQQAHLLKNMSHSRQPSDSSVDKFVLRDEATEPGDQE 603
tr|A0A2K6AF47|A0A2K6AF47_MANLE EDEEATGVLPDEASEAFRNSSMALQQAHLLKNMSHSRQPSDSSVDKFVLRDEATEPGDQE 590
tr|A0A2K6CY39|A0A2K6CY39_MACNE EDEEATGVLPDEASEAFRNSSMALQQAHLLKNMSHSRQPSDSSVDKFVLRDEATEPGDQE 641
*******:*** ***********************.************************
sp|P42858|HD_HUMAN NKPCRIKGDIGQSTDDDSAPLVHCVRLLSASFLLTGGKNVLVPDRDVRVSVKALALSCVG 720
tr|A0A2K6QU33|A0A2K6QU33_RHIRO NKPCRIKGDIGRSNDDDSAPLVHCVRLLSASFLLTGGKNVLVPDRDVRVSVKALALSCVG 663
tr|A0A2K6AF47|A0A2K6AF47_MANLE NKPCRIKGDIGQSTDDDSAPLVHCVRLLSASFLLTGGKNVLVPDRDVRVSVKALALSCVG 650
tr|A0A2K6CY39|A0A2K6CY39_MACNE NKPCRIKGDIGQSTDDDSAPLVHCVRLLSASFLLTGGKNVLVPDRDVRVSVKALALSCVG 701
***********:*.**********************************************
sp|P42858|HD_HUMAN AAVALHPESFFSKLYKVPLDTTEYPEEQYVSDILNYIDHGDPQVRGATAILCGTLICSIL 780
tr|A0A2K6QU33|A0A2K6QU33_RHIRO AAVALHPESFFSKLYKVPLDTTEYPEEQYVSDILNYIDHGDPQVRGATAILCGTLICSIL 723
tr|A0A2K6AF47|A0A2K6AF47_MANLE AAVALHPESFFSKLYKVPLDTTEYPEEQYVSDILNYIDHGDPQVRGATAILCGTLICSIL 710
tr|A0A2K6CY39|A0A2K6CY39_MACNE AAVALHPESFFSKLYKVPLDTTEYPEEQYVSDILNYIDHGDPQVRGATAILCGTLICSIL 761
************************************************************
sp|P42858|HD_HUMAN SRSRFHVGDWMGTIRTLTGNTFSLADCIPLLRKTLKDESSVTCKLACTAVRNCVMSLCSS 840
tr|A0A2K6QU33|A0A2K6QU33_RHIRO SRSRFHVGDWMGAIRTLTGNTFSLVDCIPLLRKTLKDESSVTCKLACTAVRHCVMSLCSS 783
tr|A0A2K6AF47|A0A2K6AF47_MANLE SRSRFHVGDWMGAIRTLTGNTFSLADCIPLLRKTLKDESSVTCKLACTAVRHCVMSLCSS 770
tr|A0A2K6CY39|A0A2K6CY39_MACNE SRSRFHVGDWMGAIRTLTGNTFSLADCIPLLRETLKDESSVTCKLACTAVRHCVMSLCSS 821
************:***********.*******:******************:********
sp|P42858|HD_HUMAN SYSELGLQLIIDVLTLRNSSYWLVRTELLETLAEIDFRLVSFLEAKAENLHRGAHHYTGL 900
tr|A0A2K6QU33|A0A2K6QU33_RHIRO SYSELGLQLIIDVLTLRNSSYWLVRTELLETLAEIDFRLVSFLEAKAENLHRGAHHYTGL 843
tr|A0A2K6AF47|A0A2K6AF47_MANLE SYSELGLQLIIDVLTLRNSSYWLVRTELLETLAEIDFRLVSFLEAKAENLHRGAHHYTGL 830
tr|A0A2K6CY39|A0A2K6CY39_MACNE SYSELGLQLIIDVLTLRNSSYWLVRTELLETLAEIDFRLVSFLEAKAENLHRGAHHYTGL 881
************************************************************
sp|P42858|HD_HUMAN LKLQERVLNNVVIHLLGDEDPRVRHVAAASLIRLVPKLFYKCDQGQADPVVAVARDQSSV 960
tr|A0A2K6QU33|A0A2K6QU33_RHIRO LKLQERVLNNVVIHLLGDEDPRVRHVAAASLIRLVPKLFYKCDQGQADPVVAVARDQSSV 903
tr|A0A2K6AF47|A0A2K6AF47_MANLE LKLQERVLNNVVIHLLGDEDPRVRHVAAASLIRLVPKLFYKCDQGQADPVVAVARDQSSV 890
tr|A0A2K6CY39|A0A2K6CY39_MACNE LKLQERVLNNVVIHLLGDEDPRVRHVAAASLIRLVPKLFYKCDQGQADPVVAVARDQSSV 941
************************************************************
sp|P42858|HD_HUMAN YLKLLMHETQPPSHFSVSTITRIYRGYNLLPSITDVTMENNLSRVIAAVSHELITSTTRA 1020
tr|A0A2K6QU33|A0A2K6QU33_RHIRO YLKLLMHETQPPSHFSVSTITRIYRGYNLLPSITDVTMENNLSRVIAAVSHELITSTTRA 963
tr|A0A2K6AF47|A0A2K6AF47_MANLE YLKLLMHETQPPSHFSVSTITRIYRGYNLLPSITDVTMENNLSRVIAAVSHELITSTTRA 950
tr|A0A2K6CY39|A0A2K6CY39_MACNE YLKLLMHETQPPSHFSVSTITRIYRGYNLLPSITDVTMENNLSRVIAAVSHELITSTTRA 1001
************************************************************
sp|P42858|HD_HUMAN LTFGCCEALCLLSTAFPVCIWSLGWHCGVPPLSASDESRKSCTVGMATMILTLLSSAWFP 1080
tr|A0A2K6QU33|A0A2K6QU33_RHIRO LTFGCCEALCLLSTAFPVCIWSLGWHCGVPPLSASDESRKSCTVGMATMILTLLSSAWFP 1023
tr|A0A2K6AF47|A0A2K6AF47_MANLE LTFGCCEALCLLSTAFPVCIWSLGWHCGVPPLSASDESRKSCTVGMATMILTLLSSAWFP 1010
tr|A0A2K6CY39|A0A2K6CY39_MACNE LTFGCCEALCLLSTAFPVCIWSLGWHCGVPPLSASDESRKSCTVGMATMILTLLSSAWFP 1061
************************************************************
sp|P42858|HD_HUMAN LDLSAHQDALILAGNLLAASAPKSLRSSWASEEEANPAATKQEEVWPALGDRALVPMVEQ 1140
tr|A0A2K6QU33|A0A2K6QU33_RHIRO LDLSAHQDALILAGNLLAASAPKSLRSSWASEEEANPAATKQEEVWPALGDRALVPMVEQ 1083
tr|A0A2K6AF47|A0A2K6AF47_MANLE LDLSAHQDALILAGNLLAASAPKSLRSSWASEEEANPAATKQEEVWPALGDRALVPMVEQ 1070
tr|A0A2K6CY39|A0A2K6CY39_MACNE LDLSAHQDALILAGNLLAASAPKSLRSSWASEEEANPAATKQEEVWPALGDRALVPMVEQ 1121
************************************************************
sp|P42858|HD_HUMAN LFSHLLKVINICAHVLDDVAPGPAIKAALPSLTNPPSLSPIRRKGKEKEPGEQASVPLSP 1200
tr|A0A2K6QU33|A0A2K6QU33_RHIRO LFSHLLKVINICAHVLDDVAPGPAIKAALPSLTNPPSLSPIRRKGKEKEPGEQASVPLSP 1143
tr|A0A2K6AF47|A0A2K6AF47_MANLE LFSHLLKVINICAHVLDDVAPGPAIKAALPSLTNPPSLSPIRRKGKEKEPGEQASVPLSP 1130
tr|A0A2K6CY39|A0A2K6CY39_MACNE LFSHLLKVINICAHVLDDVAPGPAIKAALPSLTNPPSLSPIRRKGKEKEPGEQASVPLSP 1181
************************************************************
sp|P42858|HD_HUMAN KKGSEASAASRQSDTSGPVTTSKSSSLGSFYHLPSYLKLHDVLKATHANYKVTLDLQNST 1260
tr|A0A2K6QU33|A0A2K6QU33_RHIRO KKGSEASAASRQSDTSGPVTTSKSSSLGSFYHLPSYLKLHDVLKATHANYKVTLDLQNST 1203
tr|A0A2K6AF47|A0A2K6AF47_MANLE KKGSEASAASRQSDTSGPVTTSKSSSLGSFYHLPSYLKLHDVLKATHANYKVTLDLQNST 1190
tr|A0A2K6CY39|A0A2K6CY39_MACNE KKGSEASAASRQSDTSGPVTTSKSSSLGSFYHLPSYLKLHDVLKATHANYKVTLDLQNST 1241
Document Page
BIOINFORMATICS ASSIGNMENT 48
************************************************************
sp|P42858|HD_HUMAN EKFGGFLRSALDVLSQILELATLQDIGKCVEEILGYLKSCFSREPMMATVCVQQLLKTLF 1320
tr|A0A2K6QU33|A0A2K6QU33_RHIRO EKFGGFLRSALDVLSQILELATLQDIGKCVEEILGYLKSCFSREPMMATVCVQQLLKTLF 1263
tr|A0A2K6AF47|A0A2K6AF47_MANLE EKFGGFLRSALDVLSQILELATLQDIGKVCVLF---LKCCS--------CHVVCLLKTLF 1239
tr|A0A2K6CY39|A0A2K6CY39_MACNE EKFGGFLRSALDVLSQILELATLQDIGKCVEEILGYLKSCFSREPMMATVCVQQLLKTLF 1301
**************************** : **.* * ******
sp|P42858|HD_HUMAN GTNLASQFDGLSSNPSKSQGRAQRLGSSSVRPGLYHYCFMAPYTHFTQALADASLRNMVQ 1380
tr|A0A2K6QU33|A0A2K6QU33_RHIRO GTNLASQFDGLSSNPSKSQGRAQRLGSSSVRPGLYHYCFMAPYTHFTQALADASLRNMVQ 1323
tr|A0A2K6AF47|A0A2K6AF47_MANLE GTNLASQFDGLSSNPSKSQGRAQRLGSSSVRPGLYHYCFMAPYTHFTQALADASLRNMVQ 1299
tr|A0A2K6CY39|A0A2K6CY39_MACNE GTNLASQFDGLSSNPSKSQGRAQRLGSSSVRPGLYHYCFMAPYTHFTQALADASLRNMVQ 1361
************************************************************
sp|P42858|HD_HUMAN AEQENDTSGWFDVLQKVSTQLKTNLTSVTKNRADKNAIHNHIRLFEPLVIKALKQYTTTT 1440
tr|A0A2K6QU33|A0A2K6QU33_RHIRO AEQEHDTSGWFDVLQKVSTQLKTNLTSVTKSRADKNAIHNHIRLFEPLVIKALKQYTTTT 1383
tr|A0A2K6AF47|A0A2K6AF47_MANLE AEQEHDTSGWFDVLQKVSTQLKTNLTSVTKNRADKNAIHNHIRLFEPLVIKALKQYTTTT 1359
tr|A0A2K6CY39|A0A2K6CY39_MACNE AEQEHDTSGWFDVLQKVSTQLKTNLTSVTKNRADKNAIHNHIRLFEPLVIKALKQYTTTT 1421
****:*************************.*****************************
sp|P42858|HD_HUMAN CVQLQKQVLDLLAQLVQLRVNYCLLDSDQVFIGFVLKQFEYIEVGQFRESEAIIPNIFFF 1500
tr|A0A2K6QU33|A0A2K6QU33_RHIRO SVQLQKQVLDLLAQLVQLRVNYCLLDSDQVFIGFVLKQFEYIEVGQFRESEAIIPNIFFF 1443
tr|A0A2K6AF47|A0A2K6AF47_MANLE SVQLQKQVLDLLAQLVQLRVNYCLLDSDQVFIGFVLKQFEYIEVGQFRESEAIIPNIFFF 1419
tr|A0A2K6CY39|A0A2K6CY39_MACNE SVQLQKQVLDLLAQLVQLRVNYCLLDSDQVFIGFVLKQFEYIEVGQFRESEAIIPNIFFF 1481
.***********************************************************
sp|P42858|HD_HUMAN LVLLSYERYHSKQIIGIPKIIQLCDGIMASGRKAVTHAIPALQPIVHDLFVLRGTNKADA 1560
tr|A0A2K6QU33|A0A2K6QU33_RHIRO LVLLSYERYHSKQIIGIPKIIQLCDGIMASGRKAVTHAIPALQPIVHDLFVLRGTNKADA 1503
tr|A0A2K6AF47|A0A2K6AF47_MANLE LVLLSYERYHSKQIIGIPKIIQLCDGIMASGRKAVTHAIPALQPIVHDLFVLRGTNKADA 1479
tr|A0A2K6CY39|A0A2K6CY39_MACNE LVLLSYERYHSKQIIGIPKIIQLCDGIMASGRKAVTHAIPALQPIVHDLFVLRGTNKADA 1541
************************************************************
sp|P42858|HD_HUMAN GKELETQKEVVVSMLLRLIQYHQVLEMFILVLQQCHKENEDKWKRLSRQIADIILPMLAK 1620
tr|A0A2K6QU33|A0A2K6QU33_RHIRO GKELETQKEVVVSMLLRLIQYHQVLEMFILVLQQCHKENEDKWKRLSRQIADIILPMLAK 1563
tr|A0A2K6AF47|A0A2K6AF47_MANLE GKELETQKEVVVSMLLRLIQYHQVLEMFILVLQQCHKENEDKWKRLSRQIADIILPMLAK 1539
tr|A0A2K6CY39|A0A2K6CY39_MACNE GKELETQKEVVVSMLLRLIQYHQVLEMFILVLQQCHKENEDKWKRLSRQIADIILPMLAK 1601
************************************************************
sp|P42858|HD_HUMAN QQMHIDSHEALGVLNTLFEILAPSSLRPVDMLLRSMFVTPNTMASVSTVQLWISGILAIL 1680
tr|A0A2K6QU33|A0A2K6QU33_RHIRO QQMHIDSHEALGVLNTLFEILAPSSLRPVDMLLRSMFVTPNTMASVSTVQLWISGILAIL 1623
tr|A0A2K6AF47|A0A2K6AF47_MANLE QQMHIDSHEALGVLNTLFEILAPSSLRPVDMLLRSMFVTPNTMASVSTVQLWISGILAIL 1599
tr|A0A2K6CY39|A0A2K6CY39_MACNE QQMHIDSHEALGVLNTLFEILAPSSLRPVDMLLRSMFVTPNTMASVSTVQLWISGILAIL 1661
************************************************************
sp|P42858|HD_HUMAN RVLISQSTEDIVLSRIQELSFSPYLISCTVINRLRDGDSTSTLEEHSEGKQIKNLPEETF 1740
tr|A0A2K6QU33|A0A2K6QU33_RHIRO RVLISQSTEDIVLSRIQELSFSPYLISCPVINRLRDGDSNSALEEHSEGKQIKNLPEETF 1683
tr|A0A2K6AF47|A0A2K6AF47_MANLE RVLISQSTEDIVLSRIQELSFSPYLISCPVINRLRDGDSNSALEEHSEGKQIKNLPEETF 1659
tr|A0A2K6CY39|A0A2K6CY39_MACNE RVLISQSTEDIVLSRIQELSFSPYLISCPVINRLRDGDSNSALEEHSEGKQIKNLPEETF 1721
**************************** **********.*:******************
sp|P42858|HD_HUMAN SRFLLQLVGILLEDIVTKQLKVEMSEQQHTFYCQELGTLLMCLIHIFKSGMFRRITAAAT 1800
tr|A0A2K6QU33|A0A2K6QU33_RHIRO SRFLLQLVGILLEDIVTKQLKVEMSEQQHTFYCQELGTLLMCLIHIFKSGMFRRITAAAT 1743
tr|A0A2K6AF47|A0A2K6AF47_MANLE SRFLLQLVGILLEDIVTKQLKVEMSEQQHTFYCQELGTLLMCLIHIFKSGMFRRITAAAT 1719
tr|A0A2K6CY39|A0A2K6CY39_MACNE SRFLLQLVGILLEDIVTKQLKVEMSEQQHTFYCQELGTLLMCLIHIFKSGMFRRITAAAT 1781
************************************************************
sp|P42858|HD_HUMAN RLFRSDGCGGSFYTLDSLNLRARSMITTHPALVLLWCQILLLVNHTDYRWWAEVQQTPKR 1860
tr|A0A2K6QU33|A0A2K6QU33_RHIRO RLFRSDGCDGSFYTLDSLNLRARSMITTHPALVLLWCQILLLVNHTDYRWWAEVQQTPKR 1803
tr|A0A2K6AF47|A0A2K6AF47_MANLE RLFRSDGCGGSFYTLDSLNLRARSMITTHPALVLLWCQILLLVNHTDYRWWAEVQQTPKR 1779
tr|A0A2K6CY39|A0A2K6CY39_MACNE RLFRSDGCGGSFYTLDSLNLRARSMITTHPALVLLWCQILLLVNHTDYRWWAEVQQTPKR 1841
********.***************************************************
sp|P42858|HD_HUMAN HSLSSTKLLSPQMSGEEEDSDLAAKLGMCNREIVRRGALILFCDYVCQNLHDSEHLTWLI 1920
tr|A0A2K6QU33|A0A2K6QU33_RHIRO HSLSSTKLLSPQMSGEEEDSDLAAKLGMCNREIVRRGALILFCDYVCQNLHDSEHLTWLI 1863
tr|A0A2K6AF47|A0A2K6AF47_MANLE HSLSSTKLLSPQMSGEEEDSDLAAKLGMCNREIVRRGALILFCDYVCQNLHDSEHLTWLI 1839
tr|A0A2K6CY39|A0A2K6CY39_MACNE HSLSSTKLLSPQMSGEEEDSDLAAKLGMCNREIVRRGALILFCDYVCQNLHDSEHLTWLI 1901
************************************************************
sp|P42858|HD_HUMAN VNHIQDLISLSHEPPVQDFISAVHRNSAASGLFIQAIQSRCENLSTPTMLKKTLQCLEGI 1980
tr|A0A2K6QU33|A0A2K6QU33_RHIRO VNHIQDLISLSHEPPVQDFISAVHRNSAASGLFIQAIQSRCENLSAPTTLKKTLQCLEGI 1923
tr|A0A2K6AF47|A0A2K6AF47_MANLE VNHIQDLISLSHEPPVQDFISAVHRNSAASGLFIQAIQSRCENLSTPTTLKKTLQCLEGI 1899
tr|A0A2K6CY39|A0A2K6CY39_MACNE VNHIQDLISLSHEPPVQDFISAVHRNSAASGLFIQAIQSRCENLSTPTTLKKTLQCLEGI 1961
*********************************************:** ***********
sp|P42858|HD_HUMAN HLSQSGAVLTLYVDRLLCTPFRVLARMVDILACRRVEMLLAANLQSSMAQLPMEELNRIQ 2040
tr|A0A2K6QU33|A0A2K6QU33_RHIRO HLSQSGAVLTLYVDRLLCTPFRVLARMVDILACRRVEMLLAANLQSSMAQLPMEELNRIQ 1983
tr|A0A2K6AF47|A0A2K6AF47_MANLE HLSQSGAVLTLYVDRLLCTPFRVLARMVDILACRRVEMLLAANLQSSMAQLPMEELNRIQ 1959
Document Page
BIOINFORMATICS ASSIGNMENT 49
tr|A0A2K6CY39|A0A2K6CY39_MACNE HLSQSGAVLTLYVDRLLCTPFRVLARMVDILACRRVEMLLAANLQSSMAQLPMEELNRIQ 2021
************************************************************
sp|P42858|HD_HUMAN EYLQSSGLAQRHQRLYSLLDRFRLSTMQDSLSPSPPVSSHPLDGDGHVSLETVSPDKDWY 2100
tr|A0A2K6QU33|A0A2K6QU33_RHIRO EYLQSSGLAQRHQRLYSLLDRFRLSTMQDSLSPSPPVSSHPLDGDGHVSLETVSPDKVWY 2043
tr|A0A2K6AF47|A0A2K6AF47_MANLE EYLQSSGLAQRHQRLYSLLDRFRLSTMQDSLSPSPPASSHPLDGDGHVSLETVSPDRDWY 2019
tr|A0A2K6CY39|A0A2K6CY39_MACNE EYLQSSGLAQRHQRLYSLLDRFRLSTMQDSLSPSPPVSSHPLDGDGHVSLETVSPDKDWY 2081
************************************.*******************: **
sp|P42858|HD_HUMAN VHLVKSQCWTRSDSALLEGAELVNRIPAEDMNAFMMNSEFNLSLLAPCLSLGMSEISGGQ 2160
tr|A0A2K6QU33|A0A2K6QU33_RHIRO IHLVKSQCWTRSDSALLEGAELVNRIPAEDMSAFMMNSEFNLSLLAPCLSLGMSEISGGQ 2103
tr|A0A2K6AF47|A0A2K6AF47_MANLE IHLVKSQCWTRSDSALLEGAELVNRIPAEDMSAFMMNSEFNLSLLAPCLSLGMSEISGGQ 2079
tr|A0A2K6CY39|A0A2K6CY39_MACNE IHLVKSQCWTRSDSALLEGAELVNRIPAEDMSAFMMNSEFNLSLLAPCLSLGMSEISGGQ 2141
:******************************.****************************
sp|P42858|HD_HUMAN KSALFEAAREVTLARVSGTVQQLPAVHHVFQPELPAEPAAYWSKLNDLFGDAALYQSLPT 2220
tr|A0A2K6QU33|A0A2K6QU33_RHIRO KSPLFEAAREVTLARVSDTVQQLPAVHHVFQSDLPAEPAAYWSKLNDLFGDAALYQSLTT 2163
tr|A0A2K6AF47|A0A2K6AF47_MANLE KSPLFEAAREVTLARVSSTVQQLPAVHHVFQSDLPAEPAAYWSKLNDLFGDAALYPSLTT 2139
tr|A0A2K6CY39|A0A2K6CY39_MACNE KSPLFEAAREVTLARVSSTVQQLPAVHHVFQSDLPAEPAAYWSKLNDLFGDAALYQSLTT 2201
** **************.************* :********************** ** *
sp|P42858|HD_HUMAN LARALAQYLVVVSKLPSHLHLPPEKEKDIVKFVVATLEALSWHLIHEQIPLSLDLQAGLD 2280
tr|A0A2K6QU33|A0A2K6QU33_RHIRO LARALAQYLVAVSKLPSHLHLPPEKEKDTLKFVVATLEALSWHLIHEQIPLSLDLQAGLD 2223
tr|A0A2K6AF47|A0A2K6AF47_MANLE LARALAQYLVAVSKLPSHLHLPPEKEKDTVKFVVATLEALSWHLIHEQIPLSLDLQAGLD 2199
tr|A0A2K6CY39|A0A2K6CY39_MACNE LARALAQYLVAVSKLPSHLHLPPEKEKDTMKFVVATLEALSWHLIHEQIPLSLDLQAGLD 2261
**********.***************** :******************************
sp|P42858|HD_HUMAN CCCLALQLPGLWSVVSSTEFVTHACSLIYCVHFILEAVAVQPGEQLLSPERRTNTPKAIS 2340
tr|A0A2K6QU33|A0A2K6QU33_RHIRO CCCLALQLPGLWSVVSSAEFVTHACSLIHCVHFILEAVAVQPGEQLLSPERRTNTPKASR 2283
tr|A0A2K6AF47|A0A2K6AF47_MANLE CCCLALQLPGLWSVVSSAEFVTHACSLIHCVHFLLEAVAVQPGEQLLSPERRTNTPKAIR 2259
tr|A0A2K6CY39|A0A2K6CY39_MACNE CCCLALQLPGLWSVVSSAEFVTHACSLIHCVHFILEAVAVQPGEQLLSPERRTNTPKAIR 2321
*****************:**********:****:************************
sp|P42858|HD_HUMAN EEEEEVDPNTQNPKYITAACEMVAEMVESLQSVLALGHKRNSGVPAFLTPLLRNIIISLA 2400
tr|A0A2K6QU33|A0A2K6QU33_RHIRO EEEEEVDPNTQNPKYITAACEMVAEMVESLQSVLALGHKRNSGVPAFLTSVLRNIVVSLA 2343
tr|A0A2K6AF47|A0A2K6AF47_MANLE EEEEEVDPNTQNPKYITAACEMVAEMVESLQSVLALGHKRNSGVPAFLTSVLRNIVVSLA 2319
tr|A0A2K6CY39|A0A2K6CY39_MACNE EEEEEIDPNTQNPKYITAACEMVAEMVESLQSVLALGHKRNSGVPAFLTSVLRNIVVSLA 2381
*****:******************************************* :****::***
sp|P42858|HD_HUMAN RLPLVNSYTRVPPLVWKLGWSPKPGGDFGTAFPEIPVEFLQEKEVFKEFIYRINTLGWTS 2460
tr|A0A2K6QU33|A0A2K6QU33_RHIRO RLPLVNSYTRVPPLVWKLGWLPKPGGDFGTAFPEIPVEFLQEKEVFKEFIYRINTLGWTS 2403
tr|A0A2K6AF47|A0A2K6AF47_MANLE RLPLVNSYTRVPPLVWKLGWSPKPGGDFGTAFPEIPVEFLQEKEVFKEFIYRINTLGWTS 2379
tr|A0A2K6CY39|A0A2K6CY39_MACNE RLPLVNSYTRVPPLVWKLGWSPKPGGDFGTAFPEIPVEFLQEKEVFKEFIYRINTLGWTS 2441
******************** ***************************************
sp|P42858|HD_HUMAN RTQFEETWATLLGVLVTQPLVMEQEESPPEEDTERTQINVLAVQAITSLVLSAMTVPVAG 2520
tr|A0A2K6QU33|A0A2K6QU33_RHIRO RTQFEETWATLLGVLVTQPLVMEQEESPPEEDTERTQINVLAVQAITSLVLSAMTVPVAG 2463
tr|A0A2K6AF47|A0A2K6AF47_MANLE RTQFEETWATLLGVLVTQPLVMEQEESPPEEDTERTQINVLAVQAITSLVLSAMTVPVAG 2439
tr|A0A2K6CY39|A0A2K6CY39_MACNE RTQFEETWATLLGVLVTQPLVMEQEESPPEEDTERTQINVLAVQAITSLVLSAMTVPVAG 2501
************************************************************
sp|P42858|HD_HUMAN NPAVSCLEQQPRNKPLKALDTRFGRKLSIIRGIVEQEIQAMVSKRENIATHHLYQAWDPV 2580
tr|A0A2K6QU33|A0A2K6QU33_RHIRO NPAVSCLEQQPRNKPLKALDTRFGRKLSIIRGIVEQEIQAMVSKRENIATHHLYQAWDPV 2523
tr|A0A2K6AF47|A0A2K6AF47_MANLE NPAVSCLEQQPRNKPLKALDTRFGRKLSIIRGIVEQEIQAMVSKRENIATHHLYQAWDPV 2499
tr|A0A2K6CY39|A0A2K6CY39_MACNE NPAVSCLEQQPRNKPLKALDTRFGRKLSIIRGIVEQEIQAMVSKRENIATHHLYQAWDPV 2561
************************************************************
sp|P42858|HD_HUMAN PSLSPATTGALISHEKLLLQINPERELGSMSYKLGQVSIHSVWLGNSITPLREEEWDEEE 2640
tr|A0A2K6QU33|A0A2K6QU33_RHIRO PSLSPATTGALISHEKLLLQINPERELGSVSYKLGQVSIHSVWLGNSITPLREEEWDEEE 2583
tr|A0A2K6AF47|A0A2K6AF47_MANLE PSLSPATTGALISHEKLLLQINPERELGSVSYKLGQVSIHSVWLGNSITPLREEEWDEEE 2559
tr|A0A2K6CY39|A0A2K6CY39_MACNE PSLSPATTGALISHEKLLLQINPERELGSVSYKLGQVSIHSVWLGNSITPLREEEWDEEE 2621
*****************************:******************************
sp|P42858|HD_HUMAN EEEADAPAPSSPPTSPVNSRKHRAGVDIHSCSQFLLELYSRWILPSSSARRTPAILISEV 2700
tr|A0A2K6QU33|A0A2K6QU33_RHIRO EEEADAPAPSSPPTSPVNSRKHRAGVDIHSCSQFLLELYSRWILPSNSARRTPAILISEV 2643
tr|A0A2K6AF47|A0A2K6AF47_MANLE EEEADAPAPSSPPTSPVNSRKHRAGVDIHSCSQFLLELYSRWILPSNSARRTPAILISEV 2619
tr|A0A2K6CY39|A0A2K6CY39_MACNE EEEADAPAPSSPPTSPVNSRKHRAGVDIHSCSQFLLELYSRWILPSNSARRTPAILISEV 2681
**********************************************.*************
sp|P42858|HD_HUMAN VRSLLVVSDLFTERNQFELMYVTLTELRRVHPSEDEILAQYLVPATCKAAAVLGMDKAVA 2760
tr|A0A2K6QU33|A0A2K6QU33_RHIRO VRSLLVVSDLFTERNQFELMYVTLTELRRVHPSEDEILAQYLVPATCKAAAVLGMDKVVA 2703
tr|A0A2K6AF47|A0A2K6AF47_MANLE VRSLLVVSDLFTERNQFELMYVTLTELRRVHPSEDEILAQYLVPATCKAAAVLGMDKVVA 2679
tr|A0A2K6CY39|A0A2K6CY39_MACNE VRSLLVVSDLFTERNQFELMYVTLTELRRVHPSEDEILAQYLVPATCKAAAVLGMDKVVA 2741
*********************************************************.**
sp|P42858|HD_HUMAN EPVSRLLESTLRSSHLPSRVGALHGVLYVLECDLLDDTAKQLIPVISDYLLSNLKGIAHC 2820
tr|A0A2K6QU33|A0A2K6QU33_RHIRO EPVSRLLESTLRSSHLPSRVGALHGILYVLECDLLDDTAKQLIPVISDYLLSNLKGIAHC 2763

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
BIOINFORMATICS ASSIGNMENT 50
tr|A0A2K6AF47|A0A2K6AF47_MANLE EPVSRLLESTLRSSHLPSRVGALHGILYVLECDLLDDTAKQLIPVISDYLLSNLKGIAHC 2739
tr|A0A2K6CY39|A0A2K6CY39_MACNE EPVSRLLESTLRSSHLPSRVGALHGILYVLECDLLDDTAKQLIPVISDYLLSNLKGIAHC 2801
*************************:**********************************
sp|P42858|HD_HUMAN VNIHSQQHVLVMCATAFYLIENYPLDVGPEFSASIIQMCGVMLSGSEESTPSIIYHCALR 2880
tr|A0A2K6QU33|A0A2K6QU33_RHIRO VNIHSQQHVLVMCATAFYLIENYPLDVGPEFSASIIQMCGVMLSGSEESTPSIIYHCALR 2823
tr|A0A2K6AF47|A0A2K6AF47_MANLE VNIHSQQHVLVMCATAFYLIENYPLDVGPEFSASIIQMCGVMLSGSEESTPSIIYHCALR 2799
tr|A0A2K6CY39|A0A2K6CY39_MACNE VNIHSQQHVLVMCATAFYLIENYPLDVGPEFSASIIQMCGVMLSGSEESTPSIIYHCALR 2861
************************************************************
sp|P42858|HD_HUMAN GLERLLLSEQLSRLDAESLVKLSVDRVNVHSPHRAMAALGLMLTCMYTGKEKVSPGRTSD 2940
tr|A0A2K6QU33|A0A2K6QU33_RHIRO GLERLLLSEQLSRLDAESLVKLSVDRVNVHSPHRAMAALGLMLTCMYTGKEKVSPGRTSD 2883
tr|A0A2K6AF47|A0A2K6AF47_MANLE GLERLLLSEQLSRLDAESLVKLSVDRVNVHSPHRAMAALGLMLTCMYTGKEKVSPGRTSD 2859
tr|A0A2K6CY39|A0A2K6CY39_MACNE GLERLLLSEQLSRLDAESLVKLSVDRVNVHSPHRAMAALGLMLTCMYTGKEKVSPGRTSD 2921
************************************************************
sp|P42858|HD_HUMAN PNPAAPDSESVIVAMERVSVLFDRIRKGFPCEARVVARILPQFLDDFFPPQDIMNKVIGE 3000
tr|A0A2K6QU33|A0A2K6QU33_RHIRO PNPAAPDSESVIVAMERVSVLFDRIRKGFPCEARVVARILPQFLDDFFPPQDIMNKVIGE 2943
tr|A0A2K6AF47|A0A2K6AF47_MANLE PNPAAPDSESVIVAMERVSVLFDRIRKGFPCEARVVARILPQFLDDFFPPQDIMNKVIGE 2919
tr|A0A2K6CY39|A0A2K6CY39_MACNE PNPAAPDSESVIVAMERVSVLFDRIRKGFPCEARVVARILPQFLDDFFPPQDIMNKVIGE 2981
************************************************************
sp|P42858|HD_HUMAN FLSNQQPYPQFMATVVYKVFQTLHSTGQSSMVRDWVMLSLSNFTQRAPVAMATWSLSCFF 3060
tr|A0A2K6QU33|A0A2K6QU33_RHIRO FLSNQQPYPQFMATVVYKVFQTLHSTGQSSMVRDWVMLSLSNFTQRTPVAMATWSLSCFF 3003
tr|A0A2K6AF47|A0A2K6AF47_MANLE FLSNQQPYPQFMATVVYKVFQTLHSTGQSSMVRDWVMLSLSNFTQRTPVAMATWSLSCFF 2979
tr|A0A2K6CY39|A0A2K6CY39_MACNE FLSNQQPYPQFMATVVYKVFQTLHSTGQSSMVRDWVMLSLSNFTQRTPVAMATWSLSCFF 3041
**********************************************:*************
sp|P42858|HD_HUMAN VSASTSPWVAAILPHVISRMGKLEQVDVNLFCLVATDFYRHQIEEELDRRAFQSVLEVVA 3120
tr|A0A2K6QU33|A0A2K6QU33_RHIRO VSASTSPWVAAILPHVISRMGKLEQVDVNLFCLVATDFYRHQIEEELDRRAFQSVFEVVA 3063
tr|A0A2K6AF47|A0A2K6AF47_MANLE VSASTSPWVAAILPHVISRMGKLEQVDVNLFCLVATDFYRHQIEEELDRRAFQSVFEVVA 3039
tr|A0A2K6CY39|A0A2K6CY39_MACNE VSASTSPWVAAILPHVISRMGKLEQVDVNLFCLVATDFYRHQIEEELDRRAFQSVFEVVA 3101
*******************************************************:****
sp|P42858|HD_HUMAN APGSPYHRLLTCLRNVHKVTTC 3142
tr|A0A2K6QU33|A0A2K6QU33_RHIRO APGSPYHRLLTCLRNVHKVTTC 3085
tr|A0A2K6AF47|A0A2K6AF47_MANLE APGSPYHRLLTCLRNVHKVTTC 3061
tr|A0A2K6CY39|A0A2K6CY39_MACNE APGSPYHRLLTCLRNVHKVTTC 3123
**********************
(Note: please consider table given below to understand the colors given to each letter)
What symbols are used for positions in the alignment that contain identical, highly homologous,
homologous, and non-homologous residues?
Answer-
An * (asterisk) indicates position which have a single and fully conserved residue.
A : (colon) indicates conservation between groups with strongly similar properties- scoring >0.5 in a
Gonnet PAM 250 matrix.
A . (period) indicates conservation among groups of weekly similar properties- scoring=<0.5 in the
Gonnet PAM matrix.
What do the colors mean?
RESIDUE COLOR PROPERTIES
Document Page
BIOINFORMATICS ASSIGNMENT 51
AVFPMILW RED Small(small+ hydrophobic
DE BLUE Acidic
RK MAGENTA Basic H
STYHCNGQ GREEN Hydroxyl + sulfhydryl+
amino+ G
OTHERS Grey Unusual amino/immune acids
etc.
10. THREE-DIMENSIONAL VIEWING OF AN IDENTIFIED PROTEIN STRUCTURE:
What is the sequence of the Huntingtin peptide? What protein fold does it encompass?
>3LRH:A|PDBID|CHAIN|SEQUENCE
MGSQPVLTQSPSVSAAPRQRVTISVSGSNSNIGSNTVNWIQQLPGRAPELLMYDDDLLAPGVSDRFSGSRSGTSASLTIS
GLQSEDEADYYAATWDDSLNGWVFGGGTKVTVLSAHHHHHH
>3LRH:B|PDBID|CHAIN|SEQUENCE
EKLMKAFESLKSFQ
>3LRH:C|PDBID|CHAIN|SEQUENCE
MGSQPVLTQSPSVSAAPRQRVTISVSGSNSNIGSNTVNWIQQLPGRAPELLMYDDDLLAPGVSDRFSGSRSGTSASLTIS
GLQSEDEADYYAATWDDSLNGWVFGGGTKVTVLSAHHHHHH
>3LRH:D|PDBID|CHAIN|SEQUENCE
EKLMKAFESLKSFQ
>3LRH:E|PDBID|CHAIN|SEQUENCE
MGSQPVLTQSPSVSAAPRQRVTISVSGSNSNIGSNTVNWIQQLPGRAPELLMYDDDLLAPGVSDRFSGSRSGTSASLTIS
GLQSEDEADYYAATWDDSLNGWVFGGGTKVTVLSAHHHHHH
>3LRH:F|PDBID|CHAIN|SEQUENCE
EKLMKAFESLKSFQ
>3LRH:G|PDBID|CHAIN|SEQUENCE
MGSQPVLTQSPSVSAAPRQRVTISVSGSNSNIGSNTVNWIQQLPGRAPELLMYDDDLLAPGVSDRFSGSRSGTSASLTIS
GLQSEDEADYYAATWDDSLNGWVFGGGTKVTVLSAHHHHHH
>3LRH:H|PDBID|CHAIN|SEQUENCE
EKLMKAFESLKSFQ
>3LRH:I|PDBID|CHAIN|SEQUENCE
MGSQPVLTQSPSVSAAPRQRVTISVSGSNSNIGSNTVNWIQQLPGRAPELLMYDDDLLAPGVSDRFSGSRSGTSASLTIS
GLQSEDEADYYAATWDDSLNGWVFGGGTKVTVLSAHHHHHH
>3LRH:J|PDBID|CHAIN|SEQUENCE
EKLMKAFESLKSFQ
>3LRH:K|PDBID|CHAIN|SEQUENCE
MGSQPVLTQSPSVSAAPRQRVTISVSGSNSNIGSNTVNWIQQLPGRAPELLMYDDDLLAPGVSDRFSGSRSGTSASLTIS
GLQSEDEADYYAATWDDSLNGWVFGGGTKVTVLSAHHHHHH
>3LRH:L|PDBID|CHAIN|SEQUENCE
EKLMKAFESLKSFQ
>3LRH:M|PDBID|CHAIN|SEQUENCE
MGSQPVLTQSPSVSAAPRQRVTISVSGSNSNIGSNTVNWIQQLPGRAPELLMYDDDLLAPGVSDRFSGSRSGTSASLTIS
GLQSEDEADYYAATWDDSLNGWVFGGGTKVTVLSAHHHHHH
>3LRH:N|PDBID|CHAIN|SEQUENCE
EKLMKAFESLKSFQ
>3LRH:O|PDBID|CHAIN|SEQUENCE
MGSQPVLTQSPSVSAAPRQRVTISVSGSNSNIGSNTVNWIQQLPGRAPELLMYDDDLLAPGVSDRFSGSRSGTSASLTIS
GLQSEDEADYYAATWDDSLNGWVFGGGTKVTVLSAHHHHHH
>3LRH:P|PDBID|CHAIN|SEQUENCE
EKLMKAFESLKSFQ
Document Page
BIOINFORMATICS ASSIGNMENT 52
Protein fold
It is an armadillo (ARM)-like fold, including a multi-helical fold made up of two curved layers
of alpha helices arranged in regular right-handed super helix.
Which option allows you to best differentiate the huntingtin peptide from the VL domain?
Answer- choosing “By Secondary structure” colour option allows to best differentiate the huntingtin
peptide.
Structure with best angle
Figure 3 showing 3D structure with best angle
Structure under cartoon style
Answer: Choosing “By Hydrophobicity” colour option allows best differentiation of huntingtin peptide
Structure with best angle

Secure Best Marks with AI Grader

Need help grading? Try our AI Grader for instant feedback on your assignments.
Document Page
BIOINFORMATICS ASSIGNMENT 53
Figure 4 showing 3D structure by hydrophobicity
Name four residues located in hydrophobic regions? For these four residues, what secondary structure are the
found within?
Answer-
1. LLE:
Position: Atom: [LLE] 74: A.CA ()
Secondary structure is the found within: Beta strand
2. SER
Position: Atom: [SER] 73: A.CA ()
Secondary structure is the found within: Beta strand
3. PHE
Position: Atom: [PHE] 63: A.CA ()
Secondary structure is the found within: Beta strand
4. TRP
Position: Atom: [TRP] 36: A.CA ()
Secondary structure is found within: Beta strand
Number 1 result found?
Document Page
BIOINFORMATICS ASSIGNMENT 54
Answer:
PDB No. 6EZ8
Name: Human Huntingtin- HAP complex structure
Method used to obtain this structure
Electron Microscopy
Resolution: 4.0 Â
What percentage of the structure of Huntingtin is predicted to be helical?
Answers: 34% helical (161 helices; 1084 residues)
Document Page
BIOINFORMATICS ASSIGNMENT 55
References
BerryKravis, E., Levin, R., Shah, H., Mathur, S., Darnell, J. C., & Ouyang, B. (2015). Cholesterol
levels in fragile X syndrome. American Journal of Medical Genetics Part A, 167(2), 379-384.
Coussens, L. M., Tinkle, C. L., Hanahan, D., Werb, Z. (2000). MMP-9 supplied by bone marrow-
derived cells contributes to skin carcinogenesis. Cell, 103, 481-490,
Hagerman, R. J., Berry-Kravis, E., Hazlett, H. C., Junior, D. B.B., Moine, H., Kooye, R. F., Tassone,
F., Gantois, I., Sonenberg, N., Mandel, J. L., & Hagerman, P. J. (2017). Fragile X syndrome.
Nature Reviews, 3, 17065.

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
BIOINFORMATICS ASSIGNMENT 56
1 out of 56
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]

Your All-in-One AI-Powered Toolkit for Academic Success.

Available 24*7 on WhatsApp / Email

[object Object]