Practical Bioinformatics: Protein Sequence Analysis and Prediction

Verified

Added on  2019/09/20

|3
|671
|482
Practical Assignment
AI Summary
This practical bioinformatics assignment requires analyzing a given protein sequence to predict its likely function and, if possible, model its structure. The task involves using various bioinformatics methods, including sequence database searches (BLAST, FASTA, Smith-Waterman), domain structure deduction, analysis using PROSITE, PRINTS, BLOCKS or Pfam, PSI-BLAST searches, multiple sequence alignments, phylogenetic tree construction, and structure prediction techniques like comparative modeling and fold recognition. The analysis should also consider secondary structure prediction, trans-membrane segment prediction, and potential protein-protein interactions. The assignment emphasizes concise presentation of results, with an abstract, introduction, methods, results and discussion, and conclusions sections, alongside proper referencing. Students are encouraged to conduct a mini-research project, exploring the protein sequence and using the provided hints to guide their investigations and analysis.
Document Page
Practical Bioinformatics
80%
3800 words
Introduction:
You are required to analyse a protein sequence by bioinformatics
methods. What is its likely function? If it does not have a structure
can you determine a model for the structure? What does the
model tell you about its function? It might also be a "hypothetical"
protein, where the structure has already been determined by a
structural genomics consortium. In which case can you use
sequence and known structure to determine protein function? You
are not restricted to methods covered in the lectures, but you
should focus on methods with the general aim of prediction of
protein function and/or structure from sequence.
Protein sequence:
MSPSVEETTS VTESIMFAIV SFKHMGPFEG YSMSADRAAS DLLIGMFGSV SLVNLLTIIG
CLWVLRVTRP PVSVMIFTWN LVLSQFFSIL ATMLSKGIML RGALNLSLCR LVLFVDDVGL
YSTALFFLFL ILDRLSAISY GRDLWHHETR ENAGVALYAV AFAWVLSIVA AVPTAATGSL
DYRWLGCQIP IQYAAVDLTI KMWFLLGAPM IAVLANVVEL AYSDRRDHVW
SYVGRVCTFY
VTCLMLFVPY YCFRVLRGVL QPASAAGTGF GIMDYVELAT RTLLTMRLGI LPLFIIAFFS
REPTKDLDDS FDYLVERCQQ SCHGHFVRRL VQALKRAMYS VELAVCYFST
SVRDVAEAVK
KSSSRCYADA TSAAVVVTTT TSEKATLVEH AEGMASEMCP GTTIDVSAES
SSVLCTDGEN
TVASDATVTA L
There should be an abstract of up to 250 words. An introduction, detailing
what you have done, and why it is interesting, perhaps with a brief
literature review if relevant. You are not expected to give details for
the methods you have used, but do cite primary references if you
use them. The remainder of the paper should be:
Results and Discussion section giving relevant results and
discussing their significance;
Conclusions section where you review the significance of your
results and comment on the usefulness of the methods used;
References.
Marks will be awarded as follows:
Abstract (5%) - awarded for a clear and concise abstract of the
paper.
tabler-icon-diamond-filled.svg

Paraphrase This Document

Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser
Document Page
Introduction (10%) - awarded for a clear introduction to the study
and its motivation.
Methods (5%) - awarded for the choice of a suitable number of
relevant investigations. Do not include a literature review of
the methods used.
Results and Discussion (30%) - awarded for the clarity of the
presentation of results, and the choice of an appropriate level of
detail.
Conclusions (20%) - awarded for a discussion showing theoretical
insight into the methods chosen, the likely accuracy of any
predictions, and the biological relevance of the results.
References (10%) - awarded for appropriate and adequate use of
references.
Presentation (20%) - awarded for clear presentation in all sections.
Over long papers will be penalised at 5%, just as they are when
submitted to real scientific journals. Good marks will be obtained if
the relevant information is given concisely, but with sufficient detail
that the expert reader could repeat the investigations if necessary.
Hints
The assessment is open ended, and is therefore more like a mini research
project. Here are some ideas about the sorts of things you might
investigate:
Searching protein sequence databases for related sequences using
BLAST, FASTA or Smith-Waterman algorithms.
Prediction of likely function of the sequence by similarity methods.
Deduction of the domain structure of the sequence from the results
of sequence searches.
Analysis of the appearance of the sequence, or domains from it, in
other organisms, or other kingdoms of life.
Analysis of the sequence using PROSITE, PRINTS, BLOCKS or Pfam.
Doing database searches with PSI-BLAST.
Making multiple alignments of the sequence (or domains from it)
with related sequences.
Making phylogenetic trees based on multiple alignments.
In the case of a sequence of known structure, searching for related
structures.
Prediction of secondary structure for the sequence, or a domain
from it.
Prediction of tertiary structure - Comparative Modelling.
Prediction of tertiary structure - Fold Recognition.
Prediction of trans-membrane segments.
Prediction of protein-protein interactions.
Document Page
chevron_up_icon
1 out of 3
circle_padding
hide_on_mobile
zoom_out_icon
[object Object]