University Bioinformatics: Sequence Analysis Homework Assignment
VerifiedAdded on 2022/09/11
|10
|1600
|26
Homework Assignment
AI Summary
This assignment solution delves into sequence analysis within the realm of bioinformatics. Part 1 focuses on experimental design, posing questions related to cancer immunotherapy and the effects of a compound SK234 on breast cancer. It includes hypotheses, experimental controls, potential errors, and statistical analyses like ANOVA, EC50, and IC50. Part 2 explores sequence analysis using BLASTn, examining a Plasmodium vivax sequence and Homo sapiens HBB gene sequences, including sequence alignment and blastx analysis to identify protein sequence similarities and mutations. Part 3 applies ddCt analysis to determine gene expression changes in diseased patients. The assignment integrates bioinformatics tools and techniques to analyze genetic data and understand biological processes.

Running head: SEQUENCE ANALYSIS
SEQUENCE ANALYSIS
Name of the Student
Name of the University
Author note
SEQUENCE ANALYSIS
Name of the Student
Name of the University
Author note
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

1SEQUENCE ANALYSIS
Part 1
a) Questions for the new information
I. What will be the key biomarker to establish tumour killing effect of SK234?
II. How can we quicken the identification of biomarkers for the prompt recognition of breast
cancer?
III. Does this drug induce different cell phases in two different cell lines?
IV. Does SK234 compound is guided by a mixture of the proteomic and genetic and
information of the tumor?
V. What will be the parameters that regulate a proficient immune system, which gives a
whole response to cancers after induction of the drug?
Hypothesis for question V is that Cancer immune therapy will show a clinical benefit in
breast cancer treatment.
Control for the experiment will be drug that shows cancer immune therapy such as Nivolumab,
Ipilimumab or Avelumab.
Experimental will be the SK234 compound treated cells.
The possible experimental error will be the dosage. All compound that act in treating breast
cancer is used in particular concentration. Dosage error can be possible as it is new compound
being tested.
The model organism can be a mouse model. We may choose 6 molecules of SK234 compound.
These 6 molecules may be used for n out of 60,000 cells (n= number of cells from mouse
model). The ‘n’ will be chosen in such a way that it can triplicate in each molecules. On other
hand we can EC50 and IC50 analysis of the drug. If the EC50 value is 1 microM then the drug
Part 1
a) Questions for the new information
I. What will be the key biomarker to establish tumour killing effect of SK234?
II. How can we quicken the identification of biomarkers for the prompt recognition of breast
cancer?
III. Does this drug induce different cell phases in two different cell lines?
IV. Does SK234 compound is guided by a mixture of the proteomic and genetic and
information of the tumor?
V. What will be the parameters that regulate a proficient immune system, which gives a
whole response to cancers after induction of the drug?
Hypothesis for question V is that Cancer immune therapy will show a clinical benefit in
breast cancer treatment.
Control for the experiment will be drug that shows cancer immune therapy such as Nivolumab,
Ipilimumab or Avelumab.
Experimental will be the SK234 compound treated cells.
The possible experimental error will be the dosage. All compound that act in treating breast
cancer is used in particular concentration. Dosage error can be possible as it is new compound
being tested.
The model organism can be a mouse model. We may choose 6 molecules of SK234 compound.
These 6 molecules may be used for n out of 60,000 cells (n= number of cells from mouse
model). The ‘n’ will be chosen in such a way that it can triplicate in each molecules. On other
hand we can EC50 and IC50 analysis of the drug. If the EC50 value is 1 microM then the drug

2SEQUENCE ANALYSIS
would be considered good for compound optimization. The data would be presented graphically
as well as in a tabular way so that the better analysis can be done. Anova analysis will also be
done to get a clear image of the drug activity. Ec50 is essential in determining the concentration
of the agonist to develop a response halfway among the maximum and baseline responses. Ec50
will clarify that the condition of the dose-response curve for a SK234 drug, it is frequently used
in measuring of an agonist's effectiveness. IC-50 is defined as the amount of the efficiency of an
element in inhibiting a precise biochemical as well as biological function. The one-way analysis
of variance (ANOVA) will help to identify whether there is a noteworthy statistical differences
amongst the means of three or more independent groups. As we have chosen mouse model they
will be euthanized and tumours will be weighed. The weight of the tumour will be different for
control and experimental. This comparative analysis will help us in understanding the effect of
compound SK234. Recently, cancer immunotherapy has demonstrated that perfect compound
vaccination as well as in vitro instigation of T cells to supplement the function of the immune
system; predominantly the development of immune memory. At the end all the data will be
presented with graphs, screenshot of the bioinformatics test as well pictures will be provided so
that clear and distinct analysis can be done for the data obtained. The composition of the
compound will be mentioned so that the comparative analysis can be done.
would be considered good for compound optimization. The data would be presented graphically
as well as in a tabular way so that the better analysis can be done. Anova analysis will also be
done to get a clear image of the drug activity. Ec50 is essential in determining the concentration
of the agonist to develop a response halfway among the maximum and baseline responses. Ec50
will clarify that the condition of the dose-response curve for a SK234 drug, it is frequently used
in measuring of an agonist's effectiveness. IC-50 is defined as the amount of the efficiency of an
element in inhibiting a precise biochemical as well as biological function. The one-way analysis
of variance (ANOVA) will help to identify whether there is a noteworthy statistical differences
amongst the means of three or more independent groups. As we have chosen mouse model they
will be euthanized and tumours will be weighed. The weight of the tumour will be different for
control and experimental. This comparative analysis will help us in understanding the effect of
compound SK234. Recently, cancer immunotherapy has demonstrated that perfect compound
vaccination as well as in vitro instigation of T cells to supplement the function of the immune
system; predominantly the development of immune memory. At the end all the data will be
presented with graphs, screenshot of the bioinformatics test as well pictures will be provided so
that clear and distinct analysis can be done for the data obtained. The composition of the
compound will be mentioned so that the comparative analysis can be done.
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

3SEQUENCE ANALYSIS
Part B
a)
Fig A: Sequence for the Fig 1 retrieved through BLASTn analysis
The person got infected with Plasmodium vivax which is a protozoal parasite as well as a human
pathogen. This parasite is common and widely scattered as well as considered to be the cause of
repetitive malaria. Though it is less infectious than Plasmodium falciparum, which is the fatal of
all the five human malaria parasites.
The above sequence was the chosen one because of the e-value which is 0. The e-value zero
means its almost similar to the given sequence and hence it was chosen and data was predicted.
Part B
a)
Fig A: Sequence for the Fig 1 retrieved through BLASTn analysis
The person got infected with Plasmodium vivax which is a protozoal parasite as well as a human
pathogen. This parasite is common and widely scattered as well as considered to be the cause of
repetitive malaria. Though it is less infectious than Plasmodium falciparum, which is the fatal of
all the five human malaria parasites.
The above sequence was the chosen one because of the e-value which is 0. The e-value zero
means its almost similar to the given sequence and hence it was chosen and data was predicted.
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

4SEQUENCE ANALYSIS
b)
Fig 2, Fig 3, Fig 4, Fig 5
sequence name Homo sapiens haemoglobin subunit beta (HBB),
mRNA
gene name HBB
database accession number NM_000518.5
chromosomal location chromosome="11" /map="11p15.4"
size (bp) 628
exon count of the (full) gene. 3
As all are human sequence therefor all the retrieved data will have similar output..
Fig B: Use of bioinformatics tool to retrieve Fig 2, 3, 4, and 5 sequence
b)
Fig 2, Fig 3, Fig 4, Fig 5
sequence name Homo sapiens haemoglobin subunit beta (HBB),
mRNA
gene name HBB
database accession number NM_000518.5
chromosomal location chromosome="11" /map="11p15.4"
size (bp) 628
exon count of the (full) gene. 3
As all are human sequence therefor all the retrieved data will have similar output..
Fig B: Use of bioinformatics tool to retrieve Fig 2, 3, 4, and 5 sequence

5SEQUENCE ANALYSIS
All the data obtained is for HBB and all are human samples therefore the chromosome number,
exon count, the size as well as accession number all will be same.
c)
Fig C: Showing the sequence alignment for the chosen sequences
In the above images blastn sequence for the fig 2-4 in comparison t fig 5 is provided. As it seen
that all are human sequence there is no predominate change in the sequence. However it is
noticed that for Sequence 1, the identity score is 100%, for sequence 2-5 is 99%.. Hence it is
All the data obtained is for HBB and all are human samples therefore the chromosome number,
exon count, the size as well as accession number all will be same.
c)
Fig C: Showing the sequence alignment for the chosen sequences
In the above images blastn sequence for the fig 2-4 in comparison t fig 5 is provided. As it seen
that all are human sequence there is no predominate change in the sequence. However it is
noticed that for Sequence 1, the identity score is 100%, for sequence 2-5 is 99%.. Hence it is
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

6SEQUENCE ANALYSIS
assumed that all sequences are very close to each other. (PDF is attached for better view of the
sequence).
d) blastx is done to change from nucleotide to protein sequence
Fig D: Blastx analysis of the data to showing protein sequence similarity
The difference in the sequence depends on the maximum score or the e-value. E-value close 1 is
considered to be perfect. It determines how much close the sequences are. Hence the sequence
Chain B, Haemoglobin (Val Beta1 Met) Mutant [Homo sapiens] is chosen as the identified
match along with e-value was perfect. The identified match was 98% that means highly identical.
The protein that are of less identical match means that their might have been some mutation. The
uncharacterised and discovering protein structure as well as amino acid gets affected if there is
any mutation. Imagine isolating an uncharacterized protein and discovering that its structure and
amino acid sequence advocates the protein kinase activity gets affected.
assumed that all sequences are very close to each other. (PDF is attached for better view of the
sequence).
d) blastx is done to change from nucleotide to protein sequence
Fig D: Blastx analysis of the data to showing protein sequence similarity
The difference in the sequence depends on the maximum score or the e-value. E-value close 1 is
considered to be perfect. It determines how much close the sequences are. Hence the sequence
Chain B, Haemoglobin (Val Beta1 Met) Mutant [Homo sapiens] is chosen as the identified
match along with e-value was perfect. The identified match was 98% that means highly identical.
The protein that are of less identical match means that their might have been some mutation. The
uncharacterised and discovering protein structure as well as amino acid gets affected if there is
any mutation. Imagine isolating an uncharacterized protein and discovering that its structure and
amino acid sequence advocates the protein kinase activity gets affected.
Paraphrase This Document
Need a fresh take? Get an instant paraphrase of this document with our AI Paraphraser

7SEQUENCE ANALYSIS
e) A mutation changes a protein, which plays a precarious role in the human body. It damages
normal development of the gene. It leads to genetic disorder. Gene mutations can also prevent an
embryo from living until birth (Niederberger, 2018). Even a small percentage of mutation leads
to serious health defect. Some mutations alter a DNA sequence however do not change the
function of the protein made by the gene. This has been observed in the above case. Here, there
is slight change in the sequence hence the kid is found infected due to sequence change (Haji
Abdolvahab, Venselaar, Fazeli, Arab & Behmanesh, 2019). Gene mutations cause a genetic
disorder are mended by definite enzymes before the gene is articulated and a different protein is
produced. DNA can be mutated or damaged in several ways, DNA repair is a significant method
by which the body defends itself from disease.
Fig E: Point mutation in the sequence of DNA. (Hopf et al., 2017)
e) A mutation changes a protein, which plays a precarious role in the human body. It damages
normal development of the gene. It leads to genetic disorder. Gene mutations can also prevent an
embryo from living until birth (Niederberger, 2018). Even a small percentage of mutation leads
to serious health defect. Some mutations alter a DNA sequence however do not change the
function of the protein made by the gene. This has been observed in the above case. Here, there
is slight change in the sequence hence the kid is found infected due to sequence change (Haji
Abdolvahab, Venselaar, Fazeli, Arab & Behmanesh, 2019). Gene mutations cause a genetic
disorder are mended by definite enzymes before the gene is articulated and a different protein is
produced. DNA can be mutated or damaged in several ways, DNA repair is a significant method
by which the body defends itself from disease.
Fig E: Point mutation in the sequence of DNA. (Hopf et al., 2017)

8SEQUENCE ANALYSIS
Part 3
Gene 1: ddct (10.27-10.47= -0.20) = 2-(ddCt) = 1.14
Gene 2: ddct (10.86-10.31 = 0.55) = 2^-(0.55) = 0.6830
Gene 3: ddct (11.01-10.21= 0.80) = 2^-(0.80) = 0.57439
Gene 4: ddct (10.89-10.58=0.31) = 2^-(0.31) = 0.80664
Gene 5: ddct (11.30-10.08= 1.22) = 2^-(1.22) = 0.42928
Increase or decrease percentage
Gene 1: (1.14/10.27) x 100= 10.88% (decrease as the ddct value is in negative)
Gene 2: (0.6830/10.86) x 100= 6.2% (increase as the ddct value is in positive)
Gene 3: (0.5743/11.01) x 100= 5.2% (increase as the ddct value is in positive)
Gene 4: (0.31/10.58) x 100= 2.9% (increase as the ddct value is in positive)
Gene 5: (1.22/10.08) x 100 = 12.1% (increase as the ddct value is in positive)
Except Gene 1, disease patient has greater ct value than healthy individual. It means gene
1 is health and rest all are unhealthy. Hence, it means the gene 2-5 are diseased whereas gene 1 is
health. Hence the disease percentage is in negative
Part 3
Gene 1: ddct (10.27-10.47= -0.20) = 2-(ddCt) = 1.14
Gene 2: ddct (10.86-10.31 = 0.55) = 2^-(0.55) = 0.6830
Gene 3: ddct (11.01-10.21= 0.80) = 2^-(0.80) = 0.57439
Gene 4: ddct (10.89-10.58=0.31) = 2^-(0.31) = 0.80664
Gene 5: ddct (11.30-10.08= 1.22) = 2^-(1.22) = 0.42928
Increase or decrease percentage
Gene 1: (1.14/10.27) x 100= 10.88% (decrease as the ddct value is in negative)
Gene 2: (0.6830/10.86) x 100= 6.2% (increase as the ddct value is in positive)
Gene 3: (0.5743/11.01) x 100= 5.2% (increase as the ddct value is in positive)
Gene 4: (0.31/10.58) x 100= 2.9% (increase as the ddct value is in positive)
Gene 5: (1.22/10.08) x 100 = 12.1% (increase as the ddct value is in positive)
Except Gene 1, disease patient has greater ct value than healthy individual. It means gene
1 is health and rest all are unhealthy. Hence, it means the gene 2-5 are diseased whereas gene 1 is
health. Hence the disease percentage is in negative
⊘ This is a preview!⊘
Do you want full access?
Subscribe today to unlock all pages.

Trusted by 1+ million students worldwide

9SEQUENCE ANALYSIS
References
Haji Abdolvahab, M., Venselaar, H., Fazeli, A., Arab, S., & Behmanesh, M. (2019). Point
Mutation Approach to Reduce Antigenicity of Interferon Beta. International Journal Of
Peptide Research And Therapeutics. doi: 10.1007/s10989-019-09938-9
Hopf, T., Ingraham, J., Poelwijk, F., Schärfe, C., Springer, M., Sander, C., & Marks, D. (2017).
Mutation effects predicted from sequence co-variation. Nature Biotechnology, 35(2),
128-135. doi: 10.1038/nbt.3769
Niederberger, C. (2018). Re: Correction of a Pathogenic Gene Mutation in Human Embryos.
Journal Of Urology, 199(2), 330-332. doi: 10.1016/j.juro.2017.11.028
References
Haji Abdolvahab, M., Venselaar, H., Fazeli, A., Arab, S., & Behmanesh, M. (2019). Point
Mutation Approach to Reduce Antigenicity of Interferon Beta. International Journal Of
Peptide Research And Therapeutics. doi: 10.1007/s10989-019-09938-9
Hopf, T., Ingraham, J., Poelwijk, F., Schärfe, C., Springer, M., Sander, C., & Marks, D. (2017).
Mutation effects predicted from sequence co-variation. Nature Biotechnology, 35(2),
128-135. doi: 10.1038/nbt.3769
Niederberger, C. (2018). Re: Correction of a Pathogenic Gene Mutation in Human Embryos.
Journal Of Urology, 199(2), 330-332. doi: 10.1016/j.juro.2017.11.028
1 out of 10
Your All-in-One AI-Powered Toolkit for Academic Success.
+13062052269
info@desklib.com
Available 24*7 on WhatsApp / Email
Unlock your academic potential
Copyright © 2020–2026 A2Z Services. All Rights Reserved. Developed and managed by ZUCOL.