logo

DE-NOVO DE novo Assignment

2 Pages689 Words233 Views
   

Added on  2021-04-19

DE-NOVO DE novo Assignment

   Added on 2021-04-19

ShareRelated Documents
DE-NOVODE novo is simply a term used to make predictions about biological features using computational models, without comparison to the already existing data. In DE novo assessing, some methods are considered, and are divided into two being, the technological ones and the biological ones. In the technological ones, the statistical description was the firstmethod, and analysing the fosmid sequence was the second. Under the biological ones are genome simulation, multiple sequence alignment, and the copy number error. Statistical description is where the N50 of an assembly is weighed median of the lengths in the sequences it contains. Normally, the length(s) is greater than or equal to half the length of the genome being assembled. Similar to the N50 is the NG50. The only difference between the two is that in theNG50, only a rough estimation is done to the length of the genome being assembled. The fosmid sequence simply helped in assessing the accuracy of respective genome assemblies. The method was mainly used in assessing birds and snakes. ‘Evaluation and validation of de novo and hybrid assembly techniques to derive high quality genome sequences’ (2)The biological ones, with genome simulation as the first simply is the process in which the genome for the simulation termed as the root genome was constructed by getting a DNA sequence and annotations. The DNA was later divided into four chromosomes of approximately equal lengths. The second one was the multiple sequence alignment. The MSA can be divided into columns with each representing a set of individual base pair positions in the input sequence. The sequences are in most cases considered homologous. Generally, the coverage of an MSA is considered as the proportion of the haplotype columns that contain the positions from the assembly. The third being the copy number error, is simply where, the copy number of the simulated diploid genome can be described by intervals of minimum and maximum. To establish whether the assemblies were producing many or few copies of the homologous positions, a haplotype was looked for and the copy number lay outside of the copy number interval. In conclusion, the assemblathon competition which aimed at comprehensively assessing the state of art in the DE novo was used, and a total of forty one assemblies from the seventeen different groups were received. Simulated benchmark including the correct answer, assemblies and the code that was used to evaluate the assemblies, is now free in the public. There are numerous subclasses of assembly that can be distinguished are among other things the nature of the reads, and the type of sequence such as the reference genome orthe genome of a closely related species. To asses accuracy, assemblies are compared with finished sequences derived from sequencing experiments held out of the assembly process. When a reference genome or sequence is available, a comparison between the assembly and the reference is performed. Comparison is however done to well sequence related species, a thing that can be done using the complete genomic sequence of an out-group. Lastly, by using simulation, a priori known as the haplotype can be assessed by the process of multiple sequence alignment. This method allows us evaluate haplotype specific contributions to the assemblies.
DE-NOVO DE novo Assignment_1

End of preview

Want to access all the pages? Upload your documents or become a member.