Servicenavigation

WG1: Technology Watch

Objectives

The objective of this group is to provide timely alerts on new technology developments in next generation sequencing. The members read papers about latest technology, applications, and things that are coming, and tell rest of the action' s members what is the latest fashion.

A description of tasks, deliverables, etc. can be found here wg1_task_description.ppt

Chairs and Members

Chairs:

  • Ralf Herwig, Max Planck Institute for Molecular Genetics (Berlin, DE)
  • Thomas Svensson, Karolinska Institute (Stockholm, SE)

Members:

  • Robert Lyle, University Hospital (Oslo, NO)
  • Laurent Falquet, Vital-IT, Swiss Institute of Bioinformatics (Lausanne, CH)
  • Jean Imbert, Université de la Méditerranée (Marseille FR)

Former members:

  • Alberto Policriti,
  • Peter Rice, European Bioinformatics Institute (Hinxton, GB)
  • Endre Barta, University of Debrecen (Debrecen, HU)

WG1 Initial technology report

Purpose of the initial technology report will be

  • to provide an overview on existing HTS (high-throughput sequencing) technologies
  • to provide an overview on computational approaches for selected applications of HTS in genome research
  • to identify major current bottlenecks and limitations
  • to provide a useful resource overview
  • to outline future developments in HTS

to come soon…

WG1 HTS Library

Purpose of the WG1 HTS Library is to provide summaries of publications, tech papers, conferences etc. in the field of HTS.

1. Sequencing technology

Authors: Benjamin A Flusberg, Dale R Webster, Jessica H Lee, Kevin J Travers, Eric C Olivares, Tyson A Clark, Jonas Korlach & Stephen W Turner

Title: Direct detection of DNA methylation during single-molecule, real-time sequencing

Journal: Nature Methods

Year/Issue: 2010 Vol 7 No 6

Summary: This paper describes the first application of DNA methylation detection using Pacific BioScience SMRT sequencing. They show that they are able to detect directly both mA and mC while sequencing. A breakthrough technology.

PubMed: http://www.ncbi.nlm.nih.gov/pubmed/20453866

Additional links: http://www.ncbi.nlm.nih.gov/pubmed/20517344.1 http://www.ncbi.nlm.nih.gov/pubmed/20508637.1

Contributor: LF

Authors: Clark TA, Murray IA, Morgan RD, Kislyuk AO, Spittle KE, Boitano M, Fomenkov A, Roberts RJ, Korlach J.

Title: Characterization of DNA methyltransferase specificities using single-molecule, real-time DNA sequencing.

Journal: Nucleic Acids Res.

Year/Issue: 2012 Feb;40(4):e29. Epub 2011 Dec 7.

Summary: This paper describes the first detection of N4-methylcytosine during sequencing in addition to N6-methyladenine and 5-methylcytosine using SMRT Pacific Biosciences.

PubMed: http://www.ncbi.nlm.nih.gov/pubmed/22156058

Additional links:

Contributor: LF

2. Applications & Methods

  • RNA-seq
  • Exome
  • Methylation
  • Genome assembly
  • Comparative genomics
  • ChIP-seq
  • microRNA
  • metagenomics
  • related technology

  • Computational analysis of genome-wide methylation data generated by MeDIP-seq

Authors: Chavez L, Jozefczuk J, Grimm C, Dietrich J, Timmermann B, Lehrach H, Herwig R*, Adjaye J*. *equal contribution

Title: Computational analysis of genome-wide DNA methylation during the differentiation of human embryonic stem cells along the endodermal lineage.

Journal: Genome Research

Year/Issue: 2010 Oct;20(10):1441-50. Epub 2010 Aug 27.

Summary: This paper describes a software pipeline, called MEDIPS, for genome-wide methylation studies with the MeDIP-seq (methylated DNA immunoprecipitation followed by sequencing) approach. Core of the pipeline is a newly developed normalization method for MeDIP-Seq data. The rational behind the method is based on the concept of coupling factors addressed by the BATMAN method of Down et al., 2008. Based on a specific distance function for calculating coupling factors, the auhtors estimated in genomic windows the dependency between total CpG density and MeDIP-Seq signals for the low range of coupling factors. MEDIPS weights the MeDIP-Seq signals with respect to the estimated coupling factor dependent normalization parameters with a linear model. In the paper the authors show 0.83 correlation of the MEDIPS normalized MeDIP-seq data to benchmark data generated with bisulfite sequencing. Furthermore, they applied the computational approach to the analysis of genome-wide differential methylation in human embryonic stem cells (hESCs) in contrast to differentiated stem stells to definite endoderm.

PubMed: http://www.ncbi.nlm.nih.gov/pubmed/20802089

Additional links: MEDIPS tutorial http://medips.molgen.mpg.de; MEDIPS package in Bioconductor http://bioc.ism.ac.jp/2.8/bioc/html/MEDIPS.html

  • Comparison of assembly tools

Authors: Salzberg SL, Phillippy AM, Zimin A, Puiu D, Magoc T, Koren S, Treangen TJ, Schatz MC, Delcher AL, Roberts M, Marçais G, Pop M, Yorke JA.

Title: GAGE: a critical evaluation of genome assemblies and assembly algorithms

Journal: Genome Research

Year/Issue: 2012 Jan 12

Summary: This paper compares the performance of various de novo assemblers on real short reads sequencing datasets with the goal to define a Genome Assembly Gold standard Evaluation (GAGE). Four different genomes were compared each sequenced with 2 or 3 libraries of various inserts sizes (small 155-400bp, medium 2280-4000bp, and large 8-35kbp) using 8 different assemblers (ABySS, Allpaths-LG, Bambus2, CABOG, MSR-CA, SGA, SOAPdenovo, Velvet). The results show 3 main conclusions: 1) the quality of the data is more important than the assembler. Thus correcting the reads is of crucial importance, something that Allpath-LG does very well. 2) the degree of contiguity of an assembly varies enormously not only among assemblers, but also among the target genomes. 3) the correctness of the assemblies varies widely and is not correlated with the statistics on contiguity. As a criticism, one can argue that they did not optimize the parameters for some assemblers, like the kmer value for Velvet, SOAPdenovo, ABySS and MRS-CA. E.g., using a single kmer of 31 only when the reads are much longer (101bp or more), is really disadvantageous for those assemblers. Whereas the datasets were chosen to fit the requirements of Allpath-LG in order to be able to compare it to the others.

PubMed: http://www.ncbi.nlm.nih.gov/pubmed/22147368

Additional links: Suppl. material http://genome.cshlp.org/content/early/2012/01/12/gr.131383.111/suppl/DC1; Data used by the authors http://gage.cbcb.umd.edu/data

Contributor: LF

  • Comparison of RNA-seq assembly tools

Authors: Qiong-Yi Zhao, Yi Wang, Yi-Meng Kong, Da Luo, Xuan Li, Pei Hao

Title: Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study

Journal: BMC Bioinformatics

Year/Issue: 2011 Dec 14, 12(Suppl 14):S2

Summary: This paper compares the performance of various de novo RNA-seq assemblers on real public short reads sequencing data sets. Three different transcriptomes were compared: Drosophila melanogaster PE76bp Illumina, Schizosaccharomyces pombe 68PE strand-specific Illumina, and Camellia sinensis 75PE Illumina. The software compared were four single k-mer assemblers (SK: SOAPdenovo, ABySS, Oases and Trinity) and three multiple k-mer methods (MK: SOAPdenovo-MK, trans-ABySS and Oases-MK). Well written and detailed, the article shows that Trinity has an edge on the other assemblers, but is much slower. Oases-MK is a good compromise when time is limited, but oases can require more RAM.

PubMed: http://www.ncbi.nlm.nih.gov/pubmed/22373417

Additional links: Suppl. material http://www.biomedcentral.com/1471-2105/12/S14/S2/additional

Contributor: LF

  • Genome assembly issues

Authors: Liliana Florea, Alexander Souvorov, Theodore S. Kalbfleisch, Steven L. Salzberg

Title: Genome Assembly Has a Major Impact on Gene Content: A Comparison of Annotation in Two Bos Taurus Assemblies

Journal: PLoS One

Year/Issue: 2011, 6(6),e21400

Summary: This paper shows a fact that seems obvious when you think about it, but was never demonstrated: the quality of a genome assembly affects the quality of its gene annotation. By comparing 2 assemblies of the same Bos taurus genome obtained with identical starting data sets, but with improved assembly program (Celera WGS). Clearly the second assembly looks better in terms of classical statistics (N50, Contig length, Scaffold length, Nr of gaps etc…), but interestingly the annotation (done with the same pipeline) highlighted 16% of structural variations. Those were sometimes difficult to connect between both annotations and their quality was sometimes better with the first assembly than with the second. With the increasing amount of draft genomes published, it is perhaps worth investing efforts in improving the finishing of those genomes or develop tools to assess and measure the accuracy of a genome assembly.

PubMed: http://www.ncbi.nlm.nih.gov/pubmed/21731731

Contributor: LF

  • SNP and INDEL calling

Authors: Heng Li

Title: Exploring single-sample SNP and INDEL calling with whole-genome de novo assembly

Journal: Bioinformatics

Year/Issue: 2012 May 7 (ahead of pub).

Summary: The author present a new way to store forward and reverse complement DNA sequence in a FM-index. This allows to develop a new de novo assembler called “fermi” achieving similar quality than other assemblers. It is possible to call SNPs and short INDELs from this assembly with INDELs calling outperforming current methods. The other interest is that assembled unitigs represents a lossless reduced representation of reads, preserving small variants and copy numbers, revealing a possible new way to compress reads which would be non-redundant and smaller in size. However the computational cost is prohibitive.

PubMed: http://www.ncbi.nlm.nih.gov/pubmed/22569178

Contributor: LF

3. Data integration & functional analysis

  • Gene Ontology

Authors: Louis du Plessis, Nives Skunca and Christophe Dessimoz

Title: The what, where, how and why of gene ontology –a primer for bioinformaticians

Journal: Briefings in Bioinformatics

Year/Issue: 2011, 12, 723-735

Summary: Nice review on Gene Ontology describing the advantages and pitfalls of using these classification tools.

PubMed: http://www.ncbi.nlm.nih.gov/pubmed/21330331

Contributor: LF

4. Databases, resources

5. New genomes

6. Personalized medicine

7. Genetics, statistics

8. Hardware, parallel computing

9. Meetings, projects


  • Human Genome Meeting 2012 in Sydney, Australia, 11.-14.03.2012

The 16th Human Genome Meeting (HGM) has the main topic of “Genetics and Genomics in Personalised Medicine”. The focus of this meeting is consistent with one of the aims of the Human Genome Organization which is to foster the integration of genomic sciences in biology and medicine towards improving human health. The power of our current sequencing and genotyping technologies and their attendant analytical tools is providing remarkable precision and completeness in our understanding of the genetic causes of disease. Goal of this meeting is to explore the impact of next generation genomic approaches on medicine and health.

Link: http://www.hgm2012.org/

Participants reports: to come soon


  • European Conference on Computational Biology (ECCB) in Basel, Switzerland, 09.-12.09.2012

The European Conference on Computational Biology is the key European computational biology event in 2012 uniting scientists working in a broad range of disciplines, including bioinformatics, computational biology, biology, medicine, and systems biology. One of its featured research areas will be sequencing technology and personalized medicine.

Link: http://www.eccb12.org/home

Participants reports: to come soon

WG1 Material collection

Purpose of this section is to provide additional materials for example from SEQAHEAD meetings, presentations, reports etc.

www.zotero.org_groups_seqahead_items

This is a link to Zotero

 
Last modified: 2012/07/26 11:33 by Laurent Falquet
DokuWikiRSS Feed