Next Generation DNA Sequencing (NGS) has been widely adopted by the research community. It offers a speedy and sensitive approach to discovering the genetic and transcriptomic nuances of clinical samples.
The larger research groups have already purchased sequencing machines and have characterised a wide variety of clinical samples and cell lines. For the rest of us, the rapidly evolving technologies are bewildering and the utility of the data can seem questionable. By cutting through the hyperbole and urban legends surrounding NGS and RNASeq, the risks and opportunities afforded by todays sequencing platforms will be explored.
A number of the more common genome sequencing and data analysis pitfalls will be articulated. What is a batch-effect and how is it manifested? Which artefacts does FFPE storage introduce in our data? How reliable are my lists of CNVs? Which of my genetic variants are actually somatic?
Some of the more common data formats from genome bioinformatics will be demystified in this presentation - what is a BAM and how does it correspond to the VCF and why? There is a massive amount of information contained within these files and bioinformaticians can present these data in a myriad of ways; by opening up these files we can encourage a discourse to enable a more meaningful exchange of information between bioinformaticians, researchers and clinicians.