Academic Journal

Next-generation sequencing revolution through big data analytics.

Bibliographic Details
Title: Next-generation sequencing revolution through big data analytics.
Authors: Tripathi, Rashmi, Sharma, Pawan, Chakraborty, Pavan, Varadwaj, Pritish Kumar
Source: Frontiers in Life Science; Jun2016, Vol. 9 Issue 2, p119-149, 31p
Abstract: Next-generation sequencing (NGS) technology has led to an unrivaled explosion in the amount of genomic data and this escalation has collaterally raised the challenges of sharing, archiving, integrating and analyzing these data. The scale and efficiency of NGS have posed a challenge for analysis of these vast genomic data, gene interactions, annotations and expression studies. However, this limitation of NGS can be safely overcome by tools and algorithms using big data framework. Based on this framework, here we have reviewed the current state of knowledge of big data algorithms for NGS to reveal hidden patterns in sequencing, analysis and annotation, and so on. The APACHE-based Hadoop framework gives an on-interest and adaptable environment for substantial scale data analysis. It has several components for partitioning of large-scale data onto clusters of commodity hardware, in a fault-tolerant manner. Packages like MapReduce, Cloudburst, Crossbow, Myrna, Eoulsan, DistMap, Seal and Contrail perform various NGS applications, such as adapter trimming, quality checking, read mapping, de novo assembly, quantification, expression analysis, variant analysis, and annotation. This review paper deals with the current applications of the Hadoop technology with their usage and limitations in perspective of NGS. [ABSTRACT FROM AUTHOR]
Subject Terms: NUCLEOTIDE sequencing, BIG data, GENOMES, MATHEMATICAL models
Copyright of Frontiers in Life Science is the property of Taylor & Francis Ltd and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
ISSN: 21553769
DOI: 10.1080/21553769.2016.1178180
Database: Complementary Index