Home arrow Projects arrow AGESA
Algorithms for genome sequence analysis PDF Print E-mail

Project duration: 2014 - 2017

Funding: Croatian Science Foundation

Key collaborator: Niranjan Nagarajan (A*STAR GIS, Singapore)

The overarching goal of the project is to develop accurate, fast algorithms and tools for analyzing the whole genome-sequencing data. The emphasis of the project is on the output data from the third-generation of sequencing machines that produce longer, more error-prone sequence reads. The basis of the project are sequence alignment algorithms, graph algorithms and signal processing methods. They will be implemented for DNA sequence assembly, sequenced RNA data and sequence similarity database search. The algorithms should feasibly handle data obtained from mammalian and plant genomes (sizes greater than 109 base pairs). Special emphasis will be put on multi-core, many-core (GPU - graphics processing unit) and intra-core (Intel's SSE - Streaming SIMD Extensions and AVX - Advanced Vector Extensions) parallelism. Additionally, the algorithms should provide good scalability over the available underlying computational architecture. All implementations of algorithms will be performed in C/C++. The research will result in the development of novel algorithms tailored to specifications of current and future long-read sequencing data. The implemented methods will further the state-of-the-art of sequence similarity database search, RNA-seq mapping and DNA assembly, ideally providing researchers with methods that return results in feasible time with limited computational resources. This, in turn could affect the current practices of genomic research; help designing new medical strategies and enable faster and more accurate analyses and diagnoses.  

© 2017 Complex Network and Bioinformatics Group
Joomla! is Free Software released under the GNU/GPL License.