8:30 - 8:40 | Opening Remarks |
8:40 - 9:20 | Keynote 1: Onur Mutlu (ETH, CMU)
“Accelerating Genome Analysis: A Primer on an Ongoing Journey” (slides) |
09:20 - 09:40 | Mohammed Alser+, Hasan Hassan*, Akash Kumar&, Onur Mutlu* and Can Alkan+ (+Bilkent Univ., *ETH Zurich, &TU Dresden)
Exploring Speed/Accuracy Trade-offs in Hardware Accelerated Pre-Alignment in Genome Analysis (slides) |
09:20 - 09:40 | Lisa Wu, Frank Nothaft, Brendan Sweeney, David Bruns-Smith, Sagar Karandikar, Johnny Le, Howard Mao, Krste Asanovic, David Patterson and Anthony Joseph (UC Berkeley)
Accelerating Duplicate Marking In The Cloud |
10:00 - 10:30 | Coffee break |
10:30 - 11:10 | Invited Talk: Bertil Schmidt (JGU Mainz)
“Next-Generation Sequencing: Big Data meets High Performance Computing Architectures” |
11:10 - 11:30 | Wenqin Huangfu+, Zhenhua Zhu*, Tianqi Tang+, Xing Hu+, Yu Wang* and Yuan Xie+ (+UCSB, *Tsinghua University)
GAME: GPU Acceleration of Metagenomics Clustering |
11:30 - 11:50 | Jose M. Herruzo+, Sonia Gonzalez-Navarro+, Pablo Ibañez*, Victor Viñals*, Jesus Alastruey* and Oscar Plata+ (+Univ. of Malaga, *Univ. of Zaragoza)
Exact Alignment with FM-index on the Intel Xeon Phi Knights Landing Processor |
11:50 - 13:30 | Lunch |
13:30 - 14:10 | Keynote 2: Srinivas Aluru (Georgia Tech)
“Automata Processor and its Applications in Bioinformatics” |
14:10 - 14:30 | Tommy Tracy Ii, Jack Wadden, Kevin Skadron and Mircea Stan (UVA)
Streaming Gap-Aware Seed Alignment on the Cache Automaton |
14:30 - 14:50 | Roman Kaplan, Leonid Yavits and Ran Ginosar (Technion)
Processing-in-Storage Architecture for Large-Scale Biological Sequence Alignment |
14:50 - 15:10 | Xueqi Li, Guangming Tan, Yuanrong Wang and Ninghui Sun (ICT)
The Genomic Benchmark Suite: Characterization and Architecture Implications |
15:10 - 15:30 | Coffee break |
15:30 - 16:10 | Invited Talk: Can Alkan (Bilkent University)
"Addressing Computational Burden to Realize Precision Medicine" (slides) |
16:10 - 16:30 | Sergiu Mosanu and Mircea Stan (UVA)
Burrows-Wheeler Short Read Aligner on AWS EC2 F1 (slides) |
16:30 - 16:50 | Angélica Alejandra Serrano-Rubio, Amilcar Meneses-Viveros, Guillermo B. Morales-Luna and Mireya Paredes-López (CINVESTAV-IPN)
Towards BIMAX: Binary Inclusion-MAXimal parallel implementation for gene expression analysis |
16:50 - 17:00 | Short break |
17:00 - 17:15 | Meysam Taassori+, Anirban Nag+, Keeton Hodgson+, Ali Shafiee* and Rajeev Balasubramonian+ (+Univ. of Utah, *Samsung Electronics)
Memory: The Dominant Bottleneck in Genomic Workloads (slides) |
17:15 - 17:30 | Meysam Roodi and Andreas Moshovos (Univ. of Toronto)
Gene Sequencing: Where Time Goes |
17:30 - 17:45 | Calvin Bulla, Lluc Alvarez and Miquel Moreto (BSC)
Are Next-Generation HPC Systems Ready for Population-level Genomics Data Analytics? (slides) |
17:45 - 17:50 | Closing remarks |
Social Event |
|
18:15 | Bus leaves to social event (Heurigen) |
Talk abstract:
Genome analysis is the foundation of many scientific and medical discoveries as well as a key pillar of personalized medicine.
Any analysis of a genome fundamentally starts with the reconstruction of the genome from its sequenced fragments.
This process is called read mapping.
One key goal of read mapping is to find the variations that are present between the sequenced genome and reference genome(s) and to tolerate the errors introduced by the genome sequencing process.
Read mapping is currently a major bottleneck in the entire genome analysis pipeline because state-of-the-art genome sequencing technologies are able to sequence a genome much faster than the computational techniques that are employed to reconstruct the genome.
New sequencing technologies, like nanopore sequencing, greatly exacerbate this problem while at the same time making genome sequencing much less costly.
This talk describes our ongoing journey in greatly improving the
performance of genome read mapping. We first provide a brief
background on read mappers that can comprehensively find variations
and tolerate sequencing errors. Then, we describe both algorithmic and
hardware-based acceleration approaches. Algorithmic approaches exploit
the structure of the genome as well as the structure of the underlying
hardware. Hardware-based acceleration approaches exploit specialized
microarchitectures or new execution paradigms like processing in
memory. We show that significant improvements are possible with both algorithmic and hardware-based approaches and their combination.
We conclude with a foreshadowing of future challenges brought about by very low cost yet highly error prone new sequencing technologies.
Talk abstract: This talk will introduce the Micron Automata Processor (AP), a novel computing architecture that enables massively parallel execution of numerous non-deterministic finite automata. The processor inspires a new programming paradigm of solving problems using complex pattern matching engines executed over streaming data. The first part of this talk will focus on the processor characteristics, programming and execution environment, and design principles we discovered that are of value in developing applications on the AP. The second part will feature my group's research on developing bioinformatics algorithms for the AP including database search and motif detection.
Talk abstract: The progress of NGS has a major impact on medical and genomic research. This high-throughput technology can now produce billions of short DNA fragments in excess of a few Terabytes of data in a single run. This leads to massive datasets used by a wide range of applications including personalized cancer treatment and precision medicine. In addition to the hugely increased throughput, the cost of using high-throughput technologies has been dramatically decreasing. Low sequencing cost of around US$1K per genome has now rendered large population-scale projects feasible. However, in order to make effective use of the produced data, the design of big data algorithms and their efficient implementation on modern HPC systems is required. In this talk, I will present the design of scalable algorithms for metagenomic read classification and for massively parallel hash maps on multi-GPU nodes.
Talk abstract:
The main computational bottleneck of HTS data analysis is to map the reads to a reference genome, for which clusters are typically used.
However, building clusters large enough to handle hundreds of petabytes of data is infeasible.
Additionally, the reference genome is also periodically updated to fix errors and include newly sequenced insertions, therefore in many large scale genome projects the reads are realigned to the new reference.
Therefore, we need to explore volunteer grid computing technologies to help ameliorate the need for large clusters.
However, since the computational demands of HTS read mapping is substantial, and the turnaround of analysis should be fast, we also need a method to motivate volunteers to dedicate valuable resources.
For this purpose, we propose to merge distributed read mapping techniques with the popular cryptocurrency protocols.
Cyryotocurrencies such as Bitcoin calculate a value (called nonce) to ensure new block (i.e. “money”) creations are limited in the system, however, this calculation serves no other practical purpose.
Our solution (Coinami) replaces nonce with a token signed by an authority that can be acquired by returning the alignment results assigned by the authority.
Authorities have two main tasks in our system: 1) inject new problem sets (i.e. “alignment problems”) into the system, and 2) check for the validity of the results to prevent counterfeit
leonid.yavits@nububbles.com
Short bio: Leonid received his MSc and PhD in Electrical Engineering from the Technion. After graduating, he co-founded VisionTech where he co-designed a single chip MPEG2 codec. Following VisionTech’s acquisition by Broadcom, he co-founded Horizon Semiconductors where he co-designed a Set Top Box on chip for cable and satellite TV.
Leonid is a postdoc fellow in Electrical Engineering in the Technion. He co-authored a number of patents and research papers on SoC and ASIC. His research interests include non von Neumann computer architectures and processing in memory
romankap@gmail.com
Short bio:
Roman received his BSC and MSc from the faculty of Electrical Engineering, Technion, Israel in 2009 and 2015, respectively. He is now a PhD candidate in the same faculty under the supervision of Prof. Ran Ginosar.
Roman's research interests are parallel computer architectures, in-data processing, accelerators for machine learning and big data, and novel computer architectures for bioinformatics applications.