Sections that describe advanced, often technical concepts that are not central to the main flow of the book are marked with a star. See the Teaching page for a list of logical dependencies among chapters. Use the Map of HTS to see how the main computational steps in high-throughput sequencing map to book chapters.
Part I – Preliminaries
1. Molecular biology and
high-throughput sequencing
1.1 DNA, RNA, proteins
1.2 Genetic variations
1.3 High-throughput sequencing
2. Algorithm design
2.1 Complexity analysis
2.2 Data representations
2.3 Reductions
2.4 Literature
3. Data structures
3.1 Dynamic range minimum queries
3.2 Bitvector rank and select operations
3.3 Wavelet tree
3.3.1 Balanced representation
3.3.2 Range queries
3.4 Literature
4. Graphs
4.1 Directed acyclic graphs (DAGs)
4.1.1 Topological ordering
4.1.2 Shortest paths
4.2 Arbitrary directed graphs
4.2.1 Eulerian paths
4.2.2 Shortest paths and the Bellman-Ford method
4.3 Literature
5. Network flows
5.1 Flows and their decompositions
5.2 Minimum-cost flows and circulations
5.2.1 The residual graph
5.2.2 A pseudo-polynomial algorithm
5.3 Bipartite matching problems
5.3.1 Perfect matching
5.3.2 Matching with capacity constraints
5.3.3 Matching with residual constraints
5.4 Covering problems
5.4.1 Disjoint cycle cover
5.4.2 Minimum path cover in a DAG
5.5 Literature
Part V – Applications
14. Genomics
14.1 Variation calling
14.1.1 Calling small variants
14.1.2 Calling large variants
14.2 Variation calling over pan-genomes
14.2.1 Alignments on a set of individual genomes
14.2.2 Alignments on the labeled DAG of a population
14.2.3 Evaluation of variation calling results
14.3 Haplotype assembly and phasing
14.4 Literature
15. Transcriptomics
15.1 Estimating the expression of annotated transcripts
15.2 Transcript assembly
15.2.1 Short reads
15.2.2 Long reads
15.2.3 Paired-end reads
15.3 Simultaneous assembly and expression estimation
15.4 Transcript alignment with co-linear chaining
15.5 Literature
16. Metagenomics
16.1 Species estimation
16.1.1 Single-read methods
16.1.2 Multi-read and coverage-sensitive methods
16.2 Read clustering
16.2.1 Filtering reads from low-frequency species
16.2.2 Initializing clusters
16.2.3 Growing clusters
16.3 Comparing metagenomic samples
16.3.1 Sequence-based methods
16.3.2 Read-based methods
16.3.3 Multi-sample methods
16.4 Literature