This book presents a guide to building computational gene finders, and describes the state of the art in computational gene finding methods, with a focus on comparative approaches. Fully updated and expanded, this new edition examines next-generation sequencing (NGS) technology. The book also discusses conditional random fields, enhancing the broad coverage of topics spanning probability theory, statistics, information theory, optimization theory and numerical analysis. Features: introduces the fundamental terms and concepts in the field; discusses algorithms for single-species gene finding, and approaches to pairwise and multiple sequence alignments, then describes how the strengths in both areas can be combined to improve the accuracy of gene finding; explores the gene features most commonly captured by a computational gene model, and explains the basics of parameter training; illustrates how to implement a comparative gene finder; examines NGS techniques and how to build a genome annotation pipeline.
Chapter 1 Introduction
Chapter 2 Single Species Gene Finding
Chapter 3 Sequence Alignment
Chapter 4 Comparative Gene Finding
Chapter 5 Gene Structure Submodels
Chapter 6 Parameter Training
Chapter 7 Implementation of a Comparative Gene Finder
Chapter 8 Annotation Pipelines for Next-Generation Sequencing Projects