by David W. Mount Cold Spring Harbor Laboratory Press (2001)564 pages. ISBN 0-87969-608-7 £70
The idea of making a living in sequence analysis in the early days of bioinformatics risked provoking the incredulity and, often, the derision of one's peers. Why line up letters when there are so many interesting experiments waiting? The growth of bioinformatics and, more importantly, of sequence databases, however, has made this new discipline an essential part of the biologist's repertoire. Now, every experimental result can be cast in the light of evolutionary history to provide novel insights into homologous systems.
The maturation of a discipline requires degree courses, proselytizers,historians and text books. In each of these respects, David Mount's Bioinformatics: Sequence and Genome Analysis reflects the coming of age of bioinformatics. It is a foundation text book that archives the early history of the field and describes the emergence of key analytical methodologies. If you have ever been curious about the inner workings of the BLAST algorithm, about what is meant by the threading of sequence onto structure, or about what Rosetta stone proteins are, then this book is for you.
In the first two chapters, the reader is gently introduced to the early history of computational biology and humdrum, but necessary, matters such as formats and data structures. Pair-wise and multiple alignment methods are then plumbed in depth with good pointers to many of the classic papers in the field. This reading is not for the fainthearted, especially those who seldom stray beyond the bounds of BLAST: the indepth, and often step-by-step,algorithmic analysis is spread over two chapters and some 150 pages. Subsequent chapters describe a gamut of bioinformatics prediction methods relating to RNA structure, phylogeny, homology, genes and protein structures. The last chapter concentrates on the fast-moving field of genome analysis. Currently, such research is the preserve of the well-heeled bioinformatics research groups because it requires significant computational resources. However, as genome sequencing becomes more routine and computer prices tumble,this could be an area that might prove to be fruitful for smaller research groups.
The author and publishers have addressed the rapid evolution of bioinformatics methodologies by providing a website(http://www.bioinformaticsonline.org),which displays weblinks, examples and problems at no extra charge. The website also promises to keep the book's readers up-to-date in the future with the latest strategies and technologies. Although this book was first published in 2001, the fast pace of progress in the field guarantees that some of the material is already out of date. Advances in 2001 and 2002, such as refinements to PSI-BLAST (Schäffer et al., 2001), genome analysis tools such as BLAT(Kent, 2002) and gene prediction algorithms based on genome-genome alignments(Korf et al., 2001), would be necessary additions to any subsequent edition.
Some of the material in this book would have benefited from a more critical discussion of the respective merits of the various bioinformatics tools and their applicability to different biological problems. Nevertheless, much of the vocabulary and many of the `rules-of-thumb' in bioinformatics are explained succinctly enough. Thus, Bioinformatics: Sequence and Genome Analysis should find a place in any advanced undergraduate or graduate bioinformatics degree course.