Engineering is the art or science of making practical. –Samuel C. Florman
One man’s “magic” is another man’s engineering. –Robert Heinlein
In the 21st century, biology is being transformed from a purely lab-based science to also an information science. As such, biologists have had to draw assistance from those in mathematics, computer science, and engineering. The result has been the development of the fields of bioinformatics and computational biology (terms often used interchangeably). The major goal of these fields is to extract new biological insights from the large and noisy sets of data being generated by high-throughput technologies. Initially, the main problems in bioinformatics were how to create and maintain databases for massive amounts of DNA sequence data. Addressing these challenges also involved the development of efficient interfaces for researchers to access, submit, and revise data. Bioinformatics has expanded into the development of software that also analyzes and interprets these data. While bioinformatics has come to mean many things, this text uses the term bioinformatics to refer to the analysis of static data such as DNA and protein sequences, techniques for finding genes or evolutionary patterns, and cluster analysis of microarray data. Algorithms for bioinformatics are not covered in this text.
The focus of this text is modeling, analysis, and design methods for Systems biology has been the subject of several books, each of which give it a somewhat different meaning. This book uses the term to mean the study of the mechanisms underlying complex molecular processes as integrated into systems or pathways made up
of many interacting genes and proteins. In other words, systems biology is concerned with the analysis of dynamic models. While it has long been known that developing dynamic models of complete systems is essential to understanding biological processes, it is only recently that the emergence of new high-throughput experimental data acquisition methods has made it possible to explore such computational models. Some example experimental techniques include cDNA microarrays and oligonucleotide chips, mass spectrometric identification of gel-separated proteins, yeast two-hybrid systems, and genome-wide location analysis (ChiP-to-chip).
Systems biology involves the collection of large experimental data sets using high-throughput technologies, the development of mathematical models that predict important elements in this data, design of software to accurately and efficiently make predictions in silico (i.e., on a computer), quality assessment of models by comparing numerical simulations with experimental data, and the design of new synthetic biological systems. The ultimate goal of systems biology is to develop models and analytical methods that provide reasonable predictions of experimental results. While it will never replace experimental methods, the application of computational approaches to gain understanding of various biological processes has the promise of helping experimentalists work more efficiently. These methods also may help gain insight into biological mechanisms, information which may not be obtained from any known experimental methods. Eventually, it may be possible that such models and analytical techniques could have substantial impact on our society such as aiding in drug discovery.
Systems biologists analyze several types of molecular systems, including genetic regulatory networks, metabolic networks, and protein networks. The primary focus of this book is genetic regulatory networks referred to as genetic circuits in this book. These circuits regulate gene expression at many molecular levels through
numerous feedback mechanisms. Chapter 1 presents the basic molecular biology and biochemistry principles that are needed to understand these circuits. A few bacterial genetic circuits are well understood. One such circuit is the lysis/lysogeny decision circuit of the phage lambda. This circuit is described in Chapter 1 and used as a running example throughout this book.
During the genomic age, standards for representing sequence data were (and still are) essential. Data collected from a variety of sources could not be easily used by multiple researchers without a standard data format. For systems biology, standard data formats are also being developed. One format that seems to be gaining some traction is the systems biology markup language (SBML) which is an XML-based language for representing chemical reaction networks. All the types of models described in this book can be reduced to a set of bio-chemical reactions. The basic structure of an SBML model is a list of chemical species coupled with a list of chemical reactions. Each chemical reaction includes a list of reactants, products, and modifiers. It also includes a mathematical description of the kinetic rate law governing the dynamics of this reaction. SBML is not a language for use in constructing models by hand. Fortunately, several graphical user interfaces (GUIs)
have been developed for entering or drawing up chemical reaction networks, which then can be exported in the SBML format.
Another essential item in the genomic age was the development of biological databases, which provide repositories for storing large bodies of data that can be easily updated, queried, and retrieved. Databases have been developed to store different sets of information ranging from nucleotide sequences within GenBank
to biomedical literature at PubMed. Recently, there has also been developed a new database for SBML models of various biochemical systems.
A final essential piece of the puzzle is the development of tools for analysis. An excellent list of bioinformatics tools can be found on NCBI’s website. A list of systems biology tools that support the SBML language can be found at the SBML website. This text concentrates on describing the methods used by such tools.
Given our vast experience in reasoning about complex circuits and systems, engineers are uniquely equipped to assist with the development of tools for the modeling and analysis of genetic circuits. It has been shown that viewing a genetic circuit as an electronic circuit can yield new insights and open up the application
of engineering tools to genetic circuits. Therefore, as in the sequencing of the human genome, collaborations between engineers and biologists will be essential to
the success of systems biology. The major goal of this textbook is to facilitate these collaborations.
An engineering approach involves three parts. First, engineers examine experimental data in order learn mathematical models. Second, engineers develop efficient abstraction and simulation methods to analyze these models. Finally, engineers use these analytical methods to guide the design of new circuits. This book discusses all three aspects of this engineering approach as applied to genetic circuits. Chapter 2 describes modern experimental techniques and methods for learning genetic circuit models from the data generated by these experiments. The next four chapters explore methods for analyzing these genetic circuit models. Perhaps, the most common method for modeling and analysis uses differential equations, a topic which is the subject of Chapter 3. In genetic circuits, however, the numbers of molecules involved are typically very small, thereby requiring the use of stochastic analysis; these methods are described in Chapter 4. Stochastic analysis can be extremely complex limiting its application often to only the simplest systems. To reduce this complexity, Chapter 5 presents several reaction-based abstraction
methods to simplify the models while maintaining reasonable accuracy. Since the state space of these models is still often quite large, Chapter 6 presents logical abstraction to reduce this state space and further improve analysis time. Finally, using these analytical methods, researchers are beginning to design synthetic
genetic circuits as described in Chapter 7. It is our hope that this book will prove to be useful to both engineers who wish to learn about genetic circuits and to biologists who would like to learn about engineering techniques that can be used to study their systems of interest.
This textbook has been used several times in an advanced undergraduate/graduate level course on the modeling, analysis, and design of biological circuits. In a semester version of this course, students select their own biological circuit and learn about them from research papers. Homework each week includes a paper and pencil problem and a software tutorial using the phage lambda example. Students then repeat the model design or analysis using their selected circuit. Finally, students complete a more extensive final project, which ranges from modeling and analysis to the development of software depending on the student’s background and interests. Throughout, the students use our iBioSim tool, which supports most of the topics discussed in this textbook. This tool allows one to construct models, learn from experimental data, and perform either differential or stochastic analyses utilizing automatic reaction-based and logical abstractions. Course examples including lecture materials and assignments as well as our iBioSim software are freely available from:
Chris J. Myers
Salt Lake City, Utah