Communication between and within cells is essential for multicellular life. While intracellular signal transduction pathways are often specified in molecular terms, the information content they transmit remains poorly defined. Here, we review research efforts to merge biological experimentation with concepts of communication that emerge from the engineering disciplines of signal processing and control theory. We discuss the challenges of performing experiments that quantitate information transfer at the molecular level, and we highlight recent studies that have advanced toward a clearer definition of the information content carried by signaling molecules. Across these studies, we emphasize a theme of increasingly well-matched experimental and theoretical approaches to decode the data streams directing cellular behavior.
Introduction: parallels between systems for information transfer
The study of cellular signal transduction – the transfer of non-genetic information within and between cells – has been a key interface point between experimental biology and systems biology. For biomedical researchers and experimental biologists, signal transduction pathways are of interest because of their central role in co-ordinating organismal development and physiological homeostasis. The etiology of most chronic human diseases can be traced to abnormal function of a regulatory network, such as mutations that alter signaling protein activity. For engineers and scientists trained in quantitative methods, these inter- and intracellular communication networks have characteristics that parallel well-studied problems in communication, presenting an attractive challenge for the application of well-established theoretical tools with the hope of overcoming some of the limitations of purely experimental research. The relationship between these fields has been renewed and revisited many times over the past 30 years [1–3]. The most important biological insights have emerged when experimental and quantitative tools are carefully and thoughtfully conjoined. In this article, we focus on a particular area of renewed collaboration, in which advances in the ability to detect signaling events with a high level of detail in individual cells have enabled connections to the engineering discipline of signal processing. Our discussion is intended as a guide to further reading, rather than a comprehensive review, with the goal of drawing interest to emerging questions in this area that will likely be rewarding over the next few years.
In traditional signal transduction experiments, the significance of a biochemical event is often evaluated (sometimes subconsciously) by its apparent magnitude. For example, a band on a Western blot representing the phosphorylation of a protein in stimulated cells may be ten-fold more intense than the corresponding band from unstimulated cells. This increase may be judged more significant than the one in which a two-fold change is induced. However, such judgements are often made in the absence of knowledge of whether these differences have functional importance within the cells of interest. A two-fold increase may be sufficient to saturate the process being studied, while a ten-fold increase evokes no further response. Conversely, both activity levels might fall below the threshold to induce a cellular response of interest. The true significance of the result (for the cell) depends on the strength of the signal relative to the responsiveness of the next step in the process. Establishing this quantitative relationship between signal and response is often challenging and leads to a great deal of ambiguity in both conceptual and formal models of signaling processes. In this article, we explore both the experimental challenges inherent in addressing such questions within signaling pathways and the broader biological concepts that have emerged from research in this area.
The scenario described above is an example of a problem inherent in any multistage communication system, whether natural or human-engineered. Communication systems – including signaling pathways, neural networks, or electronic circuits – consist of multiple elements in sequence, each of which receives an input signal and produces an output signal (Figure 1A). The basic function of each element - whether it be a kinase within a signaling cascade, a neurone within a neural pathway, or a transistor within a radio - is to produce an output signal that is variable and dependent on the input signal. The simplest types of element simply relay the input signal without changing it, while more complex elements can transform the input signal in a number of ways to create the output signal. In engineering, the relationship between the input and the output for each element is known as its ‘transfer function’. Importantly, transfer functions can be used to characterize the behavior of individual elements within a system (e.g. the response of a single protein kinase within a cascade; Figure 1B) or larger sets of connected components (e.g. an entire kinase cascade; Figure 1C). This concept has played a central role in the development of electronic communication systems, from radio broadcasting to cellular telephones, and is also essential to our understanding of information processing in neural and sensory systems. Yet, the application of such ideas to intracellular signaling pathways has been limited, especially in mammalian systems.
Alignment of transfer functions in signal transduction cascades.
A fundamental concept has emerged in these other communication-related fields that has strong implications for signal transduction pathways: for a pathway to transmit information – such as the concentration of an extracellular ligand – the transfer functions of every element in the pathway must be well aligned. If they are not, the ability of the system to act as a conduit for information is severely compromised . If elements are connected without attention to alignment, it is more likely than not that their input and output ranges will be mismatched, and the output of one element will be either too high or too low, leading to either saturation of the downstream element or failure to stimulate a response. A simple illustration of such a situation can be obtained visually, by walking from a dark room into bright sunlight; the initial discomfort and visual difficulty of this experience results, in part, from saturation of the opsin-coupled G-protein signaling pathway in the photoreceptor cells of the eye. Fortunately, the visual system has adaptive properties that quickly adjust the transfer function of the system, enabling more intense input signals to be effectively processed . Within both engineering and neurobiology, such adaptation is known as ‘gain control’. Mechanisms facilitating gain control have been well studied in neuronal and sensory systems , where they are essential for ensuring that the outputs and inputs of successive neurones are appropriately matched.
The application of concepts from the signal processing field (see Box 1), such as transfer functions and gain control, to intracellular signaling pathways has remained limited . Much of this conceptual gap can be attributed to a lack of appropriate experimental data with which to accurately measure transfer functions. To fully employ concepts from the signal processing field, the ideal data collection method would quantitate specific signaling protein activities within individual cells to avoid artifacts from averaging across heterogeneous cells. It would also provide high temporal resolution, to determine when signaling reaches steady state or whether frequency-modulated responses occur, and would permit monitoring multiple molecular signals to allow for repeated stimulation of the same cell. While fully realizing this ideal remains difficult, studies based on live-cell imaging now enable many of these criteria to be achieved with reasonable effort. These experiments can be conducted with relatively inexpensive widefield epifluorescence microscopes, and detailed protocols for all stages of setup and analysis are available [8–10]. When combined with appropriate quantitative methods, such studies have uncovered noteworthy characteristics that are common amongst mammalian signal transduction pathways . We focus on three topics in cellular communication that stand out with recent works bringing together theoretical and experimental aspects: dynamic range, signal processing, and information transfer.
For experimental biologists, incorporating quantitative methods from other fields into their area of study can seem intimidating. However, what is frequently perceived as the main obstacle – an inherent difficulty in mathematical approaches – is more often in reality a lack of familiarity with the existence and capabilities of quantitative methods. Thus, a collaborator broadly experienced in quantitative analysis is often an essential resource. Of course, the most important ingredient in any successful collaboration is a shared interest in the topic of study. But a second main question is where to look for such a collaborator, and what areas of expertise are most germane for a potential project. We provide a short summary below of quantitative approaches that integrate well with modern signal transduction studies. On a university campus, individuals skilled in these areas can be found in a wide range of departments, including the usual suspects of systems biology, statistics, physics, and various engineering fields. However, there are now many similarly trained researchers in less obvious disciplines, including plant biology, epidemiology, and sociology – especially as these fields incorporate larger datasets and quantitative methods.
Dynamical systems. Methods to simulate how systems evolve over time, using differential equations or similar types of models to represent the changes in interlinked parameters. Important for the majority of modeling studies in signal transduction.
Commonly used in: Most engineering disciplines, systems biology, economics, ecology.
Biological example: Will the activity of a kinase cascade remain elevated following stimulation, return to baseline levels, or oscillate ?
System identification. Methods to determine which model best represents the relationship between measured variables. Important for characterizing the transfer functions between signaling molecules and pathways. Many applications include machine learning techniques.
Commonly used in: Electrical, mechanical, and aeronautical engineering, systems biology, computer science and bioinformatics.
Biological example: Can a mathematical model accurately predict which genes are activated under a given stimulus ?
Control theory. Methods to predict the behavior of systems with feedback loops, and to control devices using feedback (for example thermostats). Important for systems with heavy or complex regulation, and for designing strategies to modulate systems (e.g. therapeutics).
Commonly used in: Electrical, mechanical, and aeronautical engineering.
Biological example: How does the cell maintain constant levels of metabolites such as ATP ?
Information theory. Methods to quantitate information – how much a given measurement tells you about a system. Important for assessing the reliability of a signaling system, especially in the presence of confounding noise.
Commonly used in: Electrical engineering, communications, applied physics.
Biological example: How different must two concentrations of ligand be for a signaling pathway to reliably distinguish between them ?
Measuring signaling events across their dynamic range
The challenge of quantitating the informational content of a signaling event is intimately linked to the problem of measuring that event within the cell. This connection is fundamental: in such experiments, the experimentalist is attempting to perform, in essence, the same task that the signaling pathway itself performs within the cell – that of distinguishing different levels of the input signal, with sufficient accuracy to control a cellular process (or to understand the regulation of that process, in the case of the researcher). Both the experimentalist and the signaling pathway face limits on the accuracy with which this signal can be quantitated. These include: (i) measurement noise, (ii) limits on sensitivity that determine the lower limit of input signal that can be reliably detected, and (iii) saturation when the detection process reaches its maximum value at a submaximal input strength, preventing it from distinguishing further increases in the input. In any experiment, it must be remembered that the measurement process itself is an element with its own transfer function, and taking this transformation into account is essential for correctly interpreting the data.
An instructive example of the relationship between cellular signals and measurements can be found in recent experiments in which two fluorescent protein-based reporters for the same kinase were monitored in the same cell [12,13]. One reporter, EKAR3, is a FRET-based construct, in which phosphorylation by the kinase ERK induces a conformational change leading to a shift in the emission properties of a fluorescent protein FRET pair. In the other reporter, ERKTR, phosphorylation of the reporter by ERK disrupts a nuclear localization sequence fused to a red fluorescent protein, causing the fluorescent protein to be exported from the nucleus (Figure 2A). Both reporters are reversible, respond rapidly to changes in ERK activity induced by upstream growth factors or pharmacological inhibitors, and typically show strong qualitative agreement in their responses . However, when reporter responses were compared on a cell-by-cell basis for different stimulus strengths, a clear differential relationship emerged: EKAR3 was capable of detecting smaller impulses of ERK, but reached saturation at lower levels than ERKTR (Figure 2B,C) . Because these measurements used different detection modalities but were made simultaneously within same cell, technical artifacts could be ruled out and the differences attributed to different dynamic ranges of response for the two reporters, demonstrating that such differences impact signaling at the level of an individual cell.
Parallel signaling processes with differing transfer parameters.
From this result, it becomes clear that each of the reporter signals represents a different view of cellular ERK activity, as filtered through their respective transfer functions. Considering this difference a bit more deeply, it is noteworthy that while the reporters are synthetic proteins, they operate on principles similar to many endogenous kinase substrates, in which phosphorylation induces changes in conformation or cellular localization. Thus, it can be inferred that amongst the many natural protein substrates of ERK, there may be significant variation in the dynamic range of input to which they respond. Such differences have in fact been documented amongst endogenous substrates, and may function to create unique responses to different strengths of signals. One example of such ordering can be found in the substrates of cyclin-dependent kinases (CDKs), whose sensitivities to phosphorylation vary such that they respond in a staged manner to progressively higher CDK activity over the course of the cell cycle . Similarly, differences in promoter sensitivity underlie ordered responses to morphogen gradients , and may enable time-dependent protein synthesis such that large protein complexes can be assembled more efficiently . Thus, in considering carefully the transfer properties involved in accurately measuring even one signaling element, we not only learn about the quality and limitations of the data, but also gain insight into organizing principles for how cells process the information contained within the pathway’s activity profile.
Another transformation inherent in many experimental approaches is the averaging of a measured quantity across a large number of cells . Determining an average behavior is often an essential step in analysis and interpretation of data, even when data are collected with single-cell resolution. However, relying on averages alone can severely distort the apparent transfer function for a process. For example, individual cells within a population might display a three-fold increase in signal intensity between their unstimulated and stimulated responses, but vary by ten-fold in the concentration of stimulus to which they respond (Figure 3). Population-average measurements of these cells would indicate that the overall response spans a more than ten-fold range, with an EC50 defined by the median individual response, while individual cell measurements would indicate only a three-fold range. Thus, a population measurement of dynamic range will in fact represent a convolution of the individual cell behaviors and the range of response sensitivities amongst cells, rather than the range of the response in any individual cell. Both quantities may contain important information for the system and may be useful in characterizing the response, but need to be distinguished if the ultimate goal is a quantitative model of typical cell behavior. Fitting a single-cell model to the ten-fold range of output would impose unrealistic constraints on the model and would likely lead to errors as other parts of the model are adjusted to accommodate this inaccuracy.
Relationship between single-cell and population-level responses.
Signal processing by single cells: lessons from variability
Signaling events are now routinely measured with single-cell resolution. The most accessible approach is to use activity-specific antibodies or other fluorescent probes, to capture a ‘snapshot’ measurement of the signaling activity status of many cells at a fixed point in time [18,19]. Because of the need to chemically fix cells, such datasets are limited in their time resolution, but they can nonetheless establish the distribution of responses possible for a given cell type. When coupled with information theory, a mathematical framework for quantitating the relationship between the stimulus and intracellular responses, such datasets can be used to estimate the ability of the pathway to resolve different strengths of input stimuli [20,21]. Such studies have revealed the variability inherent in signal processes, but are limited in their ability to quantitate time-dependent responses, potentially failing to account for temporal information.
To capture information carried in the time domain, live-cell measurements with fluorescent protein-based reporters allow for the tracking of cells for minutes, hours, or days following a stimulus . A growing number of studies have revealed that many signaling pathways respond to a simple step-up stimulus with diverse kinetic outputs [23–25] (for a comprehensive review, see ). One way in which kinetics may carry information is termed as ‘dose-to-duration’ encoding, in which different input strengths induce transient pulses of response with varying durations . Downstream signaling elements that integrate the amount of time that the upstream signal remains active are thus not limited in their response by the range of amplitudes of the upstream signal. Live-cell experiments can also measure each cell in multiple states, such as pre- and post-stimulus. Such within-cell comparisons can be essential, because some pathways employ fold-change detection in which the response is proportional to the relative change, as opposed to the absolute magnitude of the input signal [27,28].
In a more sophisticated form of analysis, complex time-dependent input signals are used as stimuli, and measurements are recorded at a downstream node . Such stimulus patterns can be generated using microfluidics to make rapid changes in extracellular ligand concentration [30,31] or optogenetic tools such as engineered light responsive signaling elements . This approach comes much closer to methods used to characterize transfer functions in electronic circuits, because it allows the experimenter to determine the response to both the magnitude and the frequency of the stimulus. Moreover, the same cell can be repeatedly interrogated, avoiding the assumptions inherent in combining data from different cells’ responses.
One insight consistent amongst all of these studies is that signaling pathways operating in individual cells have fairly limited dynamic ranges. Moreover, cells within a population often diverge substantially in their threshold of response to a stimulus [13,33–35]. Thus, the situation depicted in Figure 3 is relatively common and has implications for many commonly held models of cellular functions. Perhaps most strikingly, it means that any individual cell cannot mount reliably distinct responses to more than a small range of input concentration. While studies examining this issue have debated different approaches to quantitate the output function, collectively they suggest that the upper limit of the ‘channel capacity’ for many mammalian signaling pathways is approximately 2 bits. This means that the output magnitude of a stimulated pathway, at any given time, can distinguish up to approximately four different concentrations of stimulus.
A ramification of this seemingly limited view of cellular information processing, is that, because of the high degree of variability between cells, ‘decision-making’ functions may instead be performed at the population level rather than at the single-cell level. For example, at a subsaturating dose of a ligand, only a subset of cells will respond due to the heterogeneity in responsiveness. While many of these responders may already be saturated, further increases in ligand concentration will push a number of the non-responders over their activation threshold, leading to an increased response at the population level. Such population-level information processing is supported by theoretical models, which point out that heterogeneity may be essential when regulating inherent binary responses such as cell death . For example, if all cells shared the same sensitivity to a death-inducing ligand like TNF-α, the only regulatory possibilities would be to either retain or eliminate the entire population. With a broadly distributed range of sensitivities, it becomes possible to reduce the population by a desired fraction depending on the dose of the ligand. Thus, broad distributions of signal pathway responsiveness can provide an organism with a higher degree of control over the behavior of a tissue [36,37].
An elegant experimental example of such population-level control can be found in the Drosophila embryo. Nuclei respond to the morhphogen Bicoid – which is distributed in an anterior-posterior gradient – by activating certain genes wherever the Bicoid concentration exceeds a certain level. This response has a high degree of precision, enabling each nucleus to determine unambiguously whether it lies on the anterior or posterior side of the boundary . Detailed studies of the kinetics of this process in individual nuclei reveal that their individual ranges of response are too small to explain the full dynamic range. Instead, the organism-level response depends on the fact that each nucleus is probabilistic in its capacity to respond and thus is independent from the strength or duration of response . Because the mRNAs produced by each nuclei are exported and shared within the local area of the syncytium, the collective mRNA production in each region averages the local nuclei, such that the stochastic nature of induction enhances the overall dynamic range.
Gain control in multi-step signaling pathways
Population-level information processing is a reasonable strategy for functions that are collective properties of multiple cells, such as the total amount of a molecule synthesized or the fraction of cells taking on a given fate. However, other cellular functions appear to demand careful cell-autonomous regulation . For example, maintenance of cellular ATP levels is essential for cell viability, and deviations from the normative concentration must be quickly and accurately corrected to prevent cell death. This need for accuracy implies that some regulatory functions may require tighter regulatory control than has been measured for signaling pathways thus far. Accordingly, cells have developed strategies for gain control, also termed as Dose Response Alignment (or DoRA) to reduce the tendency toward mismatches in the dynamic range in signaling cascades. Despite the fact that it is relatively difficult to ensure that the quantitative parameters of a pathway maintain DoRA, such regulation appears to be pervasive across many systems, implying that DoRA is a trait selected for in the evolution of regulatory systems .
A key component for DoRA is negative feedback from a lower tier of the cascade to a higher tier. The capacity of negative feedback to shift multi-tiered responses toward alignment is well-known from many human-designed systems in which it is used to linearize the system response . Such a strategy has also been confirmed to operate and produce DoRA within the yeast MAPK cascade, and may potentially explain other instances of DoRA . Importantly, however, negative feedback alone is insufficient to produce DoRA; also required is a mechanism for comparing the input with the output and adjusting the output to match the input . Such a mechanism is analogous to ‘proportional control’ in engineered systems, where the strength of feedback is adjusted to be proportional to the difference between output and input. Further exploration of the mechanisms for DoRA in biochemical networks is ongoing, and differences in the results of theoretical studies to date underscore the complexity of this problem [41,42], which remains surprisingly understudied despite its fundamental importance to intracellular signaling functions.
Ultimately, to measure the transfer function of a signaling process as it occurs in the cell, the researcher must make accurate measurements of both the input and output signals in such a way that the dynamic range of each measurement matches or exceeds the range of the input and output values as they occur in the cell. Given the caveats of population measurements relative to single-cell ones (discussed above), such measurements would best be made using time-lapse single-cell techniques. However, despite advances in live-cell signaling reporters , there remain few cases in which multiple reporters are available for sequential steps within a pathway. One area where such sequential measurements are both achievable and of high interest is in understanding information transfer from signaling pathways to their target genes. Studies using the yeast transcription factor Msn2 have pioneered concepts in this area [24,44]. Msn2 responds to multiple stresses, including glucose limitation, high salt, and oxidative stress, raising the question of how subsets of genes specific to each stress are activated by the same transcription factor. Live-cell measurements have made it possible to link the dynamics of Msn2 localization, which varies according to the initiating stress, to the expression of different Msn2 client genes. This work has revealed that different dynamic patterns and magnitudes of Msn2, when coupled to differential response parameters for target gene promoters, can encode specific gene expression profiles. In mammalian cells, similar concepts have begun to be explored in the p53 and Ras/MAPK systems [13,45,46], where gene expression profiles are more complex.
Echoing the theme of the single-element measurements discussed in the preceding section, from these studies it is clear that there is a high degree of cell-to-cell variability that limits the amount of information that can be transmitted by any one pathway. What remains to be determined is whether this variability results primarily from stochastic events, or whether it reflects differences in the activity of pathways not measured. Ultimately, answering this question will be of great importance in interpreting genome-wide single-cell expression profiling, which is now entering widespread use.
For many, the ultimate goal of signal transduction research is to understand the cellular communication systems that regulate human physiology, in sufficient detail to know which components are required for healthy functioning, and to identify means to restore function when these systems break down. For much of its history, work in this field has focussed on enumeration of the proteins, small molecules, and nucleic acids involved in these processes and elucidation of their interactions. It has been noted many times that while this cataloging is an essential first step, models that capture systems-level behavior will be needed to integrate this knowledge. While a wave of ‘systems biology’ research over the past 15 years has endeavored to build such models, it remains unclear to many whether this goal has been achieved, given the failure of broadly useful and predictive quantitative models to emerge. If this goal has not yet been achieved, is it because we still lack key molecular details or because we have yet to develop the correct modeling approaches? It can be argued that, while both are likely to be true, there is an important third obstacle: many of the experiments reported thus far have failed to quantitate the key operational properties of signaling networks that enable predictive modeling [7,47]. Many of the studies discussed here have appreciated this gap and have contributed to an emerging conceptual framework that stresses the importance of these properties, while others have developed new tools aimed at making such measurements accessible. With these technologies now in place, we anticipate that the next few years will yield substantial insights into the properties of signal transduction pathways that enable them to transmit information, an appreciation of the limitations on reliability, and ultimately quantitative models of how cell fates and identities are specified through signaling pathways.
Concepts used to engineer communication systems can augment cell signaling studies.
The dynamic range of signaling processes is an essential factor determining how they transfer information about stimuli.
Single cells are often limited in their information processing capacity, but variability between cells can improve signal discrimination at the tissue level.
Operational properties of signaling pathways are essential parameters for predictive models.
This work was supported by the National Institute of General Medical Sciences [grant number 1R01GM115650].
The authors declare that there are no competing interests associated with the manuscript.