Systems modelled in the context of molecular and cellular biology are difficult to represent with a single calibrated numerical model. Flux optimisation hypotheses have shown tremendous promise to accurately predict bacterial metabolism but they require a precise understanding of metabolic reactions occurring in the considered species. Unfortunately, this information may not be available for more complex organisms or non-cultured microorganisms such as those evidenced in microbiomes with metagenomic techniques. In both cases, flux optimisation techniques may not be applicable to elucidate systems functioning. In this context, we describe how automatic reasoning allows relevant features of an unconventional biological system to be identified despite a lack of data. A particular focus is put on the use of Answer Set Programming, a logic programming paradigm with combinatorial optimisation functionalities. We describe its usage to over-approximate metabolic responses of biological systems and solve gap-filling problems. In this review, we compare steady-states and Boolean abstractions of metabolic models and illustrate their complementarity via applications to the metabolic analysis of macro-algae. Ongoing applications of this formalism explore the emerging field of systems ecology, notably elucidating interactions between a consortium of microbes and a host organism. As the first step in this field, we will illustrate how the reduction in microbiotas according to expected metabolic phenotypes can be addressed with gap-filling problems.
Systems biology and metabolism
Systems biology consists in considering an organism or an interacting group of organisms as a whole rather than studying its components individually . This can be contemplated through the exploration of genome-scale metabolic networks (GSMs), which contain all biochemical reactions and pathways that are expected to occur in a cell. The first published metabolic networks were associated with model organisms characterised by low biological complexity or widely studied by the community: Haemophilus influenzae , Escherichia coli , and Arabidopsis thaliana . The construction of these models was enabled by extensive literature-based curation and expert knowledge together with experimentation, including the possibility of genetic alterations.
Metabolic networks of unconventional organisms
Since then, the rise of sequencing technologies paved the way to the study of the metabolism of thousands of organisms, with strong heterogeneity between and within taxonomic groups [5,6]. Most of these sequenced genomes correspond to unconventional or non-model organisms, that is, species with little available background knowledge, or that are difficult to cultivate so that most or all of the available information is derived from their genome sequence (possibly from metagenomic studies). The de novo prediction of the functions of unknown proteins is generally not contemplated for such organisms, and the large phylogenetic distance from related model organisms frequently limits the transferability of predictions from these models . In the same vein, the validation of predicted functions is difficult and experiments on unconventional organisms mostly involve perturbations of their environment. In model organisms, on the contrary, it is possible to perform genetic modifications in order to investigate the role of a targeted unknown protein. As a result, disentangling the metabolism of unconventional organisms necessitates methods that differ from those applicable to well-studied organisms.
Metabolic network models are used to mathematically predict the growth rate [8,9] or the optimum yield (function of linear combination of rates) [10,11] of the organism by solving optimisation problems [12,13]. A GSM reconstruction is obtained first by taking into account all genomic information related to the organism and then by manually refining the draft according to specific knowledge about the organism (adding missing reactions, removing the ones that were falsely inferred) until the mass-balanced equilibrium of internal metabolites and cofactors are satisfied and growth can be adequately predicted .
In this framework, the so-called gap-filling step that consists in adding reactions to obtain a relevant biomass prediction, is crucial. However, especially in unconventional organisms, this gap-filling step may also bear negative impacts. A first drawback is a risk of adding reactions for which the considered organism has no associated gene, either because the associated enzyme is not annotated, or because the considered function is performed by a specific pathway that has not yet been identified. A second drawback is related to the underlying hypothesis of ensuring biomass production. Indeed, GSMs are considered to be valid when they accurately predict biomass production. The composition of this biomass function can be confirmed experimentally, or derived from models [15–17], in accordance with the literature. However, the objective is much more difficult to define when the organism cannot be cultured individually as is the case for many unconventional organisms. One example is the difficulties to grow in axenic conditions one of the most studied brown algae: Ectocarpus siliculosus. This alga, a model among the stramenopiles, [18,19] was the first brown alga to be fully sequenced  and has been extensively studied, notably at the metabolic scale [21,22]. Yet, at least in standard culture media, E. siliculosus does not grow properly in axenic conditions, that is individually, without its associated symbionts . Axenic culture leads to altered physiology and morphology of the alga, which can be restored after inoculation of bacterial isolates. These biotic dependencies are commonly observed when trying to culture unconventional organisms. For bacteria, in particular, the auxotrophies in their ecological niches are difficult to identify . These advocates for prudence in the interpretation of gap-filling results when focusing on a single unconventional organism, and more generally, for organisms that live in a symbiosis where several objective functions may have to be combined .
In the last decade, increasing knowledge about the wide range of interactions occurring between hosts and microorganisms or between microorganisms in their natural environment has led to expanding the concept of systems biology to what can be called ecosystems biology or microbial systems ecology [26,27]. Microbiotas are communities of microorganisms that can be found in a given environment , or in association with a specific host species, such as the highly studied human gut microbiota . The organisms that form these microbiotas are frequently unculturable using standard laboratory techniques and fall into the definition of unconventional organisms . As a consequence, they are less studied experimentally, or even not studied at all with all information about them provided by omics data. In addition to interacting with their abiotic environment, these symbiotic organisms have mutual interactions, leading to communities with complex physiology and phenotypes.
The recent technical improvements in metataxonomics and metagenomics have allowed for an increased focus on these complex communities including the unculturable part of living organisms . This led to an expansion of the limits of the tree of life  and is now providing biologists and modellers with an unprecedented amount of genomic data regarding host–microbial systems , inducing a change of paradigm in the study of biological species. Organisms now tend to be investigated as members of a complex ecosystem rather than independent individuals . Highly studied host–microbial systems include the human gut , the rhizosphere , arthropods  or marine organisms . The impact of this change of paradigm on the field of metabolic networks is important in systems biology. The goal of this mini-review is to revisit the role of gap-filling methods introduced in the process of construction of a metabolic network. We discuss, in particular, how the study of communities can be reformulated as a gap-filling problem and why the steady-state hypothesis may have to be leveraged by a more qualitative system abstraction. This leads us to introduce new optimisation problems derived from the gap-filling problem that can be efficiently applied to the study of the metabolism for individuals and communities of unconventional organisms.
As the reader will find, the translation of biological concepts and insights into formal problems is highly dependent on slight differences in mathematical abstraction that can result in divergent predictions. This necessitates the use of precise notation to accurately specify different abstractions. We accept that not all readers will be familiar with such notation, but we expect that the accompanying text will make the general principles and aims of the approaches accessible.
Steady-state and Boolean frameworks
A metabolic network is composed of reactions that transform metabolic compounds into other metabolic compounds. From the information associated with a metabolic network, mathematical approaches should theoretically enable modelling the dynamic response of cellular metabolism in a given medium with ordinary differential equation (ODE) models . In practice, the parameters of these models cannot be numerically fitted because of both non-linearities and a lack of experimental possibilities of manipulating a metabolic system. Kinetic models therefore target parts of the metabolism  but are not contemplated at genome-scale. To overcome this obstacle, several modelling abstractions have been introduced, such as the steady-state abstraction  relying on flux balance analysis (FBA)  and the Boolean abstraction based on network expansion . Both abstractions predict the family of reactions (and compounds) that can be activated (produced) from the compounds of the extracellular medium.
Formal definition of a metabolic network
Let us introduce several notations for the sake of clarity of the theoretical background. We formally describe a metabolic network by a bipartite directed graph , where and stand for reaction and metabolite nodes. When (respectively, ), with and , the metabolite is called a substrate (respectively, product) of the reaction . The edge labels : describe the stoichiometric coefficients of the considered compounds in the considered reactions. Such coefficients are gathered in the stoichiometric matrix. Each reaction is associated with a variable which represents the reaction flux activity. The complete vector is known as the rate laws of the system . It is bounded by a vector of lower bounds and upper bounds . Media compounds representing available nutrients (or seeds) are formally denoted by , with .
Here, internal compounds are assumed not to be accumulated so that the system behaviour is constrained by linear relations , where the function represents the rate laws of the system . The steady-state hypothesis is known to be valid for relatively short time slots (several minutes) assuming that regulatory transcriptional and signalling timescales are disjoint . In this context, activated reactions from a medium are those for which an admissible flux can be carried in at least one flux distribution of the system. The steady-state abstraction is widely used to construct and analyse metabolic models [46–48] and their association in small communities [49–51]. In the following, as detailed in Figure 1, activated reactions from a medium will be denoted by the set active.
Different abstractions of metabolism.
In the growth phase, the steady-state hypothesis of equilibrium for internal metabolites is not expected to be met [52,53], although FBA-based predictions can be accurate . The Boolean hypothesis is modelled by rules which activate each reaction as soon as all its substrates have been made available by the activation of upstream reactions. In this framework, activated or inactivated reactions, as well as producible or unproducible compounds, can have either 0 or 1 as discrete values. A metabolic compound is producible either if it is a medium compound or if it is the product of an activated reaction. A reaction is activated only if all its substrates are producible. This rule is applied until no additional reaction can be activated, leading to a steady-state of the system that models the capability of metabolic activation from the medium compounds. This is the main concept underlying network expansion introduced by Ebenhöh and colleagues [42,55] that has been applied to the study of metabolic networks both individually [56–59] and in communities [60–63]. Please note that according to Boolean rules, reversing a flux requires to change the value of the input fluxes, so that the Boolean hypothesis implicitly assumes that metabolic fluxes cannot be reversed during the response to a perturbation.
Formally, this hypothesis can be modelled as follows. Let be a Boolean variable associated with each metabolic compound in the network. The associated Boolean dynamic is defined by . Metabolic compounds that are producible from nutrients are those whose associated Boolean variable equals 1 for any fixed point of having non-zero values for nutrients. Equivalently, they are obtained by computing the fixed point of having the minimum number of non-zero values while having non-zero values for nutrients. A reaction is considered activated if all its substrates are producible. Such sets of reactions will be denoted by active (see the formalisation in Figure 1). As shown in , activated reactions in a Boolean framework can be computed recursively.
Differences between abstractions
Although these two concepts of activated reactions seem close in terms of dynamics, it appears that they differ slightly in the way they model the impact of internal cycles. Differences are shown in Figure 1. Reactions and ensure the import of nutrients and from the extracellular medium. In the Boolean abstraction, imports initiate the production of and through the activation of the reactions and . In the steady-state abstraction, however, is not producible because the activation of would prevent the mass-balance equilibrium of , that constitutes a dead-end metabolite. In addition, can be produced if and only if the ratio ; otherwise, internal metabolites would accumulate. Another difference between the Boolean and the steady-state abstractions is related to the production of . In the monotonous abstraction, the reaction cannot be activated because it requires the production of , which depends on itself and therefore cannot be directly produced from the medium. Therefore, compound cannot be produced. On the contrary, can be produced by the system under the steady-state abstraction because the cycle can be self-activated and produces , as well as as soon as the linear constraints allow to have a non-zero value. According to the linear constraints, this is possible if and only if , which corresponds to the case when the cycle does not require all the produced compounds to self-regenerate.
In order to reconcile these different interpretations, we can introduce a concept of hybrid activation corresponding to verifying both the steady-state and the Boolean activation conditions. This is the most stringent notion of activation which verifies both that fluxes can occur in a metabolic network according to mass-balance but also that all internal cycles can be fed by import reactions. However, Figure 1 shows that only a few reactions in a metabolic network may satisfy these properties. Formally, we define that if .
These examples confirm, as discussed in [64,65], that the steady-state abstraction of a metabolic network is highly sensitive to the network stoichiometry and to putative accumulations of internal metabolites. Under the assumption that a few cofactors are present at the initiation of the system, the Boolean abstraction has similar predictions as the steady-state abstraction in terms of metabolite production while being more resilient to network inaccuracies. In this sense, it appears suitable for a preliminary study of the metabolism of unconventional organisms. However, the Boolean abstraction fails to take into account de novo synthesis of compounds involved in cycles (for example, cofactors) and mass-balance equilibrium. The treatment of these metabolites differs between studies: they can be added into the list of available metabolites or removed from reactions [66–69]. Therefore, while the main characteristics of a metabolic network can be identified with network expansion, the steady-state abstraction is required to elucidate behaviours linked to stoichiometry constraints.
Gap-filling problems for individual and community modelling
As explained in the introduction, a metabolic network automatically built from genome annotations or orthology searches are often unable to predict either biomass production or the production of experimentally observed metabolic compounds. This is either due to errors or missing knowledge in the genome annotation procedures or denotes the inability of a species to grow without other symbiotic species. In this section, we detail why the extended concept of the gap-filling procedure is useful to address both the issue of curating a single-species metabolic network and of studying the role of metabolic complementarities between species.
The generic principle underlying gap-filling algorithms of metabolic networks is to perform a selection of reactions within a reactions database in order to restore the functionality of a model with respect to an expected objective. The selection of reactions is often performed according to a parsimony principle aiming at minimising the number of modifications to the system. When possible, predicted reactions to be added to the system are validated with genome-based or knowledge-based studies, so that the algorithms can be run iteratively if the predicted reactions appear to be irrelevant . The result of the gap-filling method is both dependents on the framework used to define a functional model (steady-state or Boolean) and on the database in which reactions are picked-up.
Formalisation of the gap-filling problem
Following a parsimony principle, the metabolic gap-filling problem aims at selecting, in a database of putative reactions, a set of reactions with a minimum size such that all reactions that are experimentally known or expected to be activated are predicted in silico to be activated from the growth medium according to the extended metabolic network.
An underlying hypothesis of gap-filling algorithms is that the considered species are able to sustain their growth with the available nutrients. Although this hypothesis is valid for many prokaryotes studied experimentally, it is less straightforward, especially for animals and plants, which harbour major biotic interactions . This is further evidenced by the difficulties encountered in cultivating unconventional organisms, notably eukaryotes, in axenic conditions, highlighting the metabolic dependency of those species to other organisms [23,76]. Microbial interactions can provide their host or other symbionts with metabolic compounds that are otherwise costly to produce . However, the precise identification of these metabolites for an adequate definition of the nutrients required by the gap-filled organism is a challenge. Such interactions are then suspected to be involved in the evolution of metabolic interdependencies within symbiotic communities [24,78].
Gap-filling a gene-soup microbiota to select reduced communities
Revisiting metabolism at the systems ecology scale consists in studying complementarities between species enabling a community to collectively operate metabolic functions. This can also be viewed as a gap-filling problem in which interactions between species are expected to be identified within several metabolic networks which each serves as a database of metabolic functions to fill gaps in the other networks. Similar to other analyses of communities , several levels of modelling can be considered for such gap-filling. The community model can be compartmentalised or grouped together. The latter is described as gene-soup (or lumped or mixed-bag): a microbiota is represented as a meta-organism or pan-metabolism in which all metabolites and all reactions of the organisms are gathered in a unique abstracted compartment. This gene-soup framework allowed, for instance, the evaluation of the effect of obesity and inflammatory bowel disease in the gut microbiota . This framework is also very useful for the selection of artificial communities based on metabolic complementarities, in order to reduce the complexity of a native microbiota and to test biological hypotheses on symbiotic interactions. To that end, we define a reduced microbiota to be a family of species that has the equivalent targeted metabolic properties as a complete microbiota.
In practice, solving this problem is often less complex than solving the gap-filling problem for individual organisms, and notably, one solution can be found in a simplified steady-state model . However, many different combinations of bacteria are expected to ensure the targeted metabolic objective due to the functional redundancy of microbiotas [80,81], therefore the space of solutions can be large. We confirmed this hypothesis in  by performing a complete enumeration of solutions to the optimisation problem in the Boolean framework, evidencing a strong redundancy in artificially reduced communities associated with the gut microbiota. In total, 86.5% of source-product pairs connected by a metabolic pathway in the gene-soup gut microbiota could be equivalently operated by more than 100 minimal artificial communities, and 49.8% could be operated by more than 1000 equivalent minimal communities.
Gap-filling a compartmentalised microbiota to identify metabolic complementarity
The gene-soup framework is intrinsically limited by the fact that the cost of interactions between species is not taken into account, nor are they precisely identified. To address this obstacle, the compartmentalised framework is designed such that all metabolic networks of the microbiota form compartments that additionally share an external compartment. Exchange reactions between compartments operate the export and import of metabolic compounds between the different metabolic networks. Compartmentalised communities have been modelled for the design of growth media , the inference of interactions in communities [51,83], or the study of microbiota evolution [25,84]. In this framework, compartmentalised reduced microbiotas are defined to be families of organisms that have metabolic properties equivalent to the initial community, while taking into account the energetic cost of metabolic exchanges between species (import and export). The weight of exchange reactions can be a transport cost (if available) or they can be considered to have equal weights if no information allows differentiating them.
Solving optimisation problems associated with gap-filling
The three optimisation problems stated above can be addressed with logic solving approaches. Answer set programming (ASP)  is a declarative approach-oriented toward both knowledge processing with a non-monotonous logic programming approach and combinatorial optimisation problem-solving, such as the optimisation problems presented above when considering the Boolean abstraction. Similarly to the solving of linear problems using LP, the problem is formulated in a dedicated language while the solving of the problem is left to the solver . The first advantage of ASP is its high-level modelling language: problems are formulated according to a first-order propositional logic which provides expressive power and flexibility in problem description. As shown in [57,74], this flexibility for extending a problem statement allowed formulating an ASP program to solve the three gap-filling problems in a unified framework . A second advantage of ASP is the high performance of the underlying solvers , designed to take advantage of SAT-based solving techniques. Solving modes of these solvers enable exploring the space of solutions, for instance by performing the intersection or the union of all solutions in addition to the more computationally demanding alternative of enumerating them. This appeared to be very useful for exploring the search space of the gap-filling problems instead of selecting a single solution [58,63].
Roughly, in ASP, the focus is on the problem specification and reasoning rather than the algorithmic part. A problem is expressed as a set of logical rules (clauses) , where each and are literals. Each proposition is a predicate, encoded by a function whose arguments can be constant atoms or variables over a finite domain. A rule states that the head is proven to be true if the body of the rule is satisfied, i.e. are true and it cannot be proven that are true. By default, all atoms are supposed to be false unless a rule proves that they are true. Optimisation rules can be described with specific predicates. Together, the syntax allows for formulating a very large panel of combinatorial optimisation problems and possibly to combine them in a unified formalism.
The main limitation of ASP resides in the fact that it is ill-suited for solving linear problems such as those yielded by the steady-state abstraction. This limitation has been partially solved in  by relying on the theory reasoning capacities of an ASP solver that allows extending ASP to express and solve linear constraints in addition to combinatorial constraints. These technologies are therefore very promising to provide a general framework to model and solve all optimisation problems related to metabolism.
Applications to unconventional organisms: the example of macroalgae
The example of macroalgal metabolism illustrates how the combinations of different semantics and different gap-filling problems can help elucidate the characteristics of unconventional species at the metabolic scale, from their individual metabolism to host–microbial interactions.
Using Boolean abstraction to shed light on the evolution of the algal metabolic processes
Brown algae (part of the stramenopiles) are important members of marine ecosystems. E. siliculosus is a model to study the biology of these organisms . Following the publication of its genome in , the first reconstruction of its metabolic network was published four years later . The authors used the Boolean abstraction to gap-fill the network resulting in 44 reactions identified as sufficient to activate the production of all the targeted metabolic compounds and the algal biomass. Such computation of Boolean gap-filling can be calculated in computational time of a few minutes . The study of the metabolic network shed light on the evolution of metabolic processes. It suggested that E. siliculosus has the potential to produce phenylalanine and tyrosine from prephenate and arogenate, but does not possess a phenylalanine hydroxylase as found in other stramenopiles. It also possesses the complete eukaryote molybdenum cofactor biosynthesis pathway, as well as a second molybdopterin synthase that was most likely acquired via horizontal gene transfer from cyanobacteria.
Combining abstractions of metabolism to capture algal properties
Applied to other macroalgae, we noted, however, that the Boolean abstraction may not always be sufficient to model the metabolic properties of a network. In , the study of the Cladosiphon okamuranus metabolic network highlighted that, although 67 reactions were sufficient to produce biomass according to the Boolean abstraction, they failed to explain the biomass production according to a steady-state framework. A method for solving the hybrid (Boolean and steady-state) problem was implemented  and enabled the identification of a single missing reaction needed to degrade one component that was not part of the biomass function, but accumulated during biomass production. Orthologues of proteins known to catalyse this reaction in other organisms were identified in C. okamuranus. The complete approach allowed studying biosynthetic pathways for carotenoids production, highlighting both reactions preserved through evolution and the specificities related to brown algae.
Similar difficulties were encountered in the study of the red alga Chondrus crispus . For this organism, an exhaustive confrontation of the experimentally detected metabolites and knowledge databases of metabolic reactions evidenced that for many compounds, biosynthetic pathways could not be inferred with gap-filling algorithms, because of incomplete biochemical knowledge and incomplete conservation of biochemical pathways during evolution. Specific methods based on the Boolean abstraction were required to infer reactions by analogy with metabolic transformations occurring in other plants. Results suggest that even metabolic pathways previously considered as conserved, like sterol or mycosporine-like amino acid (MAA) synthesis, undergo substantial turnover, evidencing a phenomenon termed ‘metabolic pathway drift’ — i.e. the fact that a given phenotype can be conserved even if the underlying molecular mechanisms are changing.
An observation from these examples and more generally from the study of high-quality GSM reconstructions are the persistence of the need for curation and the use of several semantics, methods and tools combined with each other [22,90–92]. The development of new automatic procedures [93–96] (compared in ) facilitates reconstructions but cannot fully substitute refinements operated by experts . Indeed, the publications of high-quality GSMs mention the curation effort and literature-based improvements made to the models [98,99]. For the algal GSMs presented above, the combination of abstractions was crucial as it enabled the assessment of the completeness of the model at each reconstruction step, thereby pinpointing the metabolic pathways that required curation.
From metabolic network gap-filling to suggestions of host–microbial metabolic complementarities
The study of the metabolic network of E. siliculosus also shed light on the importance of symbionts for the metabolism of the algal host. As explained in , a gap-filling algorithm was used to fill the metabolic network of E. siliculosus. The GSM was further analysed in , and the coenzyme A biosynthesis pathway was notably scrutinised. Gap-filling suggested that a reaction producing beta-alanine was required to initiate the production of vitamin B5 (pantothenic acid), a precursor of the pathway. However, genome-based studies concluded that no corresponding gene could be identified in the algal genome for the enzyme EC 18.104.22.168 producing beta-alanine. The corresponding gene was nonetheless present in a bacterium, Candidatus Phaeomarinobacter ectocarpi, known to live symbiotically with the alga , suggesting a putative host–microbial complementarity regarding beta-alanine and other putative interactions. The absence of EC 22.214.171.124 in brown algae was confirmed in , although this study also highlighted a potential alternative biosynthetic pathway from 3-aminopropanal.
Computation and indirect validation of reduced microbiota
The above example confirms that the result of gap-filling algorithms at the individual scale should be considered cautiously, in light of a putative role of biotic interactions. This is consistent with the results of  which stated that, cultured in axenic condition, E. siliculosus evidenced alteration of its morphology and physiology. The main objective in the context of this brown alga but more broadly for host–microbial systems, is to identify the precise mechanisms of metabolic interactions.
In this direction, the non-compartmentalised community selection problem was applied to E. siliculosus and 10 bacteria isolated from the algal microbiome to identify metabolic complementarities. As shown in , the algorithm allowed predicting consortia of three bacteria that would best complement the algal metabolism. Co-culture experiments were set up with a subset of these consortia to monitor algal growth as well as the presence of key metabolites. Although bacterial communities were only modified (and not fully controlled) in the experiments, the data demonstrated a significant increase in algal growth in cultures inoculated with the selected consortia, suggesting that metabolic complementarity is a good indicator for beneficial metabolite exchanges in microbiota. These results constitute a promising application of community reduction and selection of microorganisms based on metabolic complementarity. However, this study also experimentally observed the evolution of the algal microbiome after bacterial inoculation, demonstrating the presence of bacteria that was undetected in the axenic medium. This highlights new experimental challenges to test the predictions made by automated reasoning approaches when working with unconventional organisms.
The rise of high-throughput and cost-effective sequencing paved the way for the study of the metabolism of thousands of organisms with little available background knowledge, many of which are difficult to cultivate. A change of paradigm is occurring in GSM reconstruction with the extensive use of automatic methods that are under active development in order to reduce the need for manual curation and costly experiments to refine the networks, and to account for the limited knowledge of non-cultured organisms. From now on, metabolic modelling appears conceivable for ‘unconventional’ or ‘non-model’ organisms for which the main available data is the genomic sequence. In such cases, the precision of GSM reconstruction is impaired by the lack of data available for these organisms or close relatives, impeding the use of modelling approaches based on flux optimisation and compelling the development of new approaches for their analysis.
The combination of several abstractions of metabolism (Boolean, steady-state and hybrid) can be a response to investigate the metabolic capabilities of an unconventional organism, and should be combined when possible to better understand its metabolism. In particular, approaches of logic programming and Boolean abstractions of metabolic networks are promising to predict the metabolic capacities of these organisms as well as their biological roles in symbiotic communities.
Efforts are still needed in the direction of facilitating metabolic inference for non-model organisms. In particular, the field of metabolism will benefit from methods dedicated to infer reactions beyond those present in knowledge databases, and thereby account for the still-unknown pool of functional sequences in less-studied genomes. In addition, the microbiota context in which unconventional organisms are mostly studied has to be taken into account. Building and refining models for microorganisms that cannot be cultured remains an open challenge, and metagenomic data together with metagenome-assembled genomes need to be linked to metabolic modelling. The existence of multiple and complementary formalisms to abstract metabolism will certainly prove useful to address these challenges in the next few years.
The authors declare that there are no competing interests associated with this manuscript.
This work was partially funded by ANR project IDEALG (ANR-10-BTBR-04) ‘Investissements d’Avenir, Biotechnologies-Bioressources’.