Seeds are essential for human civilization, so understanding the molecular events underpinning seed development and the zygotic embryo it contains is important. In addition, the approach of somatic embryogenesis is a critical propagation and regeneration strategy to increase desirable genotypes, to develop new genetically modified plants to meet agricultural challenges, and at a basic science level, to test gene function. We briefly review some of the transcription factors (TFs) involved in establishing primary and apical meristems during zygotic embryogenesis, as well as TFs necessary and/or sufficient to drive somatic embryo programs. We focus on the model plant Arabidopsis for which many tools are available, and review as well as speculate about comparisons and contrasts between zygotic and somatic embryo processes.
Seeds first appeared a little under four million years ago . Fossil evidence places the origin of angiosperms (flowering plants) during the Cretaceous, but molecular evidence supports an earlier origin . The evolution of the seed was an innovation that led to the predominance of seed plants. Orthodox seeds allow the zygotic embryo to survive in a dry state for long periods, even under adverse conditions . Seeds can perceive environmental cues and complete germination when survival to the next seed set is optimal. The young plant relies on storage products deposited in the seed, generally the endosperm or cotyledons, until the plant becomes autotrophic. Humans directly rely on seeds for ∼70% of our daily calories. Understanding seed development can support work to increase food production and to meet current and future global demands.
Embryogenesis consists of the processes occurring from the one-cell zygote to the mature embryo within the seed. In orthodox seeds, two main phases occur, the first of which is morphogenesis, during which the plant body plan is established. Later in embryogenesis, but overlapping with morphogenesis, maturation programs lead to deposition of storage products such as lipids, carbohydrates, and/or proteins, and accumulation of products for desiccation tolerance in preparation for the loss of water. In addition, dormancy may be established .
Early zygotic embryogenesis (ZE) has been difficult to study because the cells involved are few and embedded in several layers of maternal tissues. In addition, forward genetic approaches identifying embryo defective mutants tend to have lesions in essential genes rather than in genes involved in embryogenesis specifically . Similarly, pattern formation mutants tend to be defective in gene products that are needed throughout the life cycle rather than for the establishment of domains in the embryo explicitly [5–7]. A more accessible model for ZE has been somatic embryogenesis (SE, also used as an abbreviation for somatic embryo), whereby somatic cells redifferentiate to follow an embryo program. Whether there is much congruency between molecular programs in ZE and SE remains a question. However, SE is also an important mode of plant propagation and regeneration, allowing genetic engineering to meet agricultural challenges. But, many plants or even particular cultivars of a species are recalcitrant to SE (some recent reviews include [8–10]). Thus, understanding processes promoting SE formation is also relevant. In this review, we focus on key transcription factors (TFs) necessary and/or sufficient for promoting zygotic and somatic embryo programs in the model plant Arabidopsis thaliana (henceforth Arabidopsis).
In both ZE and SE, certain events must occur including the establishment of the body axis (polarity), and of the primary and apical meristems. As such, we briefly discuss key factors involved in the initiation of tissue systems during ZE, and speculate on possible parallels (or not) with SE. Many of the gene products discussed have roles outside of embryogenesis, but we limit discussion to embryo processes.
Overview of morphogenesis
The process of flowering plant seed development begins with double fertilization where one of the sperm nuclei fuses with the egg cell nucleus forming a diploid zygote. In most angiosperms, a second sperm nucleus fuses with two polar nuclei of the central cell forming a triploid endosperm which supports and nourishes the developing embryo . Some recent reviews on double fertilization include [12–14]. The embryo and the endosperm grow covered by maternal integument tissues that later form the seed coat . Thus, the seed consists of three genotypes: the embryo with a 1 : 1 maternal (M) : paternal (P) ratio, the triploid endosperm with a 2M : 1P ratio, and the seed coat with a diploid 2 M genotype.
Upon fertilization, a positive signal from the zygote causes the co-ordinated development of the endosperm [16,17]. There have been various studies showing endosperm development is critical for embryo viability [18–20]. In most angiosperms, the triploid endosperm initially undergoes free nuclear division without cytokinesis forming a syncytium . Later, the endosperm cellularizes [21,22]. The endosperm is largely transient and is mostly consumed by the embryo in some species, such as Arabidopsis, whereas in other plants (e.g. Zea mays), the endosperm is persistent and also supports the young seedling . This review does not address endosperm development. However, for further details on endosperm, the reader may refer to reviews [18,21,23,24]. For some interesting recent literature on endosperm-embryo communication, please see recent reviews [25,26]. Imprinting is also a fascinating phenomenon not covered in this review, but some recent reviews include [27,28].
Embryogenesis is generally considered to occur in two phases: morphogenesis and maturation. Morphogenesis is the phase of embryogenesis during which the plant body is established, starting with the zygote [11,29]. The primary meristems that give rise to the epidermis, vascular system, and ground tissue are established and are called the protoderm, procambium or provascular meristem, and ground meristem, respectively. The shoot and root apical meristems (RAMs) that will perpetuate these primary meristems after completion of germination are also founded. A more detailed description of Arabidopsis morphogenesis and some key TFs involved are outlined below.
When considering developmental processes, it is important to realize that plant cells cannot generally move due to cell walls, so cellular migration, as found in animals, does not contribute to development. However, many plant cells are in symplastic communication via ‘channels’ called plasmodesmata (PD) that are regulated pathways connecting the plasma membrane, endoplasmic reticulum, and cytoplasm, cell-to-cell. PD can selectively allow transport of relatively large molecules including TFs between cells. Some recent reviews include [30,31].
Overlapping with and continuing after morphogenesis, maturation programs are active to accumulate storage reserve materials in the seeds. Additionally, desiccation tolerance is acquired, dormancy programs may be established, and desiccation occurs [32–35]. Excellent reviews for the maturation processes include [36–38].
Morphogenesis in Arabidopsis
Polarity and AC and BC determination
Cellular polarity is an important mechanism involved in many aspects of plant development, including early embryogenesis. In many species, including Arabidopsis, angiosperm egg cells are polarized with most of the cytoplasm and nucleus toward the chalazal (apical) end of the ovule (also a polar structure) and vacuoles at the micropylar (basal) end. Generally, a transient, more symmetrical stage occurs after fusion with the sperm cell, where the nucleus is more centrally located, and the large vacuole divides into many small, more evenly distributed vacuoles within the cell ([39,40] reviewed in ). Repolarization of the zygote involves elongation, nuclear migration towards the chalazal end and formation of a large vacuole at the micropylar end. The zygote elongates approximately three-fold in Arabidopsis [42,43].
In almost all flowering plant species the zygote undergoes an asymmetric transverse division [44,45], but there are exceptions (e.g. Triticum aestivum) [45,46]. This division leads to an apical daughter cell (AC) which always provides cells that contribute to the embryo proper (EP; the part of the embryo that upon completion of germination develops into the seedling) and a basal daughter cell (BC) forming the suspensor and at least part of the root (for review ). The suspensor is a transient structure that supports the developing embryo. Interestingly, suspensor cells have greater developmental potential and can express EP programs, up to forming viable embryos, when there are defects in the EP (reviewed in ).
Many embryogenesis/seed development studies have been focused on the model plant Arabidopsis because of abundant molecular and genetic tools available . In addition, morphogenesis consists of a simple and predictable pattern which is invariant [29,35] and visualization of development is relatively easy [50,51]. Not all dicots show this type of predictable pattern of divisions; for example, Gossypium hirsutum (cotton) has more random early divisions after the first asymmetric division, but organizes a recognizable protoderm by early globular stage , as well as the other primary and apical meristems later in morphogenesis.
In Arabidopsis, after the first division, the smaller AC generates a spherical pro-embryo by undergoing three rounds of cell division; first two rounds of longitudinal divisions at right angles to form four cells of equal size, followed by one transverse division that gives two tiers, producing an eight-cell stage also called an octant stage embryo (Figure 1) [29,34,41,53,54]. At the octant stage, the apical–basal axis of the embryo can be distinguished as the apical/upper tier embryo domain (UT), which generates the shoot apical meristem (SAM) and most of the cotyledons, and the basal/lower tier embryo domain (LT), which will contribute to the hypocotyl, RAM, and parts of the cotyledons. The BC divides transversely, generating a filament of seven to nine cells by the globular stage of EP development, giving the extraembryonic suspensor (S), with the exception of the uppermost cell that contributes to the root meristem through the formation of the hypophysis (Figure 1). The suspensor connects the pro-embryo to maternal tissue and provides nutrients and growth factors as well as pushing the developing embryo into the nutrient-rich endosperm [35,55,56].
Different stages of embryo development in Arabidopsis (dicot).
Molecular components of zygote polarity establishment have been identified in Arabidopsis. Several mutants have compromised zygote elongation, defects in the first asymmetric division, and/or problems in basal cell identity and include yoda (yda), gnom (gn), grounded (grd), short suspensor (ssp), wrky2 (named for a protein domain that includes a motif of amino acids WRKY), and brassinosteroid kinase(bsk)1 and bsk2 double mutant.
The YDA pathway is involved in zygote elongation and development of basal cell lineages in Arabidopsis [57,58]. Zygotes homozygous for loss-of-function in yda fail to elongate and show an almost symmetrical first division with a normal-sized AC and a smaller-sized BC. The AC lineage exhibits a normal division pattern up to the octant stage, but the BC shows a random division pattern rather than forming the normal file of cells. YDA encodes a mitogen-activated protein kinase (MAPKKK). MAPKKKs participate in signaling events via a MAP kinase cascade, involving MAPKKs (MKK4 and 5 in the YDA pathway) and MAPKs (MPK3 and MPK6; the double mutant of which also has defects in zygote elongation, division, and lacks a developed suspensor, ). Interestingly, a gain-of-function yda mutant that has a deletion in the amino-terminal region, leading to the deregulation of this kinase, results in exaggerated suspensor growth .
Recently, WRKY DNA-BINDING PROTEIN 2 (WRKY2) was shown to be a target of the YDA/MKK4/5/MPK3/6 kinase cascade . Interestingly, the cascade appears to be initiated upon fertilization, where the sperm contributes SSP mRNA that is then translated in the zygote. SSP is a Brassicaceae specific member of the BSK brassinosteroid signaling kinase family proteins BSK1 and BSK2 that also function upstream of YDA signaling . SSP appears to have lost residues involved in controlling an intramolecular interaction such that the tetratricopeptide (TPR) domain no longer interacts with the kinase domain. Generally, in BSK proteins this interaction is involved in a negative autoregulation that is relieved by phosphorylation. Loss of this interaction in SSP allows it to constitutively activate YDA and subsequently the rest of the kinase cascade, leading to activation of WRKY2 and expression of WUSCHEL-RELATED HOMEOBOX transcriptional regulators WOX8 and WOX9 [61–63].
WOX8 and another family member WOX2, along with WRKY2 are expressed in the egg and zygote, but after the first division, WOX8 is only in the BC along with WOX9, whereas WOX2 is in the AC [62,64,65]. WOX8 and WOX9 are important for BC lineage determination as wox8 wox9 double mutants showed abnormal suspensor development . Defects are also apparent in the EPs that arrest prior to the globular stage or even earlier and develop as finger-like projections [66,67]. Because WOX8 and 9 are initially only expressed in the basal lineage, the EP defects are a non-cell-autonomous effect that functions through WOX8/9 activating WOX2 expression in the apical lineage. WOX2, together with auxin signaling, is involved in the specification of AC fate and are involved in embryonic shoot patterning . Some of these events are diagrammed in Figure 2. Many other factors are also involved, including GRD/RKD4, a RWP-RK transcriptional regulator [58,68], small peptides CLAVATA-Like 8 (CLE8), and EMBRYO SURROUNDING FACTOR's (ESFs) [69,70], and ZYGOTIC ARREST1 (ZAR1) that like SSP, is a receptor-like kinase (RLK)/Pelle kinase .
Some key events during early embryogenesis in Arabidopsis.
Auxin and early events
The auxin-dependent pathway is another critical pathway in early embryogenesis. In Arabidopsis, after zygote division, auxin is transported from the basal to the AC mediated by suspensor-specific PIN7 (Figure 2) to form an auxin response maximum that contributes to the pro-embryo specification [72,73]. Auxin biosynthetic pathway components, YUCCA (YUC) family members including YUC3, YUC4, and YUC9 are a source of basal auxin. YUC3/4/9 are initially expressed in the suspensor, but loss-of-function mutants (yuc3/4/9) show abnormalities in apical regions of the embryo . Recently, Robert et al.  showed that maternally supplied auxin from synthesis in the integuments is also involved in very early EP development. At the 16-cell EP stage, an apical auxin source is generated by TRYPTOPHAN AMINOTRANSFERASE OF ARABIDOPSIS1 (TAA1), another key enzyme involved in auxin biosynthesis, and YUC1/YUC4. PIN7 becomes relocalized such that it directs auxin flow basipetally. PIN1, another auxin efflux carrier, contributes to this new auxin maximum in the hypophysis and upper suspensor cells [72,73]. In addition to the PIN transporters, two redundant auxin influx carriers AUX1 and LIKE AUX1 (LAX1) also help to establish the basal auxin accumulation .
Proper localization of the PIN auxin efflux carriers is key to morphogenesis. The gnom mutant was originally isolated as a pattern formation mutant [6,76]. Defects in this gene show a range of phenotypes, but one associated phenotype results in ball-shaped seedlings that lack the normal apical–basal patterning where the first zygotic division is symmetrical. GNOM encodes a GDP/GTP exchange factor for small G proteins of the ARF class (ARF-GEF) and is involved in proper localization of PIN1 to move auxin basipetally via endosomal recycling [77,78].
Interestingly, both PIN1 (auxin efflux) and AUX1/LAX2 (auxin influx) are regulated by the AUXIN RESPONSE FACTOR5/MONOPTEROS (ARF5/MP) in the inner embryonic cells and this is important for hypophysis/root development . ARF's bind to auxin response elements (cis elements, AREs), but have AUX/IAA proteins associated that block ability to regulate target genes. In the case of MP, the AUX/IAA protein is BODENLOS/INDOLE-3-ACETIC ACID INDUCIBLE12 (BDL/IAA12). When auxin is present, it binds to an auxin receptor (TRANSPORT INHIBITOR RESPONSE1, TIR1), activating a complex that will ubiquitinate BDL, marking this protein for degradation by the proteasome and thereby freeing MP to regulate gene expression (for review, ). The mp loss-of-function and the bdl gain-of-function (that cannot be degraded via ubiquitination) mutants lack roots [6,80].
Besides impacting on the accumulation of auxin transporters, MP directly regulates expression of TARGET OF MP7 (TMO7), a basic helix-loop-helix (bHLH) TF. MP is present in the inner cells of the very young globular embryo, but TMO7 moves from this site of synthesis to the hypophysis via PD, and is involved in root development [81–83]. Interestingly, an upstream regulator of MP expression involves SSP . In the ssp mutant, MP is ectopically expressed and the ssp mp double mutant partially restores suspensor development compared with the short suspensor found in ssp. Thus, SSP signaling is important to restrict MP expression in suspensor cells.
Another ARF/IAA pair (ARF9/IAA10 and potentially redundant suspensor-expressed factors) have an antagonistic relationship with MP and are important for suspensor and hypophysis development . Defects in auxin response in the suspensor lead to unusual cell divisions and activation of genes normally expressed in the EP. Thus, auxin has different effects in different cells possibly mediated by different contingents of ARF and associated IAA gene expression. In the uppermost suspensor cell, auxin is needed for hypophysis specification and division; however, in the suspensor cells it restricts cell division and EP programs. Some ARFs are activators of gene expression while others repress gene expression, and MP and ARF9 belong to these different groups as measured by a synthetic auxin-responsive promoter driving expression of a reporter gene .
The suspensor has a broader developmental potential that is restricted by the EP. Many studies where the EP is ablated, either physically or by a genetic defect, found that cells in the suspensor can take on EP characteristics up to forming viable embryos as in the twin mutants (reviewed in ). More recently, Liu et al. , used in vivo living cell laser ablation to remove the EP from the S within the developing seed and found that if performed at the globular stage or earlier, the top-most suspensor cell accumulated auxin and formed a new embryo. Recently, transcriptome analysis to examine the suspensor to embryo transition has revealed a role for bHLH TFs in mediating this process .
Primary meristem establishment
The upper and lower-tier cells undergo a tangential (also called periclinal) division, giving rise to 16 cells with eight inner cells and eight outer cells. This stage is called the dermatogen stage because from this point on, the outer cells will divide anticlinally, giving rise to the protoderm, while the inner cells are the founder cells of the provasculature and the ground meristem (Figure 1). The WOX TFs, together with auxin signaling, have been linked to these tangential divisions, which is disturbed in wox2 single and wox2 wox8 double mutants . This phenotype is enhanced in combinations of wox2 wox8 double or wox1 wox2 wox3 triple-mutant combinations with mutations in mp/arf5 . An ectopically accumulating, non-degradable version of BODENLOS (BDL) that would inhibit MP/ARF5 gene regulation even in the presence of auxin, shows defects at the transition to dermatogen stage. However, this dominant, gain-of-function bdl mutant later forms a protodermal layer, suggesting that while auxin response is important at the eight to 16 cell transition for protoderm formation, it is not essential for epidermis formation .
The differentiation of the outer layer and inner cells accords with the gene expression pattern of two homeodomain leucine zipper class IV (HD-ZIP IV) TFs, ARABIDOPSIS THALIANA MERISTEM LAYER 1 (ATML1) , and PROTODERMAL FACTOR 2 (PDF2) . Transcript accumulation corresponding to these two TFs is initially detected throughout the early EP, but immediately after the tangential divisions, expression becomes preferential to the protodermal layer [90–92]. The atml1 pdf2 double mutant shows severe defects in epidermal cell specification leading to embryo lethality [89,93,94]. ATML1 is also sufficient for protodermal identity [95,96].
Further divisions form the globular stages (32–64 cells) where the ground meristem and provascular/procambial meristems are specified that give rise to the ground and vascular tissues, establishing the radial axis [35,53,55,97] (Figure 1). Also, during this stage, the hypophysis, derived from the top-most cell of the basal cell lineage, divides asymmetrically to form the precursor of the quiescent center (QC), which is a smaller lens-shaped cell, and another larger basal cell which is a precursor of the distal stem cells of the root meristem, the columella [35,53,55,97].
Little is known about the initiation of the ground meristem compared with subsequent maintenance and patterning. MP, that ‘marks’ the inner cells at the dermatogen stage, has a role in ground tissue initiation , as well as provascular meristem development (discussed below). The ground tissue daughter cells will then divide to give the endodermis (most interior layer of the cortex; roles include regulation of transport of water and ions into the vascular tissue) and the cortex (exterior layer) cell lineages. Furthermore, auxin activity was found to be highest in vascular tissue, lowest in the protoderm, and intermediate in the ground tissue in globular stage embryos. Studies on vasculature formation have largely focused on MP. During early embryogenesis, mp mutants show defects in the characteristic divisions to generate the vascular tissues and fail to develop an embryonic root [80,99–101]. Later, transcript profiling studies revealed numerous direct MP target TFs, which act downstream of MP in root initiation, including bHLH TFs TMO5 and TMO5-LIKE1 (T5L1) . TMO5 marks the provascular initial cells in the early globular stage embryo, indicating that it mediates MP functions in the pro-embryo. The tmo5/t5l1 double mutants show a reduced vascular bundle [81,102]. These defects are also observed in a bHLH146 mutant, also called lonesome highway (lhw) . A double mutation of lhw and its close homolog lonesome highway-like 3 (lhl3) shows severe phenotypic defects in vascular tissue formation . TMO5 and LHW interact to form a heterodimeric complex [102,103], and constitutive co-expression of TMO5 and LHW activates periclinal divisions in all cell types, indicating that the TMO5-LHW dimer is necessary and sufficient to regulate the periclinal divisions in the establishment and for the maintenance of the vasculature in the post-embryonic root . LONELY GUY 4 (LOG4), an enzyme involved in cytokinin (CK, a plant hormone) biosynthesis, was identified as a direct target of the TMO5-LHW complex . The single log4 mutants do not have any phenotype, but higher-order mutants with orthologs show severe defects in embryonic vascular tissue development and patterning . Another interesting study identified LOG3, LOG4, and ARABIDOPSIS HISTIDINE PHOSPHOTRANSFER PROTEIN 6 (AHP6) as direct targets of the LHW-T5L1 complex . Some additional reading on the initiation of vascular development include reviews [107,108].
Shoot and root apical meristem establishment
At the end of the late globular stage, cells at the top peripheral regions begin dividing more rapidly to form the two cotyledons. Because this is a transition from a radially symmetrical globular stage embryo to a bilaterally symmetric heart stage embryo, this stage is sometimes referred to as ‘transition or triangular stage’ (not shown in Figure 1). The embryo greens during this stage and vascular differentiation (xylem and phloem) becomes apparent. Continued cell division and expansion generate later heart and then torpedo stages, and the RAM is organized. The SAM becomes anatomically apparent during torpedo stages as a population of smaller cells. The suspensor degenerates during later morphogenesis  and further expansion causes the embryo to bend to fit within the seed coat (bent cotyledon and mature stage; Figure 1).
Small populations of cells within the SAM and RAM divide slowly to provide the surrounding stem cells, and these regions are called the organizing centers . The organizing centers maintain the stem cell identity of adjacent cells . While the primary meristems are established in the embryo, they are perpetuated after completion of germination by the apical meristems.
Two families of TFs are involved in shoot versus root identity, specifically the class III HD-ZIP family  and the AP2-domain PLETHORA (PLT) family [113,114], respectively. The PLT genes (PLT1, 2, 3, and 4; PLT3 is also called ANINTEGUMENTA-LIKE6, AIL6; PLT4 is also called BABY BOOM, BBM) are expressed in the LT cells at the eight-cell stage of EP development. The HD-ZIP III family includes genes encoding the TFs PHABULOSA (PHB), PHAVOLUTA (PHV), REVOLUTA (REV), ARABIDOPSIS THALIANA HOMEOBOX 8 (ATHB8), and ATHB15 (also called ICU4), that are expressed by the 16-cell stage in the upper tier cells at the globular stage of embryogenesis and are redundant in function for post-embryonic shoot maintenance . HD-ZIP III family TFs regulate the formation of the SAM, the boundary between the SAM and the cotyledons, the central portion of the embryo, and the dorsal/ventral pattern of leaves during post-embryonic development . The HD-ZIPIIIs are post-transcriptionally regulated by microRNA 165/166 (MIR165/166) family members . Constitutive expression of PLT1 or PLT2 induces the hypocotyl, root, and root stem cell niche from the basal region of the embryo, indicating a central role for PLT1 and PLT2 in basal cell fate determination. TOPLESS (TPL) encodes a transcriptional co-repressor that regulates expression of PLT1 and PLT2 [117,118]. In the tpl-1 mutant, both PLT1 and PLT2 are constitutively expressed at both ends of the embryo, resulting in a range of phenotypes with the most severe phenotype being the transformation of the shoot to a root . TPL associates with the PLT promoters  so this may be a result of direct regulation, although via interaction with another protein as TPL is not a DNA-binding factor. TPL has been shown to interact with BDL  as well as many other TFs . PLTs are induced by auxin, although they do not appear to represent direct ARF targets . A mutation in PHB that eliminates the ability of MIR165/166 to regulate the transcript results in suppression of the double-root phenotype. On the other hand, ectopic expression of forms of HD-ZIPIIIs that cannot be regulated by MIR165/166 using the PLT2 promoter converts basal cells into a second SAM and produces a double-headed seedling. These studies demonstrate antagonistic roles for the HD-ZIP IIIs and PLTs in determining apical and basal cell fate. However, it is not clear whether the HD-ZIP III and PLT directly control each other's expression and how they interact to maintain the boundary between apical and basal cells. These interactions are diagrammed in Figure 3.
PLT and HD-ZIPIII have an antagonistic relationship to determine shoot versus root identity.
A comprehensive discussion of all interesting genes involved in SAM and RAM development is beyond the scope of this review. Some of the other TFs (e.g. MP, TMO7) that are involved in establishing the QC of the root are described above, but many other important factors have been reported that are not discussed here. The interested reader is referred to recent reviews for further reading that include [121,122]. Several molecular pathways have been identified that regulate the organization of the SAM. WUSCHEL (WUS) is involved in specifying the OC in the SAM and is involved in feedback regulation with CLV3 [123,124]. Besides the well-known factors, including KNOTTED-1 homeodomain protein homolog SHOOT MERISTEMLESS (STM)  and class III HD-ZIP factors , more genes have been shown to regulate WUS expression including the class II HD-ZIP genes: ATHB2, ATHB4, HOMEOBOX ARABIDOPSIS THALIANA 1 (HAT1), HAT2, and HAT3 [126–128]. Several class II HD-ZIP genes (e.g. HAT2, HAT3, and ATHB4) are directly regulated by the class III HD-ZIP gene REV, indicating that the HD-ZIP III genes might regulate development by activation of the HD-ZIP II genes [129,130]. For more information, some recent reviews include [131–134].
Overview of somatic embryogenesis
SE is a process by with somatic cells redifferentiate to form embryos. These somatic embryos may form directly on the explant (direct SE) or there may be an intervening embryogenic callus phase (indirect SE). Somatic embryos are bipolar structures that do not have vascular connection with the explant material. SE can be divided into inductive and developmental phases, where changes in differentiation status and acquisition of embryogenic competence occurs during the former phase, and differentiation into SEs during the latter phase .
Induction of SE typically involves the use of plant growth regulators (PGRs), often with a stress treatment. Most commonly, treatment with auxin is involved, usually the synthetic auxin 2,4-D, sometimes in combination with CKs (reviewed in ). The synthetic auxin 2,4-D may act as an auxin, and/or as a stressing agent, and may induce synthesis of endogenous auxin. The PGR/stress treatment can lead to redifferentiation of cells that are competent for SE, via de- or trans-differentiation to a status with greater potency and establishment of embryo programs by embryo induction. These stages are followed by embryo development, which often requires removal of the exogenously added auxin. Although SE is typically thought of as an in vitro phenomenon, Kalanchoë (commonly known as ‘mother of thousands’) forms somatic embryos that convert into plantlets at the leaf margins .
A loss of symplastic communication is also a feature of reprogramming of cells (reviewed in ). Recently Godel-Jedrychowska et al. , documented that symplastic isolation of embryogenic cells is required for redifferentiation during SE. The cells competent for SE express WOX2 (a marker of AC), but while auxin is common for induction, cells that form embryos show reduced auxin response (as measured by a synthetic auxin promoter driving a reporter gene). A reduction in callose biosynthesis, which would show deficiencies in symplastic isolation, repressed SE. These findings applied to both SE induced from WT immature zygotic embryos (IZE) as well as SE induced by overexpression of BABY BOOM (BBM, discussed below).
Changes in epigenetic state are involved in redifferentiation processes, and involve DNA methylation as well as histone modifications (reviewed in ). As discussed below, many Arabidopsis genes when mutated or compromised in expression, result in SE development and these tend to encode factors involved in epigenetic regulation at the developmental transition from embryo to seedling.
Somatic embryos have been used as a more easily accessible model for zygotic processes that occur embedded in multiple layers of maternal tissues, but there are some obvious differences. SE lack communication with endosperm/maternal tissues as they develop without these tissues, and there are differences in maturation processes, specifically lack of desiccation and dormancy. SE's may have a reduced or even lack an obvious suspensor, but morphological stages in the EP are similar to ZE. Often, the initial division of embryogenic cells in SE are asymmetric, similar to ZE (reviewed in ). However, for microspore embryogenesis, the asymmetric division necessary for the development of the male gametophyte is replaced by a symmetric division, producing a haploid SE .
Summary of SE systems in Arabidopsis
There are both direct and indirect SE systems in Arabidopsis. In almost all cases, embryonic or meristematic tissues serve as the explant source. Gaj  used a direct SE system she developed and subsequently documented the requirement of several TF's and proteins involved in hormone synthesis and response [143,144]. Briefly, IZE are isolated from developing seeds and used as the explant. A range of developmental stages were tested, with later-stage corresponding to bent cotyledon embryos (400–700 μm) showing 65–90% SE production for Col and Ws ecotypes . The explants were placed on B5 medium supplemented with 5 μM 2,4-D and sucrose. Within 3 weeks, primary somatic embryos (PSEs) are evident on the explant (Figure 4A). It was possible to move these PSEs to auxin-free medium (in some protocols, also including gibberellic acid (GA)) to allow conversion and recovery of plantlets. The origin of the cells producing the embryos is within the protodermis and subprotodermal layers of the adaxial cotyledons . Formation of PSE requires LEC1, LEC2, and FUS3 (these key embryo TFs are discussed below), with loss-of-function alleles of these genes showing only 1–4% SE production, and this rare SE formation is via an indirect route involving callus . Higher-order mutants of these genes only produce callus. Gaj  also reported that younger stage IZE explants more often produce embryogenic callus.
Somatic embryo systems in Arabidopsis
Su et al. , used a modified system from Ikeda-Iwai et al. , that involves an indirect route of SE production. As described above, IZEs are cultured to produce PSE that are then moved to a liquid medium with higher concentrations of 2,4-D (9 μM) to generate embryogenic callus. This can be cycled in this higher auxin medium or moved to auxin-free medium to induce secondary SE (SSE). If cultured too long in the high 2,4-D medium, the callus loses embryogenic capacity, but this can be reestablished by culture on solid B5 medium with lower 2,4-D (Figure 4B).
Ikeda-Iwai et al.  also developed an Arabidopsis SE system from vegetative tissues (Figure 4C). This system required use of meristematic tissues (SAM, floral buds, or axillary buds; SAM was most efficient) and subjected the explants to osmotic stress before moving to B5 medium with 4.5 μM 2,4-D. After 10–21 days, callus forms at the shoot apical region followed by SE development. If the explants are moved to 2,4-D free medium, additional SE forms on the callus. The best conditions for the Col ecotype involves stress treatment with mannitol for 6 h. The developmental stage of the SAM is important with shoot apical explants from 5 day seedlings giving the most efficient SE at 29%. Younger explants die, whereas explants from older seedlings only generate callus.
Other Arabidopsis SE systems appear to develop at or near SAMs [150–152] either directly or via embryogenic callus, or from IZE explants . Few systems in Arabidopsis are from other tissues. Mathur et al.  reported SE-like structures from protoplasts derived from auxin conditioned roots, and Luo and Koop  were able to obtain apparent globular stage embryos with a suspensor-like structure from leaf protoplasts, but only early stages of development occurred.
Arabidopsis transcription factors that promote embryo identity
TFs necessary and sufficient for embryogenesis
While forward genetic screens to identify embryo defective mutants generally result in the identification of non-redundant genes essential for viability rather than for embryo development or designation of domains within the embryo specifically [5,53], some embryo defective mutants have lesions in genes that are more specific for seed development. These include genes encoding four central regulators LEAFY COTYLEDON1 (LEC1), ABSCISIC ACID INSENSITIVE3 (ABI3), FUSCA3 (FUS3), and LEAFY COTYLEDON2 (LEC2), otherwise referred to as the LAFL factors, that are key to embryo development. These genes are necessary and sufficient for embryo processes, and show the highest levels of transcript accumulation in the developing embryo and endosperm ( and Arabidopsis eFP Browser ), although they have also been reported to have developmental roles after completion of germination [158–163].
To be necessary, a loss-of-function of the factor would result in an embryo defective phenotype (whether specific to embryogenesis or not). Sufficiency is tested by ectopic expression of the factor and assessment of whether it can drive embryo-specific programs outside of the embryo-context. The LAFL factors are among those necessary and sufficient (to different extents; discussed below) to drive embryo processes. Other TFs may be sufficient to drive embryo programs, but not necessary, perhaps due to redundancy. Others may show embryo deficient phenotypes, indicating necessity, but are not sufficient to drive embryo programs, often because they are needed for cell viability rather than embryo specification, but also possibly due to interacting factors missing in the ectopically expressed domains.
While putative orthologs of LEC1 and ABI3, and perhaps LEC2, appear to have arisen prior to the ‘invention’ of the seed, FUS3 orthologs have only been found in seed plants to date ( and reviewed in ). Roles for putative orthologs in non-seed plants include desiccation tolerance, and many aspects of LAFL loss-of-function phenotypes indicate that they may have been recruited to control processes during the maturation phase and establish the quiescent state of the mature seed.
LEAFY COTYLEDON 1 (LEC1, At1g21970) is a NF-YB/HAP3 subunit of a CCAAT box-binding factor (CBF) . NF-YBs form complexes with NF-YA/HAP2 and NF-YC/HAP5 subunits, with the NF-YA subunit primarily responsible for recognition of the DNA cis element (the CCAAT box) (reviewed in ). NF-YB subunits include a broadly conserved histone-like domain (the central B domain) that encompasses residues necessary for complex formation as well as residues contributing to binding DNA . The loss-of-function lec1 shows defects in embryo morphogenesis and embryonic maturation, including decreased accumulation of storage products and loss of acquisition of desiccation tolerance. When developing embryos are rescued prior to desiccation, the resulting lec1 homozygous seedlings show leaf-like traits in the cotyledons, including the development of trichomes (in Arabidopsis trichomes are a leaf, but not a cotyledon feature) [34,169,170]. LEC1 is preferentially expressed in the seeds, beginning early in embryo development and declining during maturation as shown in Figure 5A [166,171–174]. Early defects include abnormal cell divisions in the suspensor . The loss-of-function mutant also shows precocious activation of the SAM during embryogenesis. When LEC1 is constitutively expressed using a semi-constitutive 35S viral promoter , the genetically engineered plants produced somatic embryos (SE) on the seedlings, which indicates that LEC1 is sufficient to promote embryo identity . However, this sufficiency has a timeframe. When the 35S : LEC1 transgene included a glucocorticoid receptor (GR) domain that allows the TF to only be transported into the nucleus upon hormone treatment with the synthetic steroid dexamethasone, only early timepoints during or shortly after completion of germination resulted in embryo/embryo-like tissue. By 4 days after germination, the seedlings did not produce this tissue and instead appeared wild type . The central B domain of the LEC1 protein was confirmed to be required and sufficient for embryogenesis. Moreover, an amino acid residue (aspartic acid at position 55), which specifically exists in LEC1-type B domain proteins (LEC1 and L1L), but not in non-LEC1-type NF-YB subunits that have a lysine in this position, was verified to be necessary for embryogenesis . LEC1 interacts with other proteins such as bZIP67, LEC2, and others to control various programs during seed development [167,176]. The propagules that arise on Kalanchoë, by what appears to be a combination of organogenesis and SE, requires an altered form of KdLEC1 [137,177].
Interactions between key regulators of embryogenesis, SE and the transition to seedling development.
LEAFY COTYLEDON1-LIKE (L1L, At5g47670), is the closest related gene to LEC1, encoding a product with 83% sequence identity that produces a LEC-1 type NF-YB subunit. L1L is primarily expressed in developing seeds and transcript accumulation is highest during maturation (Figure 5A). The l1l RNAi suppression lines showed an incompletely penetrant and range of phenotypes varying from arrest at the globular stage with abnormal suspensor divisions to underdeveloped cotyledons. Although somatic embryos were not produced on vegetative tissues in 35S : L1L, the seedlings did express embryo-specific programs (e.g. accumulation of storage products) in leaves that had cotyledon characteristics. In addition, the overexpression line could suppress the desiccation intolerant phenotype of the lec1 mutant as could LIL driven by the LEC1 regulatory sequences .
LEAFY COTYLEDON 2 (LEC2, At1g28300) contains a B3 domain and is another major TF involved in embryogenesis in plants. Loss-of-function lec2 shows a similar phenotype to lec1 in that both have trichomes on their cotyledons upon embryo rescue, and defects in embryo shape . Ectopic LEC2 expression in vegetative cells could induce a somatic embryo-like seedling phenotype demonstrating sufficiency to drive embryo programs . Like LEC1, there is a timeframe for sufficiency. When a 35S : LEC2 transgene is expressed, the ‘seedlings’ produce masses of SE . But when an inducible form (35S : LEC2-GR) is used, while seedlings treated with dexamethasone activate embryo programs and leaves take on embryo-like features, they do not produce somatic embryos [180–182].
FUSCA3 (FUS3, At3g26790) is a B3 domain TF. Loss-of-function of fus3 seeds are desiccation intolerant, but like lec1 and lec2, can be rescued into culture prior to drying, upon which the cotyledons show ‘leafy’ traits (trichomes), but these trichomes were much less prevalent compared with lec1 or lec2 mutant plants and the embryos have a more normal bent cotyledon shape [170,183]. Unlike LEC1 and LEC2 overexpression that produces somatic embryos on the seedlings, the FUS3 ectopic phenotype was milder and produced cotyledon-like leaves . FUS3 transcript is primarily expressed in seeds but is also detected at low levels in tissues after completion of germination . FUS3 transcript accumulation begins by the eight-cell embryo stage, and continues to be expressed relatively late during embryo development compared with LEC1/LEC2 (Figure 5A and [171–174,186–190]). FUS3 is positively regulated by the plant hormone abscisic acid (ABA) and negatively regulated by GA [191,192].
ABSCISIC ACID-INDEPENDENT3 (ABI3, At3g24650) is the third B3 domain TF of the LAFL genes. Like lec1, lec2, and fus3 mutants, abi3 seeds are desiccation intolerant but homozygous plants could be generated by rescue before the seed dries. Unlike other LAFL family members, abi3 did not show abnormal leafy cotyledons, but did show reduced seed storage protein accumulation. ABI3 is also involved in ABA perception and response [193,194]. Ectopic expression of ABI3 did not produce embryos on vegetative tissues, but some seed-specific storage protein mRNA was found in seedlings indicating the ability to ectopically activate some embryo programs . ABI3 is expressed at highest levels during the late stages of embryo development (maturation phase) [171–174,188,195].
Genes that are sufficient for embryo identity/somatic embryogenesis
In addition to the key LAFL factors described above, many other genes were found to be sufficient to confer embryo identity when ectopically expressed, but were not necessary for normal zygotic embryo development. This may be due to redundant gene function(s) in the embryo . Some of these genes, can, like LEC1 and LEC2 promote SE on various parts of the seedlings. Others only promote SE on particular tissues (e.g. roots; for example, WUS) or in particular culture conditions (e.g. AGL15, AGL18).
AGAMOUS-Like15 (AGL15, At5g13790) and AGAMOUS-Like18 (AGL18, At3g57390) encode MADS domain TFs that are preferentially, but not exclusively, expressed during seed development [197,198] Overexpression of these factors dramatically promoted secondary embryo production from zygotic embryo explants in the absence of exogenous hormones [199,200]. Subculturing allowed continued somatic embryo development in the absence of any exogenously added hormones for a long period time with the oldest 35S : AGL15 cultures being nearly 24 years old [199–201]. Additionally, both AGL15 and AGL18 can promote SE development from the SAM when mature seeds are allowed to complete germination in liquid medium containing the synthetic auxin 2,4-D ( and Paul and Perry, unpublished). These proteins can interact .
While SOMATIC EMBRYOGENESIS RECEPTOR KINASE 1 (AtSERK1, At1g71830) does not encode a TF, but rather encodes a leucine-rich repeat (LRR) transmembrane RLK which is broadly expressed including in developing seeds, it can also produce SE when ectopically expressed on auxin containing media . It has been reported to form a complex that includes AGL15 . Phosphorylation of AGL15 has been reported and is important for function [204,205].
Several APETALA2 (AP2) domain TFs including BABY BOOM/PLETHORA4 (BBM/PLT4 At5g17430), EMBRYOMAKER/AINTEGUMENTA-LIKE5/PLT5 (EMK, At5g57390), PLETHORA2 (PLT2, At1g51190), and some of the WOUND INDUCED DEDIFFERENTIATION (WIND) factors promote SE from seedlings in Arabidopsis when ectopically expressed [113,147,206–208]. Other members of the PLT family (PLT1, 2, 3, and 7) can also induce SE when ectopically expressed .
Additional TFs can promote SE on roots when ectopically and/or overexpressed. These TFs include WUSCHEL (WUS, At2g17950) [210–212], PLANT GROWTH ACTIVATOR37 (PGA37/MYB118, At3g27785) that encodes MYB transcription factor, and MYB115 (AT5G40360), a paralogous gene of PGA37 .
The Arabidopsis genome contains five RWP-RK DOMAIN-CONTAINING (RKD) genes, and expression of AtRKD1 (At1g18790) and AtRKD2 (At1g74480) were detected in reproductive organs, mainly expressed in the egg cell. Single or double mutants of these two genes did not show any phenotype or embryo morphological differences compared with wild type. But overexpression of AtRKD1 or AtRKD2 individually led to the proliferation of tissue expressing egg cell markers . AtRKD4 (At5g53040) is another member of this family that is preferentially expressed in the early EP and suspensor. As discussed above, deficiencies in RKD4 (also called GROUNDED) causes abnormal zygotic cell elongation and subsequent division. Increased ectopic expression of GRD/RKD4 produced embryo-like structures on roots .
Loss-of-function mutants that produce ectopic embryos
Many other genes promote SE in Arabidopsis when present as loss-of-function alleles and these include pkl [215,216], val1 val2 [217,218] (VAL1 is also called HSI2; and VAL2 is also called HSL1), clf swn , hda6 hda19 RNAi , and a double knock-out/down of AtBMI A/B  or AtRING1a/b . These mutants/RNAi plants produce embryos/embryonic calli on aberrant seedlings and may be considered as having defects in the transition from embryo to post-germinative development. Interestingly, this group of genes encode proteins that include domains shown or predicted to be involved in epigenetic control of gene expression and in particular involving repression of gene expression. Where it has been investigated, expression of LAFL and other genes mentioned above are derepressed in the seedlings, and in the case of AGL15, VAL1 directly represses gene expression [218,220–224]. VAL1 forms protein interactions with HDA6, while VAL2/HSL1 interacts with HDA19 [225,226]. VAL1 and VAL2 have also been reported to interact [225,227]. Other studies found evidence for direct repression of LAFL genes by VAL–HDA complexes, but these studies used 35S promotor for ChIP and a different stage of seedling development [225,226], rather than the native promoter as used in Chen et al. , where the association of VAL1 with LAFL regulatory regions was not found. In addition, CLF and SWN that are components of the Polycomb Repressive Complex2 (PRC2) involved in depositing trimethylation on lysine-27 of histone 3 (H2K27me3 — an epigenetic mark associated with repression of expression), may directly regulate LEC2, FUS3, and ABI3 .
Interaction network among these transcription factors
Genes responsive to the accumulation of many of the factors mentioned above have been identified, and in some cases, those that are bound by the TF determined (so-called direct responsive targets) [159,176,180,201,209,224–226,229–234]. Extensive interaction exists among these genes and the products that they encode, and some direct regulatory interactions are shown in Figure 5B. At the bent cotyledon (B-COT) stage during ZE development, LEC1 could directly bind and regulate LEC2, FUS3, ABI3, AGL15, ATSERK1, L1L, PGA37, and EMK, as well as potentially autoregulating itself . Some of these genes are also directly induced targets of LEC1 during SE (FUS3, ABI3, LIL, and EMK). LEC2 directly regulates AGL15 and FUS3 [180,234] and indirectly regulates ABI3 . Recently, WIND1, VAL1/HSI1, and MYB118 were also found to be directly responsive LEC2 targets . Interestingly, LEC2 interacted with FUS3 in vitro by using pull-down method and in vivo by using BiFC assays . LEC1 and LEC2 can be coimmunoprecipitated in the presence of NF-YC2 and the OLE1 promoter (a target DNA sequence) in pull-down experiments . FUS3 direct target genes include LEC1, AGL15, ABI3, BBM, L1L, as well as autoregulating itself . ABI3 could regulate itself and FUS3 [188,231]. BBM regulates several of the LAFL genes .
WUS did not bind to any of the genes mentioned above in shoot apices of Arabidopsis  or interact with any of them, which suggests that WUS may participate in embryo development through a different pathway or it acted at a relatively down-stream position in the pathway, or the context of shoot apex did not permit binding to these genes. However, several genes that regulate WUS are directly induced targets of LEC1, LEC2, and/or AGL15 in ZE and SE (see below). Regulatory regions of WUS were associated with BBM , although the response of transcript accumulation was not documented . Using microarrays to compare transcriptomes in a MYB118 overexpression transgenic line with wild type control showed increased FUS3 transcript accumulation in response to increased MYB118 . In addition, microRNAs are involved in SE. As an example, PLT1 and PLT2 were found to be regulated by miRNA396 via GROWTH REGULATING FACTOR (GRF) TFs [239,240].
Is there congruency between gene regulatory programs during SE and ZE?
At a molecular level, there have been reports of both similarities of transcriptomes in SE and ZE and differences. A study in cotton indicated a high degree of similarity between SE and ZE transcriptomes, with the main difference being transcript accumulation from many stress-associated genes in SE . However, in Arabidopsis, Hofmann et al.  found few similarities between ZE and SEs, with the SE datasets that they evaluated appearing more similar to shoot and root tissues, and germinating seeds. In another study, Magnani et al.  separated cells that would undergo SE from callus by using an INTACT where the SE competent cells were labeled by driving the chimeric protein used for INTACT using a LEC2 promoter. They found increased transcript accumulation of WIND1 and BBM in the SE compared with callus cells, but not LEC1 or AGL15. They proposed that WIND1 and BBM, along with SERK1 confer embryogenic competence. They also proposed that LEC1 and AGL15 transcripts were absent because the stage sampled was too early for LEC2 to have led to the expression of these genes. Additionally, very early fate markers WOX2 and WOX9 transcripts were not detected possibly due to differences in very early patterning between SE and ZE. As a whole, they found the transcriptome of early embryogenic cells resembled octant stage zygotic embryos and meristematic cells.
The study by Kadokura et al.  that used the SE system of Ikeda-Iwai et al.  where the embryos formed from the proximity of stress-treated young seedling SAMs (Figure 4C) found key embryo TFs expressed in the SE-forming explants compared with non-SE-forming explants. They determined that embryonic commitment occurs by day 3 of culture at which time they could see the LAFL transcripts preferentially in SE producing explants. As a whole, the SE-forming explants had a mixed embryo, SAM, and root identity.
Wickramasuriya and Dunwell  looked at a direct SE system where SE is derived from IZE (from Gaj ; Figure 4A), sampling for RNA-seq after 5, 10 and 15 days of culture, comparing these stages to leaf tissue. Many embryo markers, including the LAFLs, L1L, BBM, and AGL15 were expressed at early time points as were many genes encoding TFs involved in polarity and pattern formation (WOX9, WUS, MP, PLT1/2, and STM). Gaj demonstrated that functional LAFL genes are essential for SE in this system .
A recent study  provided an explanation why use of IZE's are important for SE development in Arabidopsis. They assessed chromatin architecture and transcriptomes in response to culturing bent cotyledon IZE or 3 day seedlings on medium with 2,4-D. The former could produce SE but not the latter. They found that the ‘embryo’ developmental stage of the explants is important for auxin-induced chromatin changes to reprogram cells. Seedlings on 2,4-D had lost the open chromatin architecture at key TFs, including FUS3 and LEC2, that was not able to be reset to an open form in seedlings. While SAM tissue (or nearby tissue) can produce SE's, this only works for young seedlings, or seeds allowed to complete germination under conditions to promote SE [150,243]. It is possible that these very young SAMs are still at least partly ‘embryo’ in nature. For instance, while AGL15 accumulates to its highest level in embryos, one context where similar nuclear accumulation is seen is in the SAM of young seedlings (e.g. 4 days). This is transient, with loss of clear nuclear immunolocalization by 6 days .
How do embryo TFs promote (somatic) embryogenesis?
As discussed above, key TF's are necessary and/or sufficient for embryogenesis and there are both similarities and differences in ZE and SE, both morphologically and molecularly. However, in both cases, after fertilization or induction of SE, respectively, the body plan must be established, including initiation of the primary meristems as well as the development of the shoot and root meristems.
Specific roles for some of the TFs discussed above in zygotic processes have been reported based on phenotypic analysis of loss and/or gain-of-function, and genes directly controlled by these factors. For example, depending on the interacting partner, LEC1 is involved in morphogenesis, maturation processes, and photosynthesis [167,176]. LEC2 has roles in auxin production and signaling, and in maturation processes [180,181]. FUS3 and ABI3 both are involved in maturation processes and response to abiotic stress [230–232]. However, LEC1 transcript is already detectable in the egg cell. FUS3 transcript is initially in the basal lineage, while LEC2, AGL15/18, and BBM transcripts are present in the 24 h zygote , much earlier than the onset of maturation processes. These factors are also up-regulated early in several SE systems using wild type explants (see the discussion above). Loss-of-function of LEC1 and LEC2 have obvious morphological defects, including abnormal suspensor development. While the other genes do not have obvious morphological defects, the LAFLs show more severe phenotypes with higher-order mutants [188,246]. Likewise, the lack of zygotic phenotypes for the other TFs may reflect redundancy. Because ChIP studies should not be completely impacted by redundant factors, as phenotypes and transcriptomics may be, it is instructive, although perhaps speculative, to investigate genes that may be directly regulated by these key factors that are involved in polarity and meristem development and compare to information from SE systems. Some level of redundancy may result from the fact that many TFs may contribute to the regulation of a particular gene. For example, a 2014 study of genome-wide binding sites of 27 Arabidopsis TFs, identified more than 1000 highly occupied targets (HOTs), defined as genes bound by seven or more of these TFs, with a maximum of 18 of these TFs bound per gene . Target hubs, defined as bound by eight or more of the TFs in the study, were enriched for genes encoding regulatory factors. As an example relevant to factors discussed in this review, several key embryo TFs directly induce expression of HAT's that are in turn involved in WUS regulation. These HATs have redundant functions (reviewed in ), and therefore phenotypes in SE (and ZE) SAM development in some mutants may be masked by the regulation of different HATs by diverse embryo TFs.
Previously, based on genes regulated by Arabidopsis AGL15, and its ortholog in soybean (Glycine max), it was reported that AGL15 may impact on early stages in the SE process because many TFs found to be expressed in dedifferentiation processes (also referred to here as ‘cellular reprograming’ that has been proposed as more accurate than dedifferentiation, ) are regulated by these genes . Of the 44 ANAC, bZIP and WRKY TFs associated with reprograming , half have regulatory regions associated with LEC1 (data of ), suggesting that LEC1, like AGL15, may impact on a very early stage of SE. Pelletier et al. , looked at the association at early (4 h) and late (8 days) after induction of LEC1 to promote SE, as well as binding of LEC1 in bent cotyledon stage ZE, and the majority of ‘dedifferention’ TF genes showed similar binding between at least one SE stage and ZEs. Nearly half (43% and 41%) have regulatory regions associated with LEC2 and BBM, respectively [180,234,236]. Not all necessarily show a response in terms of transcript accumulation to the level of the potentially controlling TF, but this can be affected by the particular stage/tissue used to assess the transcriptome (for example, ABI3 directly binds regulatory regions for a gene encoding a miRNA, but expression, repression or no response depends on the stage of seed development; ). For this reason, we have chosen to report on binding, but those that respond are also shown in Supplementary Table S1 and Supplementary Figure S1. Of the 44 factors, more than three-quarters are bound at least one of the LAFL, AGL15, and/or BBM. Interestingly, of the list of these 44 ‘dedifferentiation’ TFs genes complied by Grafi et al. , 84% show transcript accumulation at or before the 32-cell stage of ZE  and Supplementary Table S1.
How much potential control of establishment of the primary meristems and apical meristems may these factors have? Interestingly, LEC2 was recently shown to directly up-regulate WOX2 and WOX3 expression during SE, and the products encoded by these genes are necessary but not sufficient for SE . While other factors involved in polarity and fate after the first division may potentially be directly regulated by LAFL/AGL15/BBM during SE, almost no components associated with BC/suspensor development appear to be controlled by these TFs. These include WOX8, YDA, GRD/RDK4, ARF9, and IAA10, as well as some of the components involved in YDA signaling (Figure 2, Supplementary Table S1). Possibly this reflects the fact that an obvious suspensor is not essential for SE development. On the other hand, several genes encoding components involved in AC determination, besides WOX2, are potential direct targets of LAFL/AGL15/BBM during SE including MP, IAA12, and TMO7.
Auxin is an important factor in both SE and ZE. How may biosynthesis and response be controlled by these factors? LEC1, LEC2, and FUS3 have been found to directly induce the YUCCA genes that encode for products involved in auxin biosynthesis. Other genes encoding key TFs are induced in response to auxin, including AGL15, FUS3, LEC1 LEC2, BBM, and WUS [147,184,234,251-253]. As mentioned above, Godel-Jedrychowska et al.  showed callose deposition and decreased auxin response (as measured by an artificial auxin response transgene), associated with totipotency. In this regard, the fact that AGL15 appears to limit auxin accumulation and response may be relevant [201,254] although high-throughput data also shows AGL15 directly expresses MP (dataset  GSE17742). The regulatory regions of many other genes that encode products involved in auxin biosynthesis, transport, or response are also associated with one or more of the LAFL/AGL15/BBM factors (Supplementary Table S1).
Some of the WOX genes (WOX2, WOX8, WOX1, and WOX3) have roles in the transition from octant to dermatogen stage of ZE. While the lec mutants do not show defects in ZE protoderm development, functional redundancy is possible. WOX2 and WOX3 were shown to be direct LEC2 target genes and necessary for SE . MP, that encodes a TF involved in the development of all three primary meristems, is a direct expressed AGL15 target gene in SE. PDF2, involved in protoderm development, and some of the LOG genes involved in provasculature development are also direct responsive targets of BBM and AGL15, respectively. Jo et al.  reported a role for LEC1 in morphogenesis via regulation of PHV, PHB, and SCR. Several factors involved in shoot-root development are also potential LEC1, LEC2, AGL15, and/or ABI3 direct responsive targets including PHB, PHV, ATHB2, ATHB4, HAT2/3, and PLT3. The list of potential directly controlled genes with products involved in meristem formation is longer if only association with regulatory regions is assessed, regardless of response (Supplementary Table S1).
We have highlighted some of the TFs involved in the control of early processes during embryogenesis and in the ability to specify embryo identity. These factors interact in complex networks, by controlling their own and other TFs expression and activity. They also share control of many other directly regulated genes that are not TFs and these targets are what actually ‘builds’ the cell. Transcriptomic studies can now measure and compare transcriptomes in individual cell types [64,65,255]. Some examples include comparison of egg to zygote transcriptomes , apical compared with basal cell transcriptomes , determination of an embryo epidermal specific transcriptome , and other subregions of the seed/embryo .
Although this review focuses on Arabidopsis, many of the TFs discussed have orthologs in other species including crop plants and comparison of regulatory networks should reveal interactions conserved and therefore presumably most important for driving developmental processes during seed development. Many orthologs of TFs described can induce SE in other plants and some examples include Nicotiana tabacum and G. max LEC1 [260,261], citrus FUS3 , and AGL15 orthologs in G. max and G. hirsutum [263,264]. In addition, computational tools are valuable approaches to determining gene regulatory networks (for a recent review ).
The authors declare that there are no competing interests associated with the manuscript.
We apologize to all of our colleagues who have generated excellent data that we could not discuss or cite due to space constraints. We would like to thank our anonymous reviewers for their comments and suggestions to strengthen this review. We are grateful to Ms. Jeanne Hartman for valuable comments on the manuscript, and Ms. Ju-young (Gloria) Yoon for the generation of Figure 1. This work was supported by the National Science Foundation (grant no. IOS-1656380 to S.E.P.) and by the National Institute of Food and Agriculture, U.S. Department of Agriculture, Hatch project (S.E.P.) under accession number 1013409. No conflict of interest is declared.
apical daughter cell
basal daughter cell
immature zygotic embryos
plant growth regulators
primary somatic embryos
somatic embryogenesis/somatic embryo
zygotic embryogenesis/zygotic embryo