Challenges and approaches to studying pore-forming proteins

Pore-forming proteins (PFPs) are a broad class of molecules that comprise various families, structural folds, and assembly pathways. In nature, PFPs are most often deployed by their host organisms to defend against other organisms. In humans, this is apparent in the immune system, where several immune effectors possess pore-forming activity. Furthermore, applications of PFPs are found in next-generation low-cost DNA sequencing, agricultural crop protection, pest control, and biosensing. The advent of cryoEM has propelled the field forward. Nevertheless, significant challenges and knowledge-gaps remain. Overcoming these challenges is particularly important for the development of custom, purpose-engineered PFPs with novel or desired properties. Emerging single-molecule techniques and methods are helping to address these unanswered questions. Here we review the current challenges, problems, and approaches to studying PFPs.


Introduction
Pore-forming proteins (PFPs) represent a highly diverse and growing class of molecules identified in all kingdoms of life ( Figure 1). Examples include the MACPF/CDC, aerolysin, ClyA, and colicin families, to name a few. PFPs are particularly remarkable in that they transition from a soluble molecule into an integral transmembrane protein. In nature, PFPs function to target and perforate lipid bilayers, often resulting in cell death.
In a typical scenario, a PFP can discriminately bind to a target, whereupon enormous conformational changes take place to insert amphipathic regions into the target bilayer ( Figure 1a). These structural changes are often accompanied or preceded by oligomerisation. After assembling into homo-or hetero-oligomers, the inserted amphipathic regions define an aqueous channel or pore. The specific mechanism of pore insertion is highly divergent and not shared by all PFPs. In general, PFPs bind to the membrane, oligomerise, and then insert into the bilayer. However, these steps can occur in varying orders across the different PFP families (Figure 1b).
The function of a PFP is frequently accomplished via a simple arrangement of a receptor-binding domain (RBD) and a pore-forming domain (PFD), although multi-component systems also exist [1][2][3][4] (Figure 1a). The RBD can be vastly different and may target specific lipid compositions, protein targets, or even glycans. The PFD is equally diverse and fulfils the role of membrane insertion and often oligomerisation. Once activated, the PFD is capable of completely transitioning from one topological fold to a different one (Figure 1c). The final pore lumen can be as small as a few nanometres (aerolysin) to as large as 50 nm (some CDCs), or larger if multiple arcs conglomerate together.
The conversion from soluble monomer to integral pores of variable diameter and chemical properties makes PFPs useful in several biotechnology applications. One objective in engineering PFPs is to make tuneable and controllable gateways for the selective flow of compounds, DNA/RNA, and proteins through membranes. In this regard, PFPs are showing promise in the laboratory for DNA sequencing [29,30], proteomics [31], and even compartmentalised biochemical systems [32]. PFPs are also of major interest in industry for their pesticide PDB: 6RW6). Motifs/domains responsible for membrane insertion and membrane binding are coloured red and orange, respectively. (B) Schematic depicting a generic pathway for pore formation into a lipid membrane based on MACPF/CDC pore-forming systems. Soluble components, typically monomers, bind the membrane. Several monomers subsequently come together to form oligomers. Oligomeric intermediates undergo a conformational change inserting transmembrane amphipathic regions into the lipid bilayer, forming an aqueous channel. Some other PFPs systems oligomerise in solution before binding the bilayer. Membrane penetration may precede oligomerisation in some systems. The stoichiometry of the various intermediates is highly dependent on the pore-forming system. Additional intermediates (not depicted) may also be present. (C) Pore forms (volume rendering in UCSF Chimera) of pleurotolysin (PDB: 4V2T), α-haemolysin (PDB: 3M2L), SmhB (PDB: 7A0G) and ClyA (PDB: 6MRW). Low-pass filtered cryoEM maps of aerolysin (EMDB-8187), lysenin (EMDB-8015) and Photorhabdus luminescens Tc holotoxin (TcdA1; EMDB-10313). A single protomer from each pore is coloured purple (and orange, for the two-component system of pleurotolysin). Where applicable, detergent or lipids are coloured yellow. Scale bar is 100 Å for A and C. and antimicrobial properties [33][34][35][36], while also being proposed as components in targeted drug-delivery strategies [37,38].
Herein, we review the emerging research themes and challenges in the field. We discuss some of the state-of-the-art techniques being employed to study PFPs and offer a perspective on their contributions in the future. Lastly, we explore some recent examples of biotechnological developments that employ PFPs for translational applications.

Current questions and challenges
Over the past decade, our understanding of PFP structure and function has been propelled forward by x-ray crystallography and cryoEM studies. In comparison, investigations into pathway kinetics, lipid interactions, and conformational changes of pore formation are relatively rarerepresenting a major shortcoming in the literature. Here, we outline some of the current challenges and limitations in our understanding of PFPs.

Identification of conditions for activation
Many novel PFPs have been identified by bioinformatic or high-throughput screening techniques. However, these methods provide little to no information on the conditions required to activate pore formation. Consequently, these conditions remain elusive for many well-studied PFPs. Identifying these conditions is nontrivial and represents a major bottleneck in understanding PFP functionoften requiring tedious and labourintensive screening on a case-by-case basis. Screening can be further complicated if pore formation is triggered by a specific sequence of events. For example, various combinations of triggers have been identified, such as proteolysis [9,10, 33,34], changing pH [39][40][41], receptor binding [17, [42][43][44], encountering lipids of particular compositions [39,41,[45][46][47][48], and physical properties of the membrane [49].
In the absence of knowledge about native activation conditions, artificial methods have enabled the study of PFPs in their pore form. Successful methods include freeze-thaw treatment [17], incubation with detergents [50], and addition of mild denaturants [51].

Assembly pathways and kinetics
The assembly pathways that culminate in pore formation are highly diverse and underpin fundamental regulatory and functional attributes specific to a given PFP [52]. Attempts to characterise events along the assembly pathway are met with fundamental challenges. Firstly, intermediates are often short-lived and rapidly transition into different states. Secondly, pore formation is asynchronous, making it difficult to extract mechanistic details from the population averages provided by ensemble analyses.
In addition to the specific events of assembly, the rates and dynamics of these processes similarly underpin functional and regulatory properties of PFPs. As such, the kinetics of pore formation is a fundamental and complex topic [53][54][55][56][57]. Quantification of these properties yields insight into kinetic bottlenecks for therapeutics, emergent properties of the system, structural and evolutionary constraints of function, and statistical understanding of assembly [52][53][54][55][56][57].

Intermediates of pore formation
Excellent progress has been made in determining representative structures of PFPs in various conformational states. These include several stable conformations in the upstream and downstream stages of the pore formation assembly pathway. Notable exceptions include some insecticidal families, such as Cry1 toxins, for which we lack structures of oligomeric species entirely. In comparison, intermediates of pore formation have proven challenging to structurally characterise. These include membrane-bound monomers, small oligomeric assemblies (arcs), early and late prepores, and structures adopted during membrane insertion.
Consequently, studies of intermediate stages of pore formation are rare or require biochemical modifications to the PFP in order to 'trap' and visualise intermediates [18, [58][59][60][61][62][63][64]. Moreover, these modifications also raise questions about the validity of the manipulated state, specifically how closely it reflects a true state in nature, if at all. Furthermore, due to the extensive nature of the rearrangements accompanying pore formation, the simple 'start-and-end' structural snapshots provided by these methods leave much to be inferred. As such, detailed atomistic descriptions outlining the conformational changes that occur during pore formation are lacking for many families.

Lipid properties and biophysics
In addition to being a platform for PFPs to bind and insert into, lipid membranes of different physical properties can also modulate and regulate the function of PFP systems. Intuitively, different lipid headgroups can function to specifically bind and anchor a PFP, as observed for several systems [3, 39,41,65,66]. A less intuitive effect on pore formation is that of the physical properties of the bilayer, such as width, density, and fluidity [67]. Even lipid rafts have been implicated in PFP function [68,69]. These properties manifest from the lipid packing, phase transition temperature, saturation of the aliphatic regions, and curvature of the bilayer.
It has recently been demonstrated that the activity of lymphocyte perforin is affected by these properties, with high lipid order and bilayer rigidity negatively regulating function [49]. Furthermore, studies of β-barrel folding kinetics in lipid bilayers have shown a relationship between certain physical properties (width, curvature, composition, fluidity) and the efficiency of β-barrel formation [70][71][72][73]. Lipid physical properties likely also modulate the kinetics of PFPs, given differences in protein diffusion rates, membrane binding, sequestration [47], and movement of lipids [74,75] during the formation of a pore.

Emergent (system-level) properties
Emergent properties, also known as collective behaviour, are system-level characteristics that arise from complex interactions between many constituent parts. These properties are manifest at the system level, but do not belong to the individual components of the system -'the whole is greater than the sum'. These include ultrastructure, phase transitions, self-assembly and other macroscopic properties. One example, in the context of PFPs, is the process of self-assembly from monomers into an intricate array of hexagonally arranged prepores, such as the honeycomb arrangement of MPEG1 prepores [39,41].
Understanding and predicting these phenomena is important for designing PFPs systems with desired emergent properties. For example, one might use the honeycomb arrangement of MPEG1 to generate an antimicrobial nanosurface for surgical equipment. Such properties are difficult to predict, however, and even more challenging to design. To achieve these goals, biophysicists require computational and theoretical models based on knowledge of system kinetics, structure and mechanisms that recapitulate and describe PFP behaviour.

State-of-the-art tools for studying PFPs
Fundamental questions concerning PFP biology can be classified into key research themes, including structural intermediates, kinetics of pore formation, mode-of-action, PFP-lipid interactions, and translational applications in biotechnology and medicine. Driving these investigations are a broad set of cutting-edge experimental and computational techniques (Figures 2, 3). Collectively, these techniques operate within the single-molecule regime. Here, we cover how several techniques have propelled the field forward in recent years.

Microfluidics
Increasingly, researchers are moving toward high-throughput, parallel, single-molecule based assays that couple microfluidic systems with state-of-the-art techniques (discussed below). The modular nature of microfluidic devices enables complex fluidic circuits to be built in a customisable fashion, integrating liposome fabrication [32,76], protein assays, and small-scale purifications [77] into one ( Figure 2a). As such, lab-on-a-chip devices can be developed for specific low-volume, efficient and reproducible experiments for parallel screening, sample optimisation, and data acquisition schemes. To date, microfluidic devices have been developed to perform myriad tasks [76][77][78][79][80][81][82][83]. While only a few studies currently make use of microfluidic or micromanipulation to specifically study PFPs, their use is growing with future applications in high-throughput lipid screening or measuring entire assembly pathways.

Membrane mimetics & cryoEM
Membrane mimetic technology has drastically accelerated the study of membrane proteins, in conjunction with cryoEM. Membrane mimetics, such as liposomes and nanodiscs, preserve the lipid bilayer while forming in solution ( Figure 2b). These offer the select advantage over detergent by acting as a membrane platform upon which PFPs may assemble [16,17,39,58,59,61,[84][85][86]. Membrane mimetics are particularly valuable for interrogating PFP structure in a lipid environment, and as such have been used extensively (Table 1).
Detergents have also been widely used to study pores by cryoEM and x-ray crystallography [50,84,87]. Some PFP systems spontaneously form prepore and pore oligomers in the presence of detergents [50]. While for  (27) Continued other PFPs, detergents have been used to extract pores from native membranes or stabilise activated pores in vitro [41,84,85,87].
Amphipathic polymers, such as amphipols or styrene maleic acid polymers (SMALPs), offer an alternative to detergents to extract and/or stabilise membrane proteins. While SMALPs have not been used yet to study PFPs, amphipols have been used to stabilise large oligomeric pores [88]. For the readers' convenience, we have summarised the advantages and disadvantages of various membrane mimetics and their applications (Table 1).
The most established example is the development of rapid microfluidic freezing devices, which have enabled millisecond mixing and halting of biochemical reactions before standard cryoEM imaging [93]. Conversely, a new cutting-edge approach achieves microsecond temporal increments by devitrifying the specimen for a controlled period in the microscope using a laser pulse (Figure 2c) [90]. During this devitrified period the specimen can undergo rapid conformational changes before undergoing re-vitrification. This treatment is followed by standard cryoEM imaging.
Lastly, modern software can now estimate the conformational landscape from an ensemble of single particles in some cases [89,[94][95][96][97]. In this context, trajectories through the conformational landscape reflect temporal changes of the macromolecule in real space. In this regard, these trajectories can be considered a form of pseudo-time-resolved analysis. Taken together these techniques provide insight into a new dimension of structure and function. We anticipate these methods to be valuable in probing short-lived (∼ms-ms) structural intermediates of PFPs in coming years, especially when combined with machine learning techniques [98,99].

Atomic force microscopy (AFM)
Time-resolved cryoEM is still in its infancy. In comparison, AFM is a mature technique capable of directly measuring the assembly pathway of a PFP at moderate-to-high spatial and temporal resolutions [67,100] (Figure 3a). In the past, AFM has been used to provide insight into both the oligomeric state(s) of PFPs and their dynamics over time, as well as specific details about PFP activation in the context of lipid membranes [67,100,101]. Indeed, AFM has been used to establish the foundation of mechanistic and structural understanding of various systems, including MACPF/CDCs [53,59,102,103], GSDMs [104], and others [105,106].
Recently, AFM studies have provided both single-molecule kinetics of assembly [53], as well as an understanding of bactericidal activity by imaging pore formation on live cells [107,108]. AFM has been especially useful for studying the interaction of PFPs with lipid bilayers, and particularly the functional impact membrane • Lipid diffusion is significantly reduced compared to native bilayers • Can easily study and visualise phase transition properties AFM (5,6,33-35) LM (36)(37)(38) properties have on PFPs [49]. Modern AFM instruments are becoming increasingly capable of achieving both high temporal and spatial resolutions, especially when combined with new approaches and algorithms [109,110]. With improvements in modern instrumentation and new algorithms, we envisage an AFM 'resolution revolution' to be on the horizon. C reproduced from Voss et al. [90]. Stimulation by laser (red) melts the vitreous ice, enabling rapid ( photo-dependent; 'stimulus') conformational changes to occur before re-vitrification. Followed by standard cryoEM imaging (green).

Single-molecule fluorescence microscopy
Unlike AFM, a major advantage of fluorescence microscopy is the ability to follow multiple differently labelled components at comparable temporal resolutions (or higher). As such, fluorescence microscopy, particularly single-molecule modalities, has been used extensively for PFPs [57]. Indeed, smFRET and TIRF studies have been employed to study various aspects of PFP dynamics and kinetics. These include measurements of complex assembly, stoichiometry, diffusion coefficients, single-molecule kinetics and lipid membrane interactions [47,52,54,56,111,112]. Evidently, single-molecule TIRF microscopy is a versatile tool to study multiple stages of pore formation and probe the mechanism of assembly at high temporal resolution (Figure 3c,d).

Single-channel conductance
Single-channel conductance (SCC) is another powerful and ubiquitous technique used to study individual nanopores [113][114][115][116][117][118] (Figure 3b). By establishing an electric potential across an impermeable bilayer, the presence of a membrane disruption (due to an arc or pore) can be detected by current flow. By measuring the steps and transient drops in current, their duration and amplitude, various details of the underlying molecular Nanopores are also used in metabolomics to detect small molecules. A nanopore (red) can capture an adaptor protein (green) that binds to a specific metabolite e.g., green amino acid. Metabolite binding induces a conformational change in the adaptor protein, leading to a characteristic change in ion flow through the nanopore. Bottom row, left to right. PFPs have been used to deliver antibodies into mammalian cancer cells. Nanopores (green) provide a passage for antibodies (orange) to cross the cell membrane and bind to their intracellular target (red). PFPs are also a staple pest control agent in agriculture, helping to protect crops from lepidopteran pests. PFPs from the crop (grey) are ingested by the pest, and subsequently perforate cells lining the pest's digestive system. Created with BioRender.com.
process can be extracted. This includes the size distribution of the nanopore, interactions with solutes or binding partners (e.g. Vip1/2), as well as pore assembly pathways and kinetics in the context of systems that form pores via a growing-arc mechanism (Figure 1b) [113][114][115][116][117]. Indeed, SCC forms the technological foundation of new polymer sequencing methods (see applications below). Individual nucleotides or amino acids can be detected by small blockages of nanopores, which produce a characteristic fingerprint as a function of their physical properties, like charge and size (Figure 4).

Molecular dynamics simulations
Historically, molecular dynamics simulations have enabled investigation of length and time scales not accessible to common experimental techniques (Figure 3e). Atomistic simulations of PFPs have revealed flexible hinge regions [119], protein/lipid interactions [47,74,120], and the distortion or reorganisation of lipid bilayers [75,121,122]. Furthermore, coarse-grained simulations have enabled larger oligomeric assemblies to be studied over longer time scales to investigate initial stages of oligomerisation (Figure 3e). Computational models of PFPs extend beyond structure. Indeed complex interaction pathways and kinetic processes can be modelled mathematically and simulated to reveal the temporal evolution of the system [55]. Additionally, computation simulations have been advantageous in translational research in studying electrical fields and current flow of nanopores [114,123]. While not a comprehensive list, these examples do show how simulations provide a powerful tool to understand atomistic details, quaternary interactions, and complex, system-level emergent properties of PFPs.

Efforts to re-engineer & modulate PFPs
Applications for PFPs have emerged in both academic and industrial settings. With nature providing a range of pore scaffolds to harness, re-engineering efforts have aimed to optimise pore properties for specific purposes. These include next-generation DNA sequencing technologies, which correlate subtle changes in the flow of ions through a nanopore with the translocation of nucleotide sequences (Figure 4). PFPs commonly used for DNA sequencing possess a narrow constriction point(s) (∼1 nm) within their lumen [115,124]. As DNA is threaded through the pore and encounters these narrow regions, the pore becomes partially blockedreducing current flow by a characteristic amount for each nucleotide combination. Re-engineering constriction sites (and other residues lining the pore lumen) by point mutations has greatly improved sequencing accuracy [125,126]. More radical changes to pore architecture have also been made, including truncation and covalent/non-covalent addition of adaptor proteins or enzymes [127][128][129]. Re-engineered PFPs fill several useful niches in DNA/RNA sequencing, including long-read sequencing [130][131][132], in-field metagenomics [133][134][135], and even epigenetics [136][137][138][139][140]. There has also been work to extend nanopore-based technology to various proteomic applications, including mass identification [117,123,141,142], peptide, and protein sequencing [143][144][145][146][147] (Figure 4). While nanopore-based proteomics promises extremely high sensitivity, and even the possibility of single-cell proteomics [31], sequencing polypeptides remains a frontier challenge in PFP engineering. In the case of de novo peptide sequencing, difficulties lie in translocating chemically diverse peptides of a non-uniform charge density through the pore and deciphering each of the 20 amino acids. By trapping individual peptides within the wild-type aerolysin pore, it was possible to distinguish between peptides with single residue substitutions for 13 of the 20 amino acids [147]. However, progress is still required to sequentially 'read' individual residues while passing a peptide through a nanopore.
Another challenge in nanopore proteomics is that protein sequencing requires unfolding and regular ratcheting of the denatured polypeptide chain through the pore. Partnering pores with other enzymes, such as unfoldases, helicases, DNA polymerases, and proteasomes has allowed for polypeptide translocation [144,146,148,149]. However, there is currently an insufficient level of control over single-pass translocation for accurate residue assignment. To address this, the use of a DNA-peptide conjugate enabled repeated re-reading of peptides, improving discrimination between a limited set of peptide variants [144]. While several challenges exist, proteomic nanopores could eventually be used on-bench to detect protein point variants, cleavage events, or post-translational modificationsall at the single-molecule level [123].
Besides nanopore sequencing, PFPs have also been co-opted for metabolomicsallowing the detection of specific ligands, such as sugars, amino acids, and vitamins [150][151][152] (Figure 4). Such systems rely on the pore capturing an adaptor protein capable of recognising a specific ligand, with the binding of the ligand to the adaptor generating a distinct change in the nanopore's ion conductivity. In the future, nanopores may be routinely used for real-time metabolomics to monitor human health [150], or even in analytical devices for industrial applications [153].
As drug and cargo delivery systems, PFPs are being developed to transport various effector molecules into cells [154,155] (Figure 4). For example, clostridium protective antigen toxin (CPAT) has recently been modified to inject antibodies and other therapeutic molecules into mammalian cells in a controlled and specific manner, thereby circumventing the membrane barrier [37,38,156]. As such, PFPs may represent a novel class of drug delivery system.
Meanwhile in agriculture, PFPs derived from the entomopathogenic bacteria, Bacillus thuringiensis, are used in transgenic crops as a key defence against pests, such as lepidopteran species (Figure 4). These PFPs therefore help to safeguard various essential commodities such as corn, soy, and cotton by successfully controlling crop pestssaving billions of dollars annually [35,157,158]. While hugely successful, new PFPs are required to target emerging pests and to combat the increasingly problematic rise in resistance to existing transgenic crops.
The applications of PFPs mentioned throughout this review have relied on pre-existing pores that have been adapted to suit particular contexts. Eventually, it may be practical to de novo design PFPs for specific purposes allowing complete control over pore properties. In fact, de novo design of simple α and β transmembrane channels was recently achieved [159,160], potentially laying the foundation for tailor-made PFPs designed for specific analytes and applications.

Concluding remarks
The field of PFPs is a remarkably diverse and rich area of investigation, covering kinetics, structures to fieldwork and proteomics. Recent developments in technology underpin several advances in the field of PFPs. Increasingly, researchers are relying on advanced techniques and integrative studies to provide insight into the biological role and function of PFPs. Specifically, microfluidic devices and single-molecule techniques are gaining greater traction and providing deeper functional insight into various systems. Researchers are finding novel and exciting areas in which translational applications of PFPs can benefit academia, industry, and biomedicine. It is our perspective that the rational de novo design or re-engineering of custom PFPs is the next frontier in the field of PFPs.

Perspectives
• PFPs are a broad class of molecules which underpin biological processes in immunology, pathogenicity, development and cell signalling. Furthermore, translational applications of PFPs can be found in biomedicine, agriculture and biotechnologies such as DNA and protein sequencing.
• Single-molecule based imaging and spectroscopic technologies are transforming the understanding of molecular processes and mechanisms that govern PFP behaviour and function.
• Improved structural, kinetic and mechanistic understanding of PFPs coupled with cutting-edge computational methods provide a tangible foundation for the next frontier of research in PFPsde novo design of custom PFP systems for biotechnology and biomedicine.

Competing Interests
The authors declare that there are no competing interests associated with the manuscript.

Open Access Statement
Open access for this article was enabled by the participation of Monash University in an all-inclusive Read & Publish pilot with Portland Press and the Biochemical Society under a transformative agreement with CAUL.

Author Contributions
Both authors contributed to the drafting, writing and revision of this manuscript. Both authors contributed equally to figure conceptualisation and creation.