Amyloid fibril formation is associated with misfolding diseases, as well as fulfilling a functional role. The cross-β molecular architecture has been reported in increasing numbers of amyloid-like fibrillar systems. The Waltz algorithm is able to predict ordered self-assembly of amyloidogenic peptides by taking into account the residue type and position. This algorithm has expanded the amyloid sequence space, and in the present study we characterize the structures of amyloid-like fibrils formed by three peptides identified by Waltz that form fibrils but not crystals. The structural challenge is met by combining electron microscopy, linear dichroism, CD and X-ray fibre diffraction. We propose structures that reveal a cross-β conformation with ‘steric-zipper’ features, giving insights into the role for side chains in peptide packing and stability within fibrils. The amenity of these peptides to structural characterization makes them compelling model systems to use for understanding the relationship between sequence, self-assembly, stability and structure of amyloid fibrils.
Amyloid is a fibrillar self-assembled non-covalent proteinacious polymer . A great number of proteins have been shown to be able to access a self-assembled β-sheet conformation and it has been suggested that under the right conditions any protein may form amyloid. It has been proposed that the amyloid conformation represents an energetically stable conformation that may be accessed by any polypeptide . Indeed, a diverse range of proteins form amyloid with large variations in protein sequence and amyloidogenic precursor structure. Proteins that form amyloid are typically associated with a degenerative disease; amyloid-β with Alzheimer's disease, transthyretin with familial amyloidotic polyneuropathy, IAPP (islet amyloid polypeptide) with Type 2 diabetes and prion or PrP (prion protein) with TSEs (transmissible spongiform encephalopathies). Potentially other diseases that have previously had unknown underlying pathology are caused by an aggregating peptidic monomer. For example, it has recently been suggested that phenylketonuria may be caused by an accumulation of amyloid-like assemblies of the single amino acid phenylalanine . The ability of a single amino acid to self-assemble in an ordered manner, when compared with the 42-residue protein amyloid-β  or the larger ~200 residue PrP , is a striking example of the variety found in amyloid systems. This variety is not only fascinating, but promises to have great utility in the application of such systems to bionanomaterial design.
The number of amyloidogenic systems associated with disease has led to numerous investigations into their fibrillar structure [4,6–8], each revealing a common shared architecture named cross-β . The cross-β architecture is defined by β-strands orientated perpendicular to the fibril long axis stacked to form long β-sheets that extend into a fibril. Two or more of these sheets may associate laterally to form the protofilaments of mature amyloid fibrils. These molecular details were first proposed on the basis of the interpretation of X-ray fibre diffraction data from amyloid-like systems [8,10,11]. This architecture has since been found in various other amyloid systems using ssNMR (solid-state NMR) [4,6,7] and also X-ray crystallography . These same advances have further provided structural details that have revealed insights into side-chain interactions that form the basis for the β-sheet association [6,8,12].
Lately, efforts have been made to expand the sequence space of structurally characterized amyloid systems. These sequences may be identified within larger amyloidogenic sequences or be identified by bioinformatics. Identification of amyloidogenic sequences via a bioinformatics approach is not only able to expand the amyloid sequence space, but in the identification algorithms required, highlight the characteristics of a sequence that give it a propensity to assemble . These algorithms are numerous and based on varying principles, but typically identify short peptide sequences, often hexapeptides [14–18]. A systematic study of the peptide with the sequence STVIIE revealed the important positional and chemical characteristic of a hexapeptide by studying the amyloid-forming propensity of variants of the peptide sequence . We previously reported a new algorithm, called Waltz, for identifying short amyloidogenic sequences  on the basis of a sequence position scoring matrix. It has successfully identified a small library of amyloidogenic peptide sequences that self-assemble in solution to form ordered amyloid aggregates.
Charge, hydrophobicity and aromaticity have all been implicated in the structural basis for assembly . The present study provides a basis from which to understand the relationship between sequence and the ability of the peptide to form amyloid fibrils and their resulting structures. We report on the structural and biophysical characterization of three of the peptides, identified by Waltz, able to form highly ordered amyloid-like fibrils with the sequences HYFNIF, RVFNIM and VIYKI. Waltz was designed to identify not only amyloid-forming sequences, but also to better distinguish between amyloid sequences and amorphous aggregates . The characterization of these three Waltz peptides further re-enforces the algorithm's ability to do this. This investigation and the analysis of fibrillar systems uses time-resolved XRFD (X-ray fibre diffraction) and analysis of LD (linear dichroism) and CD spectroscopy to identify structural characteristics and produce models representative of the fibrillar semi-crystalline structure. In doing so we reveal insights into the relationship between sequence and structure relevant to understanding amyloid formation in disease and as a basis for designing new self-assembling bionanomaterials.
Peptide synthesis and materials
The short peptides were synthesized as freeze-dried powder with N-acetylated and C-amidated capped termini to >95% purity by HPLC (JPT Peptide Technologies). Peptides were incubated in water at 10 mg/ml under quiescent conditions at room temperature (20–25°C) for 1 week unless otherwise stated. Fibres allowed to assemble for longer incubation periods of up to 3 months are referred to as mature. Water was purified by reverse osmosis and filter-sterilized using 0.2 μm membranes (Minisart).
TEM (transmission electron microscopy)
Samples were examined by TEM to confirm fibril formation. Samples were diluted to 0.25–1 mg/ml prior to TEM visualization. For grid preparation, a 4 μl drop of fibril solution was incubated on Formvar/Carbon 400 Mesh Copper grids (Agar Scientific) for 1 min, followed by a 1 min wash with 4 μl of water and negatively stained twice with 4 μl of 2% uranyl acetate for 1 min. Between each stage, excess liquid was removed by blotting with filter paper. Care was taken not to dry out the grid until after the final negative stain step. Grids were visualized using a Hitachi-7100 transmission electron microscope running at 100 kV. Images were taken using an axially mounted (2000 pixels×2000 pixels) Gatan Ultrascan 1000 CCD (charge-coupled-device) camera. The program ImageJ (http://rsbweb.nih.gov/ij/) was used to analyse TEM images  and individual fibril morphologies were inspected after applying the in-built fast Fourier-transform band-pass filter.
A Jasco J-715 spectropolarimeter with a peltier temperature control system was used to collect CD spectra. All measurements were collected at 20°C with a sample concentration of 1–2 mg/ml, 180–320 nm with a pitch of 0.1 nm at a scan speed of 50 nm/min, a response time of 4 s, slit widths of 1 nm and standard sensitivity. Control buffer spectra were averaged from triplicate accumulations and subtracted from the averaged triplicate accumulations of samples. Pathlengths were adjusted according to the degree of signal against PMT (photomultiplier tube) high-tension voltage to ensure high data quality and were 0.1 mm using quartz demountable cells (Starna Scientific). To check for contributions arising from LD artefacts due to orientation effects, cuvettes were rotated by 90° in the instrument. Where artefacts arising from orientation of fibrils were found, a combination of tip (Sonics Vibra-Cell VCX500, 20 kHz, 40% amplitude, 2 min) and water bath (Fisher Scientific FB15051, 37 kHz, 1 min) sonication was used to disrupt alignment.
Secondary structure analysis was performed on spectra confirmed to contain only CD signals using the online server Dichroweb  with the CDSSTR analysis programme  and best available reference set: SP175 190–240 nm . Careful attention was paid to the closeness of fit between secondary structure model predictions and the experimental data where low NRMSD (normalized root mean square deviation) values indicate a high goodness of fit of the secondary structure model prediction and successful analysis .
A Jasco J-815 spectropolarimeter modified for LD was used to collect LD spectra. Measurements were taken in a 0.5 mm pathlength at 20°C at 200 μg/ml, 180–320 nm with a pitch of 0.2 nm at a scan speed of 100 nm/min, a response time of 1 s, bandwidth of 1 nm and standard sensitivity. Spectra were collected as an average of three accumulated measurements and processed by subtraction of the background signal from water and zeroing at 300–320 nm. Three channels were monitored; LD, HT[V] (high tension [voltage]) and absorbance. To unambiguously detect LD signals, fibrillar samples were aligned by a Couette flow cell apparatus  using rotation speeds of 1500 rev./min, creating fibre alignment parallel with the orientation axis (//). Assembly of the Couette flow cell apparatus was found to align fibres perpendicular to the orientation axis (⊥) and thus comparison spectra were also collected with no Couette flow.
FTIR (Fourier-transform infrared) spectroscopy
Infrared spectra were recorded using a Bruker Tensor 27 infrared spectrophotometer equipped with a Bio-ATR II accessory. Spectra were recorded of dried films at a spectral resolution of 4 cm−1 with 120 accumulations performed per measurement at a wavenumber range of 900–3500 cm−1. Buffer and baseline subtraction from the obtained spectra were made with rescaling in the wave number range of 900–1800 cm−1.
Fibril samples were aligned by a variety of methods to produce different textures. Fibrous-textured alignments were created by suspending a 10 μl droplet of 10 mg/ml fibril solution between two wax-tipped 1.2 mm outside diameter, 0.94 mm internal diameter borosilicate capillaries (Harvard apparatus) and placing in a parafilm-sealed Petri dish to air-dry at room temperature. Film-textured alignments were formed by drawing a 10–50 μl droplet of 10 mg/ml fibril solution into a X-ray transmissible 0.7 mm borosilicate capillary (Capillary Tube Suppliers) and sealed at one end to prevent further capillary action. This solution was allowed to dry by evaporation to create a film-texture. These methods are further described in . In order to monitor real-time fibre alignment, a cell similar to that described previously was used . Briefly, the cell comprises a semi-enclosed chamber that can support the wax-tipped capillaries used for alignment, such that this process can be monitored by X-ray fibre diffraction in real-time. Samples for fibre diffraction were tested on a home source Rigaku 007HF Cu Kα [λ=1.5419 Å (1 Å=0.1 nm)] rotating anode generator with VariMax-HF mirrors and Saturn 944+ CCD detector. Additionally, synchrotron data were collected at the Diamond I24 microfocus beamline with a wavelength of 0.9778 Å and an MARCCD detector. Clearer  was used to measure signal positions in XRFD patterns. Graphical traces were produced by sampling a 60° radial slice of the XRFD on either meridional or equatorial axes and plotted using Braggs law. Determined diffraction signals were entered into the unit cell determination module within Clearer  and possible unit cells were explored by comparing experimental diffraction signal positions with calculated signals and examining for logical indexing schemes.
Initial peptide models were built in Insight II (Molecular Dimensions) in an ideal β-strand conformation. Rotamer conformations were assigned using the Dunbrack backbone-dependent rotamer library in PyMOL (http://www.pymol.org), variable architectures were explored and assessed using Molprobity , checking for clashes, favourable Φ-Ψ angles and side-chain positions. Crystal lattices were constructed in PyMOL using the unit cell determined by Clearer , and these were minimized to remove side-chain clashing and unfavourable conformations. Explicit lattice minimizations were performed using the CHARMM Prot_all_22_forcefield  through NAMD  as implemented in VMD  in a (3 3 3) solvated lattice.
Simulated XRFD and model validation
Using Clearer, fibre-diffraction patterns were simulated from minimized models. Briefly, the central model of a minimized lattice was used to construct a fibre texture from which the intensity and position of reflections were simulated. The diffraction settings were equivalent to the experimental collection parameters (i.e. specimen-to-detector distance, wavelength and detector parameters). The fibre disorder parameters σθ and σΦ were 0.2 and ∞ respectively, and the crystallite size was set to 400 Å3 with a sampling interval of 1 pixel. All other simulation settings were default.
Pattern comparison was initially qualitative and based on visual comparison of simulated XRFD patterns against the experimental pattern. To more accurately identify closeness of fit (RF), the experimental and simulated reflections were tabulated and a quantitative comparison was made on the basis of signal position and relative intensity.
Sequence identification and origin
The three short peptides, identified by Waltz and shown in the present paper, are five or six residues in length and taken from much larger polypeptides with variable native functions, as shown in Table 1. Of the three sequences, the crystal structure of HYFNIF within the context of the full-length native protein structure is available for assessment (PDB code 2KV2). Interestingly the predicted secondary structure by PSIPred  and observed secondary structure are different (Supplementary Figure S1 at http://www.biochemj.org/bj/450/bj4500275add.htm). Although HYFNIF was identified by Waltz and predicted to form β-strands, the crystal structure of the native protein reveals this sequence adopts an α-helical structure. RVFNIM and VIYKI are predicted to respectively adopt α-helical and β-strand conformations.
|Origin||Residue position scoring|
|Waltz sequence||UniProt ID||Length||Segment||Name||Function||Organism||1||2||3||4||5||6|
|HYFNIF||P54132||1417||27–33||Bloom syndrome protein||DNA helicase||Human||0||0||1||1||2||0|
|VIYKI||P07182||284||183–187||Chorion protein||Egg shell protein||Drosophila||0||1||−1||−2||−2|
|Origin||Residue position scoring|
|Waltz sequence||UniProt ID||Length||Segment||Name||Function||Organism||1||2||3||4||5||6|
|HYFNIF||P54132||1417||27–33||Bloom syndrome protein||DNA helicase||Human||0||0||1||1||2||0|
|VIYKI||P07182||284||183–187||Chorion protein||Egg shell protein||Drosophila||0||1||−1||−2||−2|
In the case of the hexapeptides, the terminal residues are either neutral or slightly unfavoured in the Waltz scoring algorithm, indicating that the core residues are key and may represent a possible core interaction motif. This core sequence is Phe-Asn-Ile at positions 3–5 in the hexapeptides, whereas in the case of the pentapeptide the equivalent positions 3–5 have the sequence Tyr-Lys-Ile. In general the core may be described as having the aromatic pattern A-X-X and the hydrophobicity pattern H-X-H.
Morphology characterization by TEM
TEM of the Waltz peptides revealed that in water they spontaneously form amyloid-like fibrillar morphologies, with widths of approximately 20 nm (Figure 1). Each peptide fibril exhibits characteristic morphologies, inferring the complex association of individual protofilaments in a hierarchical structure.
The characteristic morphologies of Waltz fibrils
Fibrils formed by HYFNIF exhibited the greatest range of morphologies, which can broadly be described as having a twisted morphology. The paired filament helices (Figure 1A, i) have a greatest width of 17.04 nm (S.D.±1.06; n=7) and periodicity of 95.16 nm (S.D.±3.15; n=6). Other helical arrangements of ribbons are observed with variable widths and periodicities (Figure 1A, ii and iii). Following long incubation (several months) the fibrils developed into structures that may represent a tubular architecture with widths of 26.03 nm (S.D.±3.29; n=3) (Figure 1A, iv and Supplementary Figure S2 at http://www.biochemj.org/bj/450/bj4500275add.htm).
RVFNIM was more consistently found to form tightly wound ‘ropes’ with a width of 17.61 nm (S.D.±1.23; n=5) with some indication of protofilament structure (Figure 1B, i and ii). Some examples of rope-like protofilaments twisting into a helical arrangement were also observed (Figure 1B, iii) with widths at their widest points of 19.69 nm (S.D.±1.59; n=28). The helical pitch of these morphologies varies dramatically from 147.81 nm (S.D.±21.40; n=11) to indeterminate over a single TEM micrograph (Figure 1B, iv).
VIYKI exhibited the most regular morphology where all observed fibres formed twisted ‘ropes’ (Figure 1C, i–iv) with a width of 21.0 nm (S.D.±1.2; n=6) in a helical twist with a periodicity of 58.1 nm (S.D.±1.7; n=6).
Secondary structure determination by CD and FTIR
The secondary structures of the fibrillar Waltz peptides were investigated using CD and FTIR spectroscopy. Characterization by CD is complicated by both the anisotropic nature and high aromatic content of these systems. A classic β-sheet spectrum typically has a positive and a negative maximum at ~195 and ~216 nm respectively. The CD spectra from the Waltz peptides are more complicated. Each CD spectrum exhibits a strong dependence on sample orientation, with signal intensities that are far greater than normal (Supplementary Figure S3 at http://www.biochemj.org/bj/450/bj4500275add.htm) and signal positions that are not as expected for typical β-sheet peptides. The fibril structure and thus potential for the shear alignment of these systems may give rise to LD artefacts in the CD measurements. To obtain CD spectra with minimal contribution from artefactual LD signals, the Waltz fibrils were sonicated in an attempt to abolish effects from alignment (Supplementary Figure S3) as reported elsewhere . TEM confirmed that, following sonication, fibrils remained present in the solution, but sample orientation CD signal dependence was no longer observed, revealing the true CD signals representative of the Waltz peptides. Dichroweb analysis revealed predominantly β-sheet and other non-helical structures (Supplementary Table S1 at http://www.biochemj.org/bj/450/bj4500275add.htm). We presume the non-helical conformations represent peptides in a random-coil conformation that have been liberated from fibres, but the remaining predominant β-strand structures constitute the Waltz fibrils.
Since the high aromatic content and anisotropy complicates CD analysis, we corroborated these results with FTIR of mature Waltz fibrils. The expected absorption bands for β-sheet structure with two maxima at 1626–1633 and 1666–1676 cm−1 were observed (α-helices exhibit bands at ~1654 cm−1 ) for all three peptide assemblies (Figure 2). FTIR has been used to distinguish between parallel and antiparallel β-sheets by the presence of the longer wavenumber band at 1695 cm−1 , but the reliability of this interpretation is still a matter of debate  and so here we interpret the FTIR only in terms of the presence of β-sheet conformation.
The secondary structure of the Waltz fibrils
Chromophore orientation determination by LD
Biophysical analyses identified the β-sheet-rich structure of the Waltz peptide fibrils. The presence of LD artefacts in the CD spectra suggests that a regular and ordered arrangement of structural elements is apparent in the assemblies. The alignment of the Waltz peptide fibrils in the initial CD experiments was the result of sample loading. This cannot be controlled and so we chose to measure the LD signals using a micro-volume Couette flow cell apparatus  as this provides a reproducible alignment methodology to gain more information regarding the orientation of structural elements and chromophores within the amyloid fibrils.
LD on fibrillar systems, in particular amyloid, has been reported previously [22,27,40,41]. Transition moments in chromophores that are regularly orientated produce positive and negative LD signals depending on their orientation. The orientation of chromophores relative to the fibre axis can thus be determined from the magnitude and sign of their LD signals . The LD spectra for the Waltz fibrils were collected when aligned parallel with (//) and perpendicular to (⊥) the orientation axis (Figure 3), where in our instrument the orientation axis is defined as horizontal. Our intention had been to observe orientation under Couette flow; however, the signals shown in Figure 3 indicate that the loading of the sample provided sufficient shear force to induce perpendicular fibril orientation. All three peptides show a large negative signal at ~200 nm arising from the π-π* transition of the β-sheet peptide backbone. The sign of the β-sheet π-π* transition is thus consistent with its polarization being parallel with the fibre long axis. Since this transition occurs perpendicular to the β-strand direction , the data are consistent with β-strands arranged perpendicular to the fibre long axis and are consistent with the cross-β architecture, as observed for other self-assembled systems [22,40].
The LD spectra arising from the Waltz fibrils
Couette flow created significantly more orientation for HYFNIF fibrils, not enough orientation for RVFNIM fibrils to overrule the loading-induced orientation and some orientation for VIYKI fibrils. For HYFNIF and VIYKI fibrils, the signals were also observed in the region expected to arise from aromatic LD spectra (Figure 3). RVFNIM fibrils gave no evidence of orientation of its aromatic chromophores, although this is likely to be due to the fact that phenylalanine has a low absorption coefficient. By way of contrast, HYFNIF and VIYKI have clearly resolved aromatic tyrosine LD signals, the La (230 nm) and Lb (275 nm) transitions, indicating a regular ordered arrangement of these aromatic residues in these structures. The splitting of the Lb tyrosine transition into two components is an excitonic coupling effect indicating that the tyrosine residues in these systems are closely associated and possibly involved in π-π stacking interactions . This phenomenon has been previously noted in LD spectra from fibrils formed by the heptapeptide GNNQQNY . Thus the LD data indicate that the tyrosine long axes for both HYFNIF and VIYKI fibrils are orientated greater than 54.7° from the fibre axis, whereas the tyrosine short axes are orientated less than 54.7° and may adopt a stacked geometry. Knowing the β-sheet content and orientation with the likelihood of aromatic stacking interactions we sought to gather more structural details from XRFD.
XRFD from Waltz peptide fibrils
Fibrils of Waltz assemblies were readily aligned, producing fibrous-textured alignments, by methods previously described . These alignments exhibit birefringence by cross-polarizing light microscopy, indicating a para-crystalline order of the assemblies (Supplementary Figure S4 at http://www.biochemj.org/bj/450/bj4500275add.htm). Each system exhibited the major meridional and equatorial features associated with the cross-β architecture commonly observed for amyloid-like assemblies . These results are thus consistent with the spectroscopic observations and the deduction of the cross-β arrangement within these peptide assembly systems. The reflections arising from the repetitive interatomic separations within the Waltz fibrils differ between the systems, indicating differences in the packing arrangements of the different peptides.
We also monitored XRFD in real-time over the course of fibre alignment. To our knowledge, we report for the first time the XRFD exhibited by an amyloid-like system over the course of fibre alignment in the hydrated, semi-hydrated and dried aligned state as shown in Figure 4. Importantly, we find that the reflection positions arising from the amyloid architecture are present, and the diffuse scattering from water (~3.5 Å) is still observable. This supports the view that the structures of the fibres are the same in the solution and dried state. The same phenomenon was confirmed for each Waltz system individually (Supplementary Figures S4D and S4E). We also explored the effect of texture on the observed XRFD patterns. Samples of fibres were aligned to produce film-textured alignments where all fibre axes are aligned parallel with the film plane. We note that the film-textured alignments also report the same diffraction signals as the fibrous alignments (Supplementary Figure S5 at http://www.biochemj.org/bj/450/bj4500275add.htm), but believe that the film texture may abolish contribution from fibril packing. In the case of RVFNIM, additional equatorial information is observed in the film-textured alignment and so this pattern was used in subsequent analysis.
Real-time XRFD of VIYKI over the course of alignment
The relatively large number of reflections exhibited by these systems is indicative of a para-crystalline order greater than that typically observed for amyloid-like systems . The quality of the patterns thus presents an opportunity to determine a unit cell on the basis of the reflection positions (Supplementary Table S3 at http://www.biochemj.org/bj/450/bj4500275add.htm) and model the molecular structure within the Waltz fibrillar assemblies. All three peptides share a strong meridional reflection, the distance of which varies between 4.66 and 4.76 Å (Supplementary Table S3), arising from the separation between hydrogen-bonded β-strands along the fibre axis. Higher-order reflections of the principal meridional reflections are present, but the precise position of the meridional reflections is dependent on β-strand separation and relative displacement of these along the fibre axis. Knowing the fibre axis repeat, the equatorial reflections were indexed and predicted unit cell dimensions were determined using Clearer . The determined unit cell dimensions for each system (Supplementary Table S4 at http://www.biochemj.org/bj/450/bj4500275add.htm) are defined as: a, the β-strand chain length; b, the β-sheet spacing; and c, the hydrogen bonding spacing along the fibre axis.
The strongest equatorial reflections pertaining to the distance between β-sheets for the peptides HYFNIF, RVFNIM and VIYKI are 12.4, 10.8 and 9.21 Å respectively. This information allowed for the basic arrangement of the peptides within the repeating cells to be constructed. From X-ray crystallography analysis of amyloidogenic short peptides, a range of cross-β architectures have been proposed and grouped into eight classes . The arrangement of β-strands within β-sheets can be antiparallel, β-sheet relative orientation (face-to-face or face-to-back) and β-sheet direction (up or down). This information was used as the basis for using molecular modelling to explore the possible arrangements of the Waltz peptides within their respective repeating cells (Supplementary Online Data at http://www.biochemj.org/bj/450/bj4500275add.htm).
Modelling and assessment by simulated XRFD
Diffraction patterns were simulated from models systematically constructed (Supplementary Figure S6 at http://www.biochemj.org/bj/450/bj4500275add.htm) using the descriptions of parallel amyloid structural classes  and quantitatively compared with the experimental diffraction patterns (Supplementary Table S5 at http://www.biochemj.org/bj/450/bj4500275add.htm). The model structures shown in Figure 5 are those that were explored and found to best fit the diffraction data. A comparison between the simulated and experimental diffraction patterns confirming the validity of the constructed models is shown in Figure 6.
Representative models of the Waltz peptides
Experimental XRFD patterns compared with patterns simulated from modelled structures
Figure 5 shows the most representative models for each of the Waltz systems. Of the models constructed, the majority of simulated patterns compared well with experimental fibre diffraction data, and so we postulate that the experimental fibre diffraction patterns may result from a mixture of architectures. However, the structures shown in Figure 5 represent the predominant representative architectures. The modelled β-sheet separations closely correspond to the major equatorial reflections (Supplementary Table S3) and the β-strand separation along the fibre axis varies between 4.70 and 4.78 Å (Supplementary Table S4). The low-resolution reflections found on the HYFNIF and VIYKI patterns were not easily reproduced, but we attribute this to our simulations modelling the structures as a continuous lattice rather than discrete protofilaments. HYFNIF is found to adopt an up-up architecture where two sheets are displaced by b/2, where the β-sheets are parallel with respect to one another. RVFNIM and VIYKI adopt more classical steric zippers in classes IV and II respectively; the former has parallel β-sheets in a face-to-back and up-down arrangement, whereas the latter β-sheets are face-to-back in an up-up arrangement. The meridional reflections of RVFNIM and VIYKI were better reproduced with β-strands displaced along the fibre axis by c/2.
The present study confirms further the Waltz algorithm's ability to predict ordered aggregation, and we have characterized two hexapeptides and one pentapeptide uniquely identified during the iterative sequence exploration that developed Waltz . The sequences are found to have varying secondary structure propensities, but in their native structures adopt a conformation dependent on their surrounding amino acid sequence and structural environment. When removed from their native sequences, the Waltz peptides form a β-strand conformation that adopts a cross-β architecture, but with differing lateral packing arrangements. Previous work by Johansson and co-workers suggested that amyloidogenic sequences were often ‘promiscuous’ and able to adopt both α-helical and β-strand conformations depending on their context and environment . The results of the present study re-enforce this and the concept that the flanking amino acid sequences in amyloidogenic portions of proteins protect against aggregation and self-assembly by forcing the adoption of particular secondary and tertiary structures .
On the nanoscopic scale these systems are observed to adopt characteristic and discrete fibril morphologies that may be based on the underlying differences in molecular packing, but are ultimately typical for amyloid fibrils . The exact relationship between these two levels of structure remains unclear. To model the molecular packing of these peptides, the investigations reported in the present study have required careful consideration of the use of the biophysical and structural techniques employed. It was found that CD data is highly dependent on the fibrillar nature of the assemblies, but we highlight methods, as reported elsewhere [22,27], for the successful interpretation of this data. The identification of LD artefacts in CD data has been reported previously  and the results of the present study clearly reiterate that LD artefacts can occur in CD experiments, the identification of which have broad methodological implications for using CD to study anisotropic systems. Clearly care should be taken in analyses of these sorts of data, but the previously described phenomena can be usefully rationalized and used in LD experiments to reveal information about chromophore orientation.
Particular care was taken to ensure that the hydrated state of these systems was the same as the dried, through the use of a semi-enclosed chamber to monitor XRFD in real-time over the course of fibril alignment. The structures reported in the present study are found to be unaffected by drying as reported previously .
Using XRFD and Clearer, unit cells were determined for the Waltz peptides in the fibrillar state and the architectures that best represent the fibril structure of the Waltz peptides were determined. These peptides do not crystallize in the conditions tested by us (results not shown) and we have focussed on the fibrillar structure due to concerns of the comparability of crystal structures with the absolute structure of self-assembled fibres [40,49]. Despite this, the proposed crystalline class models of amyloid systems provide a useful framework within which to structurally explore the possible architectures of short amyloidogenic sequences. It is probable that the Waltz fibrils contain a mixture of these architectures where the difference between simulated patterns of different class models is slight. It must be considered that more structural information may be obtained from crystal structures, but they represent a single global minimum, which may not be representative of the local minima of the many polymorphs present in real amyloid fibrils.
Although likely to be polymorphic, we have judged quantitatively the most representative predominant polymorphs that constitute the Waltz peptide assemblies. Structurally, RVFNIM and VIYKI are found to adopt arrangements with displacement of β-sheets along the fibre axis (c/2) with respect to one another. This has been observed for crystallized short amyloidogenic peptides , but, at the time of writing, the results from the present study are the first example of this being demonstrated through experimental and simulated XRFD. The arrangements adopted are variable between the Waltz peptides, which is perhaps surprising given the position sequence similarity of these peptides, but is a reflection of the adaptability of this conformation. Interestingly, aromatic amino acids appear to provide a possible driving force for assembly in some amyloidogenic peptides [20,40] and others have suggested an important contribution of hydrophobicity . Examination of the peptide arrangements reveals that, although hydrophobicity may drive elongation of β-sheets, it is likely that charge and aromaticity modulate the possible arrangements and architecture within the final fibrillar structure via sheet–sheet interactions and lateral association of protofilaments.
In the future, studies of these systems have the potential to make valuable and unique contributions to the model systems currently describing the structural space of amyloid assemblies. The characterization of these systems not only further confirms the ability of the Waltz algorithms to predict ordered amyloid aggregation, but also introduces an opportunity to further understand this conformation while presenting new highly ordered and well-characterized systems for nanotechnological applications.
Kyle Morris conducted the experimental work, performed analysis and wrote the paper. Alison Rodger provided equipment and expertise and contributed to data analysis. Matthew Hicks contributed to LD data collection and analysis. Maya Debulpaep collected FTIR data. Joost Schymkowitz and Frederic Rousseau contributed materials and wrote the Waltz algorithm on which the paper is based. They also advised on the interpretation and the paper. Louise Serpell managed the project, analysed the results and wrote the paper.
XRFD data collected at the Diamond I24 microfocus beamline was done so with the expert help of Dr Gwyndaf Evans, Dr Danny Axford and Dr Robin Owen. We thank Dr Julian Thorpe for essential electron microscopy support, Dr Peter Varnai for valuable advice on model minimization and Youssra Al-Hilaly for insightful discussions. The apparatus used to generate the real-time XRFD data was based on a prototype design kindly lent to us by Professor Pawel Sikorski, to whom we are indebted.
This work was supported by the Biotechnology and Biological Sciences Research Council funded Synthetic components network awarded to L.C.S. and K.L.M. L.C.S. is supported by Alzheimer's Research UK. The Switch Laboratory was supported by grants from the Flanders Institute for Biotechnology (VIB), The University of Leuven, the Funds for Scientific Research Flanders (FWO), the Flanders Institute for Science and Technology (IWT) and the Federal Office for Scientific Affairs, Belgium IUAP P7.