Subcellular proteomics is a powerful new approach that combines subcellular fractionation and MS (mass spectrometry) to identify the protein complement of cellular compartments. The approach has been applied to isolated organelles and major suborganellar structures and each study has identified known proteins not previously understood to associate with the compartment and novel proteins that had been described only as predicted open-reading frames from genome sequencing data. We have utilized subcellular proteomics to analyse the protein components of CCVs (clathrin-coated vesicles) isolated from adult brain. Accounting for identified fragmented peptides allows for a quantitative assessment of protein complexes associated with CCVs, and the identification of many of the known components of post-fusion synaptic vesicles demonstrates that a main function for brain CCVs is to recycle synaptic vesicles. In addition, we have identified a number of novel proteins that participate in CCV formation and function at the trans-Golgi network and the plasma membrane. Characterization of two of these proteins, NECAP1 and NECAP2, has led to the identification of a new consensus motif that mediates protein interactions with the clathrin adaptor protein 2. These studies highlight the ability of proteomics to reveal new insights into the mechanisms and functional roles of subcellular compartments.
The sequencing of the human genome and the genomes of various experimental model systems, coupled with the large-scale sequencing of expressed genes has provided researchers with an invaluable tool to discover new genes, including those responsible for human diseases. However, it is the decoding of this genomic information into functional proteins and protein assemblies that represent the most formidable challenge for understanding cellular function. Advances in protein and peptide separation technologies combined with innovations in MS (mass spectrometry) have greatly increased the ability to identify rapidly the protein components of a myriad of biological samples [1,2]. This has led to an explosion in the use of proteomics approaches in biology.
It has become more apparent that whole cells and tissues are not currently amenable to satisfactory proteomics analysis . First, these samples are simply too complex. It is estimated that any given cell expresses 10000 gene products, although the actual number of functional proteins is much greater due to splice variants and post-translational modifications . Secondly, the extreme dynamic range of protein expression in a tissue (protein levels can vary by as much as ten orders of magnitude) means that less abundant proteins are masked by those expressed at higher levels . The high sensitivity afforded by techniques such as MudPIT (multidimensional protein identification technology), in which peptides generated from proteolytic digestion of complex samples undergo multiple steps of separation before analysis by tandem MS can overcome some of these difficulties . For example, Liu et al.  have used MudPIT on a crude yeast protein extract to identify proteins present at as low as 100–200 copies/cell.
Isolated organelles present an attractive target for proteomics, as their protein complexity is reduced and lower abundance proteins that are specific to the compartment are enriched relative to whole cell lysates (Figure 1). The subcellular proteomics approach is also advantageous in that identified proteins are linked to functional units (Figure 1). For novel proteins, the connection to an organelle can provide the first clues as to the functional role of the protein. Moreover, a global analysis of the protein components of an organelle provides insights into organelle function that are not possible from the identification of a smaller subset of the proteins. Numerous organelles and suborganellar compartments have been analysed by subcellular proteomics. In each case, characterized proteins not previously known to associate with the compartment and novel proteins have been identified. Additionally, several of these studies have led to new understanding of the functional roles of the organelles. It is beyond the scope of this short review to provide a detailed discussion of these studies and the reader is referred to excellent recent reviews [7,8].
Subcellular proteomics analysis of CCVs
Proteomics analysis of CCVs (clathrin-coated vesicles)
CCVs are an important class of transport organelles in eukaryotic cells and the formation of CCVs at the plasma membrane is the central feature of CME (clathrin-mediated endocytosis) . CME is the major route of endocytic entry into cells and is responsible for the uptake of nutrients. CME also functions in the down-regulation of receptors although it has become apparent that certain signalling receptors in fact require CME to engage their intracellular signalling pathways . Moreover, CME contributes to the regulation of the surface levels of numerous proteins including neurotransmitter receptors, with important consequences on cell physiology. CCVs also form at the TGN (trans-Golgi network) and endosomes, where they function in the transport of cargo proteins between the secretory pathway and the endosomal–lysosomal system .
An important component of plasma membrane-derived CCVs is the clathrin AP-2 (adaptor protein 2) [11,12]. AP-2 is a heterotetramer that contributes to the recruitment of clathrin to the plasma membrane and its assembly into clathrin lattices. AP-2 also functions to recruit cargo into nascent CCPs (clathrin-coated pits) and the C-terminal regions of the large subunits of AP-2 (α- and β2-adaptin) form globular structures referred to as ear domains that bind accessory proteins functioning in CCV formation. At the TGN and endosomes, the heterotetrameric clathrin AP-1 plays an analogous role to that of AP-2 at the plasma membrane [11,12].
To understand better the molecular machineries for clathrin-mediated membrane budding, we have performed a proteomics analysis of CCVs isolated from rat brain (Figure 1) [13,14]. CCV proteins from three independent preparations (see  for a detailed review on CCV isolation) were separated by one-dimensional SDS/PAGE and individual gel slices were analysed in an automated system by nanoscale reversed-phase LC-Q-TOF (liquid chromatography quadrupole time-of-flight) MS/MS (Figure 1) . A total of 209 proteins were reproducibly identified. Of these, 92 had been previously shown to be associated with CCVs and 25 appeared to be contaminants leaving 92 potentially new CCV-associated proteins. The association of a number of these proteins with CCVs was confirmed by Western-blot analysis .
By accounting for all fragmented peptides matched to proteins, we identified peptides assigned specifically to an individual protein and those shared among different proteins. Approximately 50% of all peptides were assigned to 18 proteins that represent most known components of clathrin coats . Remarkably, the single largest category of proteins identified were the 32 that define many of the known components of SVs (synaptic vesicles) and these proteins were generally identified with high numbers of peptides. It has long been supposed that CCVs are responsible for the recycling of SVs following SV collapse into the plasma membrane concomitant with neurotransmitter release [16,17]. However, neurotransmitter release involving intermittent fusion of SVs without complete collapse into the plasma membrane, referred to as kiss-and-run, has also been demonstrated although the prevalence of this alternative model of fusion is uncertain [18–20]. The finding of a near-complete inventory of the known and well-characterized components of SV proteins in CCVs from whole brain suggests that the main function for brain CCVs is to recycle SVs. This is consistent with a model in which full fusion of SVs is a prevalent form of neurotransmitter release although it does not address the relative levels of full fusion versus kiss-and-run mechanisms.
Peptide accounting reveals endocytic protein complexes
Organelles are not fixed entities but are instead dynamic structures that remodel themselves constitutively and in response to environmental stimuli. Subcellular proteomics holds promise as an approach to monitor global changes in the protein composition and thus functional properties of organelles under specific conditions. Key to this approach is the ability to measure quantitatively relative protein levels in complex protein mixtures. So far, such analysis has been attempted primarily through the use of stable isotope-labelling methods [21,22]. Protein extracts from two different samples are covalently tagged with distinct isotopes, combined and proteolytically digested, and the ratio of ion intensities of labelled peptide pairs, determined through MS/MS, is used to quantify the relative abundance of their parent proteins. These techniques have proven useful for measuring small changes in protein levels [21,22] but are more problematic over a broader dynamic range . Moreover, a more simplistic method to measure relative protein abundance would allow for a broader application of quantitative proteomics.
By accounting for fragmented peptides identified from CCV preparations, we noticed a correlation between the amount of Coomassie Blue staining in any given gel slice and the number of peptides assigned to proteins in that slice (Figure 1), suggesting a possible linear relationship between peptides and protein abundance . In fact, when peptide accounting was normalized for the size of the protein (larger proteins generate more peptides per mol), the expected 1:1 ratio was found for clathrin heavy and light chains. A comparable stoichiometry was evident for the subunits of the AP complexes. Thus peptide accounting appears to provide a mechanism for determining relative protein expression levels independent of isotope tagging. In fact, Yates and co-workers  have recently demonstrated a linear relationship between the level of sampling for proteins in complex mixtures and the relative abundance of the protein in the mixture. The number of spectra acquired for each protein (spectral sampling) was found to be linear with respect to protein levels over two orders of magnitude . These approaches provide the only easy method to determine relative protein levels within a single proteome. Moreover, they should prove valuable in the analysis of large changes in protein expression between complex protein samples such as organelles allowing for an accessible means to analyse organelle dynamics.
Identification of novel proteins functioning in clathrin-mediated trafficking
Of the 209 proteins reproducibly identified in CCVs, eight were hypothetical gene products that had not been previously detected at the protein level [13,14]. Five of these proteins have been characterized to function in vesicle trafficking including one with WD-40 repeats and a FYVE domain , which is localized to early endosomes (FENS-1) . A second protein contained a serine/threonine kinase domain at its N-terminus and multiple DPF and NPF motifs in its C-terminal region . DPF and NPF motifs bind to the globular ear of the α-subunit of AP-2 (α-ear) and Eps15 homology (EH) domains respectively [24–26]. This protein was independently identified in a screen for α-ear-binding proteins and has been named AAK1 (adaptor-associated kinase 1) [27,28]. AAK1 phosphorylates the μ2 subunit of AP-2 and appears to regulate cargo recruitment to CCPs. A third novel protein, which we named enthoprotin , was independently identified and named Clint  and epsinR [30,31]. Enthoprotin binds clathrin and AP-1 and is highly concentrated on CCVs [13,29–31]. Enthoprotin also contains an ENTH (epsin N-terminal homology) domain, a protein and lipid-binding module that until that time had been found exclusively in proteins that function in CME (see  for a review). Interestingly, immunofluorescence studies revealed that enthoprotin co-localizes with AP-1 and other markers of the TGN and the endosomal system. Enthoprotin is in fact the first ENTH domain-bearing protein described to function in clathrin-mediated trafficking on internal membranes. RNAi studies on enthoprotin suggest that it may function primarily in a retrograde trafficking pathway that transports cargo such as the mannose-6-phosphate receptor from early endosomes to the TGN .
The NECAP proteins define a new AP-2-binding motif
Two additional novel proteins from the proteomics analysis of CCVs are highly homologous with each other but share no homology or common domains with any previously characterized proteins . The proteins, named NECAP1 and NECAP2 are highly enriched on the coats of isolated CCVs and localize in part to CCPs at the cell surface . By performing pull-downs with GST-NECAP1, followed by LC-Q-TOF analysis of the affinity-purified proteins, we identified AP-2 as the major NECAP-binding partner . The interaction is mediated exclusively by the α-ear of AP-2, although the NECAPs do not contain any sequences that match known consensus motifs for α-ear binding, DPF/W or FXDXF [24,25,35]. We instead identified the sequence WVQF as necessary and sufficient for AP-2 binding, and disruption of NECAP–AP-2 interactions by overexpression of NECAP constructs containing this motif blocks CME of the transferrin receptor .
Database searches allowed for the recognition of a number of endocytic proteins that contain WVQF-like motifs, including synaptojanin170, AAK1, auxilin 2 and stonin 2, and two recent studies have indicated that these motifs are used for AP-2 binding [36,37]. Recently, we have performed an extensive mutational analysis of the WVQF motif in the NECAPs, leading to the consensus W-[VAIED]-X-[FW] . Intriguingly, the WVXF motif in NECAPs is found at the extreme C-termini of the proteins and the free carboxylate group is critical for α-ear binding. In all other proteins, the motif is found within the protein but is always followed by a series of acidic residues and mutation of three acidic residues to alanine in stonin 2 eliminates α-ear binding . In peptide competition experiments, NECAPs do not compete for α-ear binding with proteins using DPF/W and FXDXF-binding motifs, indicating that the WVQF motif must utilize a distinct binding interface on the α-ear . The comparison of NMR spectra for the α-ear in the presence or absence of a NECAP1 peptide containing the WVQF motif revealed a NECAP-binding site, in the sandwich subdomain of the α-ear [38,39]. In fact, the binding site is identical with that previously reported to bind to DPW motifs in epsin . Mutations in this site that disrupt NECAP binding have no effect on epsin binding, demonstrating that this site is primarily a binding site for WVQF-like motifs [38,40]. Thus characterization of the NECAPs has revealed a new mechanism for α-ear binding.
Subcellular proteomics has emerged as an important approach towards the characterization of organelles. For CCVs, we have demonstrated a major role for CCVs in recycling SVs. Moreover, we have innovated a peptide accounting approach that may provide an accessible means to examine dynamic changes in the protein composition of organelles. Finally, we have identified a number of novel CCV-associated proteins. In fact, proteomics analysis of CCVs isolated from rat liver and developing rat brains has identified overlapping yet distinct sets of novel proteins. An advantage to subcellular proteomics is that the localization of a novel protein to a functional unit (an organelle) provides a framework on which to pursue the functional analysis of the protein. This is especially important for proteins such as NECAPs, in which bioinformatics analysis does not reveal any protein domains or modules that may direct functional studies. In fact, for the NECAPs, the assignment of the proteins to CCVs allowed for the identification of a new motif for interactions with AP-2. These studies thus reveal the utility of subcellular proteomics towards the analysis of clathrin-mediated trafficking events.
Structure Related to Function: Molecules and Cells: A Focus Topic at BioScience2004, held at SECC Glasgow, U.K., 18–22 July 2004. Edited by D. Alessi (Dundee, U.K.), T. Cass (Imperial College London, U.K.), T. Corfield (Bristol, U.K.), M. Cousin (Edinburgh, U.K.), A. Entwistle (Ludwig Institute for Cancer Research, London, U.K.), I. Fearnley (Cambridge, U.K.), P. Haris (De Montfort, Leicester, U.K.), J. Mayer (Nottingham, U.K.) and M. Tuite (Canterbury, U.K.).
This work was supported by grants from the Canadian Institutes of Health Research and from the Genome Quebec project: Réseau Protéomique de Montréal, Montreal Proteomics Network (RPMPN).