Human liver cathepsin L consists of a heavy chain and a light chain with Mr values of 25,000 and 5000 respectively. The chains have been purified and their N-terminal amino acid sequences have been determined. The 40 amino acids determined from the heavy chain and 42 amino acids sequenced in the light chain are homologous with the N-terminal and C-terminal regions respectively of the superfamily of cysteine proteinases. Therefore it is likely that the two chains of cathepsin L are derived by proteolysis of a single polypeptide precursor. Of the amino acids sequenced, 81% are identical with the homologous portions of a protein sequence for a major cysteine proteinase predicted from a cDNA clone from a mouse macrophage cell line. This is the closest relative amongst the known sequences in the superfamily and strongly indicates that the protein encoded by this mRNA is cathepsin L. The mouse protein is also probably the major excreted protein of a transformed cell line [Gal & Gottesman (1986) Biochem. Biophys. Res. Commun. 139, 156-162]. The heavy chain is identical in only 71% of its residues with the sequence of ox cathepsin S, providing further evidence that this latter enzyme is probably not a species variant of cathepsin L. The relationship with a second unidentified cathepsin cDNA clone from a bovine library is much weaker (41% identity), and so this clone remains unidentified.

