A number of cDNA fragments coding for bovine submaxillary mucin (BSM) were cloned, and the nucleotide sequence of the largest clone, BSM421, was determined. Two peptide sequences determined from the purified apoBSM were found near the N-terminus of the mucin-coding region of BSM421. This clone does not contain a start or stop codon, but its 3´ end overlaps with the 5´ end of a previously isolated clone, λBSM10. The composite sequence of 1589 amino acid residues consists of five distinct protein domains, which are numbered from the C-terminus. The cysteine-rich domain I can be further divided into a von Willebrand factor type C repeat and a cystine knot. Domains III and V consist of similar repeated peptide sequences with an average of 47 residues. Domains II and IV do not contain such sequences but are similar to domains III and V in being rich in serine and threonine, many of which are predicted to be potential O-glycosylation sites. Domain III also contains two sequences that match the ATP/GTP-binding site motif A (P-loop). Only β-strands and no α-helices are predicted for the partial deduced amino acid sequence. Northern analysis of submaxillary gland RNA with the BSM421 probe detected multiple messages of BSM with sizes from 1.1 to over 10 kb. The tandemly repeated, non-identical peptide sequences of approx. 47 residues in domains III and V of BSM differ from the tandemly repeated, identical 81-residue sequences of pig submaxillary mucin (PSM), although both BSM and PSM contain similar C-terminal domains. In contrast, two peptide sequences of ovine submaxillary mucin are highly similar (86% and 65% identical respectively) to the corresponding sequences in domain V of BSM.

Present address: Laboratory of Experimental Carcinogenesis, National Cancer Institute, National Institutes of Health, Building 37, Room 3C04, Bethesda, MD 20892, USA.

The nucleotide sequence reported will appear in DDBJ, EMBL and GenBank Nucleotide Sequence. Databases under the accession number AF016589.