Protein Analysis by Mass Spectrometry

Protein Analysis by Mass Spectrometry

What we can do

  • Identify single proteins from a gel band or solution
  • Identify multiple proteins in solution
  • Identify multiple proteins from a cell extract
  • Obtain sufficient sequence for cloning


What we need

  • gel band
    • Coomassie stained
    • mass spectrometry compatible silver stained
    • Sypro stained
  • solution


How it is done

The majority of protein sequence analysis today uses mass spectrometry. There are several steps in analyzing a protein.

  1. Digest the protein to peptides (in gel or solution). Mass spectrometry currently gets limited sequence data from whole proteins, but can easily analyze peptides.
  2. Trypsin is first choice for digestion-readily available, specific, majority of peptides are ideal size for analysis, peptides behave nicely in mass spectrometer.
  3. Separate peptides, usually on reverse phase column with acetonitrile gradient. We use columns 75 µm in diameter. We use acetic acid in the solvents because the commonly used trifluoroacetic interferes with ionization.
  4. Place ionized peptides in vapor phase by passing the column eluate, containing peptides and solvent, through a fine tip to form tiny droplets. After evaporation of solvent, peptides are left in the vapor phase. Charged surfaces move the ionized peptide into the mass spectrometer. Using chromatography to introduce molecules into a mass spectrometer is LC-MS (liquid chromatography mass spectrometry or HPLC-MS.
  5. Measure mass of peptides.
  6. Fragment peptides. Collisions with gas molecules fragments peptides at peptide bonds. This is CID (collision induced dissociation) or CAD (collisionally activated dissociation).
  7. Measure mass of fragments from peptides. Because there are two steps of mass spectrometry (mass of peptide, mass of fragment of peptide), this is called MS/MS, or MSn because there can be 2 or more fragmentation steps. A two step process is also called tandem mass spectrometry.
  8. Use fragment mass data to determine the sequence of the peptide by seeing which combinations of amino acids gives the observed masses of peptide fragments


The two mass measurements in steps 5 and 7 requires a tandem mass spectrometer, or MS/MS. The two measurements can be performed in

  • two different parts of the instrument- tandem in space
  • one cell of the instrument that switches modes-tandem in time

Most data analysis is done by computer, by comparison with known sequences; SEQUEST is the best known program. For new sequences and confirmation of important sequences, data analysis is done by hand.

Why you do not get complete sequence data for every protein

Seeing enough peptides to show 70% of the sequence of a protein (70% coverage)is a very successful protein analysis. In a project by the Cell Migration Consortium to analyze a number of protein involved in cell migration, 80% coverage of a protein is considered sufficient. There are several reasons why an analysis does not find all amino acids.

  • protein does not digest well
  • peptides too hydrophilic or small-they pass through the reverse phase column with salt and are not analyzed
  • peptides too large/hydrophobic-they stick in gel, adsorb to tubes, do not elute from column, or are too large for the mass spectrometer to analyze because of poor fragmentation
  • peptides fragment in ways which cannot be analyzed. Many spectra in an analysis cannot be interpreted. Some spectra only give limited data; proline, histidine, internal lysine and arginine are some reasons peptides do not give complete fragmentation data.


If more data is needed, another proteinase is used for digestion.

Mass mapping, protein mass finger printing

A less complex method for identifying proteins relies on databases of protein sequences.
After digesting a protein with trypsin or some other specific proteinase, the masses of the intact peptides are measured, usually with a MALDI instrument. A program compares the observed masses of peptides with those calculated from the digestion of all proteins in a database.

This technique only works for proteins whose sequences are available. It is only suitable for samples containing one protein, or two if they have similar abundances. This method was suggested as a way to analyze samples from 2-D gels.

Over the years, it has not been used much, especially with the increased availability of instruments which sequence peptides, and the demand to analyze samples containing multiple proteins.

Other topics:
analysis of complex protein mixtures
post translational modifications including phosphorylation