Briefings in Functional Genomics and Proteomics Advance Access originally published online on May 10, 2006
Briefings in Functional Genomics and Proteomics 2006 5(2):169-175; doi:10.1093/bfgp/ell017
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Technique Review |
Unravelling the proteome of formalin-fixed paraffin-embedded tissue
Corresponding author. Timothy D. Veenstra, Laboratory of Proteomics and Analytical Technologies, National Cancer Institute at Frederick, SAIC-Frederick, Inc., PO. Box B, Frederick, MD 21702, USA. E-mail: veenstra{at}ncifcrf.gov
| ABSTRACT |
|---|
Biofluid detectable biomarkers that originate at the site of diseased tissues would be advantageous, in that, they may provide mechanistic information concerning the manifestation and progression of the disease. Unfortunately, tissue biopsies are precious samples that can generally be acquired in small amounts due to the invasive nature of the sample collection. One of the foundations of pathological diagnosis for decades has been from formalin-fixed paraffin-embedded (FFPE) tissues, of which a vast archive exists worldwide. These tissues have also been widely used for immunohistochemistry and in situ hybridization studies examining for expression of specific proteins or transcripts. Unfortunately, the ability to analyse FFPE tissues using mass spectrometry (MS) has been essentially non-existent until recently. In this review, methods that allow the extraction of peptides from FFPE tissues and their proteomic analysis using MS are described. The ability to identify the proteins extracted from FFPE tissues allows comparative analyses that enable the potential discovery of novel biomarkers at the site of the diseased tissue.
Keywords: biomarker discovery, formalin-fixed paraffin-embedded tissue, laser-capture microdissection, mass spectrometry, proteomics
| INTRODUCTION |
|---|
The discovery of biomarkers has significant challenges both at the analytical and physiological level, most noticeably in the availability of clinical samples. The ultimate goal is to identify and/or track disease-specific biomarkers within clinical samples that can be collected with minimal invasiveness (i.e. urine, serum, plasma, etc.). Unfortunately, the translation of a biomarker from the affected site (i.e. tumour) to the point of sample collection generally results in its dilution to a level that is undetectable using standard mass spectrometry (MS) methodologies. While MS provides the ability to conduct comparative analyses of biofluids through the direct identification of peptides or proteins within the clinical samples acquired from healthy and disease-affected patients, these types of experiments represent proverbial needle-in-the-haystack searches, with the additional complication that no information about the needle is known. These methods are capable of identifying hundreds of proteins within biofluids as well as quantifying changes in a significant percentage of these species [1, 2]. Unfortunately, MS cannot indicate which of these differentially abundant proteins may be the disease-specific biomarker that is being sought. The typical throughput of these profiling studies is insufficient to do large numbers of comparative analyses, and therefore the uncertainty in proceeding to validate any of the potential markers identified in the discovery phase is generally quite high.
Beyond the physiological challenge presented by such biofluid analyses, a fundamental difficulty lies in the acquisition of appropriate clinical samples for MS-based discovery purposes. Tissue biopsies are difficult to obtain and therefore are too valuable to be used in global MS discovery projects in most instances. Coincidentally, a vast archive of tissue samples representing every conceivable condition that have been formalin-fixed (FF) and paraffin-embedded (PE) exists worldwide. These FFPE tissues are prepared by placing samples in a buffered formalin solution comprising of 3.7% (w/v) formaldehyde and 1015% methanol [3]. The process of formalin fixation results in various combinations of intra- and inter-molecular covalent crosslinks between proteins, RNA and DNA [4]. Following fixation, the samples are embedded with paraffin, which not only allows tissues to be cut into thin (i.e. a few microns) sections, but also permits the tissue to be stored at room temperature for an indefinite period of time. These tissues have been used for decades by pathologists to diagnose and stage tumours [5] and by research scientists to evaluate the expression of specific proteins and transcripts by immunohistochemistry (IHC) [6] and in situ hybridization [7], respectively. While inherently valuable in clinical analyses, FFPE tissues have been limited to immunohistochemistry for protein measurements. IHC is a hypothesis-driven analysis in which antibodies must be chosen to target a specific protein that may or may not be expressed within the cells present within the tissue of interest.
Several years ago, a study was conducted that attempted to extract proteins extracted from FFPE, ethanol-fixed and fresh-frozen tissues. The extracts were analysed using two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) followed by silver stain visualization [8]. Approximately 500 and 700 spots were detected from the ethanol-fixed and frozen-tissue extracts within the gels, respectively, while essentially no protein spots could be visualized from the extract taken from FFPE tissues. Recent developments in extraction methodologies have finally made the analysis of FFPE tissue by MS possible and have provided access to this vast, and clinically important, sample set for biomarker discovery.
| SAMPLE PREPARATION AND MASS SPECTROMETRY ANALYSIS |
|---|
Enhancements in the extraction of peptides from FFPE tissue sections yields samples that are directly compatible for separation by reversed-phase liquid chromatography (RPLC) and subsequent analysis by MS and tandem MS (MS/MS). In this method, populations ranging from 30 000 to 200 000 cells are manually collected or laser-capture microdissected from sections that are cut from FFPE tissue blocks [911]. Samples are then treated appropriately in order to facilitate extraction of peptides from the cells, enabling the analysis of these proteomes by MS without additional separation techniques (i.e. 2D-PAGE). Attempts to extract intact proteins typically met with very little success [8]. It is sometimes overlooked by the general scientific community that MS is not generally used to identify intact proteins when analysing complex proteomes, relying instead on the identification of peptides generated by enzymatic digestion of a proteome. Once the peptides are extracted from the cells, they are fractionated using RPLC that is coupled directly to a mass spectrometer [12].
In the analysis of complex proteome samples, one of the goals is to identify as many peptides as possible [13]. As an aid to achieving this goal, complex samples are often initially separated by strong cation exchange (SCX) chromatography and individual fractions are systematically analysed by RPLCMS/MS. Unfortunately, it is difficult to obtain enough total protein (i.e. >100 µg) from tissue sections so that SCX can be implemented prior to RPLCMS/MS analyses. A typical extraction from 50 000 cells will provide on the order of 510 µg of peptides. To maximize peptide identifications, however, a single sample can be analysed by RPLCMS/MS multiple times using gas-phase fractionation in the m/z dimension (GPFm/z) [14]. In this method, the mass spectrometer is instructed to selected precursor ions observed within a narrow m/z range (i.e.
200 m/z units). Experiments are run over a series of different m/z ranges that collectively span the entire scanning range of the mass spectrometer. By narrowing the m/z range in any specific RPLCMS/MS experiment, the instrument can significantly increase the number of precursor ions that can be sampled in that range, resulting in potentially a greater number of MS/MS events and ultimately in peptide identifications. An example of a proteome sample extracted from FFPE tissue and analysed by GPFm/z is shown in Figure 1. While the exact same sample was repeatedly analysed, the base-peak chromatograms of each sample were remarkably different, showing that distinct peptides were being sampled in the separated experiments. Using this method, a sample containing a total of 10 µg of protein can be analysed by RPLCMS/MS up to
10 times using different m/z ranges, where the combined data from these separate analyses typically results in the identification of several hundred proteins.
|
| COMPARATIVE ANALYSIS OF CELLS EXTRACTED FROM FFPE TISSUES |
|---|
Of great interest in the analysis of any proteome sample is how it compares with the one extracted from another cell type or from the same cell under a different set of environmental conditions. Over the past few years, a number of quantitative proteomic approaches have been developed with many of them focusing on the use of differential stable isotope labelling [15, 16]. Unfortunately, the limited amount of protein extracted from FFPE tissues makes isotope-labelling approaches very difficult. Sample losses associated with the various steps required in these comparative methods would ultimately result in the decreased recovery of sufficient sample for RPLCMS/MS analysis. Recently, proteomic investigators have been exploring the use of comparative methods that are based on the numbers of peptides identified for a specific protein [17]. The basic premise relies on the number of peptides identified for a specific protein being roughly proportional to its abundance in a proteome sample. This simplistic view can be complemented with other abundance-based determinations based on protein molecular weight and the number of possible MS-observable peptides that a protein would be expected to produce. This type of analysis is well-suited to FFPE tissues, allowing direct comparison of cell populations based on peptide abundances despite limited sample quantities.
| ANALYSIS OF FORMALIN-FIXED CELLS IN VITRO |
|---|
|
|
|---|
One of the first studies to demonstrate the ability to use MS to identify proteins extracted from FFPE cells examined a confluent culture of follicular B-cell lymphoma cells (SUDHL-4) [10]. Several sections were cut from a block of the fixed cells, and lysis buffer was added to extract the protein complement of the cells, which was subsequently digested using trypsin or glutamic-C. Approximately 325 proteins were identified by at least two unique peptides in the subsequent RPLCMS/MS analysis. Among the identifications were many important signalling proteins such as Raf-B, JAK1, STAT1 and protein kinase C (PKC), whose presence was further confirmed in the SUDHL-4 proteome by immunoblot analysis. The ability to see these signalling proteins attests to the overall sensitivity of the process.
| APPLICATION TO PROSTATE CANCER |
|---|
Recently, another newly developed methodology of analysis of FFPE tissues by RPLCMS/MS was applied to peptides extracted from prostate cancer (PCa), benign prostate hyperplasia (BPH) and stroma cells acquired from FFPE prostate tissue sections [9]. The overall procedure is presented as a schematic in Figure 2. Approximately 200 000 cells were collected from each of these regions using a modified laser-capture microdissection technique, the proteins were extracted from the cells, digested and analysed directly by RPLCMS/MS using GPFm/z. Identified peptides from the separate GPFm/z analyses were compiled and compared across all three sets.
|
A total of 2200 peptides representing 1156 unique proteins (those representing only a single protein in the database) were identified within the PCa population, while 1300 peptides corresponding to 702 unique proteins were identified within the BPH cells. Identified proteins from each sample were evaluated using a subtractive proteomics approach where unique peptides and the total number of times they were observed were compared globally to ascertain protein-abundance differences between cells. Indeed, several prostate-related proteins were identified in this study including phosphatidylethanolamine-binding protein (PEBP), prostatic acid phosphatase (PAP) and prostate-specific antigen (PSA). What is perhaps more telling is that PEBP, PAP and PSA were not only identified by more than one unique, fully tryptic peptide but that they were identified nearly equivalently in both the BPH and PCa regions (Table 1). These results support the common knowledge that while PSA is currently used as a marker for the presence of PCa, it does not accurately distinguish between PCa and other prostatic diseases. Additionally, the use of PAP as a potential marker was previously discontinued for similar reasons as observed in this study where it was also identified by similar numbers of peptides (Table 1).
|
Additionally, comparative proteomics analyses can provide potentially compelling abundance differences between different cell populations. Indeed, in the evaluation of peptides observed across all three samples, growth differentiation factor-15 (GDF-15) was observed by four total peptides in PCa cells but not observed at all in either BPH or stroma cells. A recent study has shown that serum GDF-15 can function as an independent marker of the presence of PCa [18]. In addition, this study showed that using the combination of the measurements of GDF-15 and PSA serum levels significantly increases the diagnostic specificity for PCa in men. The discovery-driven analysis of the FFPE tissues confirms that GDF-15 protein abundance is indeed up-regulated in PCa cells but not in BPH cells acquired from the same FFPE PCa tissue section [9].
| COMPARISON OF FFPE AND FRESH-FROZEN TISSUE |
|---|
What is surprising is that a vast percentage of the peptides identified in the FFPE PCa tissue study were unmodified, fully tryptic peptides [9]. Of concern in the fixation process and long-term storage of these samples is that a variety of protein modifications may occur, including residual formylation and oxidation. While these types of modifications were observed in the analysis, they did not occur to any appreciable extent and were not included in the final results. In order to further address the potential effects of the fixation process on peptide yields from the extraction process, the authors investigated the effects using a single mouse liver, fixing one-half according to standard protocols and freezing the other [9]. Samples (
30 000 cells) were processed similarly following freezing or fixation and analysed in a similar manner as above (Figure 3). While the frozen section yielded a slightly higher number of identified peptides and proteins, the species identified from each tissue type showed excellent overlap. The evaluation of protein or peptide modifications as well as tryptic site specificity (i.e. missed cleavage sites potentially due to the extensive cross-linked network) revealed that in this case there was no significant difference between the differently processed samples. These results were reflected in another study in which the proteomes of FFPE and fresh-frozen SUDHL-4 cells were characterized by liquid chromatography tandem MS (LCMS/MS) [10]. A total of 324 and 512 proteins were identified in the FFPE and fresh-frozen SUDHL-4 cells, respectively. While both of these studies used limited sample sets, taken together the results suggest that the fixation process will not unduly hinder the use of this vast archive of tissue samples for retrospective analysis of diseased states.
|
| DISCUSSION |
|---|
Over the past 5 years, the effort and cost devoted to the discovery of novel biomarkers in biofluids has exponentially increased, albeit with an unfortunate lack of success. While some of the reasons may be related to the technology, the physiological barriers that need to be overcome cannot be underestimated. Trying to unearth the identities of true biomarkers that are diluted into a vast biological matrix is extremely difficult. Considering the results, an alternative strategy might be to analyse samples of diseased tissue. Since the purpose of a biomarker discovery-driven proteomic study is to survey a large number of proteins, large sample quantities are typically required. As one of the more prevalent tissues sources available, FFPE tissues could play a critical role in this discovery phase since biopsies are difficult to obtain and are experimentally precious. A potential biomarker identified through the analysis of FFPE samples could be initially validated at the tissue level by rapidly screening a series of comparable samples (i.e. either FFPE or fresh-frozen). The biomarker can then be further targeted within an appropriate biofluid; designed around the development of an immunoaffinity reagent (i.e. antibody, aptamer, etc.) or an MS experiment to directly interrogate the biofluid for the protein or peptide of interest. While the attributes of MS for broad molecular identification are well-known, it is also extremely effective as an analytical instrument to assay for specific components within complex mixtures. The effectiveness of a tissue-specific biomarker to act as a biofluid-based biomarker is not a given; careful consideration needs to be given in choosing which validatable tissue markers might potentially be useful biofluid markers. For instance, proteins known to be located on the cell surface could be shed into the surrounding cell environment and would make good potential candidates for markers that can be readily detected in the circulatory system.
The process of formalin-fixing and paraffin-embedding of tissues, prepared by placing the sample in a solution of buffered formalin, has been a staple protocol towards the pathological determination of disease. However, the major component of formalin, formaldehyde, is a bifunctional molecule that infiltrates the cells within the tissue and reacts with various active sites on biomolecules resulting in the formation of inter- and intra-molecular covalent cross links. It is this covalent locking of proteins that has made the proteomic analysis of FFPE tissues using MS very difficult. As shown above, this challenge was overcome through the development of novel methods to extract peptides, instead of intact proteins, from FFPE tissues. In the majority of MS-based proteomic studies, it is peptides that are being analysed by the MS, and therefore these new methodologies fit perfectly into the downstream MS proteome analyses that have been developed over the past decade. Although various populations of unmodified, formalin (or other)-modified and formalin cross-linked peptides are undoubtedly extracted from these tissues, the resulting bioinformatic analysis of the MS data can directly focus solely on the unmodified peptides. It is from this list of unmodified peptides that direct comparisons can be made between samples about the relative abundance of specific proteins.
The initial investigations on FFPE tissues as described above show the importance of accessing this significant archive of retrospective collections of disease. However, as with biofluids such as serum and plasma, the effect of different sample preparation methods used to generate FFPE tissues on the reproducibility of the MS analysis is not yet fully understood. While studies have shown the ability to identify peptides or proteins from FFPE tissue that is several years old s9, 10], there is not enough critical data to make any solid conclusion concerning the effect of ageing on the ability to conduct effective proteomic analyses on these samples. The ability to explore the proteomes of FFPE tissues, while potentially quite valuable, is still in its infancy and will require the development of additional methodologies and technologies to make it more effective in the hunt for prognostic and diagnostic biomarkers.
Key Points
|
| Acknowledgements |
|---|
This project has been funded in whole or in part with federal funds from the National Cancer Institute, National Institutes of Health, under Contract NO1-CO-12400. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products or organization imply endorsement by the United States Government.
| FOOTNOTES |
|---|
Brian Hood is a research scientist in the Laboratory of Proteomics and Analytical Technologies at SAIC-Frederick, Inc. He received his PhD in Biochemistry from The Ohio State University in 2003 under the direction of Dr Russ Hille.
Tom Conrads is the director of the Mass Spectrometry Center at SAIC-Frederick, Inc. He is keenly interested in utilizing mass spectrometry technology to aid in the diagnosis of human diseases.
Tim Veenstra is the director of the Laboratory of Proteomics and Analytical Technologies at SAIC-Frederick, Inc. His research interests lie in finding novel cancer-related biomarkers.
| References |
|---|
- Anderson NL, Polanski M, Pieper R, et al. The human plasma proteome: a nonredundant list developed by combination of four separate sources. Mol Cell Proteomics 2004; 3:31126.
[Abstract/Free Full Text] - States DJ, Omenn GS, Blackwell TW, et al. Challenges in deriving high-confidence protein identifications from data gathered by a HUPO plasma proteome collaborative study. Nat Biotechnol 2006; 24:3338.[CrossRef][Web of Science][Medline]
- Fox CH, Johnson FB, Whiting J, et al. Formaldehyde fixation. J Histochem Cytochem 1985; 33:84553.[Medline]
- Kunkel GR, Mehrabian M, Martinson HG. Contact-site cross-linking agents. Mol Cell Biochem 1981; 34:313.[CrossRef][Medline]
- Wilkinson EJ, Hendricks JB. Role of the pathologist in biomarker studies. J Cell Biochem Suppl 1995; 23:108.[Medline]
- Werner M, Chott A, Fabiano A, et al. Effect of formalin tissue fixation and processing on immunohistochemistry. Am J Surg Pathol 2000; 24:10169.[CrossRef][Web of Science][Medline]
- Kadkol SS, Gage WR, Pasternack GR. In situ hybridization-theory and practice. Mol Diagn 1999; 4:16983.[CrossRef][Web of Science][Medline]
- Ahram M, Flaig MJ, Gillespie JW, et al. Evaluation of ethanol-fixed, paraffin-embedded tissues for proteomic applications. Proteomics 2003; 3:41321.[CrossRef][Medline]
- Hood BL, Darfler MM, Thomas G, et al. Proteomic analysis of formalin-fixed prostate cancer tissue. Mol Cell Proteomics 2005; 4:174153.
[Abstract/Free Full Text] - Crockett DK, Lin Z, Vaughn CP, et al. Identification of proteins from formalin-fixed paraffin-embedded cells by LC-MS/MS. Lab Invest 2005; 85:140515.[CrossRef][Web of Science][Medline]
- Palmer-Toy DE, Krastins B, Sarracino DA, et al. Efficient method for the proteomic analysis of fixed and embedded tissues. J Proteome Res 2005; 4:240411.[CrossRef][Web of Science][Medline]
- Liu H, Lin D, Yates JR, et al. Multidimensional separations for protein/peptide analysis in the post-genomic era. Biotechniques 2002; 32:898902.[Medline]
- Guerrera IC, Kleiner O. Application of mass spectrometry in proteomics. Biosci Rep 2005; 25:7193.[CrossRef][Medline]
- Blonder J, Rodriguez-Galan MC, Lucas DA, et al. Proteomic investigation of natural killer cell microsomes using gas-phase fractionation by mass spectrometry. Biochem Biophys Acta 2004; 1698:8795.[Medline]
- Ong SE, Mann M. Mass spectrometry-based proteomics turns quantitative. Nat Chem Biol 2005; 1:25262.[CrossRef][Web of Science][Medline]
- Schneider LV, Hall MP. Stable isotope methods for high-precision proteomics. Drug Discov Today 2005; 10:35363.[CrossRef][Medline]
- Ishihama Y, Oda Y, Tabata T, et al. Exponentially modified protein abundance index (emPAI) for estimation of absolute protein amount in proteomics by the number of sequenced peptides per protein. Mol Cell Proteomics 2005; 4:126572.
[Abstract/Free Full Text] - Brown DA, Stephan C, Ward RL, et al. Measurement of serum levels of macrophage inhibitory cytokine 1 combined with prostate-specific antigen improves prostate cancer diagnosis. Clin Cancer Res 2006; 12:8996.
[Abstract/Free Full Text]
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


