## Joachim Frank

Print publication date: 2006

Print ISBN-13: 9780195182187

Published to Oxford Scholarship Online: April 2010

DOI: 10.1093/acprof:oso/9780195182187.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2017. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy).date: 24 October 2017

# Electron Microscopy of Macromolecular Assemblies

Chapter:
(p.15) 2 Electron Microscopy of Macromolecular Assemblies
Source:
Three-Dimensional Electron Microscopy of Macromolecular Assemblies
Publisher:
Oxford University Press
DOI:10.1093/acprof:oso/9780195182187.003.0002

# Abstract and Keywords

Following an outline of the transmission electron microscope (TEM) and its working principles, this chapter starts with a description of specimen preparation methods for EM imaging, including negative staining, glucose embedment and ice embedment. The principle of image formation in the TEM is described as it pertains to biological weak phase objects, and in the process, the contrast transfer function (CTF) is introduced. It shows that EM images are effectively projections of the Coulomb potential distribution of the biological object, convoluted with that function. The chapter closes by describing methods for the determination and computational correction of the CTF.

Figure 2.1 Schematic diagram of a transmission electron microscope. Adapted from a drawing provided by Roger C. Wagner, University of Delaware (Web site: www.udel.edu/ Biology/Wags/b617/b617.htm).

# 1. Principle of the Transmission Electron Microscope

The transmission electron microscope uses high-energy (100 keV or higher) electrons to form an image of very thin objects.1 Electrons are strongly scattered (p.16) by matter. For instance, for 120 kV electrons, the mean free path for elastic scattering in vitreous ice is in the region of 2800 Å, and for amorphous carbon 1700 Å (Angert et al., 1996). Values for resins used for plastic embedding and for frozen-hydrated tissue closely match those for vitreous ice. The values for inelastic scattering are 850 and 460 Å, respectively. Their paths can be bent by magnetic lenses so that an image is formed on a fluorescent screen (for direct viewing), photographic film, or scintillator of a charge-coupled device (CCD camera) (for recording). The much shorter wavelength of electrons, as compared to light, carries the promise of much higher resolution. However, because of lens aberrations, the resolution of an electron microscope falls far short of what one would expect from the de Broglie wavelength (0.037 Å for electrons accelerated to 100 kV). Instead, even today’s best electron microscopes have resolutions in the range 1–2 Å.

Since electrons are also scattered by the molecules of the air, the electron microscope column needs to be permanently kept at high vacuum. The need to maintain the vacuum makes it necessary to introduce specimens and the photographic plates through special airlock mechanisms. The problem posed by the basic incompatibility between the vacuum requirements and the naturally hydrated state of biological specimens has only recently been solved by cryo-electron microscopy (cryo-EM) techniques: the specimen is kept frozen, hydrated throughout the experiment, at a temperature (liquid nitrogen or helium) at which the vapor pressure of water is negligible. Before the introduction of practical cryo-techniques, in the beginning of the 1980s (Dubochet et al., 1982), molecular specimens were usually prepared in a solution of heavy metal salt such as uranyl acetate or uranyl molybdate, and subsequently air-dried, with consequent loss of high-resolution features and interior detail (see section 2.2 and section 2.3). Despite these shortcomings, this technique of negative staining is still used in the initial phases of many projects of molecular imaging.

For the general principle and theory of scattering and image formation, and the many applications of the transmission electron microscope, the reader is referred to the authoritative works by Reimer (1997) and Spence (1988). In the following, we will make reference to a schematic diagram of a typical transmission electron microscope (Figure 2.1) to explain the essential elements.

Cathode. Electrons are emitted by the cathode, which is kept at a high negative voltage, while the rest of the microscope is at ground potential. There are two types of cathode that have gained practical importance: the thermionic cathode and the field-emission gun (FEG). The thermionic cathode works on the principle that on being heated to sufficiently high temperature, metals allow electrons to exit. Typically, a hairpin-shaped wire made of tungsten is used. Use of a material with low work function such as lanthanum hexaboride (LaB6) as tip enhances the escape of electrons and leads to a brightness higher than that of tungsten. In the FEG, the exit of electrons is enhanced (by lowering the potential barrier) and directionally focused by the presence a strong electric field at the cathode tip. To this end, a voltage in the region of 2 kV is applied to a so-called extractor electrode situated in close vicinity to the tip. As a consequence, extraordinary brightness (p.17) is attained. Brightness is directly related to spatial coherence: to achieve high coherence, which is required for high resolution (see section 3), one has to restrict the emission from the cathode to a very small angular range (or, to use the term to be introduced later on, effective source size; see section 3.3.1). Unless the brightness of the cathode is high, this restriction leads to unacceptably low intensities and long exposure times. In the 1990s, commercial instruments equipped with a FEG came into widespread use.

Wehnelt cylinder. The cathode filament is surrounded by a cup-shaped electrode, with a central approx. ∼1-mm bore, that allows the electrons to exit. Kept on a potential that is slightly more (∼500V) negative than that of the cathode, the Wehnelt cylinder is specially shaped such that it ushers the electrons emitted from the cathode toward the optic axis, and allows a space charge formed by an electron cloud to build up. It is the tip of this cloud, closest to the Wehnelt cylinder opening, that acts as the actual electron source.

Condensor lenses. These lenses are part of the illumination system, defining the way the beam impacts on the specimen that is immersed in the objective lens. Of practical importance is the spot size, that is, the physical size of the beam on the specimen, and the angle of beam convergence, which determines the coherence of the illumination. There are usually two condensor lenses, designated CI and CII, with CI situated immediately below the Wehnelt cylinder. This first condensor lens forms a demagnified image of the source. The size of this source image (“spot size”) can be controlled by the lens current. The second condensor is a weak lens that is used to illuminate the specimen with a variable physical spot size and angle of convergence. At one extreme, it forms an image of the source directly on the specimen (small spot size; high angle of convergence). At the other extreme, it forms a strongly blurred image of the source (large spot size; low angle of convergence). Additional control over the angle of convergence is achieved by variation of the CII apertures.

Objective lens. This lens, the first image-forming lens of the electron microscope, determines the quality of the imaging process. The small focal length (on the order of millimeters) means that the specimen has to be partially immersed in the bore of the lens, making the designs of mechanisms for specimen insertion, and specimen support and tilting quite challenging. Parallel beams are focused in the back focal plane, and, specifically, beams scattered at the same angle are focused in the same point. Thus, a diffraction pattern is formed in that plane. The objective lens aperture, located in the back focal plane, determines what portion of the scattered rays participate in the image formation. Contrast formed by screening out widely scattered electrons by the objective lens aperture is termed amplitude contrast. More important in high-resolution electron microscopy (EM) is the phase contrast, which is contrast formed by interference between unscattered and elastically scattered electrons that have undergone a phase shift due to wave aberrations and defocusing.

Objective lens aberrations, foremost (i) the third-order spherical aberration, (ii) chromatic aberration, and (iii) axial astigmatism, ultimately limit the resolution of the microscope. Spherical aberration has the effect that for beams (p.18) inclined to the optic axis, the focal length is less than for those parallel to the axis. As a consequence, the image of a point becomes a blurred disk. This aberration can be partially compensated by operating the lens in underfocus, which has just the opposite effect (i.e., the focal length being greater for beams inclined to the axis), though with a different angular dependency (see section 3 where this important compensation mechanism utilized in bright-field microscopy is discussed). Axial astigmatism can be understood as an “unroundness” of the lens magnetic field, which has the effect that the way specimen features are imaged depends on their orientation in the specimen plane. Axial astigmatism can be readily compensated with special corrector elements that are situated directly underneath the objective lens.

Finally, chromatic aberration determines the way the lens focuses electrons with different energies. A lens with small chromatic aberration will focus electrons slightly differing in energy into the same plane. The importance of this aberration is twofold: electrons may have different energies because the high voltage fluctuates, or they may have different energies as a result of the scattering (elastic versus inelastic scattering, the latter accompanied by an energy loss). Modern electron microscopes are well stabilized, making the first factor unimportant. However, a small chromatic aberration constant also ensures that electrons undergoing moderate amounts of energy loss are not greatly out of focus relative to zero-loss electrons.

Intermediate lenses. One or several lenses are situated in an intermediate position between objective and projector lens. They can be changed in strength to produce overall magnifications in a wide range (between 500 and 1 million), though in molecular structure research a narrow range of 35,000 to 60,000 is preferred, as detailed below. Particular settings are available to switch the electron microscope from the imaging to the diffraction mode, a mode important mainly for studying two-dimensional (2D) crystals, and not useful in conjunction with single particles. (Note that in one nomenclature, both the intermediate lenses and the projector lens described below are summarily referred to as “projector lenses.”)

Projector lens. This lens projects the final image onto the screen. Since the final image is quite large, the part of the EM column between the projector lens and screen is sensitive to stray fields that can distort the image, and special magnetic shielding is used.

Fluorescent screen. For direct viewing, a fluorescent screen is put into the image plane. The fluorescent screen is a metal plane coated with a thin (range 10–20 mm, depending on voltage) layer of a fluorescent material that converts impinging electrons into bursts of light.

Photographic recording. Upon impact on a photographic emulsion, electrons, just as light, make silver halide grains developable, and an image is formed by grains of metallic silver. Depending on emulsion thickness and electron energy, for a fraction of electrons each produces a cascade of secondary electrons, and the trail of grains in the emulsion is plume-shaped, normal to the plane of the emulsion, with a significant lateral spread.

CCD recording. Cameras using CCD cameras are made of a large array of photosensitive silicon diodes that is optically coupled with a scintillator. (p.19) The optical coupling is achieved by fiber optics or, less frequently, by a glass lens of large numerical aperture. The scintillator, like the fluorescent screen used in direct observation, converts the energy set free on impact of an electron into a local light signal. The signals from the individual array elements are read out and integrated over a brief period to form a digital image that is directly stored in the computer. Apart from offering the convenience of direct recording, CCD cameras have the advantage over photographic films in that they cover a much wider dynamic range. The disadvantage is that they are still quite expensive, and that they cover a much smaller field of view.

# 2. Specimen Preparation Methods

## 2.1. Introduction

Ideally, we wish to form a three-dimensional (3D) image of a macromolecule in its entirety, at the highest possible resolution. Specimen preparation methods must be designed to perform the seemingly impossible task of stabilizing the initially hydrated molecule so that it can be placed and observed in the vacuum. In addition, the contrast produced by the molecule itself is not normally sufficient for direct observation in the electron microscope, and various contrasting methods have been developed. Negative staining (section 2.2) with heavy metal salts such as uranyl acetate produces high contrast and protects, at least to some extent, the molecule from collapsing. However, the high contrast comes at a hefty price: instead of the molecule, with its interior density variations, only a cast of the exterior surface of the molecule is imaged, and only the molecule’s shape can be reconstructed. To some extent, stain may penetrate into crevices, but this does not alter the fact that only the solvent-accessible boundary of the molecule is visualized.

As sophisticated averaging methods were developed, it became possible to “make sense” of the faint images produced by the molecule itself, but biologically relevant information could be obtained only after methods were found to “sustain” the molecule in a medium that closely approximates the aqueous environment: these methods are embedment in glucose (section 2.3), tannic acid (section 2.4), vitreous ice (“frozen-hydrated”; section 2.5) and cryo-negative staining (section 2.6). It is by now clear, from comparisons with structures solved to atomic resolution, that embedment in vitreous ice leads to the best preservation of interior features, all the way to atomic resolution. For this reason, the main emphasis in this book, in terms of coverage of methods and applications, will be on the processing of data obtained by cryo-EM.

For completeness, this section will conclude with a brief assessment of gold-labeling techniques, which have become important in some applications of 3DEM of macromolecules (section 2.7), and with a discussion of support grids (section 2.8).

## (p.20) 2.2. Negative Staining

### 2.2.1. Principle

The negative staining method, which was introduced by Brenner and Horne (1959), has been widely used to obtain images of macromolecules with high contrast. Typically, an aqueous suspension is mixed with 1 to 2% uranyl acetate and applied to a carbon-coated specimen grid. The excess liquid is blotted away and the suspension is allowed to dry. Although, to some extent, the stain goes into crevices and aqueous channels, the structural information in the image is basically limited to the shape of the molecule as it appears in projection. As tilt studies and comparisons with X-ray diffraction data on wet suspensions reveal, the molecule shape is distorted due to air-drying (see Crowther, 1976; Kellenberger and Kistler, 1979; Kellenberger et al., 1982). Nevertheless, negative staining has been used with great success in numerous computer reconstructions of viruses and other large regular macromolecular assemblies. The reason why such reconstructions are legitimate is that they utilize symmetries which allow many views to be generated from the view least affected by the distortion. Therefore, as Crowther (1976) argues correctly, “reconstructions of single isolated particles with no symmetry from a limited series of tilts (Hoppe et al., 1974) are therefore somewhat problematical.” As this book shows, the problems in the approach of Hoppe and coworkers to the reconstruction of negatively stained molecules can be overcome by the use of a radically different method of data collection and interpretation.

Even in the age of cryo-EM, negative staining is still used in high-resolution EM of macromolecules as an important first step in identifying characteristic views and assessing if a molecule is suitable for this type of analysis (see also the re-evaluation done by Bremer et al., 1992). Efforts to obtain a 3D reconstruction of the frozen-hydrated molecule almost always involve a negatively stained specimen as the first step, sometimes with a delay of several years [e.g., 50S ribosomal subunit: Radermacher et al. (1987a, 1992b); Androctonus australis hemocyanin: Boisset et al. (1990b, 1992a); nuclear pore complex: Hinshaw et al. (1992); Akey and Radermacher (1993); skeletal muscle calcium release channel: Wagenknecht et al. (1989a); Radermacher et al. (1994b); RNA polymerase II: Craighead et al., 2002]. A useful compilation covering different techniques of negative staining, including cryo-negative staining, is found in Ohi et al. (2004).

### 2.2.2. Artifacts Introduced by Negative Staining

#### 2.2.2.1. One-sidedness and variability

Biological particles beyond a certain size are often incompletely stained; parts of the particle farthest away from the carbon film “stick out” of the stain layer (“one-sided staining”; figure 2.2). Consequently, there are parts of the structure that do not contribute to the projection image; one speaks of partial projections. A telltale way by which to recognize partial projections is the observation that “flip” and “flop” views of a particle (i.e., alternate views obtained by flipping the particle on the grid) fail to obey the expected mirror relationship. A striking example of this effect is offered by the 40S subunit of the eukaryotic ribosome with its characteristic R and L side views (Figure 2.3; Frank et al., 1981a, 1982): the two views are quite different in appearance—note, for instance, the massive “beak” of the R view which is fused with the “head,” compared to the narrow, well-defined “beak” presented in the L view. (Much later, cryo-EM reconstructions as well as X-ray structures confirmed this difference in shape between the intersubunit and the cytosolic side of the small ribosomal subunit; see, for instance, Spahn et al., 2001c.)

(p.21)

Figure 2.2 Appearance of negatively stained particles in single versus double carbon layer preparations, as exemplified by micrographs of the calcium release channel of skeletal fast twitch muscle. (a) Single-layer preparation, characterized by crisp appearance of the molecular border and white appearance of parts of the molecule that “stick out.” From Saito et al. (1988), reproduced with permission of The Rockefeller University Press. (b) Double-layer preparation, characterized by a broader region of stain surrounding the molecule and more uniform staining of interior. From Radermacher et al. (1992a), reproduced with permission of the Biophysical Society.

(p.22)

Figure 2.3 One-sidedness of staining in a single-carbon layer preparation produces strong deviations from the mirror relationship between flip/flop related projections. The average of 40S ribosomal subunit images showing the L view (top left) is distinctly different from the average of images in the R view (top right). This effect can be simulated by computing incomplete projections, that is, projections through a partially capped volume (e.g., upper half removed), from a reconstruction that was obtained from particles prepared by the double-layer technique (bottom panels). From Verschoor et al. (1989); reproduced with permission of Elsevier.

Since the thickness of the stain may vary from one particle to the next, “one-sidedness” of staining is frequently accompanied by a high variability in the appearance of the particle. Very detailed—albeit qualitative—observations on these effects go back to the beginning of image analysis (Moody, 1967). After the introduction of multivariate data analysis into EM (see chapter 4), the stain variations could be systematically studied as these techniques of data analysis allow particle images to be ordered according to stain level or other overall effects. From such studies (van Heel and Frank, 1981; Bijlholt et al., 1982; Frank et al., 1982; Verschoor et al., 1985; Boisset et al., 1990a; Boekema, 1991) we know that the appearance of a particle varies with stain level in a manner reminiscent of the appearance of a rock partically immersed in water of varying depth (Figures 2.4 and Figures 2.5). In both cases, a contour marks the level where the isolated mass raises above the liquid. Depending on the depth of immersion, the contour will contract or expand. (Note, however, that despite the striking similarity between the two situations, there exists an important difference: for the negatively stained particle, seen in projection, only the immersed part is visible in the image, whereas for the (p.23) rock, seen as a surface from above, only the part sticking out of the water is visible.)

Figure 2.4 The effect of negative staining (uranyl acetate) on the particle shape, illustrated with reconstructions of a hemocyanin molecule from Androctonus australis labeled with four Fabs. (a, c) Side and top views of the molecule reconstructed from single-particle images of a negatively stained preparation. (b, d) Side and top views of the molecule from single-particle images obtained by cryo-EM. Comparison of the side views (a, b) shows the extent of the compression that occurs when the molecule is negatively stained and air-dried. While in the cryo-EM reconstruction (b), the orientations of the Fab molecules reflect the orientations of the protein epitope in the four copies of the corner protein; they are all bent in the stained preparation into a similar orientation, so that they lie near-parallel to the specimen grid. However, comparison of the top views (c, d) shows that the density maps are quite similar when viewed in the direction of the beam, which is the direction of the forces (indicated by arrows) inducing flattening. (a, c) from Boisset et al. (1993b), reproduced with permission of Elsevier. (b, d) From Boisset et al. (1995), reproduced with permission of Elsevier.

Computer calculations can be used to verify the partial-staining model. Detailed matches with observed molecule projections have been achieved by modeling the molecule as a solid stain-excluding “box” surrounded by stain up to a certain z-level. With the 40S ribosomal subunit, Verschoor et al. (1989) were able to simulate the appearance (and pattern of variability) of experimental single-layer projections by partial projections of a 3D reconstruction that was obtained from a double-layer preparation.

Because partial projections do not generally allow the object to be fully reconstructed, the single-carbon layer technique is unsuitable for any quantitative studies of particles with a diameter above a certain limit. In the case of uranyl acetate, this limit is in the range of 150 Å or so, the maximum height fully covered with stain as inferred from the comparison between actual and simulated projections of the 40S ribosomal subunit (Verschoor et al., 1989). For methyl-amine tungstate used in conjunction with Butvar films, on the other hand, much larger stain depths appear to be achieved: fatty acid synthetase, with a long dimension of ∼300 Å, appeared to be fully covered (Kolodziej et al., 1997).

(p.24)

Figure 2.5 (a) Images of the 40S ribosomal subunit from HeLa cells, negatively stained in a single-carbon layer preparation. The images were sorted, in one of the first applications of correspondence analysis, according to increasing levels of stain. (b) Optical density traces through the center of the particle. From Frank et al. (1982), reproduced with permission of Elsevier.

The double-carbon layer method of staining [Valentine et al. (1968); Tischendorf et al. (1974); Stőffler and Stőffler-Meilicke (1983); see also Frank et al. (1986), where the relative merits of this technique as compared to those of the single-layer technique are discussed] solves the problem of partial stain immersion (p.25) by providing staining of the particle from both sides without information loss. This method has been used for immunoelectron microscopy (Tischendorf et al., 1974; Lamy, 1987; Boisset et al., 1988) precisely for that reason: antibodies attached to the surface facing away from the primary carbon layer will then be visualized with the same contrast as those close to the carbon film.

Under proper conditions, the double-carbon layer method gives the most consistent results and provides the most detailed information on the surface features of the particle, whereas the single-layer method yields better defined particle outlines but highly variable stain thickness, at least for conventional stains. Outlines of the particles are better defined in the micrograph because a stain meniscus forms around the particle border (see Moody, 1967), producing a sharp increase in scattering absorption there. In contrast to this behavior, the stain is confined to a wedge at the particle border in sandwich preparations, producing a broader, much more uniform band of stain. Intuitively, the electron microscopist is bound to prefer the appearance of particles that are sharply delineated, but the computer analysis shows that images of sandwiched particles are in fact richer in interior features. Thus, the suggestion that sandwiching would lead to an “unfortunate loss of resolution” (Harris and Horne, 1991) is based mainly on an assessment of the visual appearance of particle borders, not on quantitative analysis.

For use with the random-conical reconstruction technique (see chapter 5, section 5), the sandwiching methods (Valentine et al., 1968; Tischendorf et al., 1974; Boublik et al., 1977) have been found to give most reliable results as they yield large proportions of the grid being double-layered. There is, however, evidence that the sandwiching technique increases the flattening of the particle (i.e., additional to the usual flattening found in single-layer preparations that is due to the drying of the stain; see below.)

#### 2.2.2.2. Flattening

Specimens are flattened by negative staining and air-drying (Kellenberger and Kistler, 1979). As a side effect of flattening, and due to the variability in the degree of it, large size variations may also be found in projection (figure 2.6; see Boisset et al., 1993b).

The degree of flattening can be assessed by comparing the z-extent (i.e., in the direction normal to the support grid) of particles reconstructed from different views showing the molecule in orientations related by 90° rotation. Using such a comparison, Boisset et al. (1990b) found the long dimension of the Androctonus australis hemocyanin to be reduced to 60% of its size when measured in the x–y plane (Figure 2.4). The z-dimension of the calcium release channel is reduced to as little as 30−40% when the reconstruction from the negatively stained preparation (Wagenknecht et al., 1989a) is compared with the reconstruction from cryo-images (Radermacher et al., 1994a, 1994b). Apparently, in that case, the unusual degree of the collapse is due to the fragility of the dome-shaped structure on the transmembranous side of the molecule.

Evidently, the degree of flattening effect depends strongly on the type of specimen. It has been conjectured, for instance (Knauer et al., 1983), that molecular assemblies composed of RNA and protein, such as ribosomes, are more resistant to mechanical forces than those made entirely of protein. Indications (p.26) for particularly strong resistance were the apparent maintenance of the shape of the 30S subunit in the aforementioned reconstructions of Knauer and coworkers and the result of shadowing experiments by Kellenberger et al. (1982), which indicated that the carbon film yields to, and partially “wraps around,” the ribosomal particle.

Figure 2.6 Size variation of A. australis hemocyanin–Fab complex prepared with the double-carbon layer, negative staining method. (a) Average of small, well-defined molecules that are encountered at places where the surrounding stain is deep; (b) average of large, apparently squashed molecules seen at places where the stain is shallow. The molecules were classified by correspondence analysis. Scale bar, 100 A. From Boisset et al. (1993b), reproduced with permission of Elsevier.

The behavior of biological structures subjected to mechanical forces might be easiest to understand by considering the specific features of their architecture, which includes the presence or absence of aqueous channels and cavities. Indeed, empty shells of the turnip yellow mosaic virus were shown to be totally collapsed in the sandwiched preparation while maintaining their spherical shape in the single-layer preparation (Kellenberger et al., 1982). Along similar lines, the flattening of the 70S ribosome mentioned above was observed when the particle was oriented such that the interface cavity between the two subunits could be closed by compression. Also, the calcium release channel is a particularly fragile structure that includes many cavities and channels, as evidenced later on by cryo-EM reconstructions (Radermacher et al., 1994a, 1994b; Samsò and Wagenknecht, 1998; Sharma et al., 1998).

As a general caveat, any comparisons of particle dimensions in the z-direction from reconstructions have to be made with some caution. As will become clear later on, reconstructions from a single random-conical projection set are to some extent elongated in the z-direction, as a result of the missing angular data, so that the amount of flattening deduced from a measurement of the z-dimension actually leads to an underestimation. By applying restoration (see chapter 5, section 10) to the published 50S ribosomal subunit reconstructed from stained specimens, Radermacher et al. (1992b) found that its true z-dimension was in fact considerably smaller than the apparent dimension. A comparison of the structures shown in this abstract suggests a factor of approximately 0.7.

(p.27) Negative staining enjoys currently a revival of sorts in the form of cryo-negative staining. A separate brief section (section 2.6) will be devoted to this technique, which combines the contrast enhancement due to heavy-metal salt with the shape preservation afforded by ice embedment. Many other variants of negative staining have been recently reviewed and critically assessed by Harris and Scheffler (2002). Several unconventional stains not containing metal atoms were systematically explored by Massover and coworkers (Massover and Marsh, 1997, 2000; Massover et al., 2001). These authors explored light atom derivatives of structure-preserving sugars as a medium, among other compounds, to preserve the structure of catalase used as test specimen (Massover and Marsh, 2000).

Among the variety of stains tested, methylamine tungstate (MAT) appears to occupy a special place (Stoops et al., 1991b; Kolodziej et al., 1997). The combination of low charge and neutral pH apparently minimizes the deleterious interaction of this stain with the molecule. Following Stoops et al. (1991b), the specimen is applied to a Butvar film by the spray method, forming a glassy layer of stain entirely surrounding the molecule. (Some additional references attesting to the performance of this staining method with regard to preservation of the molecule architecture are found in chapter 5, section 12.2; see also figure 5.26 for the distribution of orientations.)

## 2.3. Glucose Embedment

The technique of preparing unstained specimens using glucose was introduced by Unwin and Henderson (1975). In this technique, the solution containing the specimen in suspension is applied to a carbon-coated grid and washed with a 1 % (w/v) solution of glucose. The rationale for the development of this technique was, in the words of the authors, “to replace the aqueous medium by another liquid which has similar chemical and physical properties, but is non-volatile in addition.” Other hydrophilic molecules initially tried were sucrose, ribose, and inositol. X-ray evidence indicated that this substitution leaves the structure unperturbed up to a resolution of 1/3 to 1/4 Å–1. Although highly successful in the study of bacteriorhodopsin, whose structure could ultimately be solved to a resolution of 1/3 Å–1 (Henderson et al., 1990), glucose embedding has not found widespread application. One reason for this failure to take hold is that the scattering densities of glucose and protein are closely matched, resulting in extremely low contrast, as long as the resolution falls short of the 1/7 to 1/10 Å1 range where secondary structure starts to become discernable. (It is this disadvantage that virtually rules out consideration of glucose embedment in imaging molecules in single-particle form.) Aurothioglucose, a potential remedy that combines the sustaining properties of glucose with high contrast of gold (Kühlbrandt and Unwin, 1982), has been discounted because of its high sensitivity to radiation damage (see Ohi et al., 2004). The other reason has been the success of frozen-hydrated electron microscopy (cryo-electron microscopy) starting at the beginning of the 1980s, which has the advantage, compared to glucose embedment, that the scattering densities of water and protein are sufficiently different to produce contrast even at low resolutions with a suitable choice of defocus (see section 3.3).

## (p.28) 2.4. Use of Tannic Acid

Tannic acid, or tannin, has been used with success by some researchers to stabilize and preserve thin ordered protein layers (Akey and Edelstein, 1983). It was found to be superior in the collection of high-resolution data for the light-harvesting complex II (Kühlbrandt and Wang, 1991; Wang and Kühlbrandt, 1991; Kühlbrandt et al., 1994). For tannin preservation, the carbon film is floated off water, transferred with the grid onto a 0.5% (w/v) tannic acid solution, and adjusted to pH 6.0 with KOH (Wang and Kühlbrandt, 1991). These authors in fact found little differences between the preservation of the high-resolution structure when prepared with vitreous ice, glucose, or tannin, with the important difference being the high success rate of crystalline preservation with tannin versus the extremely low success rate with the other embedding media. The authors discuss at length the role of tannin in essentially blocking the extraction of detergent from the membrane crystal, which otherwise occurs in free equilibrium with the detergent-free medium. Arguably the most successful application of this method of embedment has been in the solution of the structure of the ab-tubulin dimer, to atomic resolution, by electron crystallography of zinc-induced tubulin sheets, where a mixture of tannin and glucose was used (Nogales et al., 1998).

## 2.5. Ice-Embedded Specimens

The development of cryo-EM of samples embedded in vitreous ice (Taylor and Glaeser, 1976; Dubochet et al., 1982; Lepault et al., 1983; McDowall et al., 1983; Adrian et al., 1984; Wagenknecht et al., 1988b; see also the reviews by Chiu, 1993 and Stewart, 1990) presented a quantum leap in biological EM as it made it possible to obtain images of fully hydrated macromolecules. The specimen grid, on which an aqueous solution containing the specimen is applied, is rapidly plunged into liquid ethane, whereupon the thin water film vitrifies (Figure 2.7). The rapid cooling rate (see below) prevents the water from turning into crystalline ice. It is the volume change in this phase transition that normally causes damage to the biological specimens.

The extremely high cooling rate obtained (estimated at 100,000°/s) is the result of two factors: the small mass (and thus heat capacity) of the grid and the high speed of the plunging. Measurements of DNA persistence lengths by cryo-EM have shown that the time during which a 0.1 mm thick water film (the typical thickness obtained by blotting) vitrifies is in the microsecond range (Al-Amoudi et al., 2004).

The detailed sequence of steps is as follows (Stewart, 1990):

1. (i) Apply a few microliters of aqueous specimen (buffer composition is not important) to a grid made hydrophilic by glow discharge. The grid could be bare, covered with holey carbon, or with a continuous carbon film.

2. (ii) Blot for a few seconds, to remove excess buffer, allowing only a thin (∼1000 Å or less) layer to remain.

3. (iii) Immediately after step (2), submerge grid in a cryogen—usually liquid ethane supercooled by a surrounding bath of liquid nitrogen.

4. (p.29)
5. (iv) Transfer grid from cryogen to liquid nitrogen bath containing cryo-holder. Grid must then be kept in nitrogen at all time.

6. (v) Transfer cryo-holder to transmission electron microscope (TEM) (keeping grid shielded from atmosphere at all time) and record micrographs using low-dose techniques. Maintain temperature well below –133°C (common range is 160 to 175°C).

Figure 2.7 Schematic diagram of a freeze-plunger. A steel rod is mounted such that, once released and propelled by gravity, it can move through the guide collar until the adjustable stop hits the collar. At its end, a pair of tweezers are fastened that support the EM grid. For the preparation, a small droplet of the buffer containing the molecules is applied to the grid. Excess liquid is blotted so that a very thin film of the specimen remains on the grid. After blotting, the steel rod is released, and thereby the grid is plunged into the cryogen, normally liquid ethane. The grid is then transferred into liquid nitrogen and mounted onto the cooled specimen holder. From Stewart (1990).

The number of manual steps involved in the whole procedure makes it an “art”; that is, the success rate strongly depends on the skill of the experimenter. For instance, the duration and evenness of blotting controls the thickness and evenness of the ice layer. Apparatuses such as the Vitrobot (Frederick et al., 2000; Braet et al., 2003; now marketed commercially) have been specially designed to achieve high-quality cryo-specimens reproducibly under controlled conditions. High humidity is particularly important since it prevents fast (p.30) evaporation during the period between blotting and immersion of the grid in the cryogen, thus eliminating one of the chief sources of variability in the water film thickness.

As with glucose embedment, the advantage of frozen-hydrated specimen preparation is that specimen collapse is avoided and that the measured image contrast is related to the structure of the biological object itself, rather than to an extraneous contrasting agent, such as heavy atom salts in negative staining. Thus, by combining cryo-EM with 3D reconstruction, a quantitative, physically meaningful map of the macromolecule can be obtained, enabling direct comparisons with results from X-ray crystallography.2 Another advantage of cooling is the greater resistance of organic material to radiation damage at low temperatures (see Chiu et al., 1986), although initial estimates have proved overly optimistic. The reason for the reduction in damage is that free radicals produced by ionization during electron irradiation are trapped under these conditions, preventing, or at least reducing, the damage to the structure. About 30 years ago, Taylor and Glaeser (1974) showed that the crystalline order is preserved in thin platelets of catalase cooled down to liquid-nitrogen temperature. Subsequent investigations of a number of protein crystals found general improvements in radiation resistance by a factor between two and six (Chiu et al., 1986; see also the summary given by Dubochet et al., 1988).

An additional reduction of radiation damage is thought to occur when the temperature is further reduced from liquid-nitrogen temperature (∼170K) to a range close to the temperature of liquid helium. Reports on this effect are, however, inconsistent and contradictory. Measurements by Fujiyoshi (1998) indicated that the temperature of the specimen area rises upon irradiation, potentially to a range where the initially substantial dose protection factor becomes quite small. There are also indications that the size of the effect varies, depending on the aggregation form of the specimen (crystalline or single-particle; Y. Fujioshi, personal communication). One report (Grassucci et al., 2002) suggests that some of the gain in radiation protection may be offset by a reduction in contrast, which would be consistent with earlier observations that the density of beam-exposed ice increases by ∼20% when the temperature drops below 32K (Heide and Zeitler, 1985; Jenniskens and Blake, 1994). However, more definite experiments showing the actual preservation of a molecule under liquid helium versus liquid nitrogen temperature have yet to be conducted.

Cryo-electron microscopy is not without problems. Proteins have a scattering density that is only slightly higher than the scattering density of vitreous ice, hence the contrast, against an ice matrix, of unstained molecular assemblies formed by proteins is quite low. The intrinsically low contrast of cryo-EM micrographs leads to several problems: (i) it causes difficulties in spotting macromolecules visually or by some automated procedures, especially at low defocus; (ii) it poses problems in aligning particles; and (iii) it leads to the requirement of large numbers of particles (p.31) for the statistical fortification of results. Another, more hidden problem is the high proportion of inelastic scattering in ice when compared to other support media (such as carbon film). This leads to a high “inelastic background” in the image, or a high portion of Fourier components, in the low spatial frequency range, that do not obey the linear contrast transfer theory (see section 3.3).

Despite these shortcomings, however, cryo-EM has proved to be highly successful and accurate in depicting biological structure, as can be seen from a growing number of comparisons with structures solved by X-ray crystallography (e.g., GroE: Roseman et al., 2001; Ludtke et al., 2004; 70S E. coli ribosome: Gabashvili et al., 2000; Spahn et al., 2005).

As a note on nomenclature, both the terms “cryo-electron microscopy” and “electron cryo-microscopy” are being used in the field. It would be desirable at this stage to have one term only. Which one is preferable? In one, the prefix “cryo-” modifies the method “electron microscopy” (not “electron”, which might have raised a possible objection); in the other, the prefix “electron” is a modifier of “cryo-microscopy.” While “electron microscopy” is a defined term that still allows all kinds of specifications, “cryo-microscopy” by itself does not exist as a meaningful generic term, and this is the reason why “cryo-electron microscopy” (and its universally accepted abbreviation “cryo-EM”) is the preferable, more logical name for the technique described here.

## 2.6. Hybrid Techniques: Cryo-Negative Staining

With ice embedment, the only source of image contrast is the biological material itself. Practical experience has shown that for alignment (and all operations that depend on the total structural contents of the particle) to work, molecules have to be above a certain size, which is in the region of 100 Å. For globular molecules, this corresponds to ∼400 kD molecular weight (see formula in Appendix of Henderson, 1995). This is well above the minimum molecular weight (100kD) derived on theoretical grounds by Henderson (1995) (see section 5.5 in chapter 3). Stark et al. (2001) obtained a cryo-EM reconstruction of the spliceosomal U1 snRP, in the molecular weight range of 200 kD, probably on the border line of what is feasible.

Thus, the technique of 3D cryo-EM of unstained, ice-embedded molecules faces a practical hurdle as the size of the molecule decreases. It is possible to extend this range to molecules that are significantly smaller, by adding a certain amount of stain to the aqueous specimen prior to plunge-freezing. The added contrast of the molecule boundary against the more electron-dense background, rather than the increase of contrast in the molecule’s interior, is instrumental in achieving alignment even for small particles. It is therefore to be expected that the size or molecular weight limit is lower for strongly elongated molecules, and higher for the more globular kind, which provides little purchase for rotational alignment algorithms (see chapter 3, section 1 for details on these algorithms).

Cryo-negative staining was introduced by Adrian et al. (1998). The procedure they described is as follows: 4μl of sample solution is applied to gold sputter-coated holey carbon film. After about 30 s, the grid is placed onto a 100-ml droplet of 16% ammonium molybdate on a piece of paraffin, and allowed to float for 60 s. (p.32) Then the grid is removed for blotting and freeze-plunging according to the usual protocol. The pH range of the stain (7−8) avoids the denaturing effects associated with the low pH of more common staining agents such as uranyl acetate.

Later de Carlo et al. (2002) reported that application of this specimen preparation technique, apart from giving the benefit of higher contrast [a 10-fold increase in signal-to-noise ratio (SNR)], also resulted in a reduction of beam sensitivity in single molecules of GroEL. Following an exposure of 10 to 30 e2, the particles visible in the micrograph were seen to disintegrate and lose contrast in ice, while they remained virtually unchanged in cryo-negative stain. The results obtained by these authors also indicate that the resolution as measured by the Fourier shell correlation of single-particle reconstructions (a measure of reproducibility in Fourier space, to be introduced in later chapters (section 5.2.4, chapter 3; section 8.2, chapter 5) improves substantially over that with ice (Figure 2.8). There is, however, a question if this improvement might merely be a reflection of the increased definition of the molecule’s boundary, rather than an improvement in the definition of the molecule’s interior. At any rate, the stain surrounding the molecule appears to be stabilized under the impact of the beam, producing a crisper appearance and a well-reproducible boundary as indicated by the Fourier shell correlation (de Carlo et al., 2002).

Using the method of Adrian et al. (1984), as well as a variant of it employing Valentine’s carbon sandwiching method (Valentine et al., 1998), Golas et al. (2003) were able to reconstruct the multiprotein splicing factor SF3b, which is in the molecular weight region of 450 kD. The same technique (under the name (p.33) of “cryo-stain sandwich technique”) was used by Boehringer et al. (2004) and Jurica et al. (2004) for the 3D visualization of spliceosomal complexes B and C, respectively.

Figure 2.8 Fourier shell correlation plots for different doses and specimen preparation conditions. LowD low dose 10e/Å; highD high-dose 30e/Å; water unstained particles; stain cryo-negatively stained particles; FSC Crit 3-σ FSC criterion. From de Carlo et al. (2002) reproduced with permission of Elsevier.

## 2.7. Labeling with Gold Clusters

The use of selective stains to mark specific sites or residues of a molecule has been explored early on for some time (see review by Koller et al., 1971). The idea of using compounds incorporating single heavy atoms received particular interest. Even sequencing of nucleic acid sequences was once thought possible in this way (Beer and Moudrianakis, 1962). The difficulty with such single-atom probes was the relatively low contrast compared to the contrast arising from a column of light atoms in a macromolecule and from the support. Subsequently, a number of different heavy-atom clusters were investigated for their utility in providing specific contrast (see the detailed account in Hainfeld, 1992). Undecagold, a compound that incorporates 11 gold atoms, is clearly visible in the scanning transmission electron microscope (STEM) but not in the conventional electron microscope. Subsequently, Hainfeld and Furuya (1992; see also Hainfeld, 1992) introduced a much more electron-dense probe consisting of a 55-gold atom cluster (Nanogold; Nanoprobe Inc., Stony Brook, NY), which forms a 14-Å particle that is bound to a single maleimide site. This compound can be specifically linked to naturally occurring or genetically inserted cysteine residues exposed to the solvent.

From theoretical considerations and from the first experiments made with this compound, the scattering density of the gold cluster is high enough to outweigh the contribution, to the EM image, by a projected thick (200−300 Å) protein mass. Because of the presence of the strong amplitude component, which is transferred by cos γ, where γ is the wave aberration function (see section 3.4), the Nanogold cluster stands out as a sharp density peak even in (and, as it turns out, especially in) low-defocus cryo-EM images where the boundaries of macromolecules are virtually invisible (Wagenknecht et al., 1994). The usual highly underfocused images show the cluster somewhat blurred but still as prominent “blobs” superimposed on the molecule. Applications of this method are steadily increasing (e.g., Milligan et al., 1990; Boisset et al., 1992a; Braig et al., 1993; Wagenknecht et al., 1994; Wenzel and Baumeister, 1995; Montesano-Roditis et al., 2001). Boisset and coworkers (1992, 1994a) used the Nanogold cluster to determine the site of thiol ester bonds in human α2-macroglobulin in three dimensions. The cluster stands out as a “core” of high density in the center of the macromolecular complex. Wagenknecht et al. (1994) were able to determine the calmodulin-binding sites on the calcium release channel/ryanodine receptor. Using site-directed Nanogold labeling, Braig et al. (1993) succeeded in mapping the substrate protein to the cavity of GroEL.

## 2.8. Support Grids

Carbon films with holes 1–2μm in diameter, placed on 300 mesh copper or molybdenum grids, provide support and stability to the specimen. The sample (p.34) is either directly deposited onto the holey carbon or onto an additional thin but continuous layer of carbon placed onto the holey carbon.

Figure 2.9 EM grid square (60 μm × 60 μm) with Quantifoil film bearing a frozen-hydrated specimen of ribosomes, at 712 × magnification. (a) CCD image. (b) Same as (a), with marks of the data acquisition program indicating the positions of high-quality data. (From Lei and Frank, unpublished results.)

For the “manual” preparation of the holey carbon films (Baumeister and Seredynski, 1976; Baumeister and Hahn, 1978; Toyoshima, 1989), formvar is dissolved in chloroform, and then both acetone and glycerin are added. Sonication of the resulting mixture for 5 min will produce glycerin droplets of the correct size range within 30 min. The formvar is applied to a clean glass slide and dried. Glycerin droplets leave holes in the dried film. This holey formvar film is floated off the slide and transferred onto EM grids. Grids are then coated with carbon in an evaporator. The formvar is removed by placing the grids overnight in contact with filter paper soaked with chloroform or amyl acetate.

The introduction of regular, perforated carbon support foils (Ermantraut et al., 1998), now commerically produced under the trademark Quantifoil (Jena, Germany), has greatly aided in making data collection more uniform, reproducible, and convenient. These foils are spanned across EM grids and have circular holes arranged on a regular lattice (Figure 2.9), facilitating automated data collection (Potter et al., 1999; Carragher et al., 2000; Zhang et al., 2001; Lei and Frank, unpublished data).

# 3. Principle of Image Formation in the Transmission Electron Microscope

## 3.1. Introduction

Image formation in the electron microscope is a complex process; indeed, it is the subject for separate books (e.g., Spence, 1988; Reimer, 1997). In the later chapters of this volume, there will be frequent references to the contrast transfer function (CTF) and its dependence on the defocus. It is important to understand (p.35) the principle of the underlying theory for two reasons: first, the image is not necessarily a faithful representation of the object’s projection (see Hawkes, 1992), and hence the same can be said for the relationship between the 3D reconstruction computed from such images and the 3D object that it is supposed to represent. It is, therefore, important to know the imaging conditions that lead to maximum resemblance, as well as the types of computational correction (see section 3.9 in this chapter and section 9 in chapter 5) that are needed to recover the original information. Second, the contrast transfer theory is only an approximation to a comprehensive theory of image formation (see Rose, 1984; Reimer, 1997), and attains its simplicity by ignoring a number of effects whose relative magnitudes vary from one specimen to the other. An awareness of these “moving boundaries” of the theory is required to avoid incorrect interpretations, especially when the resolution of the reconstruction approaches the atomic realm.

However, some of these sections may provide detail unnecessary for practitioners of routine methods, and are marked accordingly (*) for judicious skipping.

## 3.2. The Weak-Phase Object Approximation*

The basis of image formation in the electron microscope is the interaction of the electrons with the object. We distinguish between elastic and inelastic scattering. The former involves no transfer of energy; it has fairly wide angular distribution, and gives rise to high-resolution information. The latter involves transfer of energy, its angular distribution is narrow, and it produces an undesired background term in the image. Because this term has low resolution (implied in the narrow scattering distribution), it is normally tolerated, although it interferes with the quantitative interpretation of the image (see also section 4.3 on energy filtering).

In the wave-optical picture, the “elastic” scattering interaction of the electron with the object is depicted as a phase shift Φ(r) of the incoming wave traveling in the z-direction by

(2.1)
where r is a 2D vector, which we will write as a column vector,$r=[xy]$ or $[x,y]$ and C(r, z) is the 3D Coulomb potential distribution within the object. Thus, the incoming plane wave Ψ0 = exp(ikz) is modified according to (figure 2.10)
(2.2)
The weak-phase approximation assumes that Ф (r) ≪ 1, enabling the expansion
(2.3)
(p.36) which is normally truncated after the second term. Note that this form implies a decomposition of the wave behind the object into an “unmodified” or “unscattered wave” (term 1), and a “scattered wave” [terms iΦ(r) and following]. If we disregard the higher-order terms, then we can say that the scattered wave is out of phase by 90° with respect to the unscattered wave.

Figure 2.10 A pure phase object and its effect on an incoming plane wave. The phase of the incoming wave is locally shifted by an amount that is proportional to the integral of the potential distribution in the direction of the wave’s propagation. Here, the object is shown as a globule embedded in another medium, such as ice. The incoming wave traveling in the z-direction is uniformly shifted everywhere except in x, y places, where it has intersected the globular object as well. In those regions, the shift is proportional to the thickness traversed. This has the effect that the exit wave front is deformed. The phase shift shown here is very large, on the order of the wavelength (the distance between successive crests in this drawing). However, note that for a weak phase object, the shift is quite small compared to the wavelength. (Drawing by Michael Watters.)

The Frauenhofer approximation of the diffraction theory (Goodman, 1968) is obtained by assuming that the observation is made in the far distance from the object and close to the optical axis—assumptions that are always fulfilled in the imaging mode of the transmission electron microscope. In that approximation, the wave function in the back focal plane of the objective lens is—in the absence of aberrations—the Fourier transform of equation (2.2) or of the approximated expression in equation (2.3). However, the lens aberrations and the defocusing have the effect of shifting the phase of the scattered wave by an amount expressed by the term

(2.4)
which is dependent on the coordinates in the back focal plane, which are in turn proportional to the scattering angle and the spatial frequency, k = (kx, ky). The term χ(k) is called the wave aberration function (Figure 2.11). In a polar coordinate system with
(2.5)
(p.37) where λ is the electron wavelength; Δz, the defocus of the objective lens; za, the focal difference due to axial astigmatism; φ0, the reference angle defining the azimuthal direction of the axial astigmatism; and Cs, the third-order spherical aberration constant.

Figure 2.11 Wave aberration function of the transmission electron microscope. The curves give the function sin $χ(Δz^;k^)/π=k^4/2−Δz^k^2$, which is a function of the generalized spatial frequency $k^=k(Csλ3)1/4$ for different choices of the generalized defocus $Δz^=Δz/(Csλ)1/2$ From Hawkes and Kasper (1994), reproduced with permission of Elsevier.

An ideal lens will transform an incoming plane wave into a spherical wave front converging into a single point on the back focal plane (figure 2.12a). Lens aberrations have the effect of deforming the spherical wave front. In particular, the spherical aberration term acts in such a way that the outer zones of the wave front are curved stronger than the inner zones, leading to a decreased focal length in those outer zones (figure 2.12b).

In summary, the wave function in the back focal plane of the objective lens, Ψbf(k), can be written as the Fourier transform of the wave function Ψ(k) immediately behind the object, multiplied by a term that represents the effect of the phase shift due to the lens aberrations:

(2.6)

Here and in the following, the symbol ℑ·{·} will be used to denote the Fourier transformation of the argument function, and ℑ·−1 {·} its inverse. Also, as a notational convention, we will use lower case letters to refer to functions in real (p.38) space and the corresponding capital letters to refer to their Fourier transforms. Thus, ℑ·{h(r)} = H(k), etc.

Figure 2.12 The effect of the spherical aberration. (a) In the ideal lens (C s = 0), an incoming plane wave is converted into a spherical wave converging in the focal plane (here in the focal point f i on the optic axis). (b) In a real lens with finite C s, the focal length is smaller for the peripheral regions of the lens than the central regions, hence the parts of the plane wave intersecting the periphery are focused in a point f o closer to the lens. (Drawing by Michael Watters.)

Next, the wave function in the image plane is obtained from the wave in the back focal plane, after modification by an aperture function A(k), through an inverse Fourier transformation:

(2.7)
(2.8)
where θ1 is the angle corresponding to the radius of the objective aperture. Finally, the observed intensity distribution in the image plane (ignoring irrelevant scaling factors) is
(2.9)

If the expansion of equation (2.3) is broken off after the second term (“weak phase object approximation”), we see that the image intensity is dominated by a term that results from the interference of the “unmodified wave” with the “scattered wave.” In this case, the imaging mode is referred to as bright-field electron microscopy. If the unmodified wave is blocked off in the back focal plane, we speak of dark-field electron microscopy.

Bright-field EM is the imaging mode most frequently used by far. Because of the dominance of the term that is linear in the scattered wave amplitude, bright-field EM has the unique property that it leads to an image whose contrast is—to a first approximation—linearly related to the projected object potential. The (p.39) description of the relationship between observed image contrast and projected object potential, and the way this relationship is influenced by electron optical parameters, is the subject of the contrast transfer theory (see Hansen, 1971; Lenz, 1971; Spence, 1988; Hawkes, 1992; Wade, 1992; Hawkes and Kasper, 1994). A brief outline of this theory is presented in the following section.

## 3.3. The Contrast Transfer Theory*

### 3.3.1. The Phase Contrast Transfer Function

If we (i) ignore terms involving higher than first orders in the projected potential Φ(r) and (ii) assume that Φ(r) is real, equation (2.9) yields a linear relationship between O(k) =ℑ·{Φ(r)} and the Fourier transform of the image contrast, ℑ·{I(r)}:

(2.10)
[A factor of 2 following the derivation of equation (2.10) is omitted here (cf. Frank, 1996), to allow for a convenient definition of the CTF.] Mathematically, the appearance of a simple scalar product in Fourier space, in this case with the factor H(k) = A(k) sin γ(k), means that in real space the image is related to the projected potential by a convolution operation:
(2.11)
(2.12)
where h(r) is called the point spread function. [The notation using the symbol “o” (e.g., in Goodman, 1968) is sometimes practical when expressions involving multiple convolutions need to be evaluated.]

The function sin γ(k) (figures 2.13a and 2.13e) is called the phase contrast transfer function (CTF). It is characterized, as k = lkl increases, by a succession of alternately positive and negative zones, which are rotationally symmetric provided that the axial astigmatism is fully compensated [i.e., za = 0 in equation (2.5)]. The zones have elliptic or more complicated shapes for za≠0.

### 3.3.2. Partial Coherence Effects: The Envelope Function

In equation (2.10), which is derived assuming completely coherent illumination with monochromatic electrons (i.e., electrons all having the same energy), the resolution is limited by the aperture function. In the absence of an aperture, information is transferred out to high spatial frequencies, even though the increasingly rapid oscillations of the CTF make it difficult to exploit that information. In practice, however, the illumination has both finite divergence (or, in other words, the source size is finite) and a finite energy spread. The resulting partial coherence dampens the CTF as we go toward higher spatial frequencies, and ultimately limits the resolution. The theoretical treatment of (p.40) these phenomena is somewhat complicated, and the resulting integrals (e.g., Hanszen and Trepte, 1971a, 1971b; Rose, 1984) make it difficult to gauge the effects of changing defocus, illumination divergence, or energy spread. Approximate envelope representations have been developed (energy spread: Hanszen, 1971; partial spatial coherence: Frank, 1973a; Wade and Frank, 1977), which have the advantage that they reveal the influence of these parameters in a mathematically easily tractable form:

(2.13)
where E(k) is the “compound envelope function”:
(2.14)
consisting of the term E i(k), the envelope function due to partially coherent illumination, and the term E e(k), the envelope function due to energy spread. (p.41) (For simplicity, only the radial dependence is considered here. It is straightforward to write down the full expression containing the polar coordinate dependency in the case za≠0.) The effect of the partially coherent illumination alone is shown in figures 2.13b–d and 2.13f–h: increasing the source size (described by the parameter q 0, a quantity of dimension 1/length specifying the size of the source as it appears in the back focal plane, or with the generalized parameter $q^0$ that will be introduced in the following section) is seen to dampen the high spatial frequency range increasingly. The range of validity for this product representation has been explored by Wade and Frank (1977). The first term is
(2.15)
for a Gaussian source distribution, and
(2.16)
for a “top hat” distribution (Frank, 1973a), with J 1 denoting the first-order Bessel function. The recurring argument in both expressions (C sλ3 k 3 – Δzλk) is the gradient of the wave aberration function (Frank, 1973a). It is evident from equations (2.15) and (2.16) that E i(k) = 1 wherever this gradient vanishes.

Figure 2.13 The influence of a finite illumination angle on the contrast transfer function (CTF) for two defocus values, $Δz^=1(a-d)$ and $Δz^=3(e-h).(a, e)$. (a, e) Undampened CTF; (b, f) $q^0−0.05$; (c, g) $q^0=0.1$; (d, h) $q^0=0.5$. (For the definition of generalized defocus and source parameters $Δz^$ and $q^0$, see section 3. From Hawkes and Kasper (1994), reproduced with permission of Elsevier.

The envelope due to the energy spread is (Hanszen, 1971; Wade and Frank, 1977)

(2.17)
where δz is the defocus spread due to lens current fluctuations and chromatic aberration in the presence of energy spread. The distinguishing feature of E i(k) is that it is defocus dependent, whereas E e(k) is independent of defocus. The combined effect of the two envelopes can be conceptualized as the action of two superimposed virtual apertures, one of which changes with changing defocus and one of which remains constant. For example, if E i(k) cuts the transmission of information off at k 1 and E e(k) cuts it off at k 2 < k 1, then E i(k) has no effect whatsoever. From this “product veto rule,” an important clue can be derived by looking at the power spectra of a defocus series: if the band limit produced by E(k) = E i (k)E e(k) is independent of defocus in a given defocus range, then the energy spread is the limiting factor in that range. In the case where the band limit is observed to be defocus dependent, one part of the defocus range may be dependent and thus ruled by E i(k), and the other part may be independent, and limited by E e(k). In instruments with a field-emission gun, which are increasingly used, both illumination spread and energy spread are quite small, and so both envelopes E i(k) and E e(k) remain close to unity up to high spatial frequencies, offering the opportunity for reaching atomic resolution (O’Keefe, 1992; Zhou and Chiu, 1993).

In figure 2.14, the effect of the partial coherence envelope function on the band limit is illustrated on a modern instrument equipped with a field emission (p.42) source. Even though the instrument is built for high coherence, incorrect settings of the condenser system will result in a damping by the envelope function, and thus diminished resolution.

Figure 2.14 Effect of partial coherence on the resolution, as reflected by the extent of Thon rings in the diffraction pattern. The experimental conditions under which the images are taken differ by the “spot size” setting of the Condensor I lens (see section 1), which controls the demagnification of the physical source produced by this lens. A small image of the source (or small numerical setting of CI) results in high coherence, i.e., little damping of the information in Fourier space by the envelope function, or large extent of Thon rings, meaning high resolution. (a) CI setting = 6; (b) CI setting = 7; (c) CI setting = 8. (Images were of a cryo-specimen of ribosomes, taken on an F30 Polara transmission electron microscope operated at liquid-nitrogen temperature and a voltage of 200 kV. Power spectra were calculated using the method described in section 3.6 of this chapter. Data were provided by Bob Grassucci.)

Finally, it should be mentioned that, unless compensated, axial astigmatism, which was left out in equations (2.152.17) for notational convenience, will create an azimuthal dependence of the effective band limit through the action of the defocus-dependent illumination envelope, which has then to be written as a function of a vector argument, namely as E i(k).

### 3.3.3. The Contrast Transfer Characteristics

When the value of sin y(k) is plotted as a function of both spatial frequency k and defocus Az, we obtain a pattern called the contrast transfer characteristics (Thon, 1971) of the electron microscope. If we again ignore the effect of axial astigmatism, the characteristics are determined entirely by the values of the remaining parameters in equation (2.5), namely λ (electron wavelength) and C s (third-order spherical aberration coefficient). Following the convention introduced by Hanszen and Trepte (1971a, 1971b), we can introduce dimensionless variables (see Frank, 1973a):

(2.18)
(p.43) and
(2:19)
For completeness, the generalized source size will also be introduced here:
(2:19a)
With these generalized variables, one obtains the standard characteristics, which are independent of voltage and the value of the third-order spherical aberration constant, and hence are the same for all transmission electron microscopes:
(2:20)
This important diagram is shown in Figure 2.14. The use of generalized coordinates means that this diagram is universal for all transmission electron microscopes. It makes it possible to determine the optimum defocus setting that is required to bring out features of a certain size range or to gauge the effect of axial astigmatism.

Features of a size range between d 1 and d 2 require the transmission of a spatial frequency band between l/d 1 and 1/d 2. One obtains the defocus value, or value range, for which optimum transmission of this band occurs, by constructing the intersection between the desired frequency band and the contrast transfer zones (Figure 2.15). The effect of axial astigmatism can be gauged by moving back and forth along the defocus axis by the amount z a/2 from a given defocus position, bearing in mind that this movement is controlled by the azimuthal angle φ according to the behavior of the function sin(2φ): going through the full 360° azimuthal range leads to two complete oscillations of the effective defocus value around the nominal value. Evidently, small values of astigmatism lead to an elliptic appearance of the contrast transfer zones, whereas large values may cause the defocus to oscillate beyond the boundaries of one or several zones, producing quasi-hyperbolic patterns.

### 3.3.4. The Effects of the Contrast Transfer Function

#### 3.3.4.1. Band-pass filtration

The main degrading effects of the underfocus CTF on the image, as compared to those of ideal contrast transfer [i:e:, CTF(k) ≡ const: ≡ 1], can be described as a combined low-pass (i.e., resolution-limiting) and high-pass filtration, in addition to several reversals of contrast. The effective low-pass filtration results from the fact that, in the underfocus range (by convention, Az > 0), the CTF typically has a “plateau” of relative constancy followed by rapid oscillations. In this situation, the high-frequency boundary of the plateau acts as a virtual band limit. The use of information transferred beyond that limit, in the zones of alternating contrast, requires some type of restoration, that is, CTF correction.

(p.44)

Figure 2.15 Representation of the CTF characteristics in the presence of partial coherence, $E(Δz^;k^)sin⁡γ(Δz^;k^)$, showing the effect of the envelope functions. The vertical axis is the spatial frequency, in generalized units $k˙$ and the horizontal axis is the generalized defocus $Δz^$ [see equations (2.18) and (2.19) for definition of both quantities]. The gray value (white = max.) is proportional to the value of the function. A section along a vertical line $Δz^$ = const. gives the CTF at that defocus, and allows the effective resolution to be gauged as the maximum spatial frequency (or upper transfer limit), beyond which no information transfer occurs. (a) CTF with $E(Δz^;k^)=1$ (fully coherent case); (b) partially coherent illumination with the generalized source size $q^0$=0.5 but zero defocus spread. In this case, the practical resolution is clearly defocus dependent. At high values of underfocus, not only the resolution is limited (upper boundary), but a central band is also eliminated by the effect of the envelope function. (c) Defocus spread $δz^=0.125$ in generalized units, in the case of a point source. The practical resolution has no defocus dependence in this case. From Wade (1992), reproduced with permission of Elsevier.

(p.45) Information transferred outside the first zone is of little use in the image, unless the polarities of the subsequent, more peripheral zones are “flipped” computationally, so that a continuous positive or negative transfer behavior is achieved within the whole resolution domain. More elaborate schemes employ restoration such as Wiener filtering (Kübler et al., 1978; Welton, 1979; Lepault and Pitt, 1984; Jeng et al., 1989; Frank and Penczek, 1995; Penczek et al., 1997; Grigorieff, 1998), in which not only the polarity (i.e., the phase in units of 180°) but also the amplitude of the CTF is compensated throughout the resolution range based on two or more micrographs with different defocus values (see section 3.9 in this chapter and section 9 in chapter 5). The fidelity of the CTF correction, or its power to restore the original object, is limited by (i) the presence of resolution-limiting envelope terms, (ii) the accuracy to which the transfer function formalism describes image formation, and (iii) the presence of noise.

The well-known band-pass filtering effect of the CTF is a result of the CTF having a small value over an extended range of low spatial frequencies. The effect of this property on the image of a biological particle is that the particle as a whole does not stand out from the background, but its edges are sharply defined, and short-range interior density variations are exaggerated.

In figure 2.16, a CTF has been applied to a motif for a demonstration of the contrast reversal and edge enhancement effects. One is immediately struck by the “ghost-like” appearance of the familiar object and the diminished segmentation, by density, between different parts of the scene and the background. This effect is caused by the virtual absence of low-resolution information in the Fourier transform. Similarly, low-contrast objects such as single molecules embedded in ice are very hard to make out in the image, unless a much higher defocus (e.g., in the range 3–6 mm or 30,000–60,000 Å) is used that forces the CTF to rise swiftly at low spatial frequencies.

Another way of describing the effects of the CTF (or any filter function) is by the appearance of the associated point spread function, which is the Fourier transform of the CTF and describes the way a single point of the object would be imaged by the electron microscope. Generally, the more closely this function resembles a delta function, the more faithful is the image to the object. (Simple examples of filter functions and associated point-spread functions are provided in appendix 2.) In the practically used range of defocus, the typical point-spread function has a central maximum that is sometimes barely higher than the surrounding maxima, and it might extend over a sizable area of the image. The long-range oscillations of the point-spread function are responsible for the “ringing effect,” that is, the appearance of Fresnel fringes along the borders of the object.

#### 3.3.4.3. Signal-to-noise ratio

To consider the effect of the CTF on the SNR, we make use of Parseval’s theorem (chapter 3, section 4.3) to express the signal variance:

(p.46) where A(k) is the radial amplitude falloff of the object’s transform, E(k) is the envelope function, and var(N) is the noise variance. The squaring in the integrand means that the SNR is insensitive to the degree of fidelity by which the signal is rendered. For instance, any flipping of phases in a spatial frequency band of the CTF has no effect on the signal variance and, by extension, on the SNR. It is easy to see that the SNR oscillates as a function of defocus (Frank and Al-Ali, 1975), since extra lobes of the CTF migrate into the integration area as Δz decreases (i.e., as the underfocus becomes larger). Most pronounced is the effect of the SNR variation when low underfocus values are considered. The slow start of sin γ(k) for low underfocus values will mean that a substantial portion of the Fourier transform in the low spatial frequency region makes no contribution to the SNR. Low-defocus micrographs are, therefore, notorious for poor visibility of unstained particles embedded in a layer of ice.

Figure 2.16 Demonstration of the effect of an electron microscopic transfer function on a real-life motif. (a) The motif. (b) CTF with the profile shown in (d). The first, innermost transfer interval conveys negative contrast, the following transfer interval positive contrast. (c) Motif after application of the CTF. Compared to the original, the distorted image is characterized by three features: (i) inversion of contrast; (ii) diminished contrast of large regions; (iii) edge enhancement (each border is now sharply outlined). With higher defocus, an additional effect becomes apparent: the appearance of fringes (Fresnel fringes) with alternating contrast along borders. As a rule, the resolution of the degraded image is determined by the position of the first zero of the CTF. (e) Degraded image (c) after contrast inversion. (From N. Boisset, unpublished lecture material.)

For a “white” signal spectrum without amplitude falloff [i.e., A(k) ≡ 1 throughout], the SNRs of two images with defocus settings Δz1 and Δz2 are (p.47) related as (see Frank and Al-Ali, 1975)

(2.20a)

Note in passing that the integral over the squared CTF that appears in equation (2.20a) can be recognized as one of Linfoot’s criteria (Linfoot, 1964) characterizing the performance of an optical instrument (Frank, 1975b): it is termed structural contents. Another criterion by Linfoot, the flat integral over the CTF, is termed correlation quality as it reflects the overall correlation between an object with “white” spectrum and its CTF-degraded image.

## 3.4. Amplitude Contrast

Amplitude contrast of an object arises from a virtual—and locally changing—loss of electrons participating in the “elastic” image formation, either by electrons that are scattered outside of the aperture or by those that are removed by inelastic scattering (see Rose, 1984). These amplitude components are, therefore, entirely unaffected by energy filtering. Rose writes in his account of information transfer: “It seems astonishing at first that a (energy-) filtered bright-field image, obtained by removing all inelastically scattered electrons from the beam, represents an elastic image superposed on a inelastic ‘shadow image.’ ” The shadow image that Rose is referring to is produced by amplitude contrast. Since the detailed processes are dependent on the atomic species, the ratio of amplitude to phase contrast is itself a locally varying function. Formally, the amplitude component of an object can be expressed by an imaginary component of the potential in equation (2.3). The Fourier transform of the amplitude component is transferred by cos γ(k), which, unlike the usual term sin γ(k), starts off with a maximal value at low spatial frequencies. The complete expression for the image intensity thus becomes

(2.21)
where O r(k) and O i(k) are the Fourier transforms of the real (or weak phase) and imaginary (or weak amplitude) portions of the object, respectively (Erickson and Klug, 1970; Frank, 1972c; Wade, 1992). Equation (2.21) is the basis for heavy/light atom discrimination using a defocus series (Frank, 1972c, 1973b; Kirkland et al., 1980; Typke et al., 1992; Frank and Penczek, 1995), following an original idea by Schiske (1968). The reason is that the ratio between amplitude and phase contrast is greater for heavy atoms than for light atoms.

Only for a homogeneous specimen (i.e., a specimen that consists of a single species of atom; see Frank and Penczek, 1995) is it possible to rewrite equation (2.21) in the following way:

(2.22)
(p.48) Here, Q(k) = Oi(k)/O r(k) is a function characteristic for each atomic species, but it is often assumed that Q(k) is (i) the same for all atoms in the specimen and (ii) constant within the small spatial frequency range of practical interest in most cryo-EM applications (see Toyoshima and Unwin, 1988a, 1988b; Stewart et al., 1993). (A more accurate formulation is found in section 3.5.) With these approximations, it is again possible to speak of a single contrast transfer function:
(2.23)

Figure 2.17 Electron-optical contrast transfer function (C s = 2 mm) for a mixed phase/amplitude object (Q 0 = 0.15), for two defocus values: Δz = – 0.9 μm (solid line) and Δz= –1.5 μm (dotted line). From Frank and Penczek (1995),reproduced with permission of Urban and Fischer Verlag.

Compared with the function obtained for a pure phase object, the function described by equation (2.23) has the zeros shifted toward higher radii (Figure 2.17). However, the most important change lies in the fact that, at low spatial frequencies, the transfer function starts off with a nonzero term, H′(0) = – Q o. Thus, the cosine-transferred term mitigates the pronounced band-pass filtering effect produced by sin γ, brought about by the deletion of Fourier components at low spatial frequencies.

The value of Q o is usually determined by recording a defocus series of the specimen and measuring the positions of the zeros of H′(k) in the diffraction patterns of the micrographs. These diffraction patterns can be obtained either by optical diffraction or by computation (see section 3.6). Other measurements of Q o were done by following the amplitudes and phases of reflections in the computed Fourier transform of a crystal image as a function of defocus (Erickson and Klug, 1970) or by observing the lines of zero contrast transfer in optical diffraction patterns of strongly astigmatic images (Typke and Radermacher, 1982). Toyoshima and Unwin (1988a) and Toyoshima et al. (1993) obtained Q o measurements by comparing micrographs that were taken with equal amounts of underfocus and overfocus.

Averaged values for Q o of negatively stained (uranyl acetate) specimens on a carbon film range from 0.19 (Zhu and Frank, 1994) to 0.35 (Erickson and Klug, 1970, 1971). The wide range of these measurements reflects not only the presence of considerable experimental errors, but also variations in the relative amount and thickness of the stain compared to that of the carbon film. For specimens in ice, (p.49) the values range from 0.07 (ribosome: Zhu and Frank, 1994) to 0.09 (acetylcholine receptor: Toyoshima and Unwin, 1988a) and 0.14 (tobacco mosaic virus: Smith and Langmore, 1992).

## 3.5. Formulation of Bright-Field Image Formation Using Complex Atomic Scattering Amplitudes

A treatment of image formation that takes account of the atomic composition of the specimen is found in Frank (1972). We start by considering the wave in the back focal plane, as in equation (2.6), but now trace its contributions from each atom in the object. The scattering contribution from each atom is the atomic scattering amplitude, f j(k), multiplied with a phase term that relates to the atom’s position in the specimen plane, r j:

(2.6a)
Amplitude effects are accounted for by the fact that the atomic scattering amplitudes are complex:
(2.23a)
with the real part relating to the phase, and the imaginary part relating to the amplitude (absorption) properties. The complex electron atomic scattering amplitudes were tabulated by Haase, based on Hartree−Fock (Haase, 1970a) and Thomas−Fermi−Dirac potentials (Haase, 1970b), for voltages used in transmission electron microscopy. As stated in equation (2.21), the Fourier transform of the image intensity in the linear approximation is I(k) = O r(k) sin(k) – O i(k) cos γ(k) but now the terms O r(k) and O i(k) relating, respectively, to the phase and amplitude portions of the object can be explicitly written as
(2.23b)
(2.23c)
It is clear that it is no longer possible to use the simplification made in equation (2.22), in which O i(k) was expressed as a quantity proportional to O r(k), in the form O i(k) = Q(k)O r(k), since each atom species has its own ratio of phase/ amplitude scattering. This ratio goes up with the atomic number Z.

This formulation of image formation demonstrates the limitation of approximations usually made. It becomes important in attempts to computationally predict images from coordinates of X-ray structures composed of atoms with strongly different atomic numbers, as in all structures composed of protein and RNA. If these predictions are made on the basis of the earlier equation (2.22) or even without regard to the radial falloff, as in equation (2.23), then the contrast (p.50) between protein and RNA comes out to be much smaller than in the actual cryo-EM image. For instance, the DNA portion was difficult to recognize in naively modeled images of nucleosomes, while it was readily apparent in cryo-EM images (C. Woodcock, personal communication).

## 3.6. Optical and Computational Diffraction Analysis—The Power Spectrum

The CTF leaves a “signature” in the diffraction pattern of a carbon film, which optical diffraction analysis (Thon, 1966, 1971; Johansen, 1975) or its computational equivalent is able to reveal. Before a micrograph can be considered worthy of the great time investment that is required in the digital processing (starting with scanning and selection of particles), its diffraction pattern or power spectrum should first be analyzed. This can be achieved either by optical or computational means.

The term “power spectrum,” which will frequently surface in the remainder of this book, requires an introduction. It is defined in electrical engineering and statistical optics as the inverse Fourier transform of the autocorrelation function of a stochastic process. The expectation value of the absolute-squared Fourier transform provides the means to calculate the power spectrum. Thus, if we had a large number of images that all had the same statistical characteristics, we could estimate the power spectrum of this ensemble by computing their Fourier transforms, absolute-squaring them, and forming the average spectrum. The question is, can we estimate the power spectrum from a single image? The absolute-squared Fourier transform of a single image (also called periodogram, see Jenkins and Watts, 1968) yields a very poor estimate. But the answer to our question is still yes, under one condition: often the image can be modeled approximately as a stationary stochastic process; that is, the statistical properties do not change when we move a window from one place to the next. In that case, we can estimate the power spectrum by forming an ensemble of subimages, by placing a window in different (partially overlapped) positions, and proceeding as before when we had a large number of images. (An equivalent method, using convolution, will be outlined below in section 3.7.)

In contrast, even though it is often used interchangedly with “power spectrum,” the term “diffraction pattern” is reserved for the intensity distribution of a single image, and it therefore has to be equated with the periodogram. A recording of the diffraction pattern on film, during which the area selected for diffraction is moved across the micrograph, would be more closely in line with the concept of a power spectrum.

After these preliminaries, we move on to a description of the optical diffracto-meter, a simple device that has been of invaluable importance particularly in the beginning of electron crystallography. Even though computational analysis has become very fast, the optical analysis is still preferred for quick prescreening of the micrographs collected. The optical diffractometer consists of a coherent light source, a beam expander, a large lens, and a viewing screen, and allows the diffraction pattern of a selected image area to be viewed or recorded (Figure 2.18). In the simplest arrangement, the micrograph is put into the parallel coherent laser (p.51) beam. In the back focal plane, the complex amplitude is given by the Fourier transform of the transmittance function, which for low contrast is proportional to the optical density. The latter, in turn, is in good approximation proportional to the image intensity. In other words, the complex amplitude in the back focal plane of the optical diffractometer is given by equation (2.21). Now, the diffraction pattern D(k) observed or recorded on film is the square of the complex amplitude, that is, D(k) = |I(k)|2. Thus, if we additionally take into account the envelope function E(k) and an additive noise term N(k), ignore terms in which N(k) is mixed with the signal term, and make use of equation (2.23), we obtain

(2.24)
for the part of the image transform that originates from elastic scattering. (Another part of the image transform, ignored in this formula, comes from inelastic scattering.)

Figure 2.18 Schematic sketch of an optical diffractometer. A beam-expanding telescope is used to form a spot of the coherent laser light on the viewing screen. When a micrograph is placed immediately in front of the lens of the telescope, its diffraction pattern is formed on the viewing screen. From Stewart (1988a), reproduced with permission of Kluwer Academic/Plenum Publishers.

We obtain the interesting result, first noted by Thon (1966), that the object spectrum (i.e., the absolute-squared Fourier transform of the object) is modulated by the squared CTF. In order to describe this modulation, and the conditions for its observability, we must make an approximate yet realistic assumption about the object. A thin carbon film can be characterized as an amorphous, unordered structure. For such an object, the spectrum |O(k)|2 is nearly “white.” What is meant with “whiteness” of a spectrum is that the total signal variance, which is equal to the squared object spectrum integrated over the resolution domain B (Parseval’s theorem),

(2.25)
(p.52) where O(k) is the Fourier transform of the “floated,” or average-subtracted object [o(r) –<o(r) >], is evenly partitioned within that domain. (If we had an instrument that would image this kind of object with an aberration producing a uniform phase shift of π/2 over the whole back focal plane, except at the position of the primary beam, then the optical diffraction pattern would be uniformly white.) Hence, for such a structure, multiplication of its spectrum with the CTF in real instruments will leave a characteristic squared CTF trace (the “signature” of which we have spoken before).

Observability of this squared CTF signature actually does not require the uniformity implied in the concept of “whiteness.” It is obviously enough to have a sufficiently smooth spectrum, and the radial falloff encountered in practice is no detriment.

We can draw the following (real-space) parallel: in order to see an image on a transparent sheet clearly, one has to place it on a light box that produces uniform, untextured illumination, as in an overhead projector. Similarly, when we image a carbon film in the EM and subsequently analyze the electron micrograph in the optical diffractometer, the carbon film spectrum essentially acts as a uniform virtual light source that makes the CTF (i.e., the transparency, in our analogy, through which the carbon spectrum is “filtered”) visible in the Fourier transform of the image intensity (Figure 2.19). Instead of the carbon film, a 2D crystal with a large unit cell can also be used: in that case, the CTF is evenly sampled by the fine grid of the reciprocal lattice [see the display of the computed Fourier transform of a PhoE crystal, Figure 2.18 in Frank (1996)].

The use of optical diffraction as a means of determining the CTF and its dependence on the defocus and axial astigmatism goes back to Thon (1966). Since then, numerous other electron optical effects have been measured by optical diffraction: drift (Frank, 1969), illumination source size (Frank, 1976; Saxton, 1977; Troyon, 1977), and coma (Zemlin et al., 1978; Zemlin, 1989a). Typke and Kőstler (1977) have shown that the entire wave aberration of the objective lens can be mapped out.

Modern optical diffractometers designed for convenient, routine quality testing of electron micrographs recorded on film are equivpped with a CCD camera whose output is monitored on a screen. The entire unit is contained in a light-tight (p.53) box, and the tedium of working in the dark is avoided (B. Amos, personal communication). On the other hand, electron microscopes are also increasingly fitted with a CCD camera and a fast computer capable of producing instantaneous “diagnostic” Fourier transforms of the image on-line (e.g., Koster et al., 1990). With such a device, the data collection can be made more efficient, since the capture of an image in the computer can be deferred until satisfactory imaging conditions have been established. Provided that the size of the specimen area is the same, optical and computational diffraction patterns are essentially equivalent in quality and diagnostic value.

Figure 2.19 Computed power spectra for a number of cryo-EM micrographs in the defocus range 1.2–3.9 μm.

Another note concerns the display mode. In the digital presentation, it is convenient to display the modulus of the Fourier transform, that is, the square root of the power spectrum, because of the limited dynamic range of monitor screens. Logarithmic displays are also occasionally used, but experience shows that these often lead to an unacceptable compression of the dynamic range, bringing out the background strongly and rendering the zeros of the CTF virtually invisible.

## 3.7. Determination of the Contrast Transfer Function

The CTF may either be determined “by hand,” specifically by measuring the positions of the zeros and fitting them to a chart of the CTF characteristics, or by using automated computer-fitting methods.

In the method of manual fitting, the CTF characteristics of the microscope (see section 3.3) is computed for the different voltages in use (e.g., 100 or 200 kV) and displayed on a hard copy, preferably on a scale that enables direct comparison with the print of the optical diffraction pattern. As long as the lens of the microscope remains the same, the CTF characteristics remain unchanged. (In fact, as was pointed out above, a single set of curves covers all possible voltages and spherical aberrations, provided that generalized parameters and variables are used.) In the simplest form of manual defocus determination, a set of measured radii of CTF zero positions is “slid” against the CTF characteristics until a match is achieved. Since the slope of the different branches of the characteristics is shallow in most parts of the pattern, the accuracy of this kind of manual defocus determination is low, but it can be improved by using not one but simultaneously two or more diffraction patterns of a series with known defocus increments. Such a set of measurements forms a “comb” which can be slid against the CTF characteristics in its entirety.

Manual fitting has been largely superceded by computer-assisted interactive or fully automated methods. In earlier fully automated methods (Frank et al., 1970; Frank 1972c; Henderson et al., 1986), the Fourier modulus |F(k)| (i.e., the square root of what would be called the diffraction pattern) is computed from a field of sufficient size. The theoretical CTF pattern is then matched with the experimental power spectrum using an iterative nonlinear least squares fitting method. The parameters being varied are Δz, z a, φ 0, and a multiplicative scaling factor. Thus, the error sum is (Frank et al., 1970)

(2.26)
(p.54) where
(2.27)
The summation runs over all Fourier elements indexed j, and c is a simple scaling constant.

It has not been practical to include envelope parameters in the 2D fitting procedure. The 1/k dependence was also used by other groups (Henderson et al., 1986; Stewart et al., 1993) to match the observed decline in power with spatial frequency. This 1/k dependency, obtained for negatively stained specimens, lacks a theoretical basis but appears to accommodate some of the effects discussed by Henderson and Glaeser (1985), such as specimen drift and fluctuating local charging.

The error function [equation (2.26)] has many local minima. The only way to guarantee that the correct global minimum is found in the nonlinear least squares procedure is by trying different starting values for Δz. Intuitively, it is clear that smoothing the strongly fluctuating experimental distribution |F(k)| will improve the quality of the fit and the speed of convergence of the iterative algorithm. Smoothing can be accomplished by the following sequence of steps: transform |F(k)|2 into real space, which yields a distribution akin to the Patterson function, mask this function to eliminate all terms outside a small radius, and finally transform the result back into Fourier space (Frank et al., 1970). A smooth image spectrum may also be obtained by dividing the image into small, partially overlapping regions p n(r) and computing the average of the Fourier moduli |F n(k)| = |ℑ·{p n(r)}| (Zhu and Frank, 1994; Fernàndez et al., 1997; Zhu et al., 1997). This method actually comes close to the definition of the power spectrum, at a given spatial frequency k, as an expectation value of |F(k)|2 (Fernàndez et al., 1997) (see section 3.6). This way of estimating the power spectrum proves optimal when the areas used for estimation are overlapped by 50% (Zhu et al., 1996; Fernández et al., 1997). A further boost in signal can be achieved by using a thin carbon film floated onto the ice.

More recently, several automated procedures for CTF determination have been described (Huang et al., 2003;Mindell and Grigorieff, 2003; Sander et al., 2003a). In the first approach, the power spectrum, after smoothing and background correction, is being compared with the analytical function for the CTF assuming different values for Δz on a grid, and different settings for the axial astigmatism. In the second approach (Sander et al., 2003a), power spectra of individual particles are classified by correspondence analysis followed by hierarchical ascendant classification (regarding both subjects, see chapter 4), and a comparison of the resulting class averages with analytically derived CTFs. Finally, Huang et al. (2003) make use of a heuristic approach, by dividing the power spectrum into two regions with strongly different behavior, and fitting the radial profile in these regions with different polynomial expressions. Both envelope parameters and axial astigmatism are handled with this method, as well. One of the observations made in this study is that the B-factor formalism, often used to summarily describe the decline of the Fourier amplitudes as seen on the power (p.55) spectrum, is seriously deficient. For a more detailed discussion of this point, see chapter 5, section 9.3.1.

One problem with fully automated methods is that they inevitably contain “fiddle parameters” adjusted to particular kinds of data, as characterized by specimen support, instrument magnification, etc. Manual checks are routinely required. The other class of CTF estimation methods is based entirely on user interaction, and here the concentration is on making the tools convenient and user-friendly. High sensitivity can be obtained by rotationally averaging the squared Fourier transform, and comparing the resulting radial profiles with the theoretical curves. Of course, rotational averaging assumes that the axial astigmatism is well corrected, otherwise the radial profile will be only approximately correct in reflecting the mean defocus.

The first works in this category (Zhou and Chiu, 1993; Zhu and Frank, 1994; Zhu et al., 1997) made the user go through a number of steps, at that time without the benefits of a graphics tool. The one-dimensional profile obtained by rotational averaging is first corrected by background subtraction, then the resulting profile is fitted to a product of the transfer function with envelopes representing the effects of partial coherence, chromatic defocus spread, and other resolution-limiting effects. Background correction is accomplished by fitting the minima of |F(k)| (i.e., regions where the Fourier transform of the image intensity should vanish, in the absence of noise and other contributions not accounted for in the linear bright-field theory) with a slowly varying, well-behaved function of the spatial frequency radius k=|k|. Zhou and Chiu (1993) used a high-order polynomial function, while Zhu et al. (1997) were able to obtain a Gaussian fit of the noise background, independent of the value of defocus, and to determine the effective source size characterizing partial coherence as well.

Later implementations of this approach employ graphics tools. In WEB, the graphics user interface of SPIDER (Frank et al., 1981b, 1996), the rotationally averaged power spectrum is displayed as a 1D profile along with the theoretical CTF curve, which can be tuned to match the former by changing the position of a defocus slider. Figure 2.20a shows the radial profiles of power spectra obtained for cryo-EM micrographs (ribosomes embedded in ice, with a thin carbon film added) with different defocus settings. Figure 2.20b shows an example of a background correction using a polynomial curve obtained by least squares fitting of the minima. The background-corrected curve (bottom in figure 2.20b) can now be interpreted in terms of the square of the CTF modulated by an envelope term that causes the decay at high spatial frequencies. In addition to the tool in SPIDER/WEB, several convenient graphics user tools for interactive CTF estimation have emerged. One, CTFFIND2, is part of the MRC image processing software package (Crowther et al., 1996). Further on the list are tools contained within the packages EM (Hegerl and Altbauer, 1982; Hegerl, 1996), EMAN (Ludtke et al., 1999), IMAGIC (van Heel et al., 1996), and Xmipp (Marabini et al., 1996b).

Axial astigmatism can be taken into account in this fitting process, without the need to abandon the benefit of azimuthal averaging, by dividing the 180° azimuthal range (ignoring the Friedel symmetry-related range) into a number of sectors. For each of these sectors, the defocus is then separately determined. (p.56) However, this method appears to be unreliable when applied to specimens in ice because of the reduced SNR (J. Zhu, personal communication, 1994). Zhou et al. (1996) developed a graphics tool, now integrated into the EMAN package (Ludtke et al., 1999), that allows the user to put circles or ellipses into the zeros of the experimental 2D power spectrum, thus enabling the estimation of axial astigmatism as well. Huang et al. (2003), already mentioned before, report success in the estimation of axial astigmatism with their automated CTF determination procedure, but the success hinges on a careful estimation of the background.

Figure 2.20 Radial profiles of Fourier modulus (i.e., square root of “power spectrum”) obtained by azimuthal averaging, and background correction. Vertical axis is in arbitrary units. (a) Profiles for three different defocus settings. (b) Background correction. The minima of the experimental profile (upper part of panel) were fitted by a polynomial using least squares, and the resulting background curve was subtracted, to yield the curve in the lower part. Apart from a region close to the origin, the corrected curve conforms with the expectation of the linear CTF theory, and can be interpreted as the square of the CTF with an envelope term.

Related to this subject is the measurement of the entire CTF characteristics and, along with it, the parameters characterizing energy spread and partial coherence. The purpose here is not to find the parameters characterizing a single micrograph, but to describe the instrument’s electron optical performance over an entire defocus range. Krakow et al. (1974) made a CTF versus Δz plot by using a tilted carbon grid and 1D optical diffraction with a cylindrical lens. Frank et al. (1978b) used this method to verify the predicted defocus dependence of the envelope function for partial spatial coherence (Frank, 1973a). Burge and Scott (1975) developed an elegant method of measurement in which a large axial astigmatism is intentionally introduced. As we have seen in the previous section, the effect of a large amount of axial astigmatism is that, as the azimuthal angle goes through the 360° range, the astigmatic defocus component za sin[2(φ) – φ0)] [equation (2.5)] sweeps forward and backward through the CTF characteristics, producing a quasi-hyperbolic appearance of most of the contrast transfer zones. For large enough Δz, the diffraction pattern, either optically derived (Burge and Scott, 1975) or computed (Mőbus and Rühle, 1993), will contain a large segment of the CTF characteristics in an angularly “coded” form. Mobus and Rühle (1993) (p.57) were able to map the hyperbolic pattern into the customary CTF versus Δz diagram by a computational procedure.

## 3.8. Instrumental Correction of the Contrast Transfer Function

Many attempts have been made to improve the instrument so that the CTF conforms more closely with the ideal behavior. In the early 1970s, several groups tried to change the wave aberration function through intervention in the back focal plane. Hoppe and coworkers (Hoppe, 1961; Langer and Hoppe, 1966) designed zone plates, to be inserted in the aperture plane, that would block out all waves giving rise to destructive interference for a particular defocus setting. A few of these plates were actually made, thanks to the early development of microfabrication in the laboratory of the late Gottfried Mőllenstedt, at the University of Tübingen, Germany. However, contamination build-up, and the need to match the defocus of the instrument precisely with the defocus the zone plate is designed for, made these correction elements impractical.

Unwin (1970) introduced a device that builds up an electric charge at the center of the objective aperture (actually a spider’s thread) whose field modifies the wave aberration function at low angles and thereby corrects the transfer function in the low spatial frequency range. Thin phase plates (Zernike plates) designed to retard portions of the wave have also been tried (see Reimer, 1997). Typically, these would be thin carbon films with a central bore leaving the central beam in the back focal plane unaffected. These earlier attempts were plagued by the build-up of contamination, so that the intended phase shift could not be maintained for long. The idea was recently revived (Danev and Nagayama, 2001; Majorovits and Schrőder 2002a); this time, however, with more promise of success since modern vacuum technology reduces contamination rates several-fold.

Another type of phase plate, the Boersch plate, is immune to contamination as an electric field is created in the immediate vicinity of the optic axis by the use of a small ring-shaped electrode supported by, but insulated from, thin metal wires (Danev and Nagayama, 2001; Majorovits and Schrőder, 2002a). Also, a very promising development to be mentioned here is an electromagnetic corrector element that can be used to “tune” the wave aberration function, and in the process also to boost the low-spatial frequency response if desired (Rose, 1990).

Corrections that are applied after the image has been taken, such as Wiener filtering and the merging of data from a defocus series (see section 2), have traditionally not been counted as “instrumental compensation.” However, as the computerization of electron microscopes is proceeding, the boundaries between image formation, data collection, and on-line postprocessing are becoming increasingly blurred. By making use of computer control of instrument functions, it is now possible to compose output images by integrating the “primary” image collected over the range of one or several parameters. One such scheme, pursued and demonstrated by Taniguchi et al. (1992), is based on a weighted superposition of electronically captured defocus series.

## (p.58) 3.9. Computational Correction of the Contrast Transfer Function

### 3.9.1. Phase Flipping

As has become clear, the CTF fails to transfer the object information as represented by the object’s Fourier transform with correct phases and amplitudes. All computational corrections are based on knowledge of the CTF as a function of the spatial frequency, in terms of the parameters, Δz, Cs, and za, which occur in the analytical description of the CTF [equations (2.4), (2.5), and (2.10)]. Additionally, envelope parameters that describe partial coherence effects must be known. The simplest correction of the CTF is by “phase flipping,” that is, by performing the following operation on the image transform:

(2.28)
which ensures that the modified image transform F′(k) has the correct phases throughout the resolution domain. However, such a correction leaves the spatial-frequency dependent attenuation of the amplitudes uncorrected: as we know, Fourier components residing in regions near the zeros of H(k) are weighted down by the CTF, and those located in the zeros are not transferred at all.

### 3.9.2. Wiener Filtering—Introduction

The Wiener filtering approach can be described as a “careful division” by the CTF such that noise amplification is kept within limits:

Given is a micrograph that contains a signal corrupted by additive noise. In Fourier notation,

(2.28a)
We seek an estimate $F^(k)$ that minimizes the expected mean-squared deviation from the Fourier transform of the object, F(k):
(2.29)
where E{·} denotes the expectation value computed over an ensemble of images. We now look for a filter function S(k) with the property:
(2.30)
that is, one that yields the desired estimate when applied to the image transform. The solution of this problem, obtained under the assumption that there is no correlation between F(k) and N(k), is the well-known expression:
(2.31)
(p.59) where P n(k) and P f(k) are the power spectra of noise and object, respectively. It is easy to see that the filter function corrects both the phase [since it flips the phase according to the sign of H*(k)] and the amplitude of the Fourier transform. The additive term in the denominator of equation (2.31) prevents excessive noise amplification in the neighborhood of H(k) = 0.

Figure 2.21 Demonstration of Wiener filtering. Image001 was produced by projecting a (CTF-corrected) ribosome density map (Gabashvili et al., 2000) into the side-view direction. From this, an “EM image” (dis001) was generated by applying a CTF. Next, noise was added (add001). Application of the Wiener filter assuming high signal-to-noise ratio (SNR= 100) resulted in the image res001. Low-pass filtration of this image to the bandwidth of the original image produces a “cleaned-up” version, fil001. The texture apparent in both particle and background results from a too aggressive choice of the Wiener filter, implicit in the overestimation of the SNR, which has the effect that the noise in the regions around the zeros of the CTF is disproportionately amplified. Still, the low-pass filtered, Wiener-filtered version (fil001) is a close approximation of the original.

It is instructive to see the behavior of the Wiener filter for some special cases. First, the term P N(k)/P F(k), which could be called the “spectral noise-to-signal ratio,” can often be approximated by a single constant, c=1/α, the reciprocal of the signal-to-noise ratio (see chapter 3, section 4.3). Let us first look at the filter for the two extreme cases, α = 0 (i.e., the signal is negligible compared with the noise) and α = ∞ (no noise present). In the first case, equation (2.31) yields S(k) = 0, a sensible result since the estimate should be zero when no information is conveyed. In the second case, where no noise is assumed, S(k) becomes 1/H(k); that is, the restoration (equation 2.30) reverts to a simple division by the transfer function. When the SNR is small, but nonzero, which happens when the Wiener filter is applied to raw data, then the denominator in equation (2.31) is strongly dominated by the term 1/a, rendering |H(k)|2 negligible. In that case, the filter function becomes S(k) ≈ αH*(k), that is, restoration now amounts to a “conservative” multiplication of the image transform with the transfer function itself. The most important effect of this multiplication is that it flips the phase of the image transform wherever needed in Fourier space. [In the example above (Figure 2.21), high SNR was assumed even though the SNR is actually low. Consequently, the restoration is marred by the appearance of some ripples.]

### 3.9.3. Wiener Filtering of a Defocus Pair

Thus far, it has been assumed that only a single micrograph is being analyzed. The disadvantage of using a single micrograph is that gaps in the vicinity of the (p.60) zeros of the transfer function remain unfilled. This problem can be solved, and the gaps filled, by using two or more images at different defocus settings (Frank and Penczek, 1995; Penczek et al., 1997). In the following, we show the case of two images, with Fourier transforms I 1(k) and I 2(k), and two different CTFs H 1(k) and H 2(k). We seek

(2.31a)
and the filter functions (n = 1, 2) become
(2.32)
For a suitable choice of defocus settings, none of the zeros of H 1(k) and H 2(k) coincide, so that information gaps can be entirely avoided. [It should be noted that in the application considered by Frank and Penczek (1995), the images are 3D, resulting from two independent 3D reconstructions, and the spectral noise-to-signal ratio P N(k)/P F(k) (assumed to be the same in both images) is actually much less than that for raw images, as a consequence of the averaging that occurs in the 3D reconstruction process.] As in the previous section, where we have dealt with the Wiener filtration of a single image, we can, under certain circumstances, simplify equation (2.32) by assuming P N(k)/P F(k) = const. = 1/SNR. Application of the Wiener filter to a defocus doublet is shown below for a test image (Figure 2.22).

### 3.9.4. Wiener Filtering of a Defocus Series N >2

Even more safety in avoiding information gaps is gained by using more than two defocus settings. Generalization of the least-squares approach to N images with different defocus settings leads to the estimate (Penczek et al., 1997):

(2.32a)
with the filter functions:
(2.32b)
where SNRn is the signal-to-noise ratio of each image. [Note that under the assumption that n = 2 and SNR1=SNR2, equation (2.32b) reverts to equation (2.32) for the case of constant SNR.]

(p.61)

Figure 2.22 Demonstration of Wiener filtering with N=2 degraded images having different amounts of underfocus. Graphs on the left and right show how the Wiener filter amplifies Fourier components close to the zeros, but does not go beyond a certain amplitude, whose size depends on the SNR. The graph below the restored image shows the effective transfer function achieved after combining the information from both images. It is seen that two zeros have been entirely eliminated, while a third one, at 1/11 Å–1, is not, since it is shared by both CTFs. (From N. Boisset, unpublished lecture material.)

### 3.9.5. Separate Recovery of Amplitude and Phase Components

In the above descriptions of the phase flipping and Wiener filtering approaches to CTF correction, it has been assumed that the specimen has the same scattering properties throughout. If we take into account the fact that there are different atomic species with different scattering properties, we have to go one step back and start with equation (2.21), which describes the different image components relating to the phase [with transform O r(k)] and amplitude portion [with transform O i(k)] of the object:

(2.32c)

In 1968, Schiske posed the question whether these two different object components [one following the Friedel law, the other one an anti-Friedel law; see equation (A.10) in appendix 1] can be separately retrieved, by making use of several measurements of I(k) with different defocus settings; that is, by recording a defocus series of the same object. An advantage of separating the two components is the enhanced contrast between heavy and light atoms (“heavy/light atom (p.62) discrimination”; see Frank, 1972c, 1973b) that we expect to find in the amplitude component. The solution for N=2 defocus settings is obtained by simply solving the two resulting equations (from 2.32c) for two unknowns, yielding

(2.33)
For N > 2, there are more measurements than unknowns, and the problem can be solved by least-squares analysis, resulting in a suppression of noise (Schiske, 1968). Following Schiske’s general formula, one obtains the following expression for the best estimate of the object’s Fourier transform:
(2.34)
which can then be split up into the “phase” and “amplitude” components, by separating the part observing Friedel symmetry (see appendix 1) from the part that does not, as follows:
(2.35a)
and
(2.35b)

In a first application of the Schiske formula, Frank (1972c) demonstrated enhancement in features of stained DNA on a carbon film. In-depth analyses following a similar approach were later developed by Kirkland and coworkers 1980, 1984), mainly for application in materials science. Interestingly, in their test of the technique, Typke et al. (1992) found that small magnification differences associated with the change in defocus produce intolerably large effects in the restored images and proposed a method for magnification compensation. Figure 2.23 shows the phase and amplitude portion of a specimen field, restored from eight images of a focus series.

In many applications of single-particle reconstruction, CTF correction is carried out as part of the 3D reconstruction procedure. Details will be found in the chapter on 3D reconstruction (chapter 5, section 9). Advantages and disadvantages in carrying out CTF correction at the stage of raw data versus the later stage where volumes have been obtained will be discussed in chapter 5, section 9.2.

## 3.10. Locally Varying Ctf and Image Quality

Under stable conditions we can assume that the entire image field captured by the electron micrograph possesses the same CTF, unless the specimen is tilted and unless the z-position of particles is strongly variable, due to the larger thickness of the ice (cf. van Heel et al., 2000). Moreover, with stable specimens and imaging conditions, there is no reason to expect that parameters characterizing the envelope functions should vary across the image field. Indeed, the single-particle reconstruction strategy outlined in chapter 5, section 7, is based on the assumption that the entire micrograph can be characterized by a single CTF. The fact that this strategy has led to a 7.8-Å reconstruction of the ribosome (Spahn et al., in preparation; see figures 5.24 and 6.6) indicates that these stable conditions can be realized.

(p.63)

Figure 2.23 Demonstration of Schiske-type restoration applied to a defocus series of eight micrographs (ranging from –5400 to +5400 Å) showing proteasomes on carbon embedded in vitreous ice. (a, b) Two of the original micrographs. (c) Restored phase part of the object, obtained by using the micrographs (a, b). (d) Restored phase part, obtained by using four micrographs. (e, f) Amplitude and phase parts of the specimen, respectively, restored from the entire defocus series. In the phase part, the particles stand out stronger than in (d, e) where fewer images were used. The amplitude part (e) reflects the locally changing pattern in the loss of electrons participating in the elastic image formation, either due to inelastic scattering or due to scattering outside the aperture. Another contribution comes from the fact that a single defocus is attributed to a relatively thick specimen (see Frank, 1973a; Typke et al., 1992). The particles are invisible in this part of the restored object. In the phase part (f), the particles stand out strongly from the ice + carbon background on account of the locally increased phase shift. The arrows point to particles in side-view orientations that are virtually invisible in the original micrographs but now stand out from the background. From Typke et al. (1992), reproduced with permission of Elsevier.

(p.64) However, there are conditions of instability that can lead to substantial variability of the power spectrum. Instability can result from local drift and electric charging, as discussed by Henderson (1992). In addition, local variability can result from changes in the thickness of ice. One way of observing the local variations in the power spectrum is by scanning the aperture defining the illuminated area in the optical diffractometer relative to the micrograph (i.e., sliding the micrograph across the fixed geometry of the optical diffractometer beam), and observing the resulting pattern. The computational analysis offers more options and better quantification: Gao et al. (2002) divided the micrograph into subfields of 512 × 512, computed the local, azimuthally averaged power spectra, and subjected these to correspondence analysis (see chapter 4, section 2 and section 3). For some micrographs (containing ribosomes in vitreous ice, and a thin carbon film) analyzed, the amplitude falloff varied substantially across the image field. Sander et al. (2003a), using a similar method of analysis, also found local variability of the amplitude falloff, parameterized by a variable B-factor, in analyzing micrographs of spliceosome specimens prepared with cryo-stain. Another study (Typke et al., 2004) came to similar results, but demonstrated that some improvement is possible by metal coating of the support film surrounding the (specimen-bearing) holes of the holey carbon film. Considering the extent of the variability found by all three groups, it might be advisable to use a diagnostic test as part of the routine screening and preprocessing of the data, so as to avoid processing and refining inferior particles.

# 4. Special Imaging Techniques and Devices

## 4.1. Low-Dose Electron Microscopy

Concerns about radiation damage led to careful electron diffraction measurements at the beginning of the 1970s (Glaeser, 1971). The results were not encouraging: the electron diffraction patterns of 2D crystals formed by L-valine were found to disappear entirely after exposure to 6e2 (Glaeser, 1971). Crystals of the less sensitive adenosine ceased to diffract at an exposure of about 60e2, still much lower than the then “normal” conditions for taking an exposure. To some extent, low temperature affords protection from radiation damage (Taylor and Glaeser, 1974), by trapping reaction products in situ, but the actual gain (five- to eight-fold for catalase and purple membrane crystals at (p.65) 120°C) turned out smaller than expected (Glaeser and Taylor, 1978; Hayward and Glaeser, 1979).

At a Workshop in Gais, Switzerland (October, 1973), the prospect for biological EM with a resolution better than 1/30Å –1 was discussed, and radiation damage was identified as the primary concern (Beer et al., 1975). The use of averaging to circumvent this hurdle had been suggested earlier on by Glaeser and coworkers (1971; see also Kuo and Glaeser, 1975). Even earlier, a technique termed “minimum dose microscopy,” invented by Williams and Fisher (1970), proved to preserve, at 50e2, the thin tail fibers of the negatively stained T4 bacteriophage that were previously invisible for doses usually exceeding 200e2. In their pioneering work, Unwin and Henderson (1975) obtained a 1/7Å–1 resolution map of glucose-embedded bacteriorhodopsin by the use of a very low dose, 0.5e2, along with averaging over a large (10,000 unit cells) crystal field.

In the current usage of the term, “low dose” refers to a dose equal to, or lower than 10e2. (Since “dose” actually refers to the radiation absorbed, “low exposure” is the more appropriate term to use.) Experience has shown that exposures larger than that lead to progressive disordering of the material and eventually to mass loss and bubbling. Susceptibility to radiation damage was found to vary widely among different materials and with different specimen preparation conditions. Changes in the structure of negatively stained catalase crystals were investigated by Unwin (1975). The results of his work indicate that rearrangement of the stain (uranyl acetate) occurs for exposures as low as 10e2. The radiation damage studies by Kunath et al. (1984) on single uranyl acetate-stained glutamine synthetase (glutamate−ammonia ligase) molecules came to a similar conclusion,3 comparing single, electronically recorded frames at intervals of 1 e/ Å2.

Frozen-hydrated specimens are strongly susceptible to radiation damage. For such specimens, bubbles begin to appear at exposures in the region of 50 e2, as first reported by Lepault et al. (1983). A study by Conway et al. (1993) compared the reconstructions of a herpes simplex virus capsid (HSV-1) in a frozen-hydrated preparation obtained with total accumulated exposures of 6 and 30 e2. The authors reported that although the nominal resolution (as obtained with the Fourier ring correlation criterion, see chapter 3, section 5.2.4) may change little, there is a strong overall loss of power in the Fourier spectrum with the five-fold increase of dose. Still, the surface representation of the virus based on a 1/30 Å1-resolution map shows only subtle changes. The article by Conway et al. (1993), incidentally, gives a good overview over the experimental findings accumulated in the 1980s on this subject. For another review of this topic, the reader is referred to the article by Zeitler (1990).

(p.66) Following the work of Unwin and Henderson (1975), several variations of the low-dose recording protocol have been described. In essence, the beam is always first shifted, or deflected, to an area adjacent to the selected specimen area for the purpose of focusing and astigmatism correction, with an intensity that is sufficient for observation. Afterward, the beam is shifted back. The selected area is exposed only once, for the purpose of recording, with the possible exception of an overall survey with an extremely low exposure (on the order of 0.01 e2) at low magnification.

In deciding on how low the recording exposure should be chosen, several factors must be considered: (i) The fog level: when the exposure on the recording medium becomes too low, the density variations disappear in the background. Control of this critical problem is possible by a judicious choice of the electron-optical magnification (Unwin and Henderson, 1975). Indeed, magnifications in the range 40,000×–60,000× are now routinely used following the example of Unwin and Henderson (1975). (ii) The ability to align the images of two particles: the correlation peak due to “self-recognition” or “self-detection” (Frank, 1975a; Saxton and Frank, 1977; see section 3.4 in chapter 3) of the motif buried in both images must stand out from the noisy background, and this stipulation leads to a minimum dose for a given particle size and resolution (Saxton and Frank, 1977). (iii) The statistical requirements: for a given resolution, the minimum number of particles to be averaged, in two or three dimensions, is tied to the recording dose (Unwin and Henderson, 1975; Henderson, 1995). In planning the experiment, we wish to steer away from a dose that leads to unrealistic numbers of particles. However, the judgment on what is realistic is in fact changing rapidly as computers become faster and more powerful and as methods for automated recording and particle selection are being developed (see chapter 3, section 2.4).

Since the lowering of the exposure comes at the price of lowered signal-to-noise ratio, which translates into lower alignment accuracy, which in turn leads to impaired resolution, the radiation protection provided by low temperatures can be seen as a bonus, allowing the electron dose to be raised. Even an extra factor of 2, estimated as the benefit of switching from liquid-nitrogen to liquid-helium temperature (Chiu et al., 1986), would offer a significant benefit (see also section 5.4 in chapter 3).

## 4.2. Spot Scanning

Recognizing that beam-induced movements of the specimen are responsible for a substantial loss in resolution, Henderson and Glaeser (1985) proposed a novel mode of imaging in the transmission electron microscope whereby only one single small area, in the size range of 1000 Å, is illuminated at a time. This spot is moved over the spectrum field on a regular (square or hexagonal) grid. The rationale of this so-called spot scanning technique is that it allows the beam-induced movement to be kept small since the ratio between energy-absorbing area and supporting perimeter is much reduced. After the successful demonstration of this technique with the radiation-sensitive materials vermiculite and paraffin (Downing and Glaeser, 1986; Bullough and Henderson, 1987; Downing, 1991), (p.67) numerous structural studies have made use of it (e.g., Kühlbrandt and Downing, 1989; Soejima et al., 1993; Zhou et al., 1994).

Whether or not the spot scanning leads to improvements for single particles as well is still unknown. Clearly, the yield in structural information from crystals is expected to be much more affected by beam-induced movements than from single molecules. On the other hand, to date, single particles without symmetry have not been resolved better than 6 Å resolution, so that the exact nature of the bottlenecks preventing achievement of resolutions in the order of 3 Å is uncertain.

Computer-controlled electron microscopes now contain spot scanning as a regular feature and also allow dynamic focus control (Zemlin, 1989b; Downing, 1992), so that the entire field of a tilted specimen can be kept at one desired defocus setting. The attainment of constant defocus across the image field is of obvious importance for the processing of images of tilted 2D crystals (Downing, 1992), but it also simplifies the processing of single macromolecules following the protocol (chapter 5, section 3.5 and section 5) of the random-conical reconstruction (see Typke et al., 1990).

On the other hand, it may be desirable to retain the focus gradient in more sophisticated experiments, where restoration or heavy/light atom discrimination based on defocus variation are used (see section 3): the micrograph of a tilted specimen essentially produces a focus series of single particles. At a typical magnification of 50,000 and a tilt angle of 60°, the defocus varies by 17,000 Å (or 1.7 mm) across the field captured by the micrograph (assumed to be 50 mm in width), in the direction perpendicular to the tilt axis.

## 4.3. Energy Filtration

The weak-phase object approximation introduced in section 3.2 enabled us to describe the image formation in terms of a convolution integral. In Fourier space, the corresponding relationship between the Fourier transforms of image contrast and object is very simple, allowing the object function to be recovered by computational methods described in section 3.9. However, this description of image formation is valid only for the bright field image formed by elastically scattered electrons. Inelastically scattered electrons produce another, very blurred image of the object that appears superimposed on the “elastic image”. The formation of this image follows more complicated rules (e.g., Reimer, 1997).

As a consequence, an attempt to correct the image for the effect of the contrast transfer function (and thereby retrieve the true object function) runs into problems, especially in the range of low spatial frequencies where the behavior of the inelastic components can be opposite to that expected for the phase contrast of elastic components (namely in the underfocus range, where Δz > 0). Thus, CTF correction based on the assumption that only elastic components are present will, in the attempt to undo the under-representation of these components, amplify the undesired inelastic component as well, leading to an incorrect, blurred estimate of the object. This problem is especially severe in the case of ice-embedded specimens, for which the cross-section for inelastic scattering exceeds that for elastic scattering.

(p.68) Another problem caused by inelastic scattering is that it produces a decrease in the SNR ratio of the image to be retrieved (Schrőder et al., 1990; Langmore and Smith, 1992). This decrease adversely affects the accuracy of all operations, to be described in the following chapters, that interrelate raw data in which noise is prevalent; for example, alignment (chapter 3, section 3), multivariate data analysis (chapter 4, section 2 and section 3), classification (chapter 4, section 4), and angular refinement (chapter 5, section 7).

The problems outlined above can be overcome by the use of energy-filtering electron microscopes (Schrőder et al., 1990; Langmore and Smith, 1992; Smith and Langmore, 1992; Angert et al., 2000). In these specialized microscopes, the electron beam goes through an electron spectrometer at some stage after passing through the specimen. The spectrometer consists of a system of magnets that separate electrons spatially on a plane according to their energies. By placing a slit into this energy-dispersive plane, one can mask out all inelastically scattered electrons, allowing only those electrons to pass that have lost marginal amounts (0–15eV) of energy (“zero-loss window”). In practice, the energy filter is either placed into the column in front of the projective lens (e.g., the Omega filter; Lanio, 1986) or added below the column as final electron optical element (post-column filter; Krivanek and Ahn, 1986; Gubbens and Krivanek, 1993). The performance of these different types of filters has been assessed by Langmore and Smith (1992) and Uhlemann and Rose (1994). Examples of structural studies on frozen-hydrated specimens in which energy filtering has been employed with success are found in the work on the structure of myocin fragment S1-decorated actin (∼25Å: Schrőder et al., 1993) and the ribosome (25 Å: Frank et al., 1995a, 1995b; Zhu et al., 1997).

Many subsequent studies have demonstrated, however, that both CTF correction and the achievement of higher resolution are possible with unfiltered images, as well. In fact, at increasing resolutions, the influence of the “misbehaved” low spatial frequency terms (i.e., contributions to the image that do not follow CTF theory) is diminished relative to the total structural information transferred. Although there has been steadily increasing interest in energy-filtered instruments, their most promising application appears to be in electron tomography of relatively thick (> 0.1 μm) specimens (see Koster et al., 1997; Grimm et al., 1998). In fact, specimens beyond a certain thickness cannot be studied without the aid of an energy filter.

## 4.4. Direct Image Readout and Automated Data Collection

Newer electron microscopes are equipped with CCD cameras enabling direct image readout. Fully automated data collection schemes have been designed that have already revolutionized electron tomography, and promise to revolutionize single-particle reconstruction, as well. This is not likely to happen, however, before CCD arrays of sufficient size (8 k × 8 k) and readout speed become widely available and affordable. The algorithmic implementation of low-dose data collection from single-particle specimens with specific features (as defined by shape, texture, average density, etc.) is quite complex, as it must incorporate the many contingencies of a decision logic that is unique to each specific specimen (p.69) class. These schemes have been optimized for viruses, helical fibers, and globular particles such as ribosomes (Potter et al., 1999; Carragher et al., 2000; Zhang et al., 2001; Lei and Frank, 2005). A systematic search for suitable areas is facilitated by commercially available perforated grids, such as Quantifoil (see section 2.8). Recently, the first entirely automated single-particle reconstruction of a molecule (keyhole limpet hemocyanin) was performed as proof of the concept (Mouche et al., 2004). (p.70)

## Notes:

(1) This section is not meant as a substitute for a systematic introduction, but serves mainly to provide a general orientation, and to introduce basic terms that will be used in the text.

(2) Strictly speaking, the image obtained in the TEM is derived from the projected Coulomb potential distribution of the object, whereas the diffraction intensities obtained by X-ray diffraction techniques are derived from the electron density distribution of the object.

(3) The spot scanning technique was already used by Kunath et al. (1984) in experiments designed to study the radiation sensitivity of macromolecules. It was simply a rational way of organizing the collection of a radiation damage series (which the authors called “movie”) from the same specimen field. In hindsight, the extraordinary stability of the specimen (0.05 Å/s) must be attributed to the unintended stabilizing effect of limiting the beam to a 1-μm spot.