Jump to ContentJump to Main Navigation
The Physical WorldAn Inspirational Tour of Fundamental Physics$

Nicholas Manton and Nicholas Mee

Print publication date: 2017

Print ISBN-13: 9780198795933

Published to Oxford Scholarship Online: July 2017

DOI: 10.1093/oso/9780198795933.001.0001

Show Summary Details
Page of

PRINTED FROM OXFORD SCHOLARSHIP ONLINE (www.oxfordscholarship.com). (c) Copyright Oxford University Press, 2017. All Rights Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a monograph in OSO for personal use (for details see http://www.oxfordscholarship.com/page/privacy-policy).date: 23 October 2017

General Relativity

General Relativity

(p.158) 6 General Relativity
The Physical World

Nicholas Manton

Nicholas Mee

Oxford University Press

Abstract and Keywords

This chapter presents the physical motivation for general relativity, derives the Einstein field equation and gives concise derivations of the main results of the theory. It begins with the equivalence principle, tidal forces in Newtonian gravity and their connection to curved spacetime geometry. This leads to a derivation of the field equation. Tests of general relativity are considered: Mercury’s perihelion advance, gravitational redshift, the deflection of starlight and gravitational lenses. The exterior and interior Schwarzschild solutions are discussed. Eddington–Finkelstein coordinates are used to describe objects falling into non-rotating black holes. The Kerr metric is used to describe rotating black holes and their astrophysical consequences. Gravitational waves are described and used to explain the orbital decay of binary neutron stars. Their recent detection by LIGO and the beginning of a new era of gravitational wave astronomy is discussed. Finally, the gravitational field equations are derived from the Einstein–Hilbert action.

Keywords:   Einstein equation, curved spacetime, Schwarzschild solution, black hole, LIGO, gravitational waves, binary neutron star, gravitational lens, Einstein–Hilbert action

6.1 The Equivalence Principle

Following the development of his special theory of relativity, it was clear to Einstein that a new theory of gravity was necessary to complete his revolution. In Newton’s theory, gravity appears to act instantaneously at a distance, whereas the cornerstone of special relativity is that the maximum speed of any interaction is the speed of light. In the years following the publication of special relativity, various attempts were made to incorporate a finite speed of interaction into a theory of gravity, but these early ideas proved to be too simplistic.

Gravity is special among forces, because it acts in the same way on all massive bodies. This observation dates back to Galileo’s experiments rolling balls down inclined slopes. Galileo conclusively demonstrated that balls with different masses, released together and allowed to fall under gravity, hit the ground at the same instant if no other forces are acting. This observation is explained in Newtonian physics by the cancellation of mass between Newton’s second law (2.3) and Newton’s law of gravitation (2.80). In Newton’s second law, acceleration equals the applied force divided by the mass of the object being accelerated. In this role, the mass is known as inertial mass, as its effect is to resist the change in motion of the body. Remarkably, the gravitational force on an object is also proportional to its mass. In this role, the mass is known as gravitational mass. The ratio of inertial mass to gravitational mass could in principle be different for different bodies made of different materials, but experimentally, the ratio is always 1, so we can regard inertial mass as the same as gravitational mass. Motion in a gravitational field is then independent of mass. For example, near the Earth’s surface, the equation of motion for a freely falling body is


independent of mass. The cancellation of mass is unique to gravity, as the strength of other forces is unrelated to the mass of the body on which they act. The electrostatic force, for instance, is proportional to the electric charge of the body, not its mass.

In Newton’s theory, the equality of inertial and gravitational mass seems almost accidental, but in 1907 Einstein realized that this feature of gravitation might be the perfect foundation for a new relativistic theory. He raised his insight to the status of a new principle of physics and named it the equivalence principle—the equivalence of gravitational and inertial mass. In special relativity, as in Newtonian mechanics, there is no way to determine one’s absolute velocity. Similarly, according to the equivalence principle, when in free fall, there is no way to determine one’s absolute acceleration, as all nearby bodies fall with the same acceleration. Einstein postulated that for consistency with the rest of physics, this (p.159) principle must extend to all the laws of physics and not just mechanics. He believed that locally it must be impossible to distinguish between free fall in the presence of gravitating bodies, and being in a state of rest in the absence of gravitating bodies.

Einstein illustrated this with a thought experiment. Imagine being in a lift whose cable has snapped. As the lift falls, the occupants feel weightless, just as though gravity did not exist. The reason is that the lift and everything in it, including every part of the occupants’ bodies, fall with the same downward acceleration g. We can always find a coordinate system for a falling body, such that there is no instantaneous acceleration. In a uniform gravitational field, the appropriate coordinate change is from (z,t) to (y,t) where


Then d2ydt2=d2zdt2+g, so equation (6.1) is transformed into the equation of motion


The coordinate change has eliminated the effect of gravity. This makes the force of gravity reminiscent of the fictitious forces that we considered in section 5.7, which arise from the choice of coordinates. Equation (6.3) has solutions representing motion with any constant velocity. Therefore the relative motion of freely falling bodies has constant velocity, which is exactly the same as for bodies moving freely in the absence of gravity.

We do not feel the force of gravity. We are only aware of it when other forces are acting, such as on the surface of the Earth, where our free fall is stopped by the rigidity of the ground, and our natural frame of reference is non-inertial. In developing general relativity, Einstein would discover a way to model gravity without a gravitational force at all.

6.2 The Newtonian Gravitational Field and Tidal Forces

To understand general relativity, it is useful to first reformulate Newtonian gravity as a field theory. This works very well for massive bodies moving slowly compared to the speed of light, and many of the details are actually very similar to electrostatics. It is important to realize that, in the neighbourhood of the Earth, gravity is weak. For example, a freely falling satellite takes about 90 minutes to orbit the Earth, but a light ray would travel the same distance in about 0.1 seconds, so the satellite motion is comparatively slow.

The Newtonian gravitational force exerted by one body on another is described by an inverse square law force (2.80). It is proportional to the product of the masses of the bodies and inversely proportional to the square of their separation. This is similar to the Coulomb force between two charges (3.86), which is proportional to the product of the charges, and inversely proportional to the square of their separation. Like an electrostatic force, the gravitational force on a body can be interpreted as due to a gravitational field produced by all the other massive bodies.

We saw in section 3.6 that any static distribution of electric charge produces an electric field that is minus the gradient of a potential, and the most significant property of this potential is that away from the charge sources, it satisfies Laplace’s equation. Newtonian gravity is very similar. The gravitational field is minus the gradient of a potential ϕ(x).

(p.160) The potential due to a point mass M at the origin is


where G is Newton’s universal gravitational constant, and r=|x| is the distance from the mass.1 The gradient of this potential is the inverse square law force on a unit mass. Furthermore, the potential satisfies Laplace’s equation, 2ϕ=0, except at the origin.

More generally, the gravitational potential ϕ(x) produced by a matter distribution of density ρ(x) satisfies Poisson’s equation,


For sources that are extended bodies or collections of point masses, ϕ(x) is not generally spherically symmetric. The gravitational force on a test body of mass m at x, due to all the other bodies, is


with ϕ evaluated at x. The acceleration of the body is therefore


If the sources of the gravitational potential are located in some finite region, then the total potential that they produce becomes uniform at distances that are large compared to their separation. This is usually modelled by imposing the boundary condition ϕ(x)0 as |x|.

Even in the absence of a test body, the vector field ϕ may be identified as a physical gravitational field, permeating space. It is often easier to work with the potential ϕ‎ than with the bodies and their component parts that are its sources. For example, to characterize the gravitational field outside the Earth, one need only consider the general solution of Laplace’s equation that approaches zero as |x|. This solution is an infinite sum of terms that approach zero with increasing inverse powers of the distance from the centre of the Earth. Their coefficients can be determined by observing the motion of orbiting satellites. As the Earth is spherical to a good approximation, the potential ϕ‎ is dominated by the spherically symmetric term GMr; corrections depend on the deviation of the Earth’s shape from spherical, and on the asymmetrical distribution of the Earth’s mass. Accurate knowledge of the potential is vital for satellite navigation and GPS systems, and tells us something about the internal structure of the Earth.

Around any point, the potential ϕ(x) has a local expansion that determines the gravitational field nearby, and the first two or three terms in the expansion are sufficient to model its main effects. Suppose P is a point just above the Earth’s surface, chosen to be the origin of Cartesian coordinates (x,y,z). Near P,


with g and h positive constants. (Note that x2+y22z2 satisfies Laplace’s equation, but x2,y2 and z2 individually do not.) The constant ϕ0 does not contribute to the gravitational (p.161) field. The second term describes the familiar field just above the Earth’s surface, where the potential is proportional to height z, and its gradient is the vector (0,0,g), producing a downward acceleration of magnitude g. However, gravity is not perfectly uniform. Objects that are spatially separated do not feel the same force and will have a relative acceleration. The third term in the potential (6.8) captures this. Its gradient is (hx,hy,2hz), so the total acceleration is a=(hx,hy,g+2hz). The downward gravitational acceleration, g2hz, is reduced above P and increased below P, and there is a sideways acceleration of magnitude hx2+y2 towards the z-axis. This correctly describes the relative motion of two or more bodies falling towards the Earth’s centre, as shown in Figure 6.1.

General Relativity

Fig. 6.1 In addition to the downward acceleration on two bodies falling towards the Earth, there is also a relative sideways acceleration.

The linear term in the expansion (6.8) determines the local, approximately uniform gravitational field, whereas the quadratic terms determine its spatial variation. Although the effects of the linear term can always be removed by a change of coordinates, as in equation (6.3), in general the effects of the quadratic terms cannot. These quadratic terms give rise to tidal effects. The tides produced by the Moon in the vicinity of the Earth are the defining example and are illustrated in Figure 6.2. The additional acceleration of a body on the side of the Earth facing the Moon, the near side, is compared to the additional acceleration on the far side. The difference is known as a tidal acceleration, and was first invoked by Newton to explain the tides. On the near side, the oceans flow because the pull of the Moon on them is greater than the average pull on the bulk of the Earth, and on the far side, the oceans flow because they are pulled less than the bulk of the Earth. The relative acceleration is away from the Earth’s centre. In addition to these effects along the Earth–Moon axis, there are sideways tidal forces in the directions perpendicular to the Earth–Moon axis, as shown in Figure 6.2. The solid Earth is distorted by the pull of the Moon too, but not enough for us to notice it.

General Relativity

Fig. 6.2 The tidal stretching and squeezing of the Earth in the gravitational field of the Moon (greatly exaggerated).

The gravitational field due to the Moon is proportional to GMr2, where r is the distance from the Moon and M is the Moon’s mass. The diameter of the Earth is small compared to the Earth–Moon separation, so the difference between accelerations on opposite sides of the Earth is proportional to the derivative of GMr2. The tidal effects are therefore of magnitude GMR3, where R is the Earth–Moon separation.

(p.162) Einstein realized that, due to tidal forces, the trajectories of two test particles initially travelling freely along parallel Euclidean lines through a gravitational field do not generally remain parallel. This is very similar to the geodesic deviation of particle trajectories in a curved space that was described in section 5.7.2. So Einstein made the astonishing proposal that gravity could be described in terms of curved spacetime. In this picture, massive freely falling bodies follow geodesics through spacetime, and tidal accelerations arise from spacetime curvature.

As a prelude to discussing curved spacetime further, we will describe some of the geometry of flat Minkowski space.

6.3 Minkowski Space

When discussing special relativity in Chapter 4, we saw the advantages of sewing space and time together into the 4-dimensional spacetime known as Minkowski space. The squared infinitesimal interval between events at (t,x) and (t+dt,x+dx) in Minkowski space is


This is analogous to the squared infinitesimal distance ds2=dxdx in Euclidean 3-space.2 The squared interval dτ2 is Lorentz invariant, which means that it is the same for all inertial observers in uniform relative motion, even though they may have different notions of what the individual time and space coordinates are.

If dτ2 is positive, then its positive square root dτ is called the proper time separation of the events. Infinitesimal vectors (dt,dx) for which dτ2 is positive are called time-like and those for which dτ2 is negative are space-like. Vectors for which dτ2 is zero are light-like and lie on a double cone, called the lightcone, as shown in Figure 6.3.

General Relativity

Fig. 6.3 The lightcone. Light rays travel on the lightcone. The trajectories of massive bodies must remain within the local lightcone throughout spacetime.

(p.163) Consider a curved worldline X(λ)=(t(λ),x(λ)), parametrized by λ‎, with fixed endpoints X(λ0) and X(λ1). The proper time along the worldline is

τ=λ0λ1dtdλ2dxdλdxdλ dλ.

If X(λ) is the path of a massive particle, the quantity under the square root symbol must be positive. The paths that maximize τ‎ are time-like geodesics, and they are straight lines in Minkowski space, representing a particle moving at constant velocity. There are also geodesics for which the integrand is zero, in which case τ‎ is also zero. Such geodesics are light-like and correspond to light rays. There are other paths for which the integrand in equation (6.10) is the square root of a negative quantity, in which case τ‎ is imaginary. Such paths are called space-like, and nothing physical can move along them.

The reason that τ‎ is maximized rather than minimized along a particle geodesic is readily understood. In the particle’s rest frame, the worldline is a straight line parallel to the time axis, and for such a trajectory the infinitesimal proper time is dτ=dt. Any deviation in the path will introduce negative, spatial contributions to dτ2, which reduce τ‎. As τ‎ is invariant under a Lorentz transformation, this result is true for all observers.

We will adopt a uniform notation xμ=(x0,x1,x2,x3) for coordinates in 4-dimensional Minkowski space, where x0=t is the time coordinate and (x1,x2,x3) are the space coordinates. These are mixed by Lorentz transformations. Generally, in what follows, Greek indices like μ‎ and ν‎ will range from 0 to 3. A 4-vector has a single Greek index, and a tensor has two or more of these indices. The squared infinitesimal interval (6.9) in Minkowski space can be expressed in terms of a metric tensor ημν as


This is known as the Minkowski spacetime metric. The metric tensor is diagonal, with components η00=1,η11=η22=η33=1, and the off-diagonal components all zero. It is sometimes convenient to write ημν=diag(1,1,1,1). The inverse metric tensor ημν has (p.164) identical components. Minkowski space is the appropriate geometry for special relativistic physics, and like Euclidean space, it is flat.

6.4 Curved Spacetime Geometry

Einstein incorporated gravity by transforming spacetime into a dynamical part of the theory, and allowing it to be curved. This more general curved spacetime has four coordinates that we denote in a uniform way by yμ. A spacetime point with coordinates yμ is often denoted by y. Locally, the geometrically meaningful quantity is the squared infinitesimal interval between a point with coordinates yμ and a point with coordinates yμ+dyμ. This has the form


where gμν(y) is a symmetric 4×4 matrix varying throughout spacetime, called the spacetime metric tensor. Under any coordinate transformation, the components of gμν change in such a way that dτ2 is unchanged.

In 3-dimensional Riemannian geometry, the metric tensor gij is positive definite everywhere, meaning that by a suitable choice of coordinates it can be brought locally to the form δij, whose three entries are +1. In spacetime geometry, we require that by a suitable choice of coordinates, the metric tensor gμν can be brought locally to the Minkowski form ημν=diag(1,1,1,1). If this property holds, the metric is said to be Lorentzian. A Lorentzian metric tensor gμν has an inverse gμν at each point. This is simply the matrix inverse of gμν so gλμgμν=δiνλ, where the Kronecker delta symbol δiνλ has, as before, the value 1 if the indices are the same and 0 otherwise. At each point in spacetime there are infinitesimal time-like and space-like vectors dyμ, separated by a local lightcone of light-like vectors.

In Chapter 5 we derived the significant results of Riemannian geometry, such as the form of the Christoffel symbols and the Riemann curvature tensor, for a positive definite metric. However, we made no use of the positive definiteness, only that the metric was invertible, so all these results carry over to the Lorentzian metrics of general relativity. We will assume their validity from here on without further comment. The Christoffel symbols are


and the Riemann curvature tensor is


where the indices run from 0 to 3. In 2-dimensional space, the curvature is completely determined by a single number at each point, the Gaussian curvature. There are many more curvature components at each point in 4-dimensional spacetime. The Riemann curvature tensor Rαλγδ is antisymmetric in its first two indices, and in its last two indices. This gives 4×32=6 independent combinations for each pair. It is also symmetric under the interchange of these two pairs of indices, which reduces the number of independent components to 6×72 = 21. Finally, the components of the Riemann tensor obey the first Bianchi identity (5.59),


which reduces the total number of independent curvature components to 20.

(p.165) It is always possible to transform to local coordinates around a point, such that gμν has the standard Minkowskian form ημν, and in addition the derivatives of gμν are zero. Then the Christoffel symbols are zero at that point. Such coordinates are called inertial, or freely falling, and are analogous to normal coordinates in Riemannian geometry. The existence of inertial coordinates is the mathematical counterpart of the equivalence principle. Physically, the Christoffel symbols of general relativity are generalizations of the gravitational field of Newtonian gravity, so the fact that they vanish in an inertial frame ties in with Einstein’s observation that we do not feel gravity when in free fall. The first derivatives of the Christoffel symbols involve second derivatives of the metric tensor. In general, they do not vanish. Physically, this is to be expected, as it is these derivatives that determine the spacetime curvature of general relativity and the tidal accelerations in the Newtonian picture.

Let us now consider a particle worldline in curved spacetime, a parametrized time-like path y(λ). The integrated interval between fixed endpoints y(λ0) and y(λ1) is

τ=λ0λ1gμν(y(λ))dyμdλdyνdλ dλ.

Notice that because of the square root, dλ formally cancels, so τ‎ is unchanged if the worldline is reparametrized. Maximizing τ‎ gives the Euler–Lagrange equation3


As in Riemannian geometry, this is the geodesic equation, the analogue of (5.84), and λ‎ is no longer arbitrary but linked to the interval along the worldline.

Suppose a geodesic passes through a point P. Using inertial coordinates where Γμνσμ=0, we see that d2yμdλ2=0 at P. Each coordinate yμ of the worldline is therefore locally a linear function of λ‎, just as for the free motion of a particle in Minkowski space. This is the motion required by the equivalence principle, and implies that a freely falling particle must follow a geodesic through spacetime.

Along a geodesic, the quantity


is independent of λ‎; in other words, it is conserved. This may be checked by differentiating equation (6.18) with respect to λ‎, and using the geodesic equation (6.17) together with the formula for the Christoffel symbols. Ξ is positive, zero or negative for time-like, light-like or space-like geodesics, respectively. If the geodesic is time-like we can rescale Ξ to be 1, and then the parameter λ‎ becomes the proper time τ‎ along the geodesic. Only time-like geodesics correspond to the trajectories of physical particles.

Along a light-like geodesic, τ‎ is zero, and λ‎ itself is a better parameter. A light-like geodesic is the path of a light ray in curved spacetime. It describes physical light propagation in the geometrical optics limit, where the wavelength is much less than any length scale associated with the curvature.

(p.166) If the metric of spacetime has symmetries, for example, a rotational symmetry or symmetry under time translation, then geodesic motion has further conservation laws. A continuous symmetry is most simply realized if there is a choice of coordinates yμ so that the metric tensor is independent of one of the coordinates, say yα. In that case one can show, again using equation (6.17), that


is conserved along any geodesic. Conservation of Ξ and Q will be useful later when we consider geodesic motion of particles and light in the spacetime surrounding a star or black hole, where there is time translation symmetry and some rotational symmetry.

6.4.1 Weak gravitational fields

According to the equivalence principle, we should be able to model the motion of a freely falling body in a Newtonian gravitational field as a time-like geodesic in a curved spacetime with a suitably defined metric. We know that in a weak gravitational field, Newtonian dynamics works extremely well for bodies moving much slower than the speed of light, so in such circumstances the corresponding metric must be close to Minkowskian. We will therefore use the usual coordinates of Minkowski space, x0=t and x1,x2,x3.

If the Newtonian gravitational potential is ϕ(x), then the appropriate metric for modelling Newtonian gravity is


We can neglect any time dependence of ϕ‎, as the bodies producing the potential are moving slowly. The only component of the metric tensor that differs from the Minkowski case is gtt=1+2ϕ(x), and the difference is small, because in our units, |ϕ|1. To verify that this metric has the appropriate geodesics, consider the interval τ‎ along a worldline X(t)=(t,x(t)) parametrized by t, where the velocity v=dxdt is small. The interval is

τ=t0t1(1+2ϕ(x(t)))dtdt2dxdtdxdtdt=t0t11+2ϕ(x(t))vv dt.

Since ϕ‎ and v are small, we can approximate the square root, giving


The integral of 1 is path independent and can be dropped.

τ‎ is the quantity we should maximize to find particle geodesics, but if we multiply by −m, where m is a particle mass, then equivalently we can minimize


(p.167) S is the action (2.53) for a non-relativistic particle of mass m, with kinetic energy 12mvv and potential energy mϕ. As we saw in section 2.3, the equation of motion derived by minimizing S is


which is the defining equation of Newtonian gravity. This shows that in the low velocity limit, a time-like geodesic in the curved spacetime with metric (6.20) reproduces the motion expected in Newtonian gravity.

We can explicitly check the low velocity limit of the geodesic equation (6.17). In this limit, τt so the derivatives with respect to τ‎ can be replaced by derivatives with respect to t. The dominant Christoffel symbol is


where gii=1, gtt=1+2ϕ and gtt,i=2ϕxi, so Γitti=ϕxi. (There is no summation over i in the last expression in (6.25).) The space components of (6.17) are therefore


again agreeing with the Newtonian equation of motion.

We will see later that the metric (6.20) does not satisfy the Einstein field equation exactly and a further term involving the Newtonian potential ϕ‎ appears in the spatial part of the metric tensor, but this produces a negligible correction to the equation of motion for a slowly moving particle. It is perhaps a little surprising that the most important effect of the Newtonian potential is to distort the term gtt in the spacetime metric tensor. One might have guessed that gravity would curve space. However, a distortion of time is consistent with our earlier finding that free fall in a constant gravitational field appears inertial after the coordinate change (6.2) involving time.

6.5 The Gravitational Field Equation

If we accept the idea that spacetime is curved, then, as we have seen, we can expect massive bodies and light to travel along geodesics—but how does spacetime curvature arise in the first place? What is the gravitational field equation that determines the relationship between matter and spacetime curvature? Einstein assumed that the field equation must satisfy three guiding principles:

  1. 1) it must be generally covariant,

  2. 2) it must be consistent with the equivalence principle,

  3. 3) it must reduce to the equation for the Newtonian gravitational potential, for matter of low density and low velocity.

Principle 1) means that the field equation must be a tensor equation, taking the same form in any coordinate system. Principle 2) had been the idea that initially got the ball rolling. It suggested to Einstein that gravity could be treated as spacetime curvature, because gravity affects all bodies in the same way. Moreover, the equivalence principle implies that even in a gravitational field, physics is indistinguishable in a locally inertial frame from the physics of special relativity. In other words, spacetime is locally Minkowskian. (p.168) Principle 2) combined with Principle 1) implies that one side of the field equation must be composed of some form of curvature tensor. Principle 3) provides the constant of proportionality relating the mass density to the curvature and leads to a vital check that the field equation is consistent with well established Newtonian physics.

As mentioned in section 6.2, in the presence of a mass density ρ‎, the Newtonian potential ϕ‎ obeys Poisson’s equation 2ϕ=4πGρ. The task faced by Einstein was to find the relativistic counterpart of Poisson’s equation. This should be a covariant equation relating a tensor describing the curvature of spacetime to a tensor describing the distribution of matter, and it should reduce to Poisson’s equation for low mass densities and matter speeds much less than the speed of light.

6.5.1 The energy–momentum tensor

The gravitational source producing the curvature must be a density, like the mass density appearing on the right-hand side of Poisson’s equation—but in a relativistic theory, mass in one frame contributes to energy and momentum in another frame, so energy, mass and momentum must all contribute as sources of gravitational curvature.

Energy is the time component of a 4-vector. Under a Lorentz boost it is multiplied by a gamma factor γ=(1vv)12. The energy density—the energy per unit volume—acquires a second factor of γ‎, because a volume element contracts by γ‎ in the direction of the boost. Therefore the energy density transforms as the 00 component of a 2-index tensor. This tensor is known as the energy–momentum tensor or stress–energy tensor and is denoted by Tμν. Although this argument is based on physics in Minkowski space, it also applies in curved spacetime, as the equivalence principle implies that spacetime is always locally Minkowskian.

For pure matter, the density in its rest frame is denoted by ρ‎ and is (by definition) Lorentz invariant. In this frame, T00=ρ is the dominant contribution to Tμν. If the matter is moving (and this just depends on the coordinate system that has been chosen) then there is an expression for Tμν that depends on the density ρ‎ and the matter’s local 4-velocity vμ. The components Ti0 (i=1,2,3) give the density of the momentum in the ith direction. Tij is the current, or flux, of the ith component of the momentum density in the jth direction. It has a contribution from the net flow of the matter, and from the random motion of matter particles colliding at the microscopic level, which generates a pressure.

Astrophysicists refer to an idealized fluid of non-interacting free particles, with negligible relative motion between them, and hence negligible pressure, as a dust. The energy–momentum tensor of a dust takes the simple form


where vμ is the local 4-velocity of the dust. For a more general perfect fluid the energy–momentum tensor includes a pressure term, and has the form


where ρ‎ is the density and P is the pressure. ρ‎ and P are Lorentz invariants defined in the local rest frame of the fluid, and are related by an equation of state. They vary throughout spacetime, so they are fields. More generally, Tμν can also include terms that describe electromagnetic radiation or any other physical phenomena.

(p.169) Quite generally, Tμν is symmetric under the interchange of its two indices, so of its sixteen components at each point, only ten are independent (four diagonal and six off-diagonal entries). In particular, the current of the energy density T0i equals the momentum density Ti0. The components of Tμν are not completely arbitrary functions of space and time, however, as the matter and radiation satisfy their own local field equations. Electromagnetic radiation, for example, obeys Maxwell’s equations adapted to a curved spacetime background. For a dilute gas of matter particles, the effectively free particles move along geodesic paths through spacetime. For denser gases of matter, as occur for example in stars, one needs to consider the equations for fluid motion, where pressure plays a role. These dynamical field equations lead to a further constraint on Tμν.

Recall that because electric charge is physically conserved in Minkowski space, there is a local electromagnetic 4-current conservation equation (4.45), which says that the spacetime divergence of J is zero. Using 4-vector notation (and the index notation ,ν for spacetime partial derivatives) this becomes


In curved spacetime, the divergence generalizes to the covariant derivative with a contraction of indices, so in coordinates yμ, the electromagnetic current must satisfy the covariant conservation equation


Because of the equivalence principle, we know that locally, even in curved spacetime, energy and the three components of momentum are conserved. There is a corresponding current conservation law—the covariant spacetime divergence of the energy–momentum tensor is zero. In inertial coordinates


and in a general coordinate system this becomes


This is the further constraint on Tμν. In fact, as μ‎ is a free index running from 0 to 3, there are four local constraints corresponding to the conservation of energy and momentum.

We will now introduce a new shorthand notation. Previously we replaced the partial derivative yν by the comma notation ,ν and will continue to do this. From here on, we will also replace the covariant derivative DDyν, which in curved spacetime includes terms containing Christoffel symbols, with the semi-colon notation ;ν‎. For example,


In this notation, the vanishing of the covariant divergence of the energy–momentum tensor is rewritten as


(p.170) 6.5.2 The Einstein tensor and the Einstein equation

Einstein realized that the energy–momentum tensor had all the properties required for one side of the gravitational field equation. What he needed was the appropriate tensor, now known as the Einstein tensor, that would sit on the other side of the equation. It would describe the curvature of spacetime and must therefore be related to the Riemann tensor. In empty space, the energy–momentum tensor is zero. This implies that the Einstein tensor on the other side of the equation cannot simply be a multiple of the Riemann tensor, otherwise empty space would be flat and there would be no gravitational effects in the empty space between massive bodies, which is obviously wrong.

The energy–momentum tensor is a symmetric tensor of rank 2, so the Einstein tensor must also have these properties. It is convenient to work with Tμν, having lowered its indices using the metric tensor. In Einstein’s notebooks he initially wrote the equation as


where κ‎ is a constant of proportionality and ?μν is the Einstein tensor, whose form he set out to discover. The energy–momentum tensor has zero divergence, so for consistency, the Einstein tensor must have zero divergence too. The theory then automatically incorporates conservation of energy and momentum.

As we have seen, the Riemann curvature tensor has four indices—it is of rank 4—but there is a closely related rank 2 tensor that derives from it. This is the Ricci tensor Rμν. It is obtained by contracting indices, that is, summing over selected components of the Riemann tensor, as follows:


Due to the symmetry of the Riemann tensor under exchange of the first and second pairs of indices, the Ricci tensor is symmetric in its two indices and therefore has ten independent components. A further contraction of indices is possible, which produces the Ricci scalar


The Ricci tensor, the metric tensor and the Ricci scalar can be combined into a family of symmetric, rank 2 curvature tensors


where ξ‎ is an arbitrary constant. We now show that just one value of ξ‎ gives a tensor whose divergence is identically zero in any spacetime. This divergence may be calculated with the help of an identity involving derivatives of the Riemann tensor. As with many tensor equations, the identity is most easily proved by using local inertial coordinates. The Riemann tensor is expressed in terms of the Christoffel symbols in equation (6.14). In inertial coordinates the Christoffel symbols are zero, so by the Leibniz rule, the contribution of the last two terms of the Riemann tensor, which are products of Christoffel symbols, is still zero after differentiating once. Therefore in inertial coordinates,


(p.171) Similarly, by a permutation of the indices,


Adding these three expressions, and using the symmetry of mixed partial derivatives, gives


This expression is true in inertial coordinates. Replacing the partial derivatives with covariant derivatives produces the tensor identity


valid in any coordinate system. This is known as the second Bianchi identity.

This identity takes us a big step towards finding the Einstein tensor. As we have seen, the Ricci tensor Rμν is formed by taking the trace over the first and third indices of the Riemann tensor. If we contract α‎ with γ‎ in each term in equation (6.42), we obtain


where we have used the antisymmetry of the Riemann tensor in its last two indices to obtain the middle term. We can multiply through by gνλ and contract again, to obtain Rλλ;μλRλμ;λλ+Rαλλμ;ααλ=0, and hence, using the antisymmetry of Rαλλμαλ in its first two indices, Rλλ;μλ2Rλμ;λλ=0. As R=Rλλλ is the Ricci scalar, this becomes


The metric is covariantly constant, as shown in equation (5.45), so we can multiply by the inverse metric gμν and pull it through the covariant derivative. After swapping the order of the two terms, and using the same symbol for the repeated indices, this gives


We have found a symmetric tensor of rank 2 with zero divergence.

We therefore fix the constant ξ‎ in equation (6.38) to be 12, and define the Einstein tensor (with lowered indices) to be


The field equation of general relativity follows immediately. Plugging Gμν into equation (6.35), we find


This is the Einstein equation. It has ten components and equates two symmetric, divergence-free rank 2 tensors. Given a particular distribution of matter, energy and momentum it determines the metric gμν, and therefore how spacetime curves. Spacetime is generally curved even in regions that are empty, where Tμν=0, because some components of the Riemann tensor may still be non-zero.

(p.172) We can find an alternative form of the Einstein equation as follows. If we multiply equation (6.46) through by gλμ and contract the indices λ‎ and ν‎, we obtain


as gμνgμν=δiνν=4. If we write T=Tννν for the contracted energy–momentum tensor, then


and substituting this back into the Einstein equation (6.47) gives


This is the original form in which Einstein presented the field equation.

6.5.3 Determining the constant of proportionality

In the weak field, low velocity limit we should recover Newtonian gravity in the form of Poisson’s equation, and we can use the 00 component of the field equation in this limit to determine κ‎.

In inertial coordinates, the Christoffel symbols are zero, but their derivatives may be non-zero. Contracting the first and third index of the Riemann tensor in equation (6.14) gives the Ricci tensor in inertial coordinates: Rμν=Γανμ,ααΓααμ,να. Using the formula (6.13) for the Christoffel symbols, this becomes


as the middle terms in the two brackets cancel. For weak fields, the deviations from flatness are small, so we can write


where ημν is the metric tensor of the Minkowski space background and hμν1, and we can choose the coordinates yμ to be the ordinary time and space coordinates (x0=t,x1,x2,x3). Discarding terms that are quadratic in hμν, the 00 component of the Ricci tensor is then


Slowly moving matter produces a slowly varying metric and therefore we can neglect time derivatives. Every term apart from the second in the expression for R00 includes at least one explicit time derivative, so


Neglecting the remaining time derivatives in this gives


For slowly moving matter, the rest mass is much greater than the kinetic energy, so the dominant term in the energy–momentum tensor is T00=ρ, and for an approximately flat (p.173) metric, Tμμμ=T=ρ, so the 00 component of the right-hand side of equation (6.50) is 12κρ. Combining this result with equation (6.55) gives


In the Newtonian limit, h00=2ϕ from equation (6.20) so


which matches Poisson’s equation (6.5) if κ=8πG. This fixes κ‎ and gives the final form of the Einstein equation,


6.6 The Classic Tests of General Relativity

Here we describe three classic tests of general relativity and the historical observations confirming the theory. Detailed calculations of the size of the first two effects are left to subsequent sections.

6.6.1 The perihelion advance of Mercury

Newtonian gravity is described by an inverse square law force. This results in elliptical orbits, which explains Kepler’s first law of planetary motion, as we saw in section 2.7. The inverse square law has an enhanced symmetry, resulting in the conserved Runge–Lenz vector that points along the major axis of a planet’s orbit, and remains fixed in space. Any small additional force acting on the planet will break this symmetry and the effect will be a gradual precession of the axis of the ellipse, as illustrated in Figure 6.4.

General Relativity

Fig. 6.4 Precession of Mercury’s orbit.

Observations in the 19th century showed that Mercury’s orbit around the Sun precesses by 574 arcseconds per century. (1 arcsecond is 160 of an arcminute and in turn 13600 of a (p.174) degree.) In around 225,000 years the axis of Mercury’s orbit traces out a complete circuit of the Sun. Most of this can be accounted for by perturbations due to the gravitational attraction of the other planets. The pull of Venus accounts for a shift of 277 arcseconds per century. Jupiter adds another 153 arcseconds. The Earth accounts for 90 arcseconds and the rest of the planets about 11 arcseconds more. These contributions total 531 arcseconds, which leaves 43 arcseconds per century unaccounted for.

During November 1915, Einstein addressed this issue. He performed a calculation of geodesic motion one step beyond the Newtonian approximation, which is perfectly adequate for analysing small effects in the solar system, and discovered that general relativity introduces an additional force that decreases as the inverse fourth power of distance. In the solar system, this extra term is largest in the case of Mercury, because Mercury is closest to the Sun. The additional force due to general relativity causes Mercury’s orbit to precess by 43 arcseconds per century, just the right amount to explain the total observed precession. This was the moment when Einstein knew that his theory was a success. Beside himself with joyous excitement, Einstein wrote: ‘My wildest dreams have been fulfilled. General covariance. Perihelion motion of Mercury wonderfully exact.’

6.6.2 The deflection of starlight

General relativity predicts that light will be deflected by the curved spacetime around a massive body. Within the solar system, the curvature of spacetime is very small. Even near the Sun, gravity is a weak force. A beam of starlight just grazing the edge of the Sun and following a light-like geodesic is deflected through an angle of just 1.75 arcseconds.

In 1919, a British expedition led by Arthur Eddington and Andrew Crommelin set out to test this prediction by photographing the deflections in the positions of stars located close to the edge of the Sun during a total eclipse. The eclipse would cross Northern Brazil, the Atlantic and Africa on 29 May 1919 and was very favourable for the mission as the duration of totality was six minutes, close to the maximum possible. It was also at an ideal position in the sky, being situated in the open star cluster known as the Hyades, where there are plenty of reasonably bright stars whose positions could be measured. Crommelin’s expedition photographed the eclipse from Sobral in Brazil and Eddington’s expedition photographed the eclipse from the island of Pr’incipe off the African coast. The measurements from Sobral gave a shift of 1.98±0.16 arcseconds and the results from Pr’incipe gave a shift of 1.61±0.4 arcseconds, confirming the prediction of general relativity.

The results of the eclipse expedition were hailed as a great triumph for general relativity. Einstein was thrust into the media spotlight and would be celebrated as an intellectual giant for the rest of his life. Figure 6.5 shows some of the news coverage of the expedition from later that year.

General Relativity

Fig. 6.5 Page from the Illustrated London News of 22 November 1919.

6.6.3 Clocks and gravitational redshift

Another prediction of general relativity is that gravity affects the passage of time. The effect can be understood most easily using the Newtonian approximation (6.20) to the spacetime metric, dτ2=(1+2ϕ(x))dt2dxdx, valid when the potential ϕ‎ is small and vanishes at spatial infinity.

The time measured by a clock is its local proper time τ‎. The proper time gap between ticks is a constant Δτ that is independent of the position or motion of the clock. As the metric is Minkowskian at infinity, a clock at rest there moves inertially, and measures the (p.175) coordinate time. The gap between its ticks is Δt=Δτ. A similar clock at rest at position x, deeper in the gravitational potential, will not be moving inertially; it must be accelerating in order to remain at rest, but we assume that the acceleration has no effect on its time-keeping.4 As the gap between ticks of the clock at x is Δτ, we deduce using the metric and the approximation (1+2ϕ(x))121+ϕ(x) that the corresponding gap in coordinate time is Δt=Δτ/(1+ϕ(x)).

Suppose the ticks from the clock at x are signalled out to infinity. There is a time delay, but the gap Δt in coordinate time between ticks is the same at the position of the clock as it is at infinity. This is because the metric has a symmetry under any time shift, so a physical process can be moved forward by Δt throughout spacetime, and remain physical. The ticks arriving at infinity therefore have separation Δτ/(1+ϕ(x)), which is greater than Δτ because ϕ(x) is negative. The clock at infinity has ticks separated by Δτ, so the clock deeper in the potential appears to an observer at infinity to have slowed down. Conversely, an observer at x receiving clock signals from infinity will observe the signals speeded up compared to a local clock. In summary, gravity affects clocks—not locally, but when one compares their time-keeping at different points.

A test body of mass m has (negative) potential energy mϕ(x) at a point x, and zero potential energy at infinity. The total energy, including the body’s rest energy, is m(1+ϕ(x)) at x and m at infinity. To approach spatial infinity through free motion, the body would (p.176) need to start with some additional kinetic energy, and this would decrease as the body approached infinity.

Similarly, a photon loses energy as it approaches infinity. Suppose that a photon is emitted from the surface of a massive body such as a star or planet, where the Newtonian potential is ϕ‎, and is detected by a distant observer where the potential is effectively zero. Suppose the emitted photon has (angular) frequency ω‎, and the observed frequency is ω. The energy of a photon is initially E=ω, where ћ is Planck’s constant, as we will discuss in Chapter 7. The photon’s energy decreases in the same way as that of a massive body, so


As ϕ‎ is negative, ω is less than ω‎, and we say that the photon has undergone a redshift in its climb out of the gravitational well.5 The pulsing of an electromagnetic wave is an ideal measure of the passage of time. The reduction in energy of the photon between emission and detection can be interpreted as due to a difference in the rate at which time passes at these two points. From the above calculation, we deduce again that a proper time interval at infinity is shorter, by a factor 1+ϕ, than an equivalent proper time interval signalled out to infinity from a location where the potential is ϕ‎.

In 1914, Walter Adams described the first member of a new class of stars later named white dwarfs. The following year, the faint companion of the star Sirius was identified as a second such star. These stars are remarkable because they are extremely faint compared to other stars that exhibit similar spectra. As they are in binary systems their masses can be estimated, and turn out to be comparable to the mass of the Sun, M. (The best modern estimate for the mass of Sirius B is 0.98M.) Eddington argued in 1924 that these stars could only be so faint because they are very small compared to a normal star. He estimated that they are similar in size to the Earth and therefore must be incredibly compact objects with an exceptionally high density. He calculated the gravitational redshift of light emitted from Sirius B to be the equivalent of a 20 km s−1 Doppler shift. The following year Adams made spectrographic observations of Sirius B and measured the shift in the lines in its spectrum. After accounting for the shift due to the orbital motion of the white dwarf, there remained a redshift equivalent to a Doppler shift of 19 km s−1, just as Eddington had predicted. This was acclaimed by Eddington as another great triumph for general relativity. In reality, however, there was a great deal of uncertainty in both the measurement and Eddington’s calculation, so the precise agreement was rather coincidental. The modern figure for the Doppler equivalent of the gravitational redshift of Sirius B is 80.42±4.83 km s−1.

In 1959, a much more accurate measurement of the gravitational frequency shift was undertaken in a classic experiment at Harvard University by Robert Pound and Glen Rebka. Pound and Rebka fired gamma ray photons down the 22.5 metre Jefferson Tower at the university and measured the blueshift in the frequency of the photons at the bottom of the tower due to their fall in the Earth’s gravitational field. The initial results agreed with the predictions of general relativity to within 10% accuracy. Subsequent improvements to the experiment made by Pound and Joseph Snider brought the accuracy of the agreement to within 1%.

The effects of gravitational time distortion are now routinely taken into account by the GPS (Global Positioning System) network, which is used daily by millions of people around (p.177) the world. The GPS system could not function for more than a few minutes if the predictions of general relativity were not incorporated into the system.

6.7 The Schwarzschild Solution of the Einstein Equation

Karl Schwarzschild was a German mathematician and astrophysicist who was stationed on the Eastern Front in 1915. During the War, he began to suffer from a rare and extremely painful autoimmune disease of the skin known as pemphigus. Somehow, in these incredibly difficult circumstances, Schwarzschild found the most important solutions to the Einstein equation, which had only been published a month earlier. His solutions describe spacetime inside and outside a perfectly spherical body, such as a star or a planet. Schwarzschild’s skin condition soon worsened and in March 1916 he was removed from the front. Two months later he died.

The Einstein equation is a tensor equation relating second partial derivatives of the metric tensor to the matter and energy density. In empty space, the energy–momentum tensor Tμν vanishes, so, as is clear from the form of equation (6.50), the Einstein equation simplifies to


This is known as the vacuum Einstein equation. The simplest vacuum solution is Minkowski space, the flat spacetime of special relativity, where the entire Riemann tensor vanishes.

Less trivially, the exterior Schwarzschild solution is not flat, and describes the vacuum spacetime around a spherically symmetric body. It is most simply described using polar coordinates (t,r,ϑ,φ). For a body of mass M whose centre is situated at the point r=0, the exterior Schwarzschild metric is


The non-zero metric tensor components are


The metric tensor is diagonal, and comparing the sign of each component to the Minkowski metric, we see that t should be regarded as a time coordinate and r,ϑ,φ as spatial polar coordinates throughout the region r>2GM.

The Schwarzschild metric is the relativistic counterpart of the Newtonian gravitational field outside a spherically symmetric mass, as described by the gravitational potential ϕ(r)=GMr. Both the Newtonian field and the Schwarzschild metric include a single parameter M. We can see at once that gtt=1+2ϕ, but the Einstein equation requires that the spatial part of the metric also depends on ϕ‎. At radii rGM, however, the geometry is completely equivalent to the Newtonian gravitational field.

The Schwarzschild metric tensor has two manifest symmetries, as its components are independent of φ and t. There is a rotational symmetry associated with a shift of φ and, as the final terms in the metric are proportional to the 2-sphere metric, this is part of a full spherical symmetry. The metric is also symmetric under time shifts, and is said to be static. It is, in fact, the most general spherically symmetric solution of the vacuum Einstein equation. (p.178) This result is known as Birkhoff’s theorem, and means that the Schwarzschild exterior metric even applies around matter that is undergoing spherically symmetric collapse or expansion. It also implies that an empty spherical cavity in a spherically symmetric spacetime is described by the Schwarzschild metric with M=0, which is just flat Minkowski space. The equivalent Newtonian result is that the gravitational field vanishes within a spherical shell of matter, because the only spherically symmetric potentials satisfying Laplace’s equation are of the form ϕ=Cr+D, but C must be zero if there is no singularity at the origin, and then the gradient of ϕ‎ is zero.

Figure 6.6 shows a 2-dimensional slice through the space of the exterior Schwarzschild metric, at fixed t and ϑ. The slice ends at the surface of the body, as the exterior metric is no longer applicable inside.

General Relativity

Fig. 6.6 2-dimensional (r,φ)-slice through the exterior Schwarzschild space.

Proving that the exterior Schwarzschild metric satisfies the vacuum Einstein equation (6.60) is not difficult. The algebra is quite laborious, but it is a useful exercise that we will sketch out. The Christoffel symbols can be worked out by plugging the components of the metric into the defining formula (6.13). Most of the terms are zero, so the Christoffel symbols can be computed quite rapidly. For instance,


because none of the metric components are time dependent, and all off-diagonal components including gtr are zero. Therefore


where Z=12GMr. The only non-zero Christoffel symbols are


and they can be used to compute the components of the Riemann tensor. For instance, (p.179) from equation (6.14),


Most terms, such as Γrtr,tr and ΓrrϑrΓϑttϑ, vanish leaving


where the first two terms in the second line come from the radial derivative of Γrttr. Similar calculations give other components of the Riemann tensor, for instance,


These results combine to give the tt component of the Ricci tensor


It can similarly be verified that all the other components of the Ricci tensor are zero and therefore the Schwarzschild metric satisfies the vacuum Einstein equation.

The vacuum Einstein equation itself does not include any mass parameter, and therefore the above calculation cannot determine the parameter M that appears in the Schwarzschild metric. The simplest way to show that M is the mass of the gravitating body is to consider the Newtonian limit at large r. Alternatively, it may be established by matching the exterior Schwarzschild metric to the interior Schwarzschild metric at the surface of the body. We will discuss the interior metric in section 6.10.

6.7.1 The Newtonian limit

It is illuminating to look further at the exterior Schwarzschild metric in the Newtonian approximation. We have already noted that this metric corresponds to a Newtonian potential ϕ(r)=GMr at large r, whose gradient has magnitude GMr2. This is rather significant. Newton’s theory was built around an inverse square law force in order to match the observed motions of the planets. There was no inherent reason why the force had to diminish in this way; the choice was made in order to fit the observations. In Einstein’s theory no such choice is possible. The form of the field equation is determined by very general principles, and implies that the Ricci tensor vanishes in empty space. It is a genuine prediction of general relativity that in the Newtonian limit, the gravitational potential around a spherically symmetric body falls off inversely with distance and the force decreases with the inverse square of distance. One of the most significant features of the universe has been deduced from geometrical principles.

We can also gain insight into the Newtonian limit of general relativity by looking at the (p.180) equation of geodesic deviation (5.102),


where ημ is a vector linking points on two nearby, time-like geodesics. In Minkowski space the Riemann tensor vanishes, so


This is equivalent to Newton’s first law of motion applied to the non-relativistic, relative motion of two bodies.

For motion in the radial direction of Schwarzschild spacetime, the r component of equation (6.70) is


From equation (6.67) and the antisymmetry properties of the Riemann tensor we find Rrttrr=2GMZr3, and for the Schwarzschild metric dtdτ2=1Z, so


In the Newtonian limit, the factor 2GMr3 is interpreted as a tidal stretch along the line pointing radially out from the mass M. Geodesic deviation in the transverse ϑ and φ directions can be similarly determined. As Rϑttϑϑ=Rφttφφ=GMZr3,


The factors GMr3 are interpreted as tidal squeezes. Figure 6.2 shows these tidal forces acting on the Earth due to the gravitational field of the Moon.

6.8 Particle Motion in Schwarzschild Spacetime

The spacetime around a massive body such as the Sun is described to a very good approximation by the exterior Schwarzschild metric. A particle freely falling through this spacetime will follow a time-like geodesic as described by equation (6.17). For simplicity, we will suppose this particle has unit mass. As mentioned before, the parameter λ‎ along such a geodesic can be taken to be the proper time τ‎, and the constant Ξ in equation (6.18) is then 1.

As the metric is spherically symmetric, we can assume the particle’s worldline is in the equatorial plane ϑ=π2, without any loss of generality. The symmetry of the metric under the reflection ϑπϑ implies that any worldline starting tangent to this plane remains there. So sinϑ=1 and dϑ=0, and the Schwarzschild metric reduces to


(p.181) A geodesic is a worldline (t(τ),r(τ),φ(τ)) satisfying


the appropriate version of (6.18).

The Schwarzschild metric is static, so the particle has a conserved energy


as implied by equation (6.19). Similarly, as the metric is symmetric under φ-rotations, the particle has a conserved angular momentum


Because of these conserved quantities, equation (6.76) simplifies to


which rearranges to




The geodesic equation therefore reduces to a 1-dimensional problem of a particle of unit mass and kinetic energy 12drdτ2 moving in the potential V(r), with total ‘energy’ 12E2. The second and third terms in V are the standard Newtonian gravitational potential and centrifugal terms that occur in the analysis of Newtonian orbits, but the final, inverse cubic potential gives a new relativistic term that gives rise to an inverse quartic force responsible for the orbit precession.

Let us make the substitution u=1r. Then


Using this change of variable we find from equations (6.80) and (6.81) that


Differentiating with respect to φ and dividing through by l2dudφ gives


(p.182) In the absence of the term on the right-hand side, the solution is


or equivalently r(1+ecosφ)=l2GM, matching the solution (2.95) we found for the Newtonian orbit.6 e and l are constants determined by the initial conditions.

The additional term 3GMu2 produces the relativistic modification to the Newtonian orbit. In the solar system this term is very small, and u can be approximated by substituting the Newtonian solution, giving


The improved solution is then


The first term in the braces is a small constant and the second is cyclic, producing a small correction that repeats every orbit and does not increase with time. Keeping only the final increasing term, we obtain


The functions of φ on the right-hand side can be combined by using the trigonometric expansion


for small α‎, leading to




At perihelion, the point of closest approach to the Sun, r reaches its minimum and u its maximum, so cos{(1α)φ}=1, and therefore after N orbits


The angle φ at perihelion is therefore


so with each orbit, the perihelion advances by


where we have recalled that for the Newtonian orbit of a unit mass particle, the relation between angular momentum and the semi-major axis is l2GM=a(1e2).

(p.183) In the solar system the effect is greatest in the case of Mercury, which is closest to the Sun. Mercury also completes its orbits in less time than the other planets so the deviations from Newtonian behaviour accumulate faster.

Newton’s gravitational constant is G=6.67×1011 m3 kg−1 s−2 and the speed of light is c=3.00×108 m s−1, so in units where the speed of light is 1, Newton’s constant is G=7.42×1028 m kg−1. The mass of the Sun is M=1.99×1030 kg, so GM=1.48×103 m. The semi-major axis of Mercury’s orbit is a=5.79×1010 m and its eccentricity is e=0.206. Plugging these numbers in, we find that the rate of perihelion advance is 5.04×107 radians per orbit. Mercury has an orbital period of 88.0 days, so there are 415 orbits per century. Therefore, the perihelion advance per century is 2.09×104 radians, or 43.1 arcseconds.

In 1974 the first binary neutron star system PSR B1913+16 was discovered by Russell Hulse and Joseph Taylor using the giant radio telescope at Arecibo in Puerto Rico. Neutron stars are collapsed stellar remnants that are compressed to nuclear densities. One of the neutron stars in the binary system generates a pulsar, which is a beam of electromagnetic radiation that points in our direction once every revolution of the neutron star. (We will discuss pulsars in section 13.8.1.) The neutron star rotates 17 times a second, so we receive a pulse of radio waves every 59 milliseconds. The radio pulses are received with an incredible regularity, but vary slowly with a period of 7.75 hours due to the Doppler shifts as the neutron star orbits its companion. These Doppler shifts have enabled astronomers to determine the orbital characteristics of this pulsar system with exquisite precision. Many pulsars are known in binary systems, but in most of these systems the companion is a normal star and the transfer of material on to the neutron star complicates the dynamics. By contrast, PSR B1913+16 is a very clean environment in which to study the orbital mechanics. The intense gravitational field in this system, and the precision with which the position of the neutron stars can be calculated, make it the ideal testing ground for general relativity. Astronomers have determined the masses of the two neutron stars to be 1.4411±0.0007M for the pulsar and 1.3873±0.0007M for the companion. Their orbit is highly eccentric, with e=0.617, and the length of the semi-major axis is 9.75×108 m. At the point of closest approach, the separation of the two neutron stars is just 1.1 solar radii; at their furthest separation it is 4.8 solar radii.

The axis of the orbit advances much more rapidly than for Mercury. Plugging the numbers from the above paragraph into the formula (6.94) produces a precession of 4.2 degrees per year. The observed precession is in perfect agreement with this prediction of general relativity. Each day the orbit shifts by 41.4 arcseconds, almost as much as Mercury’s orbit shifts in a century.

In 2003 a double pulsar system of neutron stars known as PSR J0737-3039A and PSR J0737-3039B was discovered at the Parkes observatory in Australia. This remains the only known binary system in which both components are visible pulsars, enabling the system to be accurately monitored. The orbital period is just 2.4 hours, and the axis of the orbit precesses by 16.90 degrees per year, again confirming general relativity’s prediction.

6.9 Light Deflection in Schwarzschild Spacetime

To calculate the deflection of light passing through the curved spacetime close to a spherical mass, as illustrated in Figure 6.7, we need to find the light rays in Schwarzschild spacetime. A light ray follows a light-like geodesic and we may again assume it is in the equatorial (p.184) plane ϑ=π2. In equation (6.18) we set Ξ=0, and the parameter λ‎ is no longer τ‎. Using the conservation laws for energy and angular momentum and setting u=1r as before, we find the equation for the light ray


General Relativity

Fig. 6.7 Deflection of light around a massive body.

In the solar system the term on the right is again very small. If this term is neglected, the solution is a straight line


where b is the impact parameter, the distance of closest approach to the central mass. For convenience we have chosen φ to be zero at closest approach, so φ increases from π2 to π2 along the line. To find an improved solution, we substitute the straight line solution into the small term on the right-hand side of (6.95), giving


whose solution is readily seen to be


At the ends of the light ray, where u=0,


As φ is close to ±π2, we neglect the cos2φ term, obtaining


(p.185) The solution is φ=π2Δ in one direction and φ=π2+Δ in the other, where Δ‎ is small. Using the familiar trigonometric formulae cosπ2Δ=cosπ2+Δ=sinΔΔ, we then find


so the full angular deflection is


For the Sun, GM=1.48×103 m, and if we take b to be the solar radius, which is 6.96×108 m, then a ray of starlight that just grazes the edge of the Sun is deflected by an angle of 8.48×106 radians or 1.75 arcseconds, as Einstein famously predicted and the eclipse expedition of 1919 confirmed.

The bending of light due to gravity can be seen in gravitational lenses. The light from a galaxy at cosmic distances may be bent around an intervening cluster of galaxies to produce multiple images of the more distant galaxy. Numerous instances of such gravitational lensing systems have been discovered. In the ideal situation where the alignment is exact and the lensing mass is spherically symmetric the image should warp into a circle known as an Einstein ring. An example of an almost perfect Einstein ring is shown in Figure 6.8.

General Relativity

Fig. 6.8 The almost perfect Einstein ring LRG 3-757 photographed by the Hubble Space Telescope Wide Field Camera 3. The ring has a diameter on the celestial sphere of 11 arcseconds (ESA-Hubble and NASA).

Gravitational lenses offer an unambiguous method of determining the mass of a cluster of galaxies. Distances to both the lensing galactic cluster and the more distant galaxy whose distorted image is observed can be determined from their redshifts. (We will look at cosmological redshift in Chapter 14.) The angular size of the ring produced by the gravitational lens can be measured. Combining the distances and the angular size gives the impact parameter b and the total angular deflection 2Δ. Then formula (6.102) can be used to determine the mass of the gravitational lens. Such calculations produce estimates for the amount of material in a cluster of galaxies that greatly exceed what is inferred from the amount of light emitted by the cluster. This suggests that clusters of galaxies are accompanied by a great deal of material that does not emit light and is therefore known as dark matter. The identity of this dark matter is, as yet, unknown. The prime candidate is an unknown species of stable particle that would have been produced in vast quantities in the very early universe. We will return to this question in Chapter 12.

6.10 The Interior Schwarzschild Solution

The interior Schwarzschild metric describes spacetime in the interior of a spherically symmetric body of density ρ(r) and pressure P(r) whose centre is situated at r=0. It takes the form




is the integrated mass from the centre, and ψ(r) is the solution of


satisfying ψ()=0.

(p.186) In the idealized situation where the density ρ‎ is constant throughout the body, M(r)=43πρr3. In this case, the spatial part of the metric (6.103) is


with K=83πGρ. This is the metric (5.73) of a 3-sphere of constant curvature 83πGρ, and hence radius 38πGρ12. The interior metric only covers part of the 3-sphere, and only a very small part for a body like the Earth, where GM(r) is everywhere much less than r, so Kr2 is much less than 1.

Outside a spherically symmetric mass, space is described by the exterior Schwarzschild solution, and has curvature components of both signs. This is similar to the hyperbolic soap film shown in Figure 1.3, where at each point on the surface the curvature components are equal and opposite to produce balancing surface tension forces. In the exterior Schwarzschild geometry there are three spatial dimensions and to satisfy the Einstein equation the inwardly directed curvature in the two angular directions balances the outwardly directed curvature in the radial direction, as given in equation (6.69). In the Newtonian picture, tidal forces stretch bodies radially and squeeze them in the orthogonal directions.

Within the mass, all three spatial curvatures are inwardly directed, so space is positively curved and bodies are squeezed in all three directions. The three curvature components are now balanced in the Einstein equation by the non-gravitational, outward stress exerted by the matter. The positive spatial curvature compresses the material of which the body is composed, and this is resisted by structural forces within the body. In the absence of such forces, which may be electromagnetic or nuclear, the body must collapse.

Figure 6.9 shows a 2-dimensional slice through the exterior and interior spatial metrics corresponding to the space in and around a spherical mass of uniform density. The interior Schwarzschild metric joins the exterior Schwarzschild metric continuously at the surface, where M(r) equals the total mass M. This confirms that the parameter M in the exterior metric is the total mass inside.

General Relativity

Fig. 6.9 2-dimensional slice through the Schwarzschild exterior and interior geometry. (Note that the portion of the 3-sphere within the spherical body has become part of a 2-sphere in the slice.)

(p.187) 6.11 Black Holes

Something strange appears to happen to the exterior Schwarzschild metric (6.61) at the radius rS=2GM, which is known as the Schwarzschild radius. Here, gtt is zero and grr is infinite. The Schwarzschild radius of a body is usually irrelevant, as the body is physically much larger than this radius and the exterior Schwarzschild geometry morphs into the interior Schwarzschild geometry at the surface of the body. For example, the Schwarzschild radius of the Sun is about 3 km, but the exterior solution only applies at distances greater than the radius of the Sun, which is about 700,000 km. Inside the Sun, to a very good approximation, the geometry is described by the interior Schwarzschild solution.

The Sun is supported by the pressure that arises from thermal motion of its constituent particles, which is dependent on the continuous release of energy through nuclear fusion, as we will discuss in Chapter 13. When a star has consumed its nuclear fuel, it must collapse under its own gravity. The end result depends on the mass of the star. A mass of up to 1.44M can be supported by electron degeneracy pressure in the form of a white dwarf. More massive stars collapse to form neutron stars which are supported by nuclear forces and neutron degeneracy pressure. They have radii of 10–15 km, perilously close to their Schwarzschild radii. The maximum mass that can be supported as a neutron star is believed to be in the region of 2–3 M. There is no known mechanism to support a star with a greater mass once its nuclear fuel has been consumed.

If a massive body is crushed under its own gravity to the extent that its radius shrinks within its Schwarzschild radius, then nothing can prevent its inexorable collapse. Such a collapsed object is known as a black hole, as not even light can escape from inside the Schwarzschild radius. The vacuum spacetime around a black hole of mass M is described by the Schwarzschild metric, from radius r=0 outwards.

The observational evidence for the existence of black holes is now overwhelming. Numerous examples are known of black holes with a mass of order ten solar masses, and supermassive black holes with masses of millions or even billions of solar masses are known to inhabit the central regions of most, if not all, galaxies. Recently, detection of gravitational waves, apparently generated in a black hole merger, has provided direct and compelling evidence for the existence of black holes.

A black hole is very small by cosmic standards. Because of this, it is quite unlikely that much material falls directly into one. Rather, a swirling accretion disc is expected to form around a black hole. Friction causes the material in the accretion disc to gradually lose energy and spiral inwards before finally plummeting into the abyss. A large amount of (p.188) gravitational energy is released in this process. This heats the accretion disc to extremely high temperatures resulting in the emission of X-rays.

We will now examine the nature of the circular orbits in Schwarzschild spacetime with the aim of shedding some light on the energy release in a black hole accretion disc. The orbits available to unit mass particles in the spacetime described by the exterior Schwarzschild metric are given by solutions to equation (6.80). Stable circular orbits with no radial motion are found at the minimum of the potential (6.81). It is convenient to use again the variable u=1r, so the potential is


Its stationary points are where dVdu=0, that is, where


which we can rewrite as


The right-hand side increases from 0 at u=0 to a maximum value 112GM at u=16GM, then decreases to 0 at u=13GM. So for all l2>12(GM)2, there are two solutions for u, one less than and one greater than 16GM. The second derivative of the potential V(u) is l2(16GMu), so the solution with u<16GM is a minimum of V and is stable, and the other solution is unstable. In terms of the radius, the orbits with r>6GM, three times the Schwarzschild radius, are stable, and those with 6GMr>3GM are unstable.

This means that the inner radius of an accretion disc around a black hole of mass M lies at a distance of r=6GM, and particles there have the critical value of the angular momentum l=12GM. We can readily calculate the energy released by any material that reaches this inner edge. Returning to equation (6.80), we see that the energy of a unit mass particle in a circular orbit around the black hole is given by


At the inner edge of the accretion disc, u=16GM so 2GMu=13, l2u2=13 and the particle has energy


The fraction of the mass of the particle that has therefore been released on its trip to this point is


From here on, the particle may be expected to rapidly fall into the black hole taking any kinetic energy generated in the final plummet with it. Therefore we can expect a total of about 5.7% of the mass accreted by the black hole to be emitted as energy before the mass disappears into the black hole. This may be compared to the nuclear fusion of hydrogen into helium, which releases around 0.7% of the mass of the hydrogen as energy. We will see shortly that spinning black holes have the potential to release even more energy into their environment.

(p.189) 6.11.1 Eddington–Finkelstein coordinates

The exterior Schwarzschild metric (6.61) is asymptotically Minkowskian. The coordinates (t,r,ϑ,φ) that we have used to describe it are convenient for an observer far from the centre, but from the time component of the metric tensor we see that clocks appear to slow down and stop at the Schwarzschild radius rS=2GM. This implies a large redshift of any signals detected by a distant observer from an object close to the Schwarzschild radius. The redshift affects both the frequency of any emitted radiation and the period of time between radiation pulses. To a distant observer, an object falling into a black hole disappears just before reaching the Schwarzschild radius.

Something even more alarming appears to happen to the radial component of the metric tensor at the Schwarzschild radius. It blows up there, suggesting that the geometry is singular. However, this is an artifact of the coordinate choice, and in reality the metric as a whole remains smooth and Lorentzian at r=rS. To understand this, we need better coordinates. Useful coordinates were discovered by Eddington in 1924 and independently rediscovered by David Finkelstein in 1958. The Eddington–Finkelstein coordinates retain r,ϑ,φ and replace the time t with a new coordinate v defined by


Differentiating this expression gives


and substituting for dt, the Schwarzschild metric (6.61) is transformed to


in both the regions r<2GM and r>2GM. The metric is now well behaved even at r=rS=2GM. The surface at this radius is a sphere with metric


This is the boundary between light rays that fall towards the centre of the black hole and those that escape to infinity, and is known as the event horizon of the black hole. The event horizon has area 4π(2GM)2.

For large r, the logarithmic term in equation (6.113) is negligible compared to r, so tvr and the metric is approximately


which is the flat Minkowski metric, as one sees by changing to coordinates (vr,r,ϑ,φ).

Light travels along light-like geodesics, on which dτ2=0. We are interested here in the radial light rays, and can make use of the spherical symmetry to set dϑ=dφ=0. There are two radial rays through each point. In flat space far from the black hole they would travel in opposite directions, one radially inwards and one radially outwards, and would be represented on a time–radius diagram by lines at 45 to the vertical.

(p.190) The radial light rays in Eddington–Finkelstein coordinates are given by


One solution is dv=0, implying that v is constant, and it represents a light ray going inwards towards the centre of the black hole. We see from equation (6.113) that as t increases, r must decrease if v is to remain constant. This solution behaves as we might expect, but the second solution of (6.118), which satisfies


is more remarkable. When r>2GM, drdv is positive, so the ray is outgoing. However, when r<2GM, drdv is negative, so the ray is ingoing. This means that once inside the event horizon of a black hole all the light emitted by a radiating body will ultimately fall inwards to the centre of the black hole. Integrating equation (6.119) gives


In Figure 6.10, t˜vr is plotted against r. The figure shows the radial light-like geodesics around a black hole in Eddington–Finkelstein coordinates. The lightcones appear to tip over as the event horizon is approached. The paths of ingoing light rays, given by our first solution v= constant, are shown as straight lines inclined at −45 to the axes. The curved lines represent our second solution, which, outside the event horizon, are the paths of outgoing light rays, but inside the event horizon are ingoing light rays falling to the centre of the black hole.

General Relativity

Fig. 6.10 Spacetime diagram in advanced Eddington–Finkelstein coordinates.

The trajectory of any material particle always lies within the light cones, where dτ2>0. Using the lightcone diagram, we can visualize the possible radial trajectories of massive particles.

(p.191) As we have seen, the apparent metric singularity at r=rS is innocuous, but the Schwarzschild metric has a singularity of an altogether different character at the point r=0. This singularity cannot be removed by a transformation to different coordinates. It is a point of infinite density and infinite spacetime curvature. As no such object can exist physically, it is believed that the prediction of this singularity indicates that general relativity has been stretched into a regime where it no longer accurately represents the physical world. At some point in the gravitational collapse within a black hole, such incredible densities will be reached that the physics can only be described in terms of a quantum theory of gravity. As yet, we do not have a viable quantum theory of gravity, so the centre of a black hole remains a mystery.

6.11.2 The Kerr metric

Although the exterior Schwarzschild metric is a possible geometry of spacetime around a black hole, there is a sense in which it is not physically realistic. Black holes are the result of the gravitational collapse of rotating objects, and are expected to be rapidly spinning. This has been confirmed by astronomical observations. The Schwarzschild metric, being spherically symmetric, describes the spacetime around a non-spinning spherical mass or black hole. In 1963, Roy Kerr found a more general solution of the vacuum Einstein equation. Outside a spinning body or black hole of mass M and angular momentum J, spacetime is described by the axisymmetric Kerr metric


where a=JM is known as the angular momentum parameter, and ρ2=r2+a2cos2ϑ.

The body generating the metric rotates steadily, so none of the Kerr metric components are functions of time. However, unlike the Schwarzschild metric, the Kerr metric includes a time-space cross-term gtφdtdφ, with a coefficient proportional to J. Time reversal, tt, changes the sign of this term and no others. This may be cancelled by the transformation φφ, so time reversal is equivalent to reversing the direction of rotation of the body, that is, to reversing the sign of J. The Kerr metric is referred to as stationary, but not static. It reduces to the exterior Schwarzschild metric when J=0.

The Kerr metric is almost, but not quite, the most general metric representing a black hole. There is an extension known as the Kerr–Newman metric, which includes an electromagnetic field and describes an electrically charged, spinning black hole. In 1972 Stephen Hawking proved that this is the most general metric of an isolated black hole. So, according to general relativity, all black holes can be described in terms of just three parameters: M, J and Q, where M is the mass, J the angular momentum, and Q the charge. This is known as the no-hair theorem. There is no known mechanism for giving a significant charge to a black hole, so it is almost certain that real black holes can be described simply in terms of M and J.

The radius r+ of the event horizon of a spinning black hole is smaller than in the (p.192) non-spinning case and is given by


where rS=2GM is the Schwarzschild radius. The maximum possible angular momentum parameter of the black hole is a=12rS=GM, in which case the angular momentum is J=GM2. In this limit the event horizon radius is r+=GM, half the Schwarzschild radius. The angular velocity of the event horizon is Ω=a2Mr+. This is the rate at which light rays at the event horizon rotate around the black hole.

A freely falling body in the vicinity of a black hole, with no angular momentum, follows a time-like geodesic that takes it inside the event horizon towards the centre of the black hole. To remain static with respect to a distant observer, such that it hovers above the event horizon at a fixed radial coordinate r, the body must be subject to an acceleration, provided by a rocket engine perhaps. For such a body, dr=dϑ=dφ=0. If we first consider Schwarzschild spacetime, then on such a trajectory all the components of the metric (6.61) vanish except gttdt2. The worldline of a massive particle must be time-like, so dτ2>0, which implies that gtt>0 and therefore 2GMr<1. This is always true outside the event horizon, so given sufficient acceleration it is always possible for the body to remain static. This is not so in the Kerr spacetime around a spinning black hole. In Kerr spacetime there is a region outside the event horizon known as the ergosphere, in which it is still possible to escape the black hole, but it is not possible to remain static with respect to a distant observer. Static time-like worldlines are again only possible if gtt is positive and therefore


This condition only holds outside the ergosphere. The boundary of the ergosphere is determined by the quadratic equation


(p.193) so it is the oblate spheroid


Any body inside the ergosphere will necessarily be dragged around by the rotation of the black hole.

The boundary of the ergosphere touches the event horizon at the poles, ϑ=0 and ϑ=π, where the effect of the black hole’s spin disappears. The ergosphere was named by Roger Penrose, who showed in 1969 that it is possible to extract rotational energy from a black hole. According to his scheme, material could be sent into the ergosphere where it would divide into two pieces, one of which is fired into the black hole with negative energy, while the other escapes to infinity with greater total energy than the original material that entered the ergosphere. The Penrose process is depicted in Figure 6.11.

General Relativity

Fig. 6.11 The ergosphere of a rotating black hole.

According to the Kerr metric, even a massive spinning body such as the Earth drags space around with it. This frame-dragging effect, as it is known, was confirmed by Gravity Probe B, which was placed in Earth orbit in 2004. The frame-dragging due to the Earth’s rotation was measured to an accuracy of around 15% by a quartet of gyroscopes aboard the probe. The effect was measured to be just 40 milliarcseconds per year, which agrees with the prediction of general relativity. The effect is much greater in the vicinity of a black hole; it implies that the accretion disc of a spinning black hole must lie in the black hole’s equatorial plane and rotate in the same direction as the black hole. Orbits in the same direction as the black hole’s spin are known as co-rotating, and orbits in the opposite direction are counter-rotating. There are stable co-rotating particle orbits in the Kerr metric that are much closer to the origin than in the Schwarzschild case, so the inner edge of an accretion disc around a rapidly spinning black hole is much deeper in its gravitational well. This greatly increases the binding energy of the closest stable circular orbit. Orbits of unit mass particles in the equatorial plane (ϑ=π2) of the Kerr metric correspond to solutions of


In terms of u=1r, the effective potential here is


where l>0 for co-rotating orbits and l<0 for counter-rotating orbits. V reduces to the Schwarzschild potential (6.107) when a=0. For circular orbits, r and hence u is constant, so V(u)=12E2. These orbits are stable if V(u) is a minimum, requiring dVdu=0 and d2Vdu2>0. The inner edge of the accretion disc lies at the radius rmin where d2Vdu2=0, beyond which there are no stable circular orbits. From the simultaneous equations arising from the conditions dVdu=d2Vdu2=0, we obtain


(p.194) Substituting these expressions into the equation V(u)=12E2, we find


The inner edge of the accretion disc is closer to the event horizon than for a Schwarzschild black hole. In the limiting case of a maximally spinning black hole, where a=GM, the inner edge coincides with the event horizon7 at rmin=r+=GM. In this case, equation (6.129) implies E=13. To reach this radius, particles must release a large proportion of their rest mass as energy; as 1E=1130.42, a remarkable 42% of the rest mass of material in the accretion disc is converted into other forms of energy prior to the material entering the black hole.

This has important astrophysical consequences. The release of gravitational energy from material falling into a black hole spinning at close to the maximum possible rate will approach 30–40%. For this reason, rapidly spinning supermassive black holes are now generally accepted as the origin of the most energetic phenomena in the universe, such as quasars and active galactic nuclei.

A quasar at a distance of six billion light years, discovered in 2008, appears as four images due to the lensing effect of an intervening galaxy, as shown in Figure 6.12. The energy source of the quasar, designated RX J1131-1231, is thought to be a supermassive black hole with a mass of around 108M. The quasar images are magnified by a factor of 50 by the gravitational lens. This has enabled astrophysicists to determine the inner radius of the black hole’s accretion disc, by measuring the broadening of an emission line in the spectrum of iron atoms in the disc due to their gravitational redshift.8 It has been (p.195) estimated that the inner edge of the accretion disc has a radius less than 3GM, half that of a non-spinning Schwarzschild black hole, so the black hole must be spinning extremely rapidly. The most likely value for the angular momentum parameter is a0.87GM.

General Relativity

Fig. 6.12 The quasar designated RX J1131-1231 appears as four images here due to a gravitational lens—the three bright spots on the left of the ring and the one on the right. The diameter of the ring is about 3 arcseconds. (Combined image from NASA’s Chandra X-ray Observatory and the Hubble Space Telescope.)

6.12 Gravitational Waves

When an electrically charged object such as an electron is shaken, it emits electromagnetic waves. This is what happens in a radio transmitter. Pulses of the electromagnetic field propagate through spacetime in accordance with Maxwell’s equations, and produce oscillating forces when they impinge on test charges. Similarly, according to general relativity, shaking or colliding massive objects generate gravitational waves. These ripples in the gravitational field are propagating distortions of the spacetime metric; they are not simply oscillations of the coordinates, because the curvature oscillates too. The detection of a gravitational wave requires the position of at least two test particles to be monitored. Figure 6.13 shows the effect of a passing gravitational wave on a ring of test particles.

General Relativity

Fig. 6.13 The effect of a gravitational wave of one polarization on a ring of test particles. Five frames are shown from one wave cycle.

Gravitational waves have no analogue in Newtonian gravity, because the Newtonian potential ϕ‎ is determined by the instantaneous matter density and does not obey a wave equation, so their very existence is a critical test of general relativity. Because gravity is intrinsically so weak, gravitational waves have an incredibly small amplitude. Only the most energetic events in the universe produce waves that could conceivably be detected on Earth. It is thought that the largest gravitational waves incident on the Earth produce fractional changes in distance of the order of 10−21. Despite this, they carry vast amounts of energy distributed over enormous regions.

As gravitational wave amplitudes are so small, we can safely use the linear approximation to the metric tensor,


ημν is the metric tensor of flat Minkowski space and hμν is a small perturbation that corresponds to the gravitational wave. The vacuum Einstein equation for gμν reduces to a linear wave equation for hμν. In Cartesian coordinates (t,x,y,z), there are two independent gravitational wave solutions for waves propagating in the z-direction. They are both polarized transverse to the z-direction and propagate at the speed of light.


The metric for the polarization shown in Figure 6.13 is


and for the polarization shown in Figure 6.14, rotated by 45, it is


f(tz) and g(tz) are arbitrary functions of small amplitude.

General Relativity

Fig. 6.14 The effect of a gravitational wave of the other polarization.

6.12.1 The detection of gravitational waves

The existence of gravitational waves has been confirmed indirectly by monitoring the Hulse–Taylor binary neutron star system PSR B1913+16, described in section 6.8. The pulsar signal has been observed for several decades and the period of the orbit is gradually diminishing; each year it decreases by 76 microseconds. This can be compared to the expected decrease of the orbital period due to the energy lost through the emission of gravitational radiation, as shown in Figure 6.15. The agreement is a staggeringly good confirmation of general relativity.

General Relativity

Fig. 6.15 Binary neutron star and gravitational wave emission.

The emission of gravitational radiation has now been confirmed in other binary pulsar systems including PSR J0348+0432, a system discovered at the Green Bank observatory in West Virginia in 2007. This remarkable system consists of a neutron star of mass 2M in a tight orbit with a white dwarf of mass 0.17M. Their orbital period is just 2 hours 27 minutes, and it decays at the expected rate of 8 microseconds per year.

The detection of gravitational waves on Earth has been an important goal for physicists for several decades. Detectors have been built at various locations around the world. These include LIGO (Laser Interferometer Gravitational-Wave Observatory) which has two facilities separated by a distance of 3000 km at Hanford, Washington and Livingston, (p.197) Louisiana in the United States. A schematic set-up of one of these facilities is shown in Figure 6.16. The interferometers are L-shaped with two perpendicular 4 km long arms. The whole apparatus is housed within an ultrahigh vacuum. A laser beam impinges on a beamsplitter that directs half the beam down each arm of the interferometer. The light is then bounced back and forth 400 times between two mirrors in each arm that act as test masses, before passing through the beamsplitter again where the two half-beams are recombined and sent to a photodetector. This makes the arms effectively 1600 km long. If the light travels exactly the same distance down both arms the waves cancel, with the peaks in one beam meeting the troughs in the other, so no signal is detected by the photodetector. However, a passing gravitational wave changes the relative lengths of the arms very slightly, in which case the light waves no longer perfectly cancel and a signal is detected. The sensitivity of the apparatus is extraordinary, as it must be to have any chance of detecting gravitational waves. The latest phase of operation is dubbed Advanced LIGO. The upgraded detectors are now sensitive to gravitational waves with amplitudes as small as 5×1022. Two widely separated facilities are required to distinguish true gravitational wave events from the inevitable noise from local background disturbances.

General Relativity

Fig. 6.16 LIGO.

Four days before the official start of the Advanced LIGO programme on 14 September 2015, an unmistakable and practically identical signal lasting 0.2 seconds was measured by both detectors within a few milliseconds of each other, as shown in Figure 6.17. This signal was interpreted as a train of gravitational waves produced by the merger of two black holes at a distance of around 1.3 billion light years. It was the first ever detection of a binary black hole system and the most direct observation of black holes ever made. The signal from the event also confirms that gravitational waves travel at the speed of light.

General Relativity

Fig. 6.17 First gravitational wave signal detected by Advanced LIGO.

Binary black holes should emit a continuous stream of gravitational waves at twice their orbital frequency. With their emission, the binary system loses energy and the black (p.198) holes gradually spiral together. In the final moments of inspiral, the amplitude of the waves increases dramatically. Initially, the newly merged black hole is rather asymmetrical, but it rapidly settles down with a final blast of gravitational waves, known as the ring-down. The signal detected by Advanced LIGO was produced during the final inspiral and ring-down.

Comparison with computer models of black hole merger processes enables researchers to extract a great deal of information about the observed event. The frequency of the gravitational waves allows the masses of the black holes to be deduced and the amplitude of the waves allows their distance to be estimated. Also, the difference in the arrival time of the waves at the two LIGO facilities determines the direction towards the event, at least roughly. Putting all this information together, we know that in this first event, the signal was produced by the merger of two black holes with masses of around 29M and 36M that coalesced to form a rapidly spinning black hole of 62M. In the process, an incredible (p.199) 3M was converted into energy in the form of gravitational waves. The resulting black hole has an angular momentum parameter a0.67GM.

The energy density of an expanding spherical wavefront of electromagnetic radiation emitted by a star decreases with the inverse square of distance from the star. This follows from the conservation of energy. Similarly, the energy density of a gravitational wave decreases with the inverse square of distance from its source. But there is an important difference in how we detect these two types of wave. With electromagnetic waves, we always measure their energy density or intensity, whether the detector is our eye, a CCD camera or a photographic plate. Gravitational wave detectors, on the other hand, directly measure the amplitude of gravitational waves. This is rather advantageous. The energy density of a wave is proportional to the square of its amplitude, so the wave amplitude only decreases inversely with distance from the source. This means that if the sensitivity of Advanced LIGO could be increased by another factor of 10, the volume of space being surveyed would be increased by a factor of 1000. This could increase the rate at which black hole merger and other extreme events are observed by over 1000 times, as they were probably more common in the distant past. We could possibly see black hole mergers like the one described here all the way back to the Big Bang.

There are already plans to increase the sensitivity of Advanced LIGO by a factor of three in the next round of upgrades, and gravitational wave detectors elsewhere are also coming online. The era of gravitational wave astronomy has only just begun.

6.13 The Einstein–Hilbert Action

In section 3.2 we considered the action S of a classical field theory. It is of the form


where the Lagrangian density L(x,t) is integrated over flat 4-dimensional Minkowski space. In the case of a relativistic scalar field ψ‎ the Lagrangian density is


The principle of least action says that for a physical field evolution ψ(x,t), the action S is stationary under any variation of the field. As we have seen, the field equation may be derived by varying ψ‎, and equating δS to zero.

We can use the same procedure in general relativity, but now the dynamical field is the spacetime metric itself, so this must be varied, and we cannot simply assume a fixed, flat background spacetime. Within days of Einstein’s announcement of the field equation of general relativity in November 1915, Hilbert found an appropriate action for the theory, which is now known as the Einstein–Hilbert action. The Lagrangian density must be a scalar, and the simplest one available is the Ricci scalar R. This is indeed the Lagrangian density, and the Einstein–Hilbert action is


Here gd4y is the spacetime volume of a coordinate integration element d4y. It is known as the measure of the integral. g is the determinant of the metric gμν, and the minus sign (p.200) is required for a Lorentzian metric so that −g is positive. g is also the determinant of the Jacobian factor when one changes from local normal coordinates to a general coordinate system in the integral.

Let us look at a couple of illustrative examples. In section 5.4.1 we considered the metric tensor


on a 2-sphere of radius a. As there are no off-diagonal terms, the infinitesimal squared distance is


and an infinitesimal area element is adϑ×asinϑdφ=a2sinϑdϑdφ=gdϑdφ. This is the measure that must be used when integrating over the 2-sphere. The total area of the sphere is


With a diagonal 4-dimensional Lorentzian metric, the measure is


For instance, the measure for the exterior Schwarzschild metric (6.61) is


The metric is diagonal in these coordinates, so g is the product of the four diagonal entries of the metric tensor. (The factors Z and Z1 in the first two terms of the Schwarzschild metric cancel.) We could change coordinates and in general this would produce off-diagonal terms. However, the appropriate measure is still gd4y.

Rather than give a complete derivation of the field equation of general relativity from the Einstein–Hilbert action, which is rather technical and complicated, we will partially sketch it out. The field equation is the condition for the action S to be unchanged to first order when a small change δgμν is made to the (inverse) metric tensor. Both the Ricci scalar and the measure vary as the metric varies. To see what this implies, we require the following results, which we simply quote:


From these expressions, it follows that


The tensor in brackets is the Einstein tensor Gμν. According to the principle of least action, δS=0 for any infinitesimal variation δgμν. This will only be true if the tensor in brackets is zero, which tells us that the vacuum Einstein equation is Gμν=0.

(p.201) We can also include matter fields in the theory and then the action becomes


where α‎ is a constant of proportionality and LM is the Lagrangian density for the matter fields. In general the matter Lagrangian will depend on various fields, such as scalar fields or Maxwell fields. If we vary SM we find


The energy–momentum tensor is defined to be9


so if we vary the whole action SG+SM with respect to both the metric and the matter fields, we find


together with the field equations of the matter fields in a curved spacetime background. Fixing the constant to be α=16πG, we have the Einstein equation in the presence of matter.

The only other term that could be added to the Lagrangian density is a constant, LΛ=2Λ, known as the cosmological constant term. (The factor of 2 is conventional.) The variation of the additional action SΛ is


If this term is included, then the full Einstein equation is


We will consider the significance of the cosmological constant in Chapter 14.

(p.202) 6.14 Further Reading

Bibliography references:

For an overview of gravitation and an introduction to general relativity, see

M. Begelman and M. Rees, Gravity’s Fatal Attraction: Black Holes in the Universe (2nd ed.), Cambridge: CUP, 2010.

N.J. Mee, Gravity: Cracking the Cosmic Code, London: Virtual Image, 2014.

For comprehensive coverage of general relativity, see

I.R. Kenyon, General Relativity, Oxford: OUP, 1990.

S. Carroll, Spacetime and Geometry: An Introduction to General Relativity, San Francisco: Addison Wesley, 2004.

J.B. Hartle, Gravity: An Introduction to Einstein’s General Relativity, San Francisco: Addison Wesley, 2003.

For an approach to general relativity based on particle dynamics, see

J. Franklin, Advanced Mechanics and General Relativity, Cambridge: CUP, 2010.

For a comprehensive treatise on black holes, see

V.P. Frolov and I.D. Novikov, Black Hole Physics: Basic Concepts and New Developments, Dordrecht: Kluwer, 1998.


(1) By convention, there is no factor of 4π here.

(2) From here on, we express the infinitesimal interval as dτ, and an infinitesimal distance as ds, rather than using the notation δτ or δs.

(3) The same equation occurs if the square root is omitted in the integrand, as optimizing the square root of a function is essentially the same problem as optimizing the function itself.

(4) This has been checked for atomic clocks subject to moderate accelerations, but of course fails for clocks that depend on the force of gravity, like pendulum clocks.

(5) Red light forms the low frequency end of the visible spectrum. Redshift is the term used to describe the decrease in frequency of electromagnetic radiation, whether or not it is in the visible part of the spectrum.

(6) The solution sinφ is not needed if we choose φ to be zero at the maximum of u.

(7) The minimum radius of counter-rotating circular orbits is r=9GM for a maximally spinning black hole.

(8) The line corresponds to the emission of 6.4 keV X-ray photons.

(9) This curved spacetime approach to determining the energy–momentum tensor of matter fields is very convenient, and is consistent with what is found by considering energy and momentum conservation in Minkowski spacetime.