General Relativity For Teletubbies

Sir Kevin Aylward B.Sc., Warden of the Kings Ale


Axiomatic Derivation

Of Special Relativity


Contents


This is a reproduction of a posting by Tom Roberts, Physicist, on the sci.physics.relativity newsgroups in 1996/10/22

The post describes an axiomatic approach to the derivation of Special Relativity. It states axioms and derives from general procedures the Lorentz Transform. Its implication is that such a derivation supports the view that “space-time”, that is, “geometry” explains why clocks incur different readings when subjected to different velocity profiles. It doesn’t, as explained in Geometry&Relativity. 


From: tjrob@bluebird.flw.att.com (Tom Roberts)

Subject: A Physicist's Derivation of Special Relativity

Date: 1996/10/22

Message-ID: <54jfst$glp@ssbunews.ih.lucent.com>

organization: Bell Laboratories

keywords: Relativity, Group Theory, Symmetry

newsgroups: sci.physics.relativity


Many criticisms of Special Relativity center on the "assumption" that the speed of light is constant in all reference frames. The derivation given here does not make that assumption; the existence of a universal speed (c) is a natural consequence of the Postulates forming the basis of the derivation.  General symmetry properties of space-time are sufficient to determine the equations of the Lorentz Transformation [to within a topological choice - see below]. The bottom line is that it is IMPOSSIBLE to formulate an alternative to Special Relativity, while obeying the observed symmetries of space-time and agreeing with the experimental evidence [see below about the limitations of the symmetry postulates used in this derivation].

This article will, I hope, show why physicists believe in Special Relativity (within its applicable domain), and are extremely sceptical of "alternative descriptions". Historically, it took a long time for physicists to accept Special Relativity. Even today, the compelling derivation given here is usually not presented in textbooks; I don't know why.

I claim no originality for this derivation; I do not know who originally discovered it; I have re-created it based upon dimly-remembered ideas from graduate school.

Written by:

          Tom Roberts

          Lucent Technologies / Bell Laboratories

          tjroberts@lucent.com

          original date: sometime 1989-1990


Colloquially, a Lorentz Transformation is called a "boost".

This derivation will be heavy going, in algebra; I hope it will be understandable to most people with a good understanding of elementary algebra, and a smattering of common sense. This is NOT a rigorous mathematical derivation, but one at the level of rigor common to physicists.

NOTATION:

          F(x)             F is a function of x

          a*b              a multiplied by b

          A**2           A squared (raised to the 2nd power == A*A)

          ==               "is identical to", or "is the same as"

          =                 mathematical equality (NOT the FORTRAN meaning)

First, four Postulates will be given, with a brief discussion. Then, the general form of the transformation equations will be derived, followed by a brief discussion of their implications.

THE MAPPING POSTULATE

When two observers observe the same physical space-time, they assign individual coordinate systems to THE SAME points of space-time. There is a relation between the assignments they (separately) make, which is called a coordinate transformation, usually expressed as a consistent set of mathematical formulas relating the coordinates of one observer to the coordinates of the other. The coordinate transformation from one system to the other MUST be one-to-one and onto the other, BECAUSE THEY ARE DESCRIBING THE SAME PHYSICAL SPACE-TIME; the transformation must be invertible (see Relativity Postulate, below). [Mathematicians worry about a lot of conditions for this, and for the other Postulates; this is a Physicist's derivation, and will assume that physical systems satisfy the mathematical conditions necessary (continuity, etc.).]

THE ISOTROPY/HOMOGENEITY POSTULATE

Space is isotropic, in that there is no "preferred direction" in space. The transformation must have the same mathematical form for a boost in any (spatial) direction. Space is also homogeneous, in that there is no "preferred position" in space. The transformation must have the same mathematical form for any origin of coordinates; this applies to time, as well.

THE RELATIVITY POSTULATE

There is no "preferred velocity", or "Preferred coordinate system" - only relative velocities are observable.  If coordinate system S' is moving with velocity v, as observed in coordinate system S, then S is moving with velocity -v, as observed in S'. [This is Einstein's fundamental departure from classical physics. Today, it seems natural.]

THE GROUP POSTULATE

The collection of all possible Transformations must form a group under composition by successive application of transformations.

This is the key postulate, and the one that makes a general derivation of the transformation equations possible; it imposes severe constraints on the form of the equations. It has four important implications:

 1. An identity transformation exists, which maps a coordinate system to itself.

 2. Any transformation has an inverse, which is also a transformation.

 3. The result of applying two transformations in succession is itself a transformation.

 4. The application of three transformations in succession follows the law of associativity [ABC = (AB)C = A(BC)].

[This is a more modern approach to the subject than was common in Einstein's day; Einstein was instrumental in pointing out how important symmetries are in physics, which leads naturally to group theory.]

Those are the Postulates; make sure you understand and believe in them now, because they are sufficient to derive the general form of the transformation equations.

[It may be a surprise to some readers that no postulate includes a statement that light has the same speed in all frames; such a statement is not required. This is just one example of the power of group theory.]

Now for the math...

This derivation will be done in "1+1" dimensions, that is, for one space coordinate and one time coordinate. The derivation would follow similar lines in "3+1" dimensions, but the extra complexity would exhibit no additional features.

There are three frames of reference (or coordinate systems) of interest; they will be called S, S', and S"; their coordinates will be called x and t, x' and t', and x" and t", respectively. They are constructed so that their x, x', and x" axes are all collinear, with the origins of coordinates coincident (i.e. in the exact same place in the real (i.e. physical) space-time); that is, the coordinates x=0,t=0 and x'=0,t'=0 and x"=0,t"=0 all refer to the same point (event) in the real space-time.  The Homogeneity Postulate guarantees that no special significance arises from their coincident origins of coordinates.

All three frames will use the same scales for length and time (these simplifications are not necessary, but relaxing them would add unenlightening complexity).

The difference between the three frames is their relative velocities. We will call the velocity of S' as measured in S, u; S" as measured in S' is v; S" as measured in S is w. The physical situation ensures that these assignments can all be made. Implicit is the assumption that the relative velocities are constant (but arbitrary).

The Mapping Postulate and the Homogeneity postulate imply that the transformation equations are linear, with coefficients independent of position. That is

          x' = A(u)*x + D(u)*t + E(u)                        1

          t' = B(u)*x + C(u)*t + F(u)                         2

The coefficients (A,B,C,D,E,F) can depend upon the relative velocity between S' and S (i.e. upon u), but there is no other quantity that can have physical relevance. Thus, eqn 1 & 2 are the most general possible transformation equations satisfying the postulates. This is important; if there were other powers of x or t on the right-hand side, the transformation would not be one-to-one everywhere. If the coefficients depended upon x or t (as they do in General Relativity - see below), then space-time would not be homogeneous and isotropic.

[Some other derivations use a postulate that straight lines are transformed into straight lines to deduce the linearity of the transformation equations.]

The translation terms (E(u) and F(u)) can easily be calculated, based upon the construction of the systems S and S'; they are both 0.

They are not functions of u, because we arranged for coincident coordinate origins independently of u (i.e. for each value of u, the origins were individually arranged to be coincident).  This is true also for the other transformations (S' to S", and S to S"). The Homogeneity Postulate guarantees that this choice has no physical significance. Since S' is moving with velocity u relative to S, the point x'=0 is moving with velocity u (with respect to S); this allows us to solve for D(u), with no loss in generality:

          x' = A(u) * (x - u*t)                                     3

          t' = B(u)*x + C(u)*t                                    4

Note: u=0 is certainly possible, in which case S'==S, so A(0)=1, B(0)=0, C(0)=1 (i.e. x'=x and t'=t). In the following, u and v will be assumed to be non-zero, but w will have no such restriction.

The transformations S' to S", and S to S" follow similarly:

          x" = A(v) * (x' - v*t')                                  5

          t" = B(v)*x' + C(v)*t'                                  6

          x" = A(w) * (x - w*t)                                  7

          t" = B(w)*x + C(w)*t                                  8

We will now use the Group Postulate to compose Eqns 3 and 4 with Eqns 5 and 6, to get 7 and 8 (i.e. u and v are arbitrarily fixed, and w will be determined from them).

Substituting 3 and 4 into 5 and 6:

          x" = [A(v)*A(u) - A(v)*v*B(u)]*x - [A(v)*v*C(u) + A(v)*A(u)*u]*t     9

          t" = [B(v)*A(u) + C(v)*B(u)]*x + [(-u)*B(v)*A(u) + C(v)*C(u)]*t       10

Comparing 9 and 10 with 7 and 8, and equating coefficients of

x and t (Eqns 7-10 are each valid for ALL x and ALL t), we conclude:

          A(w) = A(v)*A(u) - A(v)*v*B(u)                         11

          w*A(w) = A(v)*v*C(u) + A(v)*A(u)*u                12

          B(w) = B(v)*A(u) + C(v)*B(u)                             13

          C(w) = C(u)*C(v) - u*B(v)*A(u)                          14

Now, let's consider the special case v=-u. Then 5&6 will be the inverse of 3&4 (Relativity Postulate), so w=0. 11-14 become:

          1 = A(-u)*A(u) + u*A(-u)*B(u)                  15

          0 = A(-u)*(-u)*C(u) + A(-u)*A(u)*u           16

          0 = B(-u)*A(u) + C(-u)*B(u)                       17

          1 = C(u)*C(-u) - u*B(-u)*A(u)                    18

Assuming u*A(-u) is non-zero (see below), eqn 16 says:

          C(u) = A(u)                                       19

[Note this is true in general (not just for v=-u); it is a mathematical statement about the two functions, valid for all u.]

The Isotropy postulate requires that C(-u) = C(u) [if I boost S' in a different direction (i.e. backwards), the clocks of S' must be affected exactly the same as before]. This plus eqn 17 gives:

          B(-u) = -B(u)                                               20

[Note that the stated symmetries of A(u), B(u), and C(u) are all consistent with their values at u=0 given above.]

Eqns 15-18 reduce to:

          1 = A(u)**2 + u*A(u)*B(u)                        21

Returning to the general case (arbitrary v), Eqns 11-14 become:

          A(w) = A(v)*A(u) - v*A(v)*B(u)                         22

          w*A(w) = v*A(u)*A(v) + u*A(u)*A(v)                23

          B(w) = B(v)*A(u) + A(v)*B(u)                             24

          A(w) = A(u)*A(v) - u*B(v)*A(u)                         25

[Note the symmetry of Eqns 22-25 under interchange of u <-> v (interchange 22 and 25); this is expected, as adding collinear velocities should not depend upon their order.]

Eqns 22 and 25 yield:

          v*A(v)*B(u) = u*B(v)*A(u)                       26

or (assuming u*A(u) and v*A(v) are both non-zero):

          B(u)/(u*A(u)) = B(v)/(v*A(v))                    27

Since Eqn 27 must hold for all u and for all v, eqn 27 must be a universal constant; call it q:

          q == B(u)/(u*A(u)) = B(v)/(v*A(v))            28

or

          B(u) = q*u*A(u)                                         29

Substituting 29 into Eqn 21 gives:

          1 = A(u)**2 + q*u**2*A(u)**2                  30

Solving for A(u) gives:

          A(u) = 1/sqrt(1+q*u**2)                             31

Combining Eqns 3, 4, 19, 29, and 31, we have the general form of the transformation equations:

          A(u) = 1/sqrt(1+q*u**2)                             31

          x' = A(u) * (x - u*t)                                     32

          t' = q*u*A(u)*x + A(u)*t                            33

By solving Eqn 22 for w, we get the rule for composition of velocities:

          w = (u + v) / (1 - q*u*v)                    34

The choice of q is arbitrary. There are three basic choices that have significantly different behavior: zero, negative, and positive.

This is the topological choice mentioned above.

Choosing q=0 yields the Galilean transformation:

          x' = x - u*t                                        35

          t' = t                                                  36

          w = u + v                                          37

          Note the universal time; velocities simply add.

These are the "familiar" transformation equations that are approximately true (to very high accuracy) in our ordinary lives where velocities are small.

Choosing q<0 yields the Lorentz transformation.

By convention, define a constant, c, by q==-1/c**2 [manifestly negative]

and let G(u/c)==A(u)

we have:

          x' = G(u/c) * (x - (u/c)*ct)                           38

          ct' = -(u/c)*G(u/c)*x + G(u/c)*ct                 39

          G(u/c) = 1/sqrt(1-(u/c)**2)                          40

          w/c = (u/c + v/c) / (1 + (u/c)*(v/c))              41

Here, ct and ct' are the time coordinates multiplied by c (which gives them the same units as x and x': length).

Normally, G(u/c) is called gamma, and u/c is called beta.

In the limit u/c -> 0, 38-41 reduce to 35-37, the Galilean transformation. Here, velocities do not simply add, but have a more complicated composition rule; an object moving with velocity c in one frame moves with velocity c in all frames.

Note, however, that the transformation equations are not well-behaved when transforming to a frame moving with velocity c; the velocity c serves as a limiting velocity, because G(u/c) goes to infinity as u/c goes to 1. Eqn 41guarantees that the composition of two velocities will be less than c, as long as the individual velocities are each less than c.  If u/c > 1, imaginary numbers appear, leading most physicists to be sceptical of the physical applicability of such velocities.

Choosing q>0 yields a third transformation;

here q==+1/c**2 [manifestly positive]

and H(u/c)==A(u):

          x' = H(u/c) * (x - (u/c)*ct)                           42

          ct' = (u/c)*H(u/c)*x + H(u/c)*ct                  43

          H(u/c) = 1/sqrt(1+(u/c)**2)                         44

          w/c = (u/c + v/c) / (1 - (u/c)*(v/c))               45

This transformation can be cast into more familiar form by substituting:

          u = c * tan(k)  (where k,l,m are in the range -PI/2 < k,l,m < +PI/2 )

          v = c * tan(l)

          w = c * tan(m)

Then the transformation becomes (with some analytic continuation and trigonometric identities):

          x' = x * cos(k) - ct * sin(k)                          46

          ct' = x * sin(k) + ct * cos(k)                        47

          H(u/c) = 1/sqrt(1+tan(k)**2) = cos(k)         48

          m = k + l                                                    49

This transformation is clearly a simple Euclidean transformation, in which the time coordinate behaves just like the spatial coordinate, and boosts are simple rotations.  The limit (u/c) -> 0 still yields the Galilean transformation. The velocity c serves as a velocity "scale", but nothing dramatic happens to the transformation when (u/c) = 1 (i.e. k = PI/4), or when (u/c) > 1.

The singularity of Eq. 45 disappears when "velocities" are viewed as "angles"in Eq. 49. Note that two positive velocities greater than care composed into a NEGATIVE velocity (Eq. 45), which is explained by Eq. 46-49 as simply going more than halfway around a circle.

Note that there is no velocity that is the same in all frames, and that causality is not necessarily preserved by a coordinate transformation (ct' can run BACKWARDS with respect to ct). It seems very difficult to build a world view based upon Eqs 42-45 (or 46-49).

Before discussing the implications of these transformation equations, let me suggest the following exercises:

Exercise for the reader: At several places in the derivation, the velocities u and v were assumed to be non-zero, as wellas some other functions of u or v were assumed to be non-zero.Verify that all such assumptions are valid.

Exercise for the reader: Re-do the derivation while retaining the translation terms of Eqns 1 and 2; show that their presence does not change the conclusions.

Exercise for experts: As you know, the full Poincare group includes not only the boosts derived here, but also spatial rotations and two point transformations: parity inversion (x' = -x) and time reversal (t' = -t). Don't bother deriving the equations for the full Poincare group (adding rotations is trivial, but tedious). Instead, note that Eqns 31-33 were derived from very general considerations, BUT DO NOT INCLUDE THE POINT TRANSFORMATIONS. Point out exactly where they were left out, and modify the derivation to retain them.

Identifying the actual topology of space-time can only be done by resorting to physical observations of phenomena in the real world (i.e. by doing an experiment). There is a tremendous body of experimental evidence that shows that the speed of light is independent of the velocities of either the source or observer (there are also many other, equivalent observations).

This compels us to choose the Lorentz Transformation (Eqns 38-41), and  to identify the arbitrary constant "c" with the speed of light. No other choice is possible, while satisfying the four Postulates and the experimental evidence.

This is why most (if not all) physicists today believe in Special Relativity - it is IMPOSSIBLE to construct an alternative description without violating one of the postulates or disregarding a very large body of experimental evidence. If you truly believe that Special Relativity simply must be false (for whatever reason), go back and review the four Postulates, and find a hole in them.

Einstein DID find a hole in the four Postulates, and brought us General Relativity. He was, in a very real sense, the first fish to see the water, and to describe it.

Einstein's departure was in the Isotropy/Homogeneity Postulate - he proposed that space-time is isotropic and homogeneous only within an infinitesimal region of any given point in space-time; that is, in the presence of matter, space-time itself is NOT homogeneous, but its geometry is affected by the presence of matter.

[Before General Relativity, Cartesian coordinates were used as a matter of course, and their applicability to the real world was never challenged (Lagrangian mechanics is very different from this). Physical theory had two basic parts, in which the Laws of Physics possess many symmetries (such as isotropy and homogeneity of space-time), while the initial conditions rarely possess the same symmetries (this remains true today, but the lesson has been learned to be careful). Implicitly, these symmetries were applied GLOBALLY, to the entire space-time (e.g. as in the statements of the Postulates above). After General Relativity, Cartesian coordinates have been replaced by general curvilinear coordinates, and the symmetries are LOCAL in nature (i.e. apply only within each infinitesimal region of space-time). Unfortunately, this generality causes enormous complexity in the mathematics; curvilinear coordinates and general coordinate transformations have not yet been successfully applied to the other great advancement in physics of the Twentieth Century - Quantum Mechanics.]

Tom Roberts

Lucent Technologies / Bell Laboratories

tjroberts@lucent.com