General Relativity For Teletubbies
Sir Kevin Aylward B.Sc., Warden of the Kings Ale
Axiomatic
Derivation
Of
Special Relativity
This is a reproduction of a posting by Tom Roberts,
Physicist, on the sci.physics.relativity newsgroups in 1996/10/22
The post describes an axiomatic approach to the
derivation of Special Relativity. It states axioms and derives from general
procedures the Lorentz Transform. Its implication is that such a derivation
supports the view that “space-time”, that is, “geometry” explains why clocks
incur different readings when subjected to different velocity profiles. It
doesn’t, as explained in Geometry&Relativity.
From: tjrob@bluebird.flw.att.com (Tom Roberts)
Subject: A Physicist's Derivation of Special
Relativity
Date: 1996/10/22
Message-ID:
<54jfst$glp@ssbunews.ih.lucent.com>
organization: Bell Laboratories
keywords: Relativity, Group Theory, Symmetry
newsgroups: sci.physics.relativity
Many criticisms of Special Relativity center on the
"assumption" that the speed of light is constant in all reference
frames. The derivation given here does not make that assumption; the existence
of a universal speed (c) is a natural consequence of the Postulates forming the
basis of the derivation. General symmetry
properties of space-time are sufficient to determine the equations of the Lorentz
Transformation [to within a topological choice - see below]. The bottom line is
that it is IMPOSSIBLE to formulate an alternative to Special Relativity, while
obeying the observed symmetries of space-time and agreeing with the
experimental evidence [see below about the limitations of the symmetry postulates
used in this derivation].
This article will, I hope, show why physicists
believe in Special Relativity (within its applicable domain), and are extremely
sceptical of "alternative descriptions". Historically, it took a long
time for physicists to accept Special Relativity. Even today, the compelling
derivation given here is usually not presented in textbooks; I don't know why.
I claim no originality for this derivation; I do not
know who originally discovered it; I have re-created it based upon
dimly-remembered ideas from graduate school.
Written by:
Tom
Roberts
Lucent
Technologies / Bell Laboratories
tjroberts@lucent.com
original
date: sometime 1989-1990
Colloquially, a Lorentz Transformation is called a
"boost".
This derivation will be heavy going, in algebra; I
hope it will be understandable to most people with a good understanding of
elementary algebra, and a smattering of common sense. This is NOT a rigorous mathematical
derivation, but one at the level of rigor common to physicists.
NOTATION:
F(x) F is a function of x
a*b a multiplied by b
A**2 A squared (raised to the 2nd power ==
A*A)
== "is identical to", or
"is the same as"
= mathematical equality (NOT the
FORTRAN meaning)
First, four Postulates will be given, with a brief
discussion. Then, the general form of the transformation equations will be
derived, followed by a brief discussion of their implications.
THE MAPPING POSTULATE
When two observers observe the same physical
space-time, they assign individual coordinate systems to THE SAME points of
space-time. There is a relation between the assignments they (separately) make,
which is called a coordinate transformation, usually expressed as a consistent set
of mathematical formulas relating the coordinates of one observer to the
coordinates of the other. The coordinate transformation from one system to the
other MUST be one-to-one and onto the other, BECAUSE THEY ARE DESCRIBING THE
SAME PHYSICAL SPACE-TIME; the transformation must be invertible (see Relativity
Postulate, below). [Mathematicians worry about a lot of conditions for this,
and for the other Postulates; this is a Physicist's derivation, and will assume
that physical systems satisfy the mathematical conditions necessary
(continuity, etc.).]
THE ISOTROPY/HOMOGENEITY POSTULATE
Space is isotropic, in that there is no
"preferred direction" in space. The transformation must have the same
mathematical form for a boost in any (spatial) direction. Space is also
homogeneous, in that there is no "preferred position" in space. The
transformation must have the same mathematical form for any origin of
coordinates; this applies to time, as well.
THE RELATIVITY POSTULATE
There is no "preferred velocity", or "Preferred
coordinate system" - only relative velocities are observable. If coordinate system S' is moving with
velocity v, as observed in coordinate system S, then S is moving with velocity
-v, as observed in S'. [This is Einstein's fundamental departure from classical
physics. Today, it seems natural.]
THE GROUP POSTULATE
The collection of all possible Transformations must
form a group under composition by successive application of transformations.
This is the key postulate, and the one that makes a
general derivation of the transformation equations possible; it imposes severe
constraints on the form of the equations. It has four important implications:
1. An
identity transformation exists, which maps a coordinate system to itself.
2. Any
transformation has an inverse, which is also a transformation.
3. The result
of applying two transformations in succession is itself a transformation.
4. The
application of three transformations in succession follows the law of
associativity [ABC = (AB)C = A(BC)].
[This is a more modern approach to the subject than was
common in Einstein's day; Einstein was instrumental in pointing out how
important symmetries are in physics, which leads naturally to group theory.]
Those are the Postulates; make sure you understand
and believe in them now, because they are sufficient to derive the general form
of the transformation equations.
[It may be a surprise to some readers that no
postulate includes a statement that light has the same speed in all frames;
such a statement is not required. This is just one example of the power of
group theory.]
Now for the math...
This derivation will be done in "1+1" dimensions,
that is, for one space coordinate and one time coordinate. The derivation would
follow similar lines in "3+1" dimensions, but the extra complexity
would exhibit no additional features.
There are three frames of reference (or coordinate
systems) of interest; they will be called S, S', and S"; their coordinates
will be called x and t, x' and t', and x" and t", respectively. They
are constructed so that their x, x', and x" axes are all collinear, with
the origins of coordinates coincident (i.e. in the exact same place in the real
(i.e. physical) space-time); that is, the coordinates x=0,t=0 and x'=0,t'=0 and
x"=0,t"=0 all refer to the same point (event) in the real space-time. The Homogeneity Postulate guarantees that no
special significance arises from their coincident origins of coordinates.
All three frames will use the same scales for length
and time (these simplifications are not necessary, but relaxing them would add unenlightening
complexity).
The difference between the three frames is their
relative velocities. We will call the velocity of S' as measured in S, u;
S" as measured in S' is v; S" as measured in S is w. The physical
situation ensures that these assignments can all be made. Implicit is the
assumption that the relative velocities are constant (but arbitrary).
The Mapping Postulate and the Homogeneity postulate
imply that the transformation equations are linear, with coefficients
independent of position. That is
x' =
A(u)*x + D(u)*t + E(u) 1
t' =
B(u)*x + C(u)*t + F(u) 2
The coefficients (A,B,C,D,E,F) can depend upon the
relative velocity between S' and S (i.e. upon u), but there is no other
quantity that can have physical relevance. Thus, eqn 1 & 2 are the most
general possible transformation equations satisfying the postulates. This is
important; if there were other powers of x or t on the right-hand side, the
transformation would not be one-to-one everywhere. If the coefficients depended
upon x or t (as they do in General Relativity - see below), then space-time
would not be homogeneous and isotropic.
[Some other derivations use a postulate that
straight lines are transformed into straight lines to deduce the linearity of
the transformation equations.]
The translation terms (E(u) and F(u)) can easily be
calculated, based upon the construction of the systems S and S'; they are both
0.
They are not functions of u, because we arranged for
coincident coordinate origins independently of u (i.e. for each value of u, the
origins were individually arranged to be coincident). This is true also for the other
transformations (S' to S", and S to S"). The Homogeneity Postulate guarantees
that this choice has no physical significance. Since S' is moving with velocity
u relative to S, the point x'=0 is moving with velocity u (with respect to S);
this allows us to solve for D(u), with no loss in generality:
x' =
A(u) * (x - u*t) 3
t' =
B(u)*x + C(u)*t 4
Note: u=0 is certainly possible, in which case
S'==S, so A(0)=1, B(0)=0, C(0)=1 (i.e. x'=x and t'=t). In the following, u and
v will be assumed to be non-zero, but w will have no such restriction.
The transformations S' to S", and S to S"
follow similarly:
x"
= A(v) * (x' - v*t') 5
t"
= B(v)*x' + C(v)*t' 6
x"
= A(w) * (x - w*t) 7
t"
= B(w)*x + C(w)*t 8
We will now use the Group Postulate to compose Eqns
3 and 4 with Eqns 5 and 6, to get 7 and 8 (i.e. u and v are arbitrarily fixed, and
w will be determined from them).
Substituting 3 and 4 into 5 and 6:
x"
= [A(v)*A(u) - A(v)*v*B(u)]*x - [A(v)*v*C(u) + A(v)*A(u)*u]*t 9
t"
= [B(v)*A(u) + C(v)*B(u)]*x + [(-u)*B(v)*A(u) + C(v)*C(u)]*t 10
Comparing 9 and 10 with 7 and 8, and equating
coefficients of
x and t (Eqns 7-10 are each valid for ALL x and ALL
t), we conclude:
A(w)
= A(v)*A(u) - A(v)*v*B(u) 11
w*A(w)
= A(v)*v*C(u) + A(v)*A(u)*u 12
B(w)
= B(v)*A(u) + C(v)*B(u) 13
C(w)
= C(u)*C(v) - u*B(v)*A(u) 14
Now, let's consider the special case v=-u. Then
5&6 will be the inverse of 3&4 (Relativity Postulate), so w=0. 11-14
become:
1 =
A(-u)*A(u) + u*A(-u)*B(u) 15
0 =
A(-u)*(-u)*C(u) + A(-u)*A(u)*u 16
0 =
B(-u)*A(u) + C(-u)*B(u) 17
1 =
C(u)*C(-u) - u*B(-u)*A(u) 18
Assuming u*A(-u) is non-zero (see below), eqn 16
says:
C(u)
= A(u) 19
[Note this is true in general (not just for v=-u);
it is a mathematical statement about the two functions, valid for all u.]
The Isotropy postulate requires that C(-u) = C(u)
[if I boost S' in a different direction (i.e. backwards), the clocks of S' must
be affected exactly the same as before]. This plus eqn 17 gives:
B(-u)
= -B(u) 20
[Note that the stated symmetries of A(u), B(u), and
C(u) are all consistent with their values at u=0 given above.]
Eqns 15-18 reduce to:
1 =
A(u)**2 + u*A(u)*B(u) 21
Returning to the general case (arbitrary v), Eqns
11-14 become:
A(w)
= A(v)*A(u) - v*A(v)*B(u) 22
w*A(w)
= v*A(u)*A(v) + u*A(u)*A(v) 23
B(w)
= B(v)*A(u) + A(v)*B(u) 24
A(w) =
A(u)*A(v) - u*B(v)*A(u) 25
[Note the symmetry of Eqns 22-25 under interchange
of u <-> v (interchange 22 and 25); this is expected, as adding collinear
velocities should not depend upon their order.]
Eqns 22 and 25 yield:
v*A(v)*B(u)
= u*B(v)*A(u) 26
or (assuming u*A(u) and v*A(v) are both non-zero):
B(u)/(u*A(u))
= B(v)/(v*A(v)) 27
Since Eqn 27 must hold for all u and for all v, eqn
27 must be a universal constant; call it q:
q ==
B(u)/(u*A(u)) = B(v)/(v*A(v)) 28
or
B(u)
= q*u*A(u) 29
Substituting 29 into Eqn 21 gives:
1 =
A(u)**2 + q*u**2*A(u)**2 30
Solving for A(u) gives:
A(u)
= 1/sqrt(1+q*u**2) 31
Combining Eqns 3, 4, 19, 29, and 31, we have the
general form of the transformation equations:
A(u) =
1/sqrt(1+q*u**2) 31
x' =
A(u) * (x - u*t) 32
t' =
q*u*A(u)*x + A(u)*t 33
By solving Eqn 22 for w, we get the rule for
composition of velocities:
w =
(u + v) / (1 - q*u*v) 34
The choice of q is arbitrary. There are three basic
choices that have significantly different behavior: zero, negative, and
positive.
This is the topological choice mentioned above.
Choosing q=0 yields the Galilean transformation:
x' =
x - u*t 35
t' =
t 36
w = u
+ v 37
Note
the universal time; velocities simply add.
These are the "familiar" transformation
equations that are approximately true (to very high accuracy) in our ordinary lives
where velocities are small.
Choosing q<0 yields the Lorentz transformation.
By convention, define a constant, c, by q==-1/c**2
[manifestly negative]
and let G(u/c)==A(u)
we have:
x' =
G(u/c) * (x - (u/c)*ct) 38
ct' =
-(u/c)*G(u/c)*x + G(u/c)*ct 39
G(u/c)
= 1/sqrt(1-(u/c)**2) 40
w/c =
(u/c + v/c) / (1 + (u/c)*(v/c)) 41
Here, ct and ct' are the time coordinates multiplied
by c (which gives them the same units as x and x': length).
Normally, G(u/c) is called gamma, and u/c is called
beta.
In the limit u/c -> 0, 38-41 reduce to 35-37, the
Galilean transformation. Here, velocities do not simply add, but have a more
complicated composition rule; an object moving with velocity c in one frame
moves with velocity c in all frames.
Note, however, that the transformation equations are
not well-behaved when transforming to a frame moving with velocity c; the
velocity c serves as a limiting velocity, because G(u/c) goes to infinity as
u/c goes to 1. Eqn 41guarantees that the composition of two velocities will be less
than c, as long as the individual velocities are each less than c. If u/c > 1, imaginary numbers appear,
leading most physicists to be sceptical of the physical applicability of such
velocities.
Choosing q>0 yields a third transformation;
here q==+1/c**2 [manifestly positive]
and H(u/c)==A(u):
x' =
H(u/c) * (x - (u/c)*ct) 42
ct' =
(u/c)*H(u/c)*x + H(u/c)*ct 43
H(u/c)
= 1/sqrt(1+(u/c)**2) 44
w/c =
(u/c + v/c) / (1 - (u/c)*(v/c)) 45
This transformation can be cast into more familiar
form by substituting:
u = c
* tan(k) (where k,l,m are in the range
-PI/2 < k,l,m < +PI/2 )
v = c
* tan(l)
w = c
* tan(m)
Then the transformation becomes (with some analytic
continuation and trigonometric identities):
x' =
x * cos(k) - ct * sin(k) 46
ct' =
x * sin(k) + ct * cos(k) 47
H(u/c)
= 1/sqrt(1+tan(k)**2) = cos(k) 48
m = k
+ l 49
This transformation is clearly a simple Euclidean
transformation, in which the time coordinate behaves just like the spatial coordinate,
and boosts are simple rotations. The
limit (u/c) -> 0 still yields the Galilean transformation. The velocity c
serves as a velocity "scale", but nothing dramatic happens to the
transformation when (u/c) = 1 (i.e. k = PI/4), or when (u/c) > 1.
The singularity of Eq. 45 disappears when "velocities"
are viewed as "angles"in Eq. 49. Note that two positive velocities
greater than care composed into a NEGATIVE velocity (Eq. 45), which is
explained by Eq. 46-49 as simply going more than halfway around a circle.
Note that there is no velocity that is the same in
all frames, and that causality is not necessarily preserved by a coordinate transformation
(ct' can run BACKWARDS with respect to ct). It seems very difficult to build a
world view based upon Eqs 42-45 (or 46-49).
Before discussing the implications of these
transformation equations, let me suggest the following exercises:
Exercise for the reader: At several places in the
derivation, the velocities u and v were assumed to be non-zero, as wellas some
other functions of u or v were assumed to be non-zero.Verify that all such
assumptions are valid.
Exercise for the reader: Re-do the derivation while
retaining the translation terms of Eqns 1 and 2; show that their presence does
not change the conclusions.
Exercise for experts: As you know, the full Poincare
group includes not only the boosts derived here, but also spatial rotations and
two point transformations: parity inversion (x' = -x) and time reversal (t' =
-t). Don't bother deriving the equations for the full Poincare group (adding
rotations is trivial, but tedious). Instead, note that Eqns 31-33 were derived
from very general considerations, BUT DO NOT INCLUDE THE POINT TRANSFORMATIONS.
Point out exactly where they were left out, and modify the derivation to retain
them.
Identifying the actual topology of space-time can
only be done by resorting to physical observations of phenomena in the real
world (i.e. by doing an experiment). There is a tremendous body of experimental
evidence that shows that the speed of light is independent of the velocities of
either the source or observer (there are also many other, equivalent
observations).
This compels us to choose the Lorentz Transformation
(Eqns 38-41), and to identify the
arbitrary constant "c" with the speed of light. No other choice is
possible, while satisfying the four Postulates and the experimental evidence.
This is why most (if not all) physicists today
believe in Special Relativity - it is IMPOSSIBLE to construct an alternative
description without violating one of the postulates or disregarding a very
large body of experimental evidence. If you truly believe that Special Relativity
simply must be false (for whatever reason), go back and review the four
Postulates, and find a hole in them.
Einstein DID find a hole in the four Postulates, and
brought us General Relativity. He was, in a very real sense, the first fish to see
the water, and to describe it.
Einstein's departure was in the Isotropy/Homogeneity
Postulate - he proposed that space-time is isotropic and homogeneous only
within an infinitesimal region of any given point in space-time; that is, in
the presence of matter, space-time itself is NOT homogeneous, but its geometry
is affected by the presence of matter.
[Before General Relativity, Cartesian coordinates
were used as a matter of course, and their applicability to the real world was
never challenged (Lagrangian mechanics is very different from this). Physical
theory had two basic parts, in which the Laws of Physics possess many symmetries
(such as isotropy and homogeneity of space-time), while the initial conditions
rarely possess the same symmetries (this remains true today, but the lesson has
been learned to be careful). Implicitly, these symmetries were applied GLOBALLY,
to the entire space-time (e.g. as in the statements of the Postulates above).
After General Relativity, Cartesian coordinates have been replaced by general
curvilinear coordinates, and the symmetries are LOCAL in nature (i.e. apply
only within each infinitesimal region of space-time). Unfortunately, this
generality causes enormous complexity in the mathematics; curvilinear
coordinates and general coordinate transformations have not yet been
successfully applied to the other great advancement in physics of the Twentieth
Century - Quantum Mechanics.]
Tom Roberts
Lucent Technologies / Bell Laboratories
tjroberts@lucent.com