We've been working on an essay on the foundations of special relativity (SR). Why do we invest so much time and effort into SR foundations ? Well honestly because i think the foundations are never taught, so there's alot to say. Moreover the SR converts tend to ignore foundations, evading them and jumping ahead to their conclusions. So the task of foundations is typically neglected by adherents, and it's left to the skeptics to develop the foundational issues. And the student of the history of SR will know that there has always been a strong skeptic school in SR (and GR) and this school frequently included Einstein himself at various times in his life, and for good reason. There is much to critically examine in the SR theory, and this the purpose of our essay.
Einstein SR and Maxwell
Einstein's theory of special relativity (SR) arose from Einstein's study of Maxwell's equations (ME) circa 1870 AD. But what is the logical and mathematical relation between SR and Maxwell? This is interesting question because they are essentially antagonistic. Although Einstein was much influenced by the Maxwellian field theory viewpoint, his own early work (1905) was based on the antithetical photon model of light.
Briefly, by Maxwell equations we understand that in a given reference frame $K$ we have the existence of electric and magnetic fields $E,B$ satisfying the four equations on $div(E), div(B)$ and $curl(E)$ and $curl(B)$ relative to a charge volume density $\rho$ and electric current density $J$. In vacuum where $\rho=0$ and $J=0$, composing the first-order Maxwell equations together yields the second-order fact that the coordinate components of $E,B$ satisfy wave equations with speed of propagation $c=\sqrt{\epsilon_0 \mu_0}$. This usually leads to the idea that electromagnetic field disturbances travel at the speed of $c$ in aether. And indeed Maxwell's equations expressly assume an aether as the medium by the which the electromagnetic radiation travels.
Now this author does not really accept Maxwell's field equations as being satisfactory. For example, the magnetic field $B$ is not a rectifiable or reifiable field, meaning it has only potential and not any material substance. The same could be said for Maxwell's electric field, which again is a potential field describing the force experienced by a charged test particle. This is the so-called continental field-theoretic viewpoint after Maxwell, etc.
However Maxwell's equations were not satisfactory in their predictions on the photoelectric effect. For example, is light a disturbance in the electric or the magnetic field? If light is such a disturbance, then Maxwell equations predict the interaction of the $E$-wave (or is it $B$-wave) with charged test particles.
Here it's interesting to compare Einstein's 1905 explanation of the photoelectric effect using the photon particle theory of light. Thus we tend to interpret Einstein's developments of SR from a photon or corpuscular point of view.
Problem: The classical homogeneous wave equation has the property of the velocity being dependant on the receiver velocity relative to the medium. But SR argues that the assumption on the "rectilinear uniform propagation of light" somehow yields a wave equation where velocity is receiver independant. But how? [We do not address this important issue here].
SR and Lorentz Groups
The null result of the Michelson-Morley experiments led to Einstein's postulating the Lorentz transformations relating space and time variables $x,t$. Undoubtedly the theory of SR is summarized in the representations of the Lorentz group of linear transformations, namely the isometry group designated $O(ds^2)=O(3,1)$ and its standard linear action on ${\bf{R}}^{3,1}$.
For the mathematician, once a single linear representation is given, there are many algebraic constructions possible to obtain further representations, for example the symmetric and alternating representations. We develop this idea further to try and bridge the assumptions of SR to Maxwell's equations, and especially the wave equation.
Now we discuss several group representations (i.e. linear group actions).
First we begin with the standard linear representation $$\rho_0:{\bf{R}}^{1,3} \times L \to {\bf{R}}^{1,3}$$ which is the linear representation $\rho_0$ represented by left matrix multiplication $(v, \lambda) \mapsto \lambda.v$.
Next we dualize.
Let $C({\bf{R}}^{3,1})$ be the space of polynomial functions on the space. Abbreviate $C:=C({\bf{R}}^{3,1})$. Naturally we assemble $C$ from the dual functionals $\lambda\in {({\bf{R}}^{3,1})}^*$. Taking products and polynomials in the dual functions $\lambda$ we obtain the contragradient represention $$\rho_0^*:C( {\bf{R}}^{3,1}) \times L \to C({\bf{R}}^{3,1}). $$
The idea is that the vector spaces $V$ and $V^*$ are isomorphic (non canonically) in finite dimensions. Moreover the algebra generated by $V^*$ yields an (infinite-dimensional) space of polynomial functions on $V$.
Now what are vector fields?
In differential topology, the vector fields $\frac{\partial }{\partial x}$ act on functions as derivations, i.e. as linear maps $$\frac{\partial }{\partial x}: C \to C $$ satisfying Liebniz product formula. Iterating these linear maps generates an algebra of operators on $C$, namely the operators polynomial in $\partial / \partial x$. On the other hand, the differential $dx$ itself as contained in the cotangent space is not an algebra.
Iterating the derivations $\frac{\partial }{\partial x}\circ \frac{\partial }{\partial y}=\frac{\partial^2 }{\partial x \partial y}$ leads to the usual linear differential operators on $C$. We are specially interested in d'Alembert's operator $$\square:= \frac{-1}{c^2} \frac{\partial^2}{\partial t^2} +\frac{\partial^2}{\partial x^2}+\frac{\partial^2}{\partial y^2}+\frac{\partial^2}{\partial z^2} .$$
Our main proposal, and this is not yet altogether rigorous, is to identify the Minkowski squared line element $ds^2$ as dual in a certain yet-to-be-defined algebraic sense to the d'Alembert operator $\square$. The difficulty is that the symmetric product of the differential operators $\partial/ \partial x$ and $\partial / \partial y$ is distinct from the composition of the differential operators $\partial^2 / \partial x \partial y$.
The term $dx^2$ in Minkowski's line element is formally a section of the $(T^*)^{\otimes 2}$ bundle over the manifold space, here ${\bf{R}}^{4}$. So here is the informal computation. Let us formally relabel the variables $$x_0, x_1, x_2, x_3 = t,x,y,z, $$ respectively. Now the choice of Lorentz metric $h$ can be represented as a square symmetric matrix $$[h]= \begin{pmatrix} -c^2 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix}.$$
The choice of $h$ allows us to define an isometry between the differential forms and vector fields, i.e. secord order linear operators. Specifically, $h$ allows us to define explicit isometry between symmetric $(2,0)$ tensors and symmetric $(0,2)$ tensors.
Lemma: The metric $h$ identifies the dual of $ds^2$ with d'Alembert's wave operator $\square$. That is $(ds^2)^* = \square.$
Proof. The proof is basic linear algebra. First one needs prove that the $h$-dual of $dx_0$ is $dx_0^* =-c^{-2} \frac{\partial}{\partial x_0}$. Likewise we find $dx_i^*=\frac{\partial}{\partial x_i}$ for $i=1,2,3$. This is all in the tangent space, i.e. between $(1,0)$ and $(0,1)$ tensors. Now we consider the squares, i.e. the symmetric $(2,0)$ tensors. We find that $$(ds^2)^*=-c^2 (dx_0^*)^2 + (dx_1^*)^2 + (dx_2^*)^2 + (dx_3^*)^2, $$ and which is equal to $$-c^{-2} (\frac{\partial}{\partial x_0})^2 + (\frac{\partial}{\partial x_1})^2 +(\frac{\partial}{\partial x_2})^2+(\frac{\partial}{\partial x_3})^2, $$ which is equal to d'Alembert's square operator $\square$ as desired. [See our remarks above on the nonrigorous nature of this argument, and is the subject of investigation.]
The Lorentz invariance of $\square$ shows the solutions to the homogeneous wave equation (HWE) are Lorentz covariant and $\square \phi =0$ if and only if $\square \lambda \cdot \phi =0$ for every Lorentz transformation $\lambda \in L$. This is the wave equation version of the fact that the null cone $ds^2=0$ is Lorentz covariant.
Now Einstein's (A12) postulates the uniform rectilinear propagation of light in vacuum. This would suggest a corpuscular model of light, being represented as affine parameterized lines $$s\mapsto (s, x(s), y(s), z(s))=(s, \gamma(s)) $$ in ${\bf{R}}^{3,1}$ satisfying $D^2_{ss} \gamma =0$.
Is the equation $D^2_{ss} \gamma=0$ Lorentz covariant? (Yes?)
But what are the corresponding "uniform rectilinear" solutions $\phi$ for the dual HWE: $~~\square \phi=0$ ? Compare this.
An idea: there has always been correspondance between lines in $V$ (one-dimensional linear subspaces) and quadratic functionals via the Segre embedding, or $\lambda \mapsto \lambda^2$ where $\lambda\in V^*$ is a linear functional.
The following questions will be answered below:
Are the quadratic functions $q(x)=h(v,x)^2/2$ solutions to $\square =0$ for null vector $v\in N$? (Yes, we prove below).
Can we find quadratic functions $q$ whose level sets are everywhere orthogonal to the null cone $N$ ?
The idea would be to derive some canonical solutions $\square q=0$ from quadratics arising from vectors on the null cone.
If $v$ belongs to null cone, then $q(x):=h(v,x)^2/2$ for $x\in V$ defines a quadratic function on $V$ with $q(v)=h(v,v)^2=0$.
It's clear that $q$ is minimized along $v^\perp$, i.e. $q(x,v)=0$ for all $x\in v^\perp$ and $v\in v^\perp$. Here $v^\perp$ consists of all $u$ such that $h(u,v)=0$.
Lemma. For every vector $v\in {\bf{R}}^{3,1}$, let $q(x):=h(v,x)^2/2$ be the quadratic form defined by $v$. Then $\square q=0$ if and only if $v \in N$ and $h(v,v)=0$.
Proof. We claim that $\square q=h(v,v)$ when $q(u)=h(v,u)^2/2$. If the vector $v$ has coordinates $v=\langle v_t, v_x, v_y, v_z \rangle$, then $h(v,x)^2$ is equal to $$(-c^2 v_t t + v_xx+ v_yy+ v_zz)^2/2,$$ which is equal to $$c^4 v_t^2 t^2 +v_x^2 x^2 + v_y^2 y^2 + v_z^2 z^2 + (mixed~ terms).$$ Applying d'Alembert's operator we find $$\square q =2( -c^2 v_t^2+v_x^2 + v_y^2 + v_z^2)=2 h(v,v),$$ since $\square(mixed~~terms)=0$ and the claim follows.
Thus we find that null vectors $v\in N$ yield solutions $q_v$ to HWE.
A superposition principle also applies, where any signed measure $\mu \in \mathscr{M}(N)$ yields a $\mu$-averaged solution $q(x):=\int_N q_v (x) d\mu(v)$ to the HWE. Here it would be useful to have a representation theorem, something like, if $\phi$ is any solution of HWE, then $\phi$ can be represented as a $\mu$-average of the $h_v$ as described above.
[To do: establish the conservation of energy for the HWE from the same principles.]