Open In Colab

Costs and Optimal Transport (OT)

What is OT about? Everybody knows its about costs, and specifically trying to minimize the expected cost of a coupling or semicoupling or correlation $\pi$ between a source $(X,\sigma)$ and a target $(Y, \tau)$. The coupling $\pi$ is a measure on $X\times Y$ which correlates $\sigma$ with $\tau$.

But what is the cost $c(x,y)$ of transporting a unit mass from $x$ to a unit mass at $y$?

Here we assume that the measures $\sigma, \tau$ represent matter which needs be conserved. So if a unit mass at $x$ is transported to a unit mass at $y$, then the intermediate masses, say $\mu_s$ for some parameter $s$, needs to have total mass $\int \mu_s$ equal to $1$ for all parameters. We will comment below on the possible confusion arising from the use of the term "mass" in applications of OT to GR.

A priori, it is difficult to construct costs $c$ between spaces $X, Y$ which occupy different spaces, having no spatial relationship between points $(x,y)$, and having no measure of either near or far, or hot and cold. In case $X,Y$ occupy different universes, then they might be said to be infinitely far apart and the only canonical cost appears to be a constant zero (or constant infinity) cost. Therefore insofar as optimal transport is studying geometric transport, it is necessary that the source and target spaces $X,Y$ have some spatial relations between the various elements. As trivial as this sounds, it is very difficult and important problem to construct interesting geometric costs.

In practice the author finds best results are obtained when the target $Y$ is given as a subset of the source $Y\hookrightarrow X$. Interesting topological applications arise when $Y=\partial X$ is a type of rational bordification of $X$, e.g. $Y=\partial X[t]$ where $X[t] \subset X$ is a $\Gamma$-rational excision like Borel-Serre bordifications of locally symmetric spaces. See our thesis for such examples.

The author is also not convinced one can invent en abstracto interesting geometric costs.

So rather we turn to physical models for inspiration. In our view, cost always represents a cost of energy in Joules, and not necessarily of dollars, or kilometers.

So a cost $c(x,y)$ measures an interaction between a unit mass at $x$ and a unit mass at $y$.

In our minds, this tells us that costs $c(x,y)$ represent interaction energies.

In our thesis [Ibid] we compared the properties of attractive costs, e.g. the quadratic geodesic cost $c(x,y)=d(x,y)^2/2$ when $Y\subset X$, with the class of so-called repulsive costs. These costs were interesting because the geometry of the singularities of $c$-optimal transports had very different structures.

We were motivated by electrodynamics. Heuristically, the attractive costs represents interaction energies between oppositely charged positive source and and negatively charged target configurations. The repulsive cost represents interaction energies between, say, positive source and positive target configurations. We recall that opposite charges attract and like charges repel, hence the terminology. This final idiom that like charges repel has a very interesting modification when interaction energy is measured via Weber's potential, but we leave this for future topic. This is briefly discussed here.

GR and OT

We would like to continue to develop our ideas on GR and OT. This seems popular subject, and I know we have an interesting viewpoint on the whole situation. Where are we? In previous section we were discussing cost as energy. This implies that the expected cost, or integral $\int c(x,y) d\pi(x,y)$ also has well-defined units of energy. Bearing this in mind, let us now continue our discussion of GR and OT.

As we discussed here what Robert calls the Lorentz distance $\ell(\sigma, \tau)$ between measures $\sigma, \tau$ on $\mathbf{R}^{3,1}$ and which he interprets as the "expected maximal proper time between the events $\mu, \nu$" is not readily interpreted as an energy, and so we find it difficult to accept $\ell$ as a suitable cost in the classical sense.

And so we return to the question "what does $\ell(x,y)$ represent given events $x,y$? " Given the above discussion, let us then also ask: "what does the additive expected value represent $\int \ell (x,y) d\pi(x,y)$ when $\pi$ is a coupling measure between events $\sigma, \tau$? Our point here is that the cost has definite energy units in the classical setting, and these energy units are additive, and therefore the integral representing the expected value again has well-defined energy units. However with the Lorentzian distance $\ell$, there are no units to justify or confirm that the additive average is well-defined. Of course, the numerical integral value is well-defined, but there are many other possible modifications. Robert's paper has anticipated this objection somewhat in his general treatment of $q \ell(x,y)^q$ for $0 < q \leq 1 $.

Is the above question irrelevant? Some might rationalize it away, and dismiss the objection. They might ask "why should $\ell$ need an interpretation?" or "why should $\ell$ require units?".

Our response would be, "well, are we looking for something to compute, or are we looking for something to experimentally verify?" If we are just computing values, and if that is considered physics, then okay, we don't need units. But if physics is to relate to observation, then we necessarily need units. Why? Because units are used to quantify uncertainty. Again, this is the classical viewpoint.

Remark. If the Lorentz-Minkowski metric $ds^2$ was positive definite, then we could happily represent $\ell$ as a sum of positive squares, in which case we have a formula in the units of energy, i.e. kinetic energy assuming that one can define the inertial mass. But in the SR spacetime formulation, there seems no opportunity to introduce inertial mass, and the sum of signed squares has no Riemannian metric meaning.

Problem: Construct Interesting Energetic Costs

So before we digress into a question about GR and OT, let's pose some problems. Basically the usual quadratic cost $c(x,y)=d(x,y)^2/2$ is taken as the canonical cost on a Riemannian manifold. Thus we witness the same thing with proposals for applying OT to SR and GR, and this is indeed natural.

Our question here is where to find more examples of costs. For as we developed in our thesis, for one example, different costs can generate singularities of very different homotopy type. From our point of view, this depends on whether costs are attractive or repulsive.

For example with an attractive cost, one is typically looking at ground states which collect near the target. However for repulsive costs, the ground states are typically deeply nested in the source domain, i.e. states are being repelled from the target, and look to escape as far away as possible.

We continue to use electrodynamic energies as the basic supply of interesting costs. The author would be open to hearing other recommendations for interesting costs.

m, matter, mass, Mach.

I can't help myself from making another comment on the challenge of applying OT to GR. In OT one often speaks about mass transport, where it's always assumed that a continuity equation holds. Thus when the measures are transported there is conservation and nothing lost or gained along the way.

So if OT is to study "mass transport" in the setting of GR, are we to assume that mass also will satisfy local conservation ? In otherwords, what are the measures $\sigma, \tau$ actually representing on the spacetime ?

Wal Thornhill has made this point himself, that among the greatest hazards and sources of confusion in physics is the unfortunate coincidence that both mass and matter begin with the same letter "m".

Really no joke. That's the cause of all the trouble. What happens is mathematicians and physicists both get lazy and begin to interpret "m" for matter and mass as if they are equivalent or interchangeable. This possibly originates in Newton's own non-definition of mass as simply the presence of immediate ponderable matter, and Newton assumed that there was some unspecified constant of proportionality between mass and matter and so "up to a constant which we can set to $1$" both "m[atter]" and "m[ass]" became confused with the letter $m$.

Einstein's Equivalence Principle is another source of confusing the gravitational mass $m_g$ of an object (which following Newton is something vaguely defined like the quantity of gravitational charge, or in otherwords quantity of matter) with the inertial mass $m_i$. Thus Einstein's argument that $m_g=m_i$ is another source of confusion. The argument against Einstein's equivalence principle is elementary, and recognized by Einstein himself that rotating bodies do not admit global inertial frames! Therefore the inertial frames used to convert the gravitational potential into an inertial reference frame is only defined locally on the tangent space. It is very limited first-order observation.

Amazing there is another "m" that enters the problem of "matter" and "mass", namely Mach! Because Mach proposed that the inertial mass $m_i$ must be defined as the potential energy of the body relative to the fixed stars at infinity. What are the implications? Namely that

matter can neither be created nore destroyed

while inertial mass is variable depending on the interaction of the matter with the matter of the fixed stars at infinity.

This is essentially my understanding of AKT Assis' development of Relational Mechanics. This is also Halton Arp's interpretation of the observed intrinsic red shifts of quasars which are visibly interacting with nearby systems. Arp's idea was that, quasars are creation hotspots in the universe, where newly created atoms have less mass because they have been interacting with only a limited part of the universe for a short period of time. Therefore their inertial mass is much smaller, and therefore the wave lengths emitted by the atoms is increased. This increased wave length is caused not by usual red shift velocity mechanism, but by the Machian dependance of inertial mass with the other matter in the universe, and these matter-to-matter signals take time. This is the meaning of the intrinsic red shift, as opposed to the Hubble-Einstein velocity red shift. Further details can be found in Arp's book "Seeing Red".