More Remarks on Craig Alan Feinstein's P versus NP argument.

In previous articles here and here we have been reviewing Craig Alan Feinstein's proposed solution to the $P$ versus $NP$ problem, and he has been poorly received by the so-called experts. But we continue to find his papers interesting, and this post is mainly about his Dialogue Concerning The Two Chief World Views which contains some interesting remarks on precisely what is the argument being made by CAF that $P \neq NP$.

The first point is that given a finite subset $X \subset \bf{Z}$, there are indeed exponentially many expressions in $2^X$. This is obvious. But generically there are also $2^{\#(X)}$ distinct values in the image $\epsilon: 2^X \to \bf{Z}$ where $\epsilon(Y) = \sum_{y\in Y} y$ is the natural addition map defined on subsets $Y \in 2^X$.

This point is made as rebuttal to the objection that if $X$ is bounded a priori, then it's known that there are polynomial many distinct values of the image $2^X \to \bf{Z}$. [Reference needed]. We think this is important observation, but one which might require more evidence or proof. Can we more formally prove that arbitrary subsets $X \subset \bf{Z}$ almost surely have exponentially many distinct values in the image $\epsilon: 2^X \to \bf{Z}$?

If $X$ is in fact a subset of the powers of $2$, namely $$ \{1, 2^2, 2^4, 2^8\}$$ then there's large gaps between the elements, and we think it's clear that the image is exponential. What's also interesting in these cases is that it's also evident that there is a negative solution to the SUBSET-SUM problem.

And this raises the second interesting point in CAF's paper:

The algorithms which are competing with the "meet in the middle" algorithm (which has $O(2^{n/2})$ time-step complexity) must also have the ability to deterministically return a negative answer to SUBSET-SUM(X) if there are no zero subset sums for the given input $X$.

Perhaps it can be argued that generically the solution to SUBSET-SUM is negative, and therefore the exponential search is almost always required to determine negative solutions. It's maybe useful to contrast this with algorithms which are known a priori to have positive solutions, for example linear sort of a list of integers. There are surely many more examples of algorithms which are known a priori to terminate with positive solutions. For example, Hilbert Nullstellensatz as described by [Blum-X-Smale-Shub-et.al.] over the complex numbers.

And actually, doesn't it now appear that: the whole question of SUBSET-SUM(X) is precisely about the injectivity of the map $\epsilon: 2^X \to \bf{Z}$. This is because $\epsilon$ is basically linear and the difference $$\epsilon(f)-\epsilon(g) = 0$$ if and only if $\epsilon(f-g)=0$. But $f-g$ represents a signed symmetric difference, so it's not exactly a solution of SUBSET-SUM.

Idea: perhaps it's possible that the signed subset-sum problem is equivalent to unsigned subset-sum. If we return a signed solution $f-g$ then we can examine whether its a positive solution by looking at the signs of the coefficients. This would only add a linear time-complexity. So maybe we should simply focus on the complexity of injectivity of $2^X$ and its relation to negative solutions. Because any algorithms which competes with MITM algorithm must at least solve this decision problem.

On the Injectivity of $\epsilon: 2^X \to \bf{Z}$

The idea of the previous section is to actually relax somewhat and consider the Signed Subset-Sum problem, maybe abbreviated $SS_{\pm}(X)$ for a given input $X$ in contrast to $SS(X)$. With this point of view the signed subset sum problem becomes precisely equivalent to the injectivity of the augmentation map $\epsilon: 2^X \to \bf{Z}$.

This suggests the following question:

_What is the computational time-step complexity of deterministically deciding the injectivity of a linear map $T: V_1 \to V_2$._

What would it mean if the complexity of injectivity of linear maps was essentially $O(dim(V_1))$, or can it be decreased?

What does category theory say about injectivity?

Does the strange Yoneda Lemma have any application? I know the logician's find that an important representation theorem for certain categories defined over Sets or something, but I never quite understood it.

There are further remarks to be made about CAF's "Two Worlds" paper.

The critics employ an invalid meta-mathematical critique of CAF's proof when they inaccurately reason that "if your proof was correct, then [THIS] would be true." (A kind of reductio ad absurdum)

One such example is this: they say that if we apply CAF's proof to finding solutions in integers of linear equations $$\sum_i a_i x_i =0 $$ where $a_i\in \bf{Z}$ and $x_i \in \bf{Z}$, is asymptotically much faster than $O(2^{n/2})$. This is true, but a trivial example since solutions to such equations are very immediately found. For example, set all unknowns to zero except two. Then it's trivial to solve over $\bf{Q}$ the linear homogeneous equation $ax+by=0$ when $a,b$ are nonzero constant integers.

However it's somewhat more interesting for inhomogeneous linear equations over integers. For example, it's not so easy to solve $$2x_1 + 13x_2 =9.$$ Here we enter into the commutative algebra interpretation. For the range of all possible values obtained by $2x_1 + 13 x_2$ is determined by a unique integer (modulo units), namely the greatest common divisor $gcd(13, 2)$. And the determination of gcd's is obtained via the Euclidean algorithm. For example, we have $13=2.6+1$. Therefore $1.13 - 6.2=1$ and $(x_1, x_2)=(-6, 1)$ is a solution in integers.

That's clear, so what do the critics say? Well, the image of $2x_1 + 13x_2$ is indeed large (infinite), yet we have found solutions without searching through this infinite maze. This is true, so what's the difference? Feinstein argues that the linear equation over integers admits many algebraic identities, or rearrangements. This is somewhat the key of the euclidean algorithm which reduces the coefficients of the equation into essentially equivalent linear expressions, and ultimately into a trivial equation.

Here enters somewhat the idea of the ideal generated by the coefficients over $\bf{Z}$. The image of the linear integer equation is infinite, but it's highly structured, being determined by the multiples of the generator of the ideal, i.e. the gcd of the coefficients. This structure enables the significant speedup in finding solutions.And the point is that this reduction is not possible in the (signed) subset-sum problem for finite sets. Why? Because the image is much less structured, and not forming any "ideal".

If we speculate a little, perhaps this is another reason why SUBSET-SUM is so difficult. It's an extremely constrained version of a much easier problem. I.e. if we are allowed to arbitrarily duplicate elements of $X$, then we could basically realize better approximate solutions to zero subset sum.

Complexity of Deciding Divisibility over $\bf{Z}$

This relates to a curious point about Euclidean algorithm, namely the backsubstitution that is necessary to return to the first solution. If we are simply studying the decidability problem, then we never need this backsubstitution. This relates to CAF's observation that the equation $$83x_1 + 18x_2=t$$ for example, has a solution if and only if there exists $z\in \bf{Z}$ such that $gcd(83, 18).z=t$ where $t$ is the integer target value. With Euclidean we find $gcd(83, 18)=1$ and therefore the equation $83x_1 + 18x_2=t$ is effectively equivalent to the equation $$1.z=t$$ which has trivial solution. So Euclidean algorithm allows us to reduce the equation into a simpler, almost trivial equation.

But the point CAF makes, and perhaps it deserves more elaboration, is that the equations defining SUBSET-SUM are not defined over structured rings with Euclidean algorithms. Maybe it suggests a non-algorithm of trying to perform Euclidean algorithm with only coefficients of $\pm 1$ involved and never really being allowed to re-use an element. It's highly constrained.

To conclude, the decidability of integer solutions depends on a reduction, depending on Euclidean algorithm finding the primite generator of the ideal generated by the integral coefficients, to a much simpler equation $$gcd(a_i).z=t,$$ and the decidability question becomes reduced to "Is the gcd of the coefficients a divisor of the target?".

So after Euclid, everything is reduced to the question How is divisibility decided ? I.e., how to decide the function "isDivisible(a,b)". There's a discussion of the decidability here. Also interesting is the relation to Newton's method, sometimes called Newton-Raphson division.

Remark. The efficiency of the Newton-Raphson is not immediately relevant to the question, since it efficiently finds a solution, yet our decidability problem is related to determining whether $x=t.a^{-1}$ is integral. The question would then become whether Newton's method is sufficiently accurate to distinguish integral versus rational values!

For example, to solve $ax=t$ requires $x=t.a^{-1}$. But the value of $a^{-1}$ can be expressed as finding the zero of the function $0=f(x)=\frac{1}{x}-a$, which has derivative $f'(x)=\frac{-1}{x^2}$. Applying Newton's method to finding approximate zeros $x_n$, we find approximate quotient equal to the product $t.x_n$ and we must decide whether $t.x_n$ is integral.

Reviewing the Wikipedia article on Newton-Raphson has interesting remark on the quadratic convergence of Newton's method. To quote:

"This squaring of the error at each iteration step – the so-called quadratic convergence of Newton–Raphson's method – has the effect that the number of correct digits in the result roughly doubles for every iteration, a property that becomes extremely valuable when the numbers involved have many digits (e.g. in the large integer domain)"{end quote}

It's simple enough to recall the idea behind Newton's method. At an initial point $x_n$ we evaluate the $x$-intercept of the tangent line at $(x_n, f(x_n))$, which tangent line has equation $$\frac{x-x_n}{y-(1/x_n-a)} =-x_n^2. $$ Therefore the $x$-intercept is obtained by $$x_{n+1}:=-x_n^2(0-1/x_n+a)+x_n=x_n(2-ax_n).$$

We remark that there is something arbitrary in our choice of auxiliary function $f(x)=\frac{1}{x}-a$, namely all we really needed was a function $F$ such that $F(a^{-1})=0$) prescribed zero and have well-behaved gradient descent into that zero. That's what is obtained by the analytic expression $f(x)=1/x-a$, and using Newton's calculus we find the slope of the tangent line everywhere as $-1/x^2$ by which, again, via calculation that we have recursive formula for a quadratically fast converging approximation.

Here we face the same difficulty as outlined by Blum-Cucker-Shub-Smale in their real complexity book.

[To be continued ... -JHM]

We might compare two algorithms for determining divisibility. The first algorithm is naive, deterministic, and originates in Eudoxus and Archimedes axiom. It runs like this:

def isDivisible(a,b):
  k=0
  while k*a < b:
    k=k+1
  if k*a == b:
    return True
  elif k*a > b:
    return False

print(isDivisible(7, 39))
print(isDivisible(7, 48))
print(isDivisible(7, 49))
print(isDivisible(7, 59))
print(isDivisible(7, 69))
print(isDivisible(3, 2**(12)-1))
print(isDivisible(5, 2**(12)-1))
print(isDivisible(7, 2**(12)-1))
print(isDivisible(3, 2**(13)-1))
print(isDivisible(19, 2**(13)-1))

False
False
True
False
False
True
True
True
False
False

What's notable about the function isDivisible(a,b) is that it is guaranteed to return an answer given enough steps or iterates. The Eudoxus-Archimedes method of exhaustion guarantees this fundamental property. It is somewhat exhaustive search, iterating patiently through $k$, $k+1$, $k+2$, etc., but which vanishes at some definite (but arbitrarily large step).

There is a faster algorithm for the floating point approximation. In the following code we implement a faster algorithm, however it's not necessarily deterministic, and in fact does not really decide whether $a$ divides $b$, but gives a floating point number which is indicative of whether there is divisibility.

def inv(a, initial=0.1, steps=7):
  x = initial
  i=0
  while i < steps:
    x = x*(2-a*x)
    i=i+1
    #print("the", i, "th approximant is .", x)
  else:
    pass
  return x

#the second function returns the arithmetic quotient b/a. 
# but the algorithm is not yet deterministic about returning a TRUE when
# the resultants are integral.
def div(a,b):
  return b*inv(a)

print(div(3,8))
print(div(9, 81))
print(div(5, 1225))
print(div(3, 2**(12)-1))
print(div(5, 2**(12)-1))
print(div(7, 2**(12)-1))
print(div(3, 2**(13)-1))
print(div(19, 2**(12)-1))

2.666666666666667
9.0
244.99999999999997
1365.0000000000002
818.9999999999999
585.0
2730.3333333333335
215.52601618967765