A digestion of unit distance constructions
Article excerpt
Suppose that one has a set of points in the plane, which we will think of as the complex plane . Let denote the number of unit distances determined by these points, i.e., pairs of points whose displacement obeys the equation (It makes little difference for the asymptotics, but we will count the pair separately […]
Suppose that one has a set of points in the plane, which we will think of as the complex plane . Let denote the number of unit distances determined by these points, i.e., pairs of points whose displacement obeys the equation
(It makes little difference for the asymptotics, but we will count the pair separately from here.)
The Erdös unit distance problem asks, for a given large number , what is the largest possible value of amongst all sets of cardinality ?
For instance, if one takes to be equally spaced collinear points with unit spacing, one can obtain a linear construction with . Erdös observed that one can improve this construction asymptotically:
Theorem 1 (Erdös construction) There exists point sets of arbitrarily large cardinality such that for some absolute constant 0}" class="latex" />.
In fact, in the construction one could take arbitrarily close to . Erdös famously asked whether had to be bounded above by ; and for decades there was significant effort expended on upper bounding , with the best known upper bound being , established by by Spencer, Szemerédi, and Trotter in 1984. We will note here that it seems extremely difficult to improve this upper bound. One reason for this is that if one replaces the equation (1) with the superficially similar equation
(i.e., replace the unit circle by a standard parabola), then the bound is best possible, as can be seen by taking to be a rectangle in the Gaussian integers of width and height . Hence any improvement of the bound would have to exploit some special property of the unit circle that is not shared by the parabola.
It came as some surprise recently when a team from OpenAI resolved the question of Erdös:
Theorem 2 (OpenAI construction) There exists point sets of arbitrarily large cardinality such that for some absolute constant 0}" class="latex" />.
The optimal value of is still unknown, but the best upper and lower bounds on are tracked at this page; currently we know that .
The construction in Theorem 2 is a heavily modified version of that in Theorem 1, and uses some non-trivial amount of algebraic number theory, in particular the device of Golod, Shafarevich towers of field extensions. However, it was later observed using the Mythos AI that one could get a weaker bound with less algebraic number theory, which after optimizing parameters yields the following intermediate result between Theorem 1 and Theorem 2:
Theorem 3 (Mythos construction) There exists point sets of arbitrarily large cardinality such that for some absolute constant 0}" class="latex" />.
Furthermore, by inserting Golod, Shafarevich towers back into the Mythos construction, one can recover the full strength of Theorem 2.
These results already have a number of expositions; see for instance this article of Alon et al., or this blog post of Bloom. As an exercise for myself, I recently spent some time trying to “digest” these constructions and place them on a common footing, with an emphasis on trying to find the minimal route to either heuristically or rigorously recovering these results relying on as little algebraic number theory as possible. The post here is a writeup of this exercise. (Disclosure: AI tools were useful for providing initial summaries of these arguments, as well as on explaining various fundamentals of algebraic number theory to me.)
The first (trivial) observation is that one can use rescaling to replace the unit distance by any other fixed distance. In particular, for any positive real , if we let denote the number of pairs whose displacement obeys the equation
then it is clear that any construction of a point set with a given value of can be rescaled to another point set of the same cardinality with the corresponding value of . It turns out to be convenient to work with values of that are asymptotically large, for instance the product of several large primes.
All the constructions of good point sets basically involve taking all the elements of a certain ring of algebraic integers up to some height. In the original construction of Erdös, was chosen to be the ring of Gaussian integers , but in fact any ring of integers in a non-trivial bounded degree field extension of would suffice to recover Theorem 1 (though always with the constant not exceeding ). To go beyond this, one has to start considering number fields of unbounded degree. As it turns out, the field extensions arising from Golod, Shafarevich towers are the most efficient for this purpose, and lead to Theorem 2; but one can work with the more elementary construction of number fields generated by many square roots of medium-sized primes, and this suffices for the intermediate result in Theorem 3.
The numerology can be explained as follows. Take to be a ring of integers in some number field of degree , and suppose for sake of argument that is the product of (rational) primes , which for simplicity we will assume to all have comparable magnitude, thus for some and all . Thus, is roughly of the size of . In practice one wants to impose some additional “splitting” conditions on these primes , but the prime number theorem, as well as variants such as the Chebotarev density theorem, suggest that we should be able to keep reasonably close to in size; for instance, if we select primes greedily then we can have . In particular we expect to have in practice.
By construction, splits into the product of rational primes. Moving up to the degree extension, one can optimistically hope that splits further into the product of primes in . Using conjugation symmetry, these primes might split into conjugate pairs . By selecting one element from each pair and multiplying, this generates solutions to (3) in . These solutions will of course have complex magnitude ; one can optimistically hope that they in fact have “height” in some sense.
To take advantage of this, take to be the set of points in of height . As has rank , we therefore expect the size of this set to be roughly
(For this heuristic discussion I will be deliberately vague about what the symbol means.) Meanwhile, using our solutions to (3), we expect to have
But this can be clarified by the heuristic (4). Taking logarithms, we expect to have
In the regime where the degree of the number field is held fixed, we thus expect to exhibit logarithmic type growth in , and on inserting this back into (5) we (heuristically) recover Theorem 1 (with the natural constant ). In fact it is not hard to turn the above heuristics into a rigorous argument, by setting equal the Gaussian integers and selecting all the primes to be , so that they split completely in by the Fermat two-square theorem.
But if one can permit the degree to grow in the construction, and in particular be superpolynomial in , then the above heuristics suggest that we can start improving upon Theorem 1 , and even get all the way to Theorem 2 if we can make the degree go to infinity while keeping the number the number of primes .
If one naively tries this approach by forcing all the primes to split completely in a very high degree number field, one runs into significant technical difficulties, not least of which is the need to obtain good error terms in the Chebotarev density theorem, which touches upon such difficult questions as the Generalized Riemann Hypothesis and the existence of Siegel zeroes. From an algebraic number theory perspective, this is related to the breakdown of unique factorization in such number fields, as measured by the class group. The size of this group is in turn controlled by the discriminant of the field, as per the fundamental theorem of Minkowski in this subject.
But one can hope that the arguments are robust enough to tolerate a little bit of breakdown in unique factorization, so long as the class group is not too large. The most natural way to do this is to use all the standard machinery of algebraic number theory, such as the unique factorization of ideals. But there turns out to be a more elementary (though largely equivalent) approach, which is to weaken the target equation (3) to a congruence equation
This condition does not pin down the value of completely, but so long as we can keep the height of not too much larger than , it does restrict to a sufficiently small set of possible values that a simple application of the pigeonhole principle can allow one to conclude.
In order for this strategy to work well, one needs to locate high-degree number fields of controlled discriminant for which it is relatively easy to at least partially split one’s rational primes into ideals in this field . It turns out that requiring the field to have a tower structure and admit complex multiplication (which basically amounts to it including ) is already sufficient to get a satisfactory amount of splitting. To control discriminants, the most efficient choices are the Golod, Shafarevich towers, for which the (root) discriminant stays bounded; but a more naive choice of a tower of quadratic extensions also gives reasonable control on discriminants and is sufficient to establish Theorem 3.
The three constructions thus sit on a continuum, with the key differences being the selection of the key parameters (the number of primes multiplied together) and (the degree). The Erdös construction keeps the degree fixed and sends the number of primes to infinity. The OpenAI construction does the opposite, keeping the set of primes fixed but sending the degree to infinity. The Mythos construction is a compromise, in which the degree and the number of primes both go to infinity in a coupled fashion. In particular, one could easily imagine an alternate timeline of events in which the Mythos construction was the first to be discovered (by either humans or AI) after the Erdos construction as a reasonably natural modification of the latter, and then subsequently refined (again either by humans or AI) to the OpenAI construction once the significance of Golod, Shafarevich towers was realized.
In this recent paper of Pohoata, the terms “horizontal amplification” and “vertical amplification” were proposed for the technique of constructing large configurations by increasing and , thus the Erdös construction becomes a paradigm for horizontal amplification while the OpenAI construction becomes a paradigm for vertical amplification (and the Mythos construction utilizes both types of amplification). See also this paper of Bloom-Sawin-Schildkraut-Zhelezov for another recent application of vertical amplification.
, 1. Some more details,
Here we sketch how the quadratic extension approach can recover Theorem 3.
As indicated above, we will work with a product of (rational) primes of size . Our only requirements of these primes, beyond their size, will be that they are distinct and equal to mod , so that they split in the Gaussian integers. By the prime number theorem in arithmetic progressions, this allows us to take as small as , so in particular .
To construct , we start by taking distinct medium-sized (rational) primes of size for some medium-sized parameter (eventually, when we optimize parameters, we will take to be a small multiple of ). We will not need any further properties of these primes, so by the prime number theorem we can take as small as , so in particular . We will take to be larger than , so that the primes are distinct from the primes .
We will work in the number field generated by and real square roots . For instance, if and , a typical element in this field would take the form
In this particular example, the ring of integers would consist of those elements (8) for which are rational integers. In general, the ring of integers can be slightly larger than this, but for our purposes we can just work with the “naive” ring of integers generated by and . A typical element of this ring then looks like
where are rational integers and we adopt the convention that . Let us define the (naive) height of such a ring element to be . Then the number of elements of of height at most is as long as is sufficiently large (again I will be vague about what means here).
Our main goal is to find a large number of solutions in to the congruence equation (7). As per the usual Minkowski embedding based on the various ways to embed into or , it is convenient to think of as a lattice in a -dimensional vector space, which (due to the presence of , which excludes purely real embeddings) is naturally thought of as the product of copies of . For instance, in the running example, one can identify a ring element with an element
of , and with this embedding becomes a lattice in . In general, the embedding of into is essentially the Walsh, Fourier transform, weighted by the various square roots of . (The appearance of the Walsh-Fourier transform reflects the fact that the specific number field we are working with is an abelian Galois extension with Galois group .) Because of this, one can readily compute the covolume of the lattice, which up to lower order terms is basically , which with our construction can be crudely bounded by . As long as we keep small compared to , this covolume will be small compared to and will end up being a lower order term.
The reason we care about the covolume (whichis essentially the square root of the discriminant) is because of Minkowski’s theorem, which we will use in this crude form: Minkowski’s theorem: any lattice in a -dimensional space of covolume will contain a non-zero lattice vector of length . Thus for instance will contain some vector of length , and any sublattice of of index will contain a vector of length , and thus also of height by inverting the Walsh-Fourier transform.
Anyway, suppose that we can find some sublattice (in fact, they will be ideals, but we will not need this) of for which we can obtain the inclusion
that is to say one has for all . Then clearly any element of will obey the congruence (7). An obvious choice of would be , but this is way too sparse: this lattice has index in , so the shortest vector one can hope to locate in it will have height , which is too large for our purposes. Instead, we would like to have index (which is the smallest it can be while still yielding the inclusion (9)); then will contain a non-zero element of height , which means that is equal to times an element of of height . The number of such elements is , which will end up being a lower order term that we can easily pigeonhole away.
We claim that we can find at least different sublattices of index that obey (9). To verify this claim, it is a straightforward matter to use the Chinese remainder theorem to work “prime by prime”. Indeed, it suffices to show for each of the primes dividing , that there are different sublattices of of index that obey the inclusion
We can descend now to the finite ring , which is a -dimensional vector space over the finite field . The complex conjugation operation descends to an involution on this vector space, and we are looking for subspaces of dimension with the property that
So now we just need to understand the structure of . The general Wedderburn, Artin theorem tells us that this ring is the product of finite fields, but we can be much more explicit here in this specific situation. Exactly as Minkowski embedding maps rings in number fields into product of copies of and associated to the real and complex embeddings of the ring, we can also embed into a product of copies of and , depending on how we assign square roots to or in or the quadratic extension . We can illustrate this with the running example. As mod , the square roots of in stay in . Suppose first that also splits into square roots in . Then we can embed into by mapping
By counting elements we see that this embedding is in fact an isomorphism. If instead does not split, so that now lie in , then we can embed into by mapping
thus we drop half of the previous embeddings as being conjugate to the half that we retain. Again, counting shows that this is an isomorphism.
In general, one can show that is isomorphic to either (if all the split in ) or (if at least one of the does not split). Furthermore, in the former case the copies of organize into conjugate pairs (corresponding to flipping to ), and in the latter case the copies of organize into conjugate pairs. By selecting one element from each conjugate pair, and taking to be the joint kernel of such elements, we can generate either or different subspaces of dimension with the desired property that . By the aforementioned Chinese remainder argument, this gives the claimed lattices of index .
Invoking Minkowski’s theorem, this now generates non-zero vectors of height obeying (7). There is a technical issue that some of these vectors could conceivably collide with each other; however there is a further Chinese remainder theorem argument (which I will omit here) that shows that any such can belong to at most such lattices. So the number of distinct generated by this argument is at least , and by pigeonholing one can now also get at least at least solutions to the equation
of height for some of height . As long as we select
then the type factors can be neglected, and we approximately have
up to lower order terms. If we choose to be a small multiple of , then we soon calculate that , and we recover Theorem 3.