The distribution of Hecke eigenvalues, Part I

Here is a question I raised at the Puerto-Rico conference during one of the “problem sessions.” Toby Gee seems to remember that I had some half-baked heuristics that predicted both A and B below, but perhaps one of my readers has a more sophisticated suggestion, or even a similarly wild guess (or even a similarly contradictory collection of guesses).

Fix a pair of distinct odd primes p and l. Now consider a random normalized Hecke eigenform f = \sum a_n q^n \in \overline{\mathbf{Z}} of weight two and level \Gamma_0(N), where N is squarefree and prime to both p and l. Now take the Hecke eigenvalue a_{l} and reduce it modulo a random prime \mathfrak{p} above p.

Question: As one ranges over all newforms of conductor < X, what is the resulting distribution — if it even exists — of a_l \in \overline{\mathbf{F}}_p?

Let me not be too precise about what random means — for example, there's a question about whether one wants to normalize in some way for Galois conjugates of eigenforms, but none of this will really matter for the very weak questions I have in mind. For example, consider the following two possibilities:

  1. A: The element a_l lies in \mathbf{F}_p at least 100% of the time.
  2. B: The element a_l lies in \overline{\mathbf{F}}_p \setminus \mathbf{F}_p at least 100% of the time.

Here by 100% I mean as a proportion of all forms as X \rightarrow \infty, although I confess that I can’t even rule out the extreme version of A where 100% really means every single form.

The specifications on the level are designed to rule out some “trivial” examples. At level \Gamma_0(N) with N squarefree there will be no CM-forms, which is one cheap way to generate large coefficient fields. The level also prevents twisting by characters (an even cheaper trick). Finally, in general, it is possible to generate large coefficient fields for Galois representations by imposing a local condition at some auxiliary prime q. For example, one can impose some supercuspidal condition so that the *local* residual representation at q does not land in \mathrm{GL}_2(\mathbf{F}_{p^m}) for any m not divisible by an arbitrary fixed integer chosen in advance. However, this too is not possible if \pi_q is forced to be either unramified or special (up to unramified quadratic twist).

Note that there do exist infinitely many semi-stable modular elliptic curves, so a_l will lie in \mathbf{F}_p at least infinitely often. This disproves the “extreme” version of B, but doesn’t go very far towards disproving the asymptotic version of B. As for A, every single time you write down a normalized eigenform with coefficients in some field E \ne \mathbf{Q}, you disprove the extreme version of A for a positive density of pairs (p,l). But no finite collection of such forms can disprove A even for a single l and varying p, because there will always be (many) primes which split completely in any finite collection of number fields.

Here are three questions:

  1. Can you disprove the extreme version of A for all p and l?
  2. Can you disprove the super-extreme version of A, namely, show that for all primes p, there exists a newform of squarefree level N prime to p such that the residual representation is not defined over \mathbf{F}_p? (equivalently, replace a_l by the collection of all a_l with l prime to Np.)
  3. Can you give any heuristic that suggests that either A or B (in the weaker form) is either strong or true?
  4. Do you have any guesses as to the distribution of the a_l?

Right now, as you read this, KB’s computer is churning away in sage generating some data, which will be the topic of Part II. But until then, I would like to hear your opinions/guesses. For me, I think that A is probably false, but I honestly have no feeling for B.

Posted in Mathematics | Tagged , , | 2 Comments

En Passant IV

My students Richard Moy and Joel Specter have uploaded their paper on partial weight one Hilbert modular forms, previously discussed here, to the ArXiv.

Germany now leads the world in both soccer and perfectoid spaces.

This is a recipe for a moist tasty almond cake. Also: quick way to make dessert with stale mint oreo cookies: add brandy and mash together, then add coconut, and, voilà, instant rum balls.

SiriusXM is truly, truly disappointing. They don’t seem to understand that when presenting classical music one should always indicate who the performer is. On top of that, SiriusXM just cancelled one third of its classical stations (that would be one out of three).

Here is Existential Comics, the best comic strip I’ve found since I learnt about Dinosaur Comics from BZ.

I still go the the Bourgeois Pig to chat with Matt, but I don’t buy anything; instead, I stop off at Intelligentsia first and get a Cortado (which they call a Gibraltar) instead.

What were the two topics which dominated the political conversation in Australia when I visited in 2011? Asylum policy and the Carbon Tax (Caahbin Tax). What are the topics today? The very same. Is it possible that Abbott could make Little Johnny H seem enlightened? Oh, talking about Australia, perhaps you would like to play the white Australia game.

This is where I like to sit when I talk to Toby about math. My math comes with the imprimatur of justice!

Posted in Waffle | Tagged , , , , , , , , , , , , , , , , , , | Leave a comment

A public service announcement concerning Fontaine-Mazur for GL(1)

There’s a rumour going around that results from transcendence theory are required to prove the Fontaine-Mazur conjecture for \mathrm{GL}(1). This is not correct. In Serre’s book on \ell-adic representations, he defines a p-adic representation V of a global Galois group G_F to be rational if it is unramified outside finitely many primes and if the characteristic polynomials of \mathrm{Frob}_{\lambda} actually all lie in some fixed number field E rather than over \mathbf{Q}_p. Certainly being rational is a consequence of occurring inside the etale cohomology of a smooth proper scheme X, and one might be motivated to make a conjecture in the converse direction assuming that V is absolutely irreducible. But being “rational” is just a rubbish definition (sorry Serre), a mere proxy for the correct notion of being potentially semistable at all primes dividing p (“geometric,” given the other assumptions on V). And the implication

A character \chi: G_F \rightarrow \overline{\mathbf{Q}}_p is Hodge-Tate \Rightarrow \chi is automorphic

doesn’t require any transcendence results at all. One can’t really blame Serre for not coming up with the Fontaine-Mazur conjecture in 1968. The reason for this confusion seems to be the proof of Theorem stated on III-20 of Serre’s book on abelian \ell-adic representations (with the modifications noted in the updated version of Serre’s book), namely:

Theorem (Serre-Waldschmidt): If V is an abelian representation of G_F which is rational, then V is locally algebraic.

This argument (even for the case when F is a composite of quadratic fields, the case considered by Serre) requires some transcendence theory. But the implication V is abelian and Hodge-Tate \Rightarrow V is locally algebraic (also proved in Serre) only uses Tate era p-adic Hodge theory. The other ingredients for Fontaine-Mazur are as follows: First, there is the classification of algebraic Hecke characters (due to Weil, I think). A key point here is that the algebraicity forces the unit group to be annihilated by some element in the integral group ring. However, the representation V occurs in \mathcal{O}^{\times}_F \otimes \mathbf{C} with dimension \dim(V|c = 1) if V is non-trivial, so this forces the existence of representations V of G on which c = - 1, corresponding to CM subfields. The final step is the theory of CM abelian varieties. So although the result is non-trivial, you can be rest assured, gentle reader, that you are not secretly invoking subtle transcendence results every time you twist an automorphic Galois representation by a Hodge-Tate character and claim that the result is still automorphic.

Posted in Mathematics | Tagged , , , | Leave a comment

Report from Luminy

For how long has Luminy been infested with bloodthirsty mosquitoes? The combination of mosquitoes in my room with the fact that my bed was 6′ long with a completely unnecessary headboard (which meant that I had to sleep on an angle with my ankles exposed) did not end well.

As for the math, there were plenty of interesting talks, most of which I will not discuss here. Jan Nekovar gave a nice talk explaining how one could prove that the cohomology of compact Shimura varieties of \mathrm{GL}(2)-type were semi-simple. For concreteness, imagine that X is a Hilbert modular surface associated to a real quadratic field F. Suppose that \rho is the Galois representation associated to a cuspidal Hilbert modular form of parallel weight two. Then the Langlands-Kottwitz method shows that the semi-simplification of \rho^{\otimes 2} should occur inside H^2. On the other hand, this argument only ever deals with the trace of Hecke operators and so cannot say anything about semi-simplifications. Nekovar’s argument is to use the Eichler-Shimura relation applied to partial Hecke operators for primes which split completely in both F and the corresponding reflex field. The point is that these operators satisfy a quadratic relation (with distinct eigenvalues for generic elements of the Galois group), and so act semi-simply on H^2 (imagine everything is compact here). Then, by pure group theory, if the image of \rho is large enough, the sheer number of such elements is enough to force semi-simplicity. It is perhaps useful to note that if V is a representation such that V^{ss} = (\rho^{\otimes 2})^{ss} and V is geometric and pure, then V should automatically be semi-simple. This follows from any number of combinations of bits of the standard conjectures, but one way to see it is that if W is geometric and of weight zero, then (by Bloch-Kato) one should have H^1_f(F,W) = 0. The relevant W in the example above is \mathrm{ad}^0(\rho). So in fact one can give an alternate proof of the theorem using the full power of modularity lifting theorems, providing one is willing to omit finitely many primes p. This is really an explanation of why Jan’s result is nice! For example, as soon as one replaces \rho^{\otimes 2} by \rho^{\otimes n}, one has to start dealing with H^1_f(F,\mathrm{Sym}^{2n}(\rho)(-n)), which gets a little tricky.

These fellows turned up for the Bouillabaisse.

Ana Caraiani talked about a very nice result concerning the sign of Galois representations associated to torsion classes for \mathrm{GL}(n)/F^{+} for totally real fields F^{+} (this was joint work with Bao Le Hung). Namely, the trace of any complex conjugation lies in \{-1,0,1\} (in fact, the result identified the exact characteristic polynomial, which is more general in small characteristic). The basic strategy is to follow Scholze’s construction and “reduce” the problem to the case of essentially self-dual forms, where one has previous results by Taylor, Bellaïche-Chenevier, and Taïbi. However, there is a problem, which is that the regular self-dual automorphic forms one finds congruences with need not be globally irreducible, and perhaps not even cuspidal. Suppose one can show that they decompose into an isobaric sum \pi = \boxplus \pi_i where the \pi_i are self-dual. One runs into problems if too many of the \pi_i are of dimension n_i with n_i odd. However, by considering the weights, only one of the \pi_i can be odd, because otherwise the Hodge-Tate weight zero would occur with multiplicity which would violate the fact that \pi is regular. There is still something to check for the even n_i also, because previous results required some sign assumption on the character \eta such that \pi = \pi^{\vee} \eta. I believe that even getting to this point required a further assumption on the torsion class not coming from the boundary. In the boundary case, there was also a reduction/induction case which also required careful handling of “odd” dimensional pieces, and some computation of a restriction of Hecke operators from the relevant Parabolic/Levi which required a sign to come out correctly. One clever technical step was working with the cohomology of adelic quotients G(F)\backslash G(\mathbf{A})/UK where K is a maximal compact of G(\mathbf{R}) rather than the connected component K^0. The advantage of this is that, in the odd dimensional case, this pins down the trace of complex conjugation to be +1 rather than \pm 1. This is clear when n = 1, and that one should expect it to be true follows for n odd by taking determinants.

Peter Scholze gave a talk on his new functor. The basic elements in the construction of this functor are as follows. The Gross-Hopkins period map allows one to view the (infinite level) Lubin-Tate tower as a \mathrm{GL}_n(F)-torsor over the (D^{\times} Severi-Brauer variety) \mathbf{P}^{n-1}_{\mathbf{C}_p}. So, given an admissible representation \pi, one can form the “local system” \mathscr{F}_{\pi} on the base, and then take its cohomology. The key technical point of this construction is to show that the result is admissible for D^{\times}, which amounts to proving finiteness of K-invariants for suitable compact open K of D^{\times}. The first step is to pull back to the (lowest level) part of the Lubin-Tate tower, which one can do because the GH map splits. Now the map from infinite level to the base of the Lubin-Tate tower is really a \mathrm{GL}_n(\mathcal{O}_F)-torsor, so one only has to consider the restriction of \pi to \mathrm{GL}_n(\mathcal{O}_F). But then using the admissibility of \pi, one can look instead at the regular representation of \mathrm{GL}_n(\mathcal{O}_F). Now, by some sort of Shapiro’s Lemma, one can pull everything back up to infinite level. At infinite level, however, one can replace the Lubin-Tate space by the corresponding Drinfeld tower. Now taking K-invariants is something that is “easy” to do, because there is an action of K on the space, and the quotient by K is some sufficiently nice object for which one has (again by Peter) some nice finiteness theorems for cohomology. I should probably have mentioned that at some point we are working with coefficients in \mathcal{O}^+/p, i.e. in the almost world. The main application in the talk was to show that when one patches completed cohomology (a la CEGGPS), then one can recover the Galois representation from the result. This essentially amounts to showing that when one patches together suitable admissible \pi_i, one can also patch the functor. This requires more than admissibility of the functor, but some sort of “uniform” admissibility (which is always required for patching). I think the key point here is that if \pi_i is something patched with a group of diamond operators \Delta, then \pi_i has a filtration by |\Delta| copies of the original \pi, and so \mathscr{F}_{\pi_i} has a corresponding uniformly bounded filtration by \mathscr{F}_{\pi}, and so H^{n-1}(\mathbf{P}^{n-1}_{\mathbf{C}_p},\mathscr{F}_{\pi})^K has length at most |\Delta| times the corresponding length for the (fixed for all time) version for \pi. On the other hand, Peter instead pulled out a new piece of kit by patching using ultra-filters. My own feeling about logic is that it is never really necessary to prove anything, and I think PS agreed that it wasn’t strictly required for this particular application. Now I understand that my prejudice may not be justified (for example, it is probably hard to prove various identities concerning orbital integrals in small characteristic directly), but I think it applies in this case. Plus, as a purely expositional remark, if you are going to whip out ultrafilters during a number theory talk then everyone is just going to talk about ultrafilters rather than the beautiful construction!

Posted in Mathematics, Uncategorized | Tagged , , , , , , , | 5 Comments

An Obvious Claim

It’s been a while since I saw Serre’s “how to write mathematics badly” lecture, but I’m pretty sure there would have been something about the dangers of using the word “obvious.” After all, if something really is obvious, then it shouldn’t be too difficult to explain why. It is especially embarrassing when someone asks you to clarify a remark/claim in one of your papers which you claim is “obvious” and you find yourself having no idea what the implicit argument was supposed to be. Such a thing happened recently to me, when Toby asked me to explain why the following was true:

Claim: Let N \equiv 3 \mod 4 be prime, and let \epsilon be the fundamental unit of K = \mathbf{Q}(\sqrt{N}). Then \epsilon = a + b \sqrt{N} where a is even and b is odd.

Proof of Claim: Between Toby, Kevin, and myself, we managed to come up with the argument below, following a suggestion of Toby Rebecca Bellovin: It’s easy enough to see (obvious) that a and b are integers and N(\epsilon) = 1. Hence, it suffices to rule out the case that b even and a odd. Write a^2 - N b^2 = 1. It follows that a^2 \equiv 1 \mod N, and since N is prime, that a \equiv \pm 1 \mod N. Assuming that a is odd, write a = 2NA \pm 1, and b = 2B. Then the equation above becomes

A(NA \pm 1) = B^2.

Without loss of generality, assume that A is positive. Then this equation implies that A and NA \pm 1 are squares, say A = d^2 and NA \pm 1 = c^2. But then

c^2 - N d^2 = (NA \pm 1) - N A = \pm 1,

and hence \eta = c + N \sqrt{d} is a (smaller) unit (in fact, \eta^2 = \pm \epsilon), contradicting the assumption that \epsilon was a fundamental unit. \quad \square

This argument is really a 2-descent on the unit group. As Kevin remarked: “So this is a descent argument in a completely elementary situation which I don’t think I’d ever seen before and which proves something that I don’t think I knew … What’s ridiculous is that if the equation had been a cubic and we were after rational solutions then I would have instantly leapt on descent as one of my main tools for attacking it :-/ We live and learn!”

So what was I thinking when I wrote the paper? The actual claim in the paper is this: “If H' is the (2 part of the) strict ray class group of K of conductor (2), then H = H', where H is the (2 part of the) class group. The “argument” is as follows:

The proof of [the above] is even more straightforward: it follows immediately from a consideration of the units in \mathcal{O}^{\times}_K and the exact sequence

\mathcal{O}^{\times}_K \rightarrow (\mathcal{O}_K/2 \mathcal{O}_K)^{\times} \rightarrow H' \rightarrow H \rightarrow 0.

Well, at least the word obvious was only implicit here. I could try to place the blame on my co-author Matt here, but honestly the phrasing of the claim does sound a little like something I would write.

Next up: a report from Luminy!

Posted in Mathematics | Tagged , , , , | 4 Comments

Huuuuuge piles of cash

As widely reported today, the first of the “Breakthrough” prizes in mathematics have been announced. Leaving aside the question as to whether such awards are sensible (Persiflage is more sympathetic to capitalist principles than your average pinko marxist mathematician), I think we should all be happy that the inaugural awardees are beyond reproach, something which could not have been taken for granted. So well done to whoever was on the prize committee.

Of the awardees that I know personally, my impression is that, at least in part, they would feel mildly embarrassed by the large amount of cash involved, even if they (presumably) feel gratified by the deserved acknowledgement of their contributions to mathematics. I do hear, however, that a bottle of Chateau d’Yquem 1967 does wonders to wash away any last remaining vestiges of embarrassment, and I am more than willing to help out with the consumption of said beverage if required.

Posted in Mathematics, Politics | Tagged , | 2 Comments

There are non-liftable weight one forms modulo p for any p

Let p be any prime. In this post, we show that there is an integer N prime to p such that H^1(X_1(N),\omega_{\mathbf{Z}}) has a torsion class of order p. Almost equivalently, there exists a Katz modular form of level N and weight one over \mathbf{F}_p which does not lift to characteristic zero. We shall give two different arguments. The first argument will have the virtue that the torsion class is non-trivial after localization at a maximal ideal \mathfrak{m} which is new of level N. The second argument, in contrast, will produce torsion classes at fairly explicit levels. Neither proof, unfortunately, implies the existence of interesting Galois representations unramified at p with image containing \mathrm{SL}_2(\mathbf{F}_p). Rather, the classes will come from deformations of characteristic zero classes. (This post is an elaboration of my comment here.)

The first argument: Let K/\mathbf{Q} be an imaginary cubic extension unramified outside p with Galois closure L/\mathbf{Q} with Galois group S_3. There is a corresponding Galois representation:

\rho: G_{\mathbf{Q}} \rightarrow  \mathrm{Gal}(L/\mathbf{Q}) = S_3  \rightarrow  \mathrm{GL}_2(\mathbf{Q}_p).

This representation is modular. Suppose for convenience that p > 3. Associated to \rho is an absolutely irreducible residual representation \overline{\rho}. Let R denote the corresponding universal unramified deformation. The only characteristic zero deformations are dihedral. Let R^{\mathrm{dh}} denote the corresponding universal unramified dihedral deformation ring. It’s easy to identify this ring explicitly; it is

R^{\mathrm{dh}} = \mathbf{Z}_p[C_E \otimes \mathbf{Z}_p],

where C_E is the class group of the imaginary quadratic subfield E of L. The ring R will fail to be \mathbf{Z}_p-flat exactly when R \ne  R^{\mathrm{dh}}. Fortunately, this can be determined purely from the reduced tangent space of R. Note that

\mathrm{ad}^0(\rho) \simeq \rho \oplus \eta,

where \eta is the quadratic character of E/\mathbf{Q}. The reduced tangent space of R^{\mathrm{dh}} is the Bloch–Kato Selmer group H^1_f(\mathbf{Q},\overline{\eta}), where H^1_f denotes the subring of cohomology classes which are unramified everywhere. So it all comes down to finding K so that H^1_f(\mathbf{Q},\overline{\rho}) is non-zero. However, an elementary argument using inflation-restriction shows that this is equivalent to showing that the class number h_K of K is divisible by p. So we are done provided that we can find a suitable K with class number divisible by p. (I should mention, of course, that we are using the theorem that R = \mathbf{T}_{\mathfrak{m}} which was proved by David Geraghty and me.) The last step follows from the lemma below; the argument is essentially taken from this paper of Bilu–Luca.

Lemma: Fix a prime p > 3. There exists an imaginary cubic field K/\mathbf{Q} of
discriminant prime to p and class number divisible by p.

Proof: Consider the field K = \mathbf{Q}(\theta), where

(\theta^2 + 1)(\theta - t^p + 1) - 1 = 0,

and t is an element of \mathbf{Q} to be chosen later. Note that \theta^2 + 1 is manifestly a unit in K. We may compute that

(\theta^2 + \theta + 1) \theta = (1 + \theta^2) t^p.

Since (\theta,\theta^2 + \theta +  1) = (1) is trivial, it follows that (\theta) = \mathfrak{a}^p for some ideal \mathfrak{a}. We shall show that, for a suitably chosen t, the element \mathfrak{a} is non-trivial in the class group. If \mathfrak{a} is trivial, then, up to a unit, \theta is a pth power. On the other hand, the rank of the unit group of K is one, and \theta^2 + 1 is a unit. Hence, it suffices to choose a t such that:

  1. \theta^2 + 1 generates a subgroup of \mathcal{O}^{\times}_K of index prime to p. Equivalently, \theta^2 + 1 is not a perfect pth power in K.
  2. None of the elements \theta (\theta^2 + 1)^i for i = 0,\ldots,p-1 is a perfect pth power in K.
  3. The polynomial defining K is irreducible.
  4. The discriminant of K is not a square.
  5. The discriminant of K is prime to p.

By working over the function field \mathbf{Q}(t) instead of \mathbf{Q}, one finds that the first four conditions hold for all t \in \mathbf{Q} outside a thin set. (The discriminant \Delta is always negative, so the signature of the field is always (1,1).) On the other hand, the discriminant of the defining polynomial is -3 \mod t, so if one (for example) takes t to be an integer divisible by p then the discriminant will be prime to p. Note that the set of integers divisible by p will contain elements not in any thin set, because the number of integral points of height at most H in a thin set is o(H).

Second Argument: Let E = \mathbf{Q}(\sqrt{-23}), and let L = E[\theta]/(\theta^3 - \theta + 1) be the Hilbert class field of E. There is a weight one modular form of level \Gamma_1(23) and quadratic character corresponding to the Galois representation:

\rho: \mathrm{Gal}(L/\mathbf{Q}) \rightarrow \mathrm{GL}_2(\mathbf{Q}_p).

Lemma: Let p > 3. Let q = x^2 + 23 y^2 be a prime such that q \equiv 1 \mod p. Equivalently, let q be a prime which splits completely in L(\zeta_p). Then

\# H^1(X(\Gamma_1(23) \cap \Gamma_0(q)),\omega)^{\mathrm{tors}}

is divisible by p. More generally, for any prime q, the quantity above is divisible by any prime divisor of a^2_q - (1+q)^2, and a_{\ell} \in \{2,0,-1\} for a prime \ell is the coefficient of q^{\ell} in q \prod (1-q^n) (1 - q^{23 n}).

Proof: This follows from “level–raising” in characteristic p for weight one forms. Under the hypothesis that a^2_q - (1+q)^2, we find that there is more cohomology (over \mathbf{Z}_p) in level \Gamma_1(23) \cap \Gamma_0(q) than is accounted for by oldforms. Assuming that there is no torsion, this is inconsistent with the fact that there are no newforms in characteristic zero, because weight one forms cannot be Steinberg at any place. (The easiest way to see this is that the eigenvalue of U_q would have to be non-integral — it also follows on the Galois side from local-global compatibility, but this is overkill.) Note that level–raising in this context does not follow from classical level–raising — for the details I refer to you my fifth lecture in Barbados on non-minimal modularity lifting theorems in weight one.

Posted in Mathematics | Tagged , , , | 2 Comments