Counting solutions to a_p = λ

We know that the eigenvalue of T_2 on \Delta is 24. Are there any other level one cusp forms with the same Hecke eigenvalue? Maeda’s conjecture in its strongest form certainly implies that there does not. But what can one prove along these lines? Conjecturally, one would certainly predict the following:

Conjecture: Fix a tame level N prime to p. If \lambda \ne 0, there are finitely many eigenforms of level N an arbitrary weight such that a_p = \lambda. If \lambda = 0, there are finitely many eigenforms with the additional condition that they do not have CM by a quadratic field in which p is inert.

I have no idea how to prove this conjecture. If one counts the number of such forms of weight \le X, then the trivial bound for eigenforms with a_p = \lambda is O(X^2). When I visited Princeton a few weeks ago, Naser Sardari, a student of Sarnak, showed me a short preprint he is writing which improves this bound by a power saving (additionally, it gives a power saving for each individual weight as well). The most interesting case of this result is when \lambda = 0, but today I want to talk about the much easier case when \lambda \ne 0, where, via some p-adic tricks, one can obtain a substantial improvement on the trivial bound. Let’s start from the following:

Proposition I: Let S_{\lambda}(X) denote the number of cuspforms of level N and weights \le X such that a_p = \lambda. Assume that \lambda \ne 0. Then

S_{\lambda}(X) = O(X).

Proof: Since \lambda \ne 0, the p-adic valuation of \lambda is finite. However, all forms with bounded slope belong to one of finitely many Coleman families, so the number of such forms in any weight is bounded. Using Wan’s explicit results, one can even give an explicit bound here that depends only on N, p, and the valuation of \lambda.

The point of this post, however, is to give an improvement on this bound.

Proposition II: Let S_{\lambda}(X) denote the number of cuspforms of level N and weight \le X such that a_p = \lambda. Assume that \lambda \ne 0. Then, as X \rightarrow \infty,

S_{\lambda}(X) \ll_{\lambda} \log \log \log \log \log \log \log X.

The argument will (obviously) allow for an arbitrary number of logs. But then the statement would become more cumbersome.

Proof: As in the proof of the previous result, we may reduce to the case where we are considering a single Coleman family \mathcal{F}. Over this family, the function U_p is continuous, and hence so is U_p(U_p - \lambda). More importantly, over a small enough disc, it is an Iwasawa function. Let \Sigma denote an infinite set of integral weight such that, for the relevant points of \mathcal{F}, we have T_p = \lambda, or

U_p(U_p - \lambda) = - p^{k-1}.

If s is a limit point of \Sigma, then certainly U_p(U_p - \lambda) will vanish at s. Since this function is a non-zero bounded function on a disc, it has only finitely many zeros, and so the set of weights \Sigma will have only finitely many limit points. Thus, we may reduce to the case where the set of weights has a single limit point. In particular, if S_{\lambda}(X) is not bounded, we may imagine that the set \Sigma consists of a sequence of integers (which we may assume to be increasing in the Archimedean norm): k_0, k_1, k_2, \ldots which converge p-adically to s, and, at the relevant point of \mathcal{F}, correspond to an eigenform which satisfies the equation

U_p(U_p - \lambda)(k_i) = - p^{k_i - 1}.

Around a zero s, any Iwasawa function has an asymptotic expansion of the form

F(s + \epsilon) \simeq A \cdot \epsilon^m + \ldots

where the LHS has the same valuation as the leading term of the RHS for sufficiently small \epsilon. If F = U_p(U_p - \lambda), we deduce that, for sufficiently large integers k_i,

v(s - k_n) = r k_n + c

for some r = 1/m > 0, which implies that v(k_{n+1} - k_n) = v(s - k_n), and hence also that

k_{n+1} - k_n > C p^{r k_n}

for some r  > 0. This iterated exponential growth proves the result. QED.

The argument also shows that if the set \Sigma is infinite, the limit roots of U_p - \lambda = 0 will be transcendental Liouville numbers, which seems unlikely. The result also applies if one replaces \lambda by a sufficiently continuous function without zeros, say a_2 = 24(1 + 2(k -12)^2). On the other hand, I don’t think these analytic methods will ever be enough to prove the conjectural bound, which is O(1).

Posted in Mathematics | Tagged , , , , | 2 Comments


The space of classical modular cuspforms of level one and weight 24 has dimension two — the smallest weight for which the dimension is not zero or one. What can we say about the Hecke algebra acting on this space without computing it?

Formally, the Hecke algebra \mathbf{T} is a rank two \mathbf{Z}-algebra, which is either an order in the ring of integers of a real quadratic field, or a subring of \mathbf{Z} \oplus \mathbf{Z}. Let’s investigate the completion of this algebra at various primes p.

Let’s first consider the prime p =23. The curve X_0(23) has genus two, and the corresponding Hecke algebra in weight two is \mathbf{Z}[\phi], where \phi is the Golden Ratio. The prime p =23 does not split in this field, and hence modulo p there is a pair of conjugate eigenforms with coefficients in \mathbf{F}_{p^2}. Multiplying by the Hasse invariant, we see that this eigenform also occurs at level one and weight 24 over \mathbf{F}_{p}. It follows that:

\mathbf{T} \otimes \mathbf{Z}_{23} = W(\mathbf{F}_{23^2}).

In particular, \mathbf{T} = \mathbf{Q}(\sqrt{D}) for some square-free integer D > 0.

Now let us consider primes p < 23. Any Galois representation modulo such a prime will occur — possibly up to twist — in lower weight. Yet all the spaces in lower weight have dimension at most one, and hence it follows that the residue fields of \mathbf{T} are all of the form \mathbf{F}_p. Suppose further that 5 \le p < 23. Then, using theta operators, we may find two distinct eigenforms in weight 24, from which it follows that \mathbf{T} has two distinct residue fields of characterstic p, and so, for 5 \le p < 23, we have:

\mathbf{T} \otimes \mathbf{Z}_p = \mathbf{Z}_p \oplus \mathbf{Z}_p.

One expects at level one that a_2(f) always generates the Hecke field. This is still a conjecture, but we may deduce this unconditionally in weight 24 because the dimension of the cuspforms is two, and so this follows automatically from the Sturm bound! Hence we may write:

\mathbf{T} = \mathbf{Z}[a_2(f)], \quad a_2(f) = \displaystyle{\frac{a + b \sqrt{D}}{2} \in \mathbf{Z} \left[ \frac{1+\sqrt{D}}{2} \right]}

where b \ne 0. Even better, using Hatada’s Theorem — giving congruences for a_2 and a_3 for eigenforms of level one modulo 8 and 3 respectively — we may write

a_2(f) = 12(a + b \sqrt{D}), \quad a,b \in \mathbf{Z}

where b \ne 0. This gives an upper bound on D in light of the Deligne bound |a_2| \le 2 \cdot 2^{23/2}. More precisely, we obtain the bound b^2 D < 2^{27}/24^2, and hence that D < 233017.

Let’s now think more carefully about p = 2 and 3. For these primes, there will be a unique Coleman family of slope v(-24) = 3 for p =2 and v(252) = 2 for p = 3. I can’t quite see a pure thought way of proving this, but at least this would be a consequence of the strong form of the GM-conjecture as predicted by Buzzard. So we should expect that, in these cases

\mathbf{T} \otimes \mathbf{Z}_p \hookrightarrow \mathbf{Z}_p \oplus \mathbf{Z}_p.

In addition to congruences for small primes, there will also be congruences between the unique cusp form with an Eisenstein series modulo the numerator of B_{24}, which is

\displaystyle{B_{24} = \frac{-1}{2 \cdot 3 \cdot 5 \cdot 7 \cdot 13} \times 103 \times 2294797.}

I claim that these primes will also have to split in \mathbf{T}. For example, it is impossible for b to be divisible by 2294797, because that would violate the inequality on b^2 D above, and hence it follows that p = 2294797 must also split in \mathbf{T} \otimes \mathbf{Q}. The same argument works for p = 103 having ruled out some very small D. To summarize, we have the following:

The primes 5 \le p < 23, p = 103, 2294797 split in K = \mathbf{Q} (\sqrt{D}), but p = 23 does not split, and D < 233017. Moreover, we expect that 2 and 3 also split.

This is enough to determine D completely up to 72 possibilities, and 9 with the unproven assumption at 2 and 3. On the other hand, all of these D are quite large (the smallest are 3251 and 15791 respectively), which forces b to be very small. But we also have the congruence

12(a + b \sqrt{D}) \equiv 1 + 2^{23} \mod 2294797.

For the remaining D, we can determine, with |b| satisfying the required inequality, whether there exists such a congruence with |a| \le 2^{27/2}/24 \sim 483. A simple check shows that is a unique solution (with the assumption on two or three or not), and hence, by (something close to) pure thought, we have shown that D = 144169, and moreover (using Deligne’s bound again) that

a_2(f) = 12(45 \pm \sqrt{144169}), \qquad \mathbf{T} = \mathbf{Z}[12 \sqrt{144169}].

One can indeed check this is the case directly, if you like. Curiously enough, this Hecke eigenvalue is quite close to the Deligne bound — the probability it is (in absolute value) this big is, assuming a Sato-Tate distribution, slightly under 5%.

Extra Credit Problem: Hack Ken Ribet’s Yelp password by using the fact that 144169 is his favorite prime number.

Posted in Mathematics | Tagged , , , | 4 Comments

How not to be wrong

I recently finished listening to Jordan’s book “how not to be wrong,” and thought that I would record some of the notes I made. Unlike other reviews, Persiflage will cut through to the key aspects of the book which perhaps non-specialists may have missed.

Unfortunately, my first few notes did not record the specific time in the recording where the relevant passage occurred, so some of the earlier comments are a little more vague, because I couldn’t go back and check them more carefully.

Title: How Now to Be Wrong: The Power of Mathematical Thinking.
Author: Jordan Ellenberg.
Book Format: Pirated audio copy.

  • OK, Penguin, what have you done to Jordan? It sounds as though before the recording session began, Jordan was force fed him a greasy pizza with a couple of prozac stuffed in the crust. I was expecting a hyperactive delivery style, but instead there is a relatively calm and measured tone you might expect on any professionally made audio book.
  • Did he just say yoked? Yes, my friends, we have here a student of Barry Mazur.
  • 2377. This is all it says in my notes. I think this was used as a number which was supposed to sound random. But I did wonder whether it had any other significance. A brief web search indicates the full phrase may have been: Moving over to complicated/shallow, you have the problem of …[computing]… the trace of Frobenius on a modular form of conductor 2377. I checked — there are no elliptic curves of conductor 2377. I think there was an opportunity missed to say 5077 instead, thus alluding to the Gross-Zagier plus Goldfeld solution to the class number problem.   Although if there was such an allusion, it may have ruined the implication of being shallow, so never mind.
  • Some reference to galois representations being deep; unfortunately I didn’t write any further notes here. They are indeed complicated and deep.
  • The claim is made that if you cut a tuna fish sandwich you will be left with two right-angle isoceles triangles. Is this so clear? I mean, does everyone cut their tuna fish sandwiches along the diagonal?
  • Rounding Errors: the range for (I guess?) one standard deviation for some normal distribution with mean 50 is given as 46.2 and 53.7, but these numbers are not symmetric around 50.
  • Infinity of my profit comes from pastry. I liked this line.
  • 4, 21, 23, 34, 39. Repeated strings of numbers on the page are easy to read, but even Jordan is getting a little bored reading out 4, 21, 23, 34, 39 for the n-th time.
  • if your kid drew Jesus on the cross… See two comments up.
  • At this point, I should probably point out to the readers of the book that they are missing out on all the extra fancy technological gizmos that Penguin took advantage of when transferring the book from the page to audio. And by this, I mean that, in approximately 13 and half hours of reading, we not only have Jordan reading out the text of the book, we are also treated to exactly one such extra, namely, the first 9 notes of Beethoven’s Ode to Joy as played on what appears to be an 8-key child’s keyboard.
  • Ouroboric? Is that really how you pronounce that? It doesn’t seem consistent with the OED’s pronunciation of Ouroboros. Hmmm, but on the other hand, gives someting similar to what Jordan says…
  • How Many States should one have expected Nate Silver to get wrong? This might have been another opportunity to mention how the expectation is not the “expected” answer. Presumably, one would expect a high correlation between getting one (close) state wrong and getting another wrong (I’m imagining here that swings undetected by polls would be nationwide rather than statewide). So I have several questions here. Was there anything in Silver’s model which could allow one to predict not only the expected number of states he would get wrong but the expected *distribution* of the number of states he would get wrong? Because of the stickiness of states, I suppose that the expectation that he would get all the states correct is higher than what one might guess from the fact that the expected number of states one expected he would get wrong (from his model) was approximately 3. I’m sure I’ve heard Jordan mention elsewhere that Nate Silver claimed that one should not have expected Silver to get all 50 states right. However, it’s completely consistent to believe that a well designed model could both predict that the expected number of states that Silver would get wrong is 3, but also that there is a high probability (at least > 50%) that he really would get all the states correct. So it’s not clear that a criticism of Silver for getting too many states correct is necessarily valid.
  • The problems you meet freshman year are the deepest… Is this true? Matt and I wondered which p-adic modular functions were expressible as convergent sums of finite slope eigenforms, and I still don’t know, but I’m not sure that’s the deepest question ever.
  • Did the student of the introduction listen to the entire book? I think I kind of missed that this was a preface (I think?) and kept expecting her to return.

Summary: Was I convinced at the end that the girl’s time spending doing those 30 definite integrals was worthwhile? I’m not so sure. In fact, I could almost have been convinced that we should slash all the public math departments in half and replace them by statistics departments. On the other hand, by every other measure, the book was a complete success — as a piece of prose, as a source of interesting yet thematically linked historical anecdotes, and as both an exposition and celebration of a certain way of thinking (“mathematical thinking”) which we all aspire to. It was worth every cent.

Audio: On a scale from “Jordan’s talking to you quite loudly on a train in Germany and someone tells you to shut up” to “Ambient waterfall sounds for Ultimate Bedtime Relaxation,” I rate it a 4, which is about where you would wish it to be. (For an inside look at the recording session, see this post.)


Posted in Book Review | Tagged , , , , , , , , , , , , , , , , | 7 Comments

Chenevier on the Eigencurve

Today I wanted to mention a theorem of Chenever about components of the Eigencurve. Let \mathcal{W} denote weight space (which is basically a union of discs), and let

\pi: \mathcal{E} \rightarrow \mathcal{W}

be the Coleman-Mazur eigencurve together with its natural map to \mathcal{W}. It will do well to also consider the versions of the eigencurve corresponding to quaternion algebras D/\mathbf{Q} as well.

Theorem: [Chenevier] Suppose that

  1. \mathcal{E} has “no holes” (that is, a family of finite slope forms over the punctured disc extends over the missing point),
  2. The “halo” of \mathcal{E} is given by a union of finite flat components whose slope tends to zero as x \in \mathcal{W} tends to the boundary of the disc.

Then every non-ordinary component of \mathcal{E} has infinite degree.

In particular, since both of these theorems are now known in many cases (properness by Hansheng Diao and Ruochuan Liu, and haloness by Ruochuan Liu, Daqing Wan, and Liang Xiao, at least in the definite quaternion algebra case), the conclusion is also known.

The proof is basically the following. Given a component C of finite degree, the first assumption implies that it actually is proper and finite. One may then consider the norm of U_p on C to the Iwasawa algebra to obtain a bounded (hence Iwasawa) function F = \mathrm{Norm}(U_p). This function cannot have any zeros (again by properness), and hence, by the Weierstrass preparation theorem, it is a power of p times a unit. But that implies that F has constant valuation near the boundary, which contradicts the fact that the slopes are tending to zero (except in the ordinary case).

Naturally one may ask whether \mathcal{E} has only finitely many components, although this seems somewhat harder to prove.

Posted in Mathematics | Tagged , , , , , , , | Leave a comment

What does it take to get a raise?

Gauss … [had] a salary that remained fixed from 1807 to 1824.

(see here.) What was Gauss’ salary? My limited google skills were not able to find this information, although I’m not sure how meaningful it would be to translate any such number into today’s dollars. More generally, although there is available data for academic salaries over the past 40 years or so, I’m curious for comparisons that go further back in time.

Posted in Waffle | Tagged , | 6 Comments

The seven types of graduate student applicant

Yes, it’s that time of year again.

  1. Hide and Seek: Contacts you every day about the status of their application, then goes on radio silence the moment they receive an offer, never to be heard of again.
  2. The No Chancer: Has an offer from Harvard, Princeton, and MIT, but still plans to attend the prospective student day because they fancy a three day holiday in (wherever your university is located).
  3. The Copy & Paster: It has always been my dream to attend Michigan University, the best university in the world. Well, good luck with that.
  4. The Nervous Nellie: Has some questions about the graduate program — a lot of questions. Wants you (that is, me) to answer detailed questions about everything from the reasonable (exact duties of a TA, particulars on graduate student stipend and health insurance) to the less so (graduation statistics and data for the last 10 years of graduates, upcoming schedule of faculty sabbaticals for the next three years, tips on the best place in Evanston to purchase toothpaste, etc.).
  5. The Surprise: Never responds to any email query about whether they are interested in coming or whether they have offers from somewhere else, is completely discounted by the committee, but then ends up accepting on April 15.
  6. The Googler: Makes an effort to look at the department website to customize their application, but gets it all wrong: I would really like to work with X, Y, and Z where X is a postdoc, Y has retired, and Z moved to a different institution two years ago.
  7. The Unicorn: Actually accepts the offer well before April 15.

Tell me if I’ve missed anyone.

Posted in Mathematics, Rant | Tagged , , | 11 Comments

Only Harvard Grads need apply

It’s hard to take articles in Slate too seriously, but I have to admit I was quite perplexed about the following article (with the concomitant research publication here).

The main thrust of the article seems to be as follows. A disproportionate number faculty at research universities in the US received PhDs from a small number of prestigious institutions, and hence (?) such hiring practices reflect profound social inequality. Is it just me, or does this appear to be utter bollocks? There is an obvious pair of hypotheses that would completely explain the data, namely:

  1. There is a hierarchical system of admission to graduate programs,
  2. Universities hire the strongest candidates they can, and admit the strongest graduate students they can.

Let’s examine these possibilities in the context of graduate school in mathematics. I have, on several occasions, been responsible for graduate admissions at my institution. I would say, on the whole, that prospective graduate students are among the most class conscious of anyone in academia. I would guess that, at least 75% of the time, a student will accept either the program that is the most highly ranked amongst those where they were admitted or a school within at most one or two places of their highest ranked option.

What about the second hypothesis? The worry here is that universities might view “undergraduate/graduate institution” as a proxy for “quality of candidate.” In my experience (being on hiring committees), this is utterly preposterous. I am not claiming that mathematical judgements are not a slippery thing — there are many variations which relate to matters of taste and inclination — but there are some reasonable objective criteria (GRE scores for graduate applications, publication record for job candidates) which would serve as a check against any implicit bias in this regard.

We here at Persiflage, however, are open to the idea that we may have missed something. So here are some other possibilities:

  1. You are talking about Mathematics, a field for which it is easier to make reliable judgements about the quality of research, and a field for which there is a more pronounced spike in talent at the top of the scale. Is this true? I honestly don’t know. Perhaps whatever field it is that produces papers like the one under consideration is not something for which talent of any kind is an asset, and so there is no real difference between graduates from Harvard or from Podunk U. Less sarcastically, suppose (say) I compare the English department faculty at the top ranked place (taking from this list) Berkeley and compare it to a place also ranked in the top 25 schools but closer to the bottom of that list, say UIUC. Then, if I knew something, could I confidently say that one department is much better than the other?
  2. You are talking about the experience of hiring/admitting students at a Group I university. Perhaps it is the case that, for lower ranked universities, there is insufficient expertise to hire on the basis of talent/output, and so PhD institution serves as a lazy way to evaluate the candidate. This seems to be a somewhat condescending argument, but it’s true that I don’t have any idea how hiring works at non-Group I universities. But surely the letters of recommendation would carry the most weight, and they would reflect the quality of research? At the very least, if you are going to claim this is what happens, you need to come up with a way to substantiate that claim.

Ultimately, I certainly don’t feel that I can rule out bias when it comes to hiring, but the fact that the paper under review uses “prestige” as a dirty word and doesn’t seem to acknowledge in any way that there is some correlation between prestige and quality of graduates is highly disturbing. Perhaps, as with this paper, the main goal is to substantiate the political beliefs of the authors rather than to undertake a serious academic inquiry. Still, even if the methodology is flawed, I would like people’s opinion on the conclusion.

Posted in Politics, Rant | Tagged , | 9 Comments