Archive for the ‘maths’ Category
An interesting scholarly article appeared in the journal Studies in Higher Education in February of this year, by Jennifer Chubb and Richard Watermeyer. It investigates some aspects of the research funding system in the UK and Australia.
Give any research academic in Australia today (or the UK, or well, anywhere) a few minutes to vent about their job and you will most likely hear a tirade about grants — whether the writing of research grant applications, the application process, the chances of success, who and what tends to succeed, the pressures universities exert on researchers to obtain them, or any aspect of the related culture.
Well, come to think of it, there might be tirades against many possible things. The university universe is not short on tirades or things to tirade against.
While anyone in the academic world will be very familiar with the standard grievances — and it would take far too long to attempt to make a list — they are grievances usually only aired in private.
What is good about this article is that it uses the medium of a research article to air the views of academics, suitably anonymised, in public. The focus is on a particularly problematic aspect of the process of research funding in Australia and the UK: impact statements.
To quote the article,
In both UK and Australian funding contexts… the perceived merit of a research funding application is now linked to the capacity of the applicant to prescribe convincing (pathways to) research impacts, or more specifically, credible statements of how they will ensure economic and/or societal returns from their research… ‘Impact Statements’ … demand that academics demonstrate an awareness of their external communities and how they will benefit from the proposed research… [and] require that academics demonstrate methodological competency in engaging with their research users, showing how research will be translated and appropriated in ways that most effectively service users’ needs.
On its face, it looks like a good idea: any research asking for public money must make some attempt to justify its effect on society. And that doesn’t just look like a good idea, it is a good idea.
However, most research — including especially most important and worthy research — has zero-to-infinitesimal direct impact on society — or at least very little that can be explained in the few sentences of the word limit to create an “impact”. There are certainly areas that do have direct impact: most medical research; some (but not all) climate research; some renewable energy research; some biotechnology and nanotechnology research, and so on. But of course, that research with the most immediate direct economic or commercial impact is already funded by private capital and does not need public funding. Most research is much slower, uncertain, slowly and methodically working towards a long-term scientific or scholarly goal — with occasional surprises and breakthroughs.
But what impact statements, and the associated culture, demand are not accurate stories with all the complexity of scientific understanding, research programmes, educated guesswork and careful methodology that sensible research requires. That would take too long. Boring! We want impact. In a few words. Major impact. High velocity. Boom. That’s what we’re looking for. And that’s just not how research works.
Alas, simply saying that your research makes the world a better place by improving its store of important scientific and scholarly knowledge, and making society better because by supporting this research the society becomes the kind of society that supports this kind of research, is much too subtle for the politics of the situation to allow. Rather, the politics of the situation make the impact statement into a crude sales pitch.
Thus, we have a situation where, in the principal public statements made to support scientific and scholarly research, the predominant, sufficient and principal good reason for the public to support scientific and scholarly research is out of the question — it is inexpressible. It is also, in effectively preventing full justifications from being aired (at least where it counts), a scientific version of the censorship by concision so familiar in mainstream media.
How would, say, Euler have written an impact statement for his research into, say, analysis? The impact of theorems which gradually improve mathematical understanding, over decades and centuries, to the point where they enable breakthroughs in other sciences, engineering, or technology, is impossible to quantify. Even for those parts of Euler’s research which have had major, definite and decisive impact, like Euler’s theorem in number theory central to RSA encryption, the idea that Euler could have had any inkling of this application, over 200 years later, is laughable. Even in 1940 Hardy’s A Mathematician’s Apology sung the praises of number theory precisely because of its uselessness.
So, justification based on “impact” would have been an impossible task for Euler. And Euler is the most prolific mathematician of all time, one of the greatest mathematicians of all time. God help any lesser mortal.
To be fair, pure mathematics is in some sense too easy a case. The very inapplicability of pure mathematics is so clear that any statement about “impact” in this context can only seriously be understood as a source of amusement. A three-year project to think hard and prove some theorems about some interesting and important field of mathematics — but which may have some practical applications, one day, but this is impossible to predict, and in all likelihood not — is so far from the average person’s concept of “impact” that we can only feel that the poor mathematician has been dragged by a faceless bureaucracy into a system designed for someone else, in some other time and place.
Or, perhaps slightly more accurately, and disturbingly, a pure mathematician made to justify their research based on “impact” is a lamb about to be fed to the lions. But thankfully, mathematicians will not be fed to the lions — or at least, not all of them — because the emperor has taken their side. A society without mathematicians produces none of the STEM-literate graduates that the emperor, capital, demands. The survival of the planet, as it turns out, also demands STEM-literate graduates, but as the perilous state of the planet so clearly attests, it is capital, not the planet, which is a much stronger determinant of social outcomes, at least under present social arrangements.
Mathematics aside, the point remains. Requiring 30-second written advertisements called “impact statements” leads to exaggeration, over-speculation, and, at best, twisting of the truth.
But don’t take if from me — it’s much more interesting to requote the senior academics at Australian/UK universities quoted in the article:
It’s virtually impossible to write one of these grants and be fully frank and honest in what it is you’re writing about. (Australia, Professor)
‘illusions’ (UK, Professor); ‘virtually meaningless’, or ‘made up stories’ (Australia, Professor) ‘…taking away from the absolute truth about what should be done’ (UK, Professor). Words such as lying, lies, stories, disguise, hoodwink, game – playing, distorting, fear, distrust, over- engineering, flower-up, bull-dust, disconnected, narrowing and the recurrence of the word ‘problem’
Would I believe it? No, would it help me get the money – yes. (UK, Professor)
I will write my proposals which will have in the middle of them all this work, yeah but on the fringes will tell some untruths about what it might do because that’s the only way it’s going to get funded and you know I’ve got a job to do, and that’s the way I’ve got to do it. It’s a shame isn’t it? (UK, Professor)
If you can find me a single academic who hasn’t had to bullshit or bluff or lie or embellish in order to get grants, then I will find you an academic who is in trouble with his [sic] Head of Department. If you don’t play the game, you don’t do well by your university. So anyone that’s so ethical that they won’t bend the rules in order to play the game is going to be in trouble, which is deplorable. (Australia, Professor)It’s about survival. It’s not sincere all the way through…that’s when it gets disheartening. It puts people on the back foot and fuels a climate of distrust. (UK, Professor)
It is impossible to predict the outcome of a scientific piece of work, and no matter what framework it is that you want to apply it will be artificial and come out with the wrong answer because if you try to predict things you are on a hiding to nothing. (UK, Professor)
The idea therefore that impact could be factored in in advance was viewed as a dumb question put in there by someone who doesn’t know what research is. I don’t know what you’re supposed to say, something like ‘I’m Columbus, I’m going to discover the West Indies?!’ (Australia, Professor)
It’s disingenuous, no scientist really begins the true process of scientific discovery with the belief it is going to follow this very smooth path to impact because he or she knows full well that that just doesn’t occur and so there’s a real problem with the impact agenda- and that is it’s not true it’s wrong – it flies in the face of scientific practice. (UK, Professor)
It’s really virtually impossible to write an (Australian Research Council) ARC grant now without lying and this is the kind of issue that they should be looking at. (Australia, Professor)
It becomes increasingly difficult – one would be very hard pressed to write a successful grant application that’s fully truthful…you’re going to get phony answers, they’re setting themselves up for lies…[they go on]…it’s absurd to expect every grant proposal to have an impact story. (Australia, Professor)
Trying to force people to tell a causal story is really tight, it’s going to restrict impact to narrow immediate stuff, rather than the big stuff, and force people to be dishonest. (UK, Professor)
They’re just playing games – I mean, I think it’s a whole load of nonsense, you’re looking for short term impact and reward so you’re playing a game…it’s over inflated stuff. (Professor, Australia)
Integration is hard.
When we learn calculus, we learn to differentiate before we can integrate. This is despite the fact that, arguably, integration is an “easier” concept. To my mind at least, when I am given a curve in the plane, the notion of an area bounded by this curve is a very straightforward, intuitive thing; while the notion of its “gradient” or “slope” at a point is a much more subtle, or at least less intuitive idea.
But whether these ideas are natural or not, one is certainly mathematically and technically more difficult than the other. Integration is much more subtle and difficult.
These difficulties highlight the extent to which integration is less a science and more an art form. And in my experience, those difficulties are seen very rarely in high school or undergraduate mathematics, even as students take course after course about calculus and integration. So it high time we shed some light on this lost art.
Really existing differentiation
In order to see just how hard integration is, let’s first consider how we learn, and apply, the ideas of differentiation.
When we learn differentiation, we first learn a definition that involves limits and difference quotients — the old chestnut . We pass through a discussion of chords and tangents — perhaps even supplemented with some physical intuition about average and instantaneous velocity. From this we have a “first principles” approach to calculus, using the formula .
This formula, and the whole “first principles” approach, is then promptly forgotten. After we learn the “first principles” of calculus, we then learn a series of rules, techniques and tricks, such as the product rule, quotient rule and chain rule. Using these, combined with a few other “basic” derivatives, most students will never need the “first principles” again.
More specifically, once we know how to differentiate basic functions like polynomials, trig functions and exponentials,
and we know the rules for how to differentiate their products, quotients, and compositions
we can forget all about “first principles” and mechanically apply these formulae. With some basics down, and armed with the trident of product, quotient, and chain rules, then, we can differentiate most functions we’re likely to come up against.
It turns out, then, that in a certain sense, differentiation is “easy”. You don’t need to know the theory so much as a few basic rules and techniques. And although these rules can be a bit technically demanding, you can use them in a fairly straightforward way. In fact, their use is algorithmic. If you’ve got the technique sufficiently down, then you can mechanically differentiate most functions we’re likely to come across.
Let’s make this a little more precise. What do we mean by “most functions we’re likely to come across”? What are these functions? We mean the elementary functions. We can define these as follows. We start from some “basic” functions: polynomials, rational functions, trigonometric functions and their inverses, exponential functions and logarithms.
We then think of all the functions that you can get by repeatedly adding, subtracting, multiplying, dividing, taking ‘th roots (i.e. square roots, cube roots, etc) and composing these functions. These functions are the elementary ones. They include functions like the following:
(Aside: There’s actually a technicality here. Instead of saying that we can take ‘th roots of a function, we should actually say that we can take any function which is a solution of a polynomial expression of existing functions. The ‘th root of a function , i.e. , is the solution of the polynomial equation in given by . That is, you can take an algebraic extension of the function field. Having done this, you can find the derivative of the new function using implicit differentiation. But we will not worry too much about these technicalities.)
Actually, the above definition is not really a very efficient one. If you start from just the constant real functions and the function , then you can build a lot just from them! By repeatedly adding and multiplying s and constants, you can build any polynomial; and then by dividing polynomials you can build any rational function. If you throw in and , then you also have all the other exponential and logarithmic functions, because for any (positive real) constant ,
and is a constant! If you allow yourself to also use complex number constant functions, then you can build the trig functions out of exponentials,
and then you have . You can also build hyperbolic trigonometric functions if you wish, since , , and .
The formulas above for and are relatively well known if you’ve studied complex numbers; a little less well-known are the formulas that allow us to express inverse trigonometric functions in terms of complex numbers, together with logarithms and square roots:
(If you haven’t seen these before, try to prove them! There are also logarithmic functions for inverse hyperbolic trigonometric functions, which are probably slightly more well known as they don’t have complex numbers in them.)
Thus, we can define an elementary function as a function which can be built from the functions using a finite number of additions, subtractions, multiplications, divisions, compositions, and ‘th roots (or really, solving polynomial equations in existing functions but don’t worry about this bit in parentheses).
The point is, that if you are good enough at the product, chain, and quotient rules, you can differentiate any elementary function. You don’t need any more tricks, though you might need to apply the rules very carefully and many times over! A further point is that when you find the answer, you find that the derivative of an elementary function is another elementary function.
Not so elementary, my dear Watson
When we come to integration, though, everything becomes much more difficult. I’m only going to discuss indefinite integration, i.e. antidifferentiation. Definite integration with terminals just ends up giving you a number, but indefinite integration is essentially the inverse problem to differentiation. If we’re asked to find the indefinite integral , we’re asked to find a function whose derivative is , i.e. such that . There are many such functions: if you have one such function , then you can add any constant to it, and the resulting function also has derivative ; that is why we tend to write at the end of the answer to any indefinite integration question. But it will suffice for us, here, to be able to find one — for the sake of simplicity, I will not write in the answers to indefinite integrals. In doing so I lose 1 mark for every integral I solve, but I don’t care!
We start with some basic functions like polynomials and trigonometric functions, exponentials and logarithms, some integrals are standard.
Some are slightly less standard:
(You might complain that the integral of should actually be . You’d be right, and I am totally sweeping that technicality under the carpet!)
Some inverse trigonometric integrals, perhaps, are less standard again:
So far, so good — although perhaps not always obvious! But now, in general, what if we start to combine these functions? The problem is that if you know how to integrate and you know how to integrate , it does not follow that you know how to integrate their product . This is in contrast to differentiation: if you know how to differentiate and , then you can use the product rule to differentiate . There is no product rule for integration!
The product rule for differentiation, rather, translates into the integration by parts formula for integration:
This is not a formula for ! A product rule for integration would say to you “if you can integrate both of my factors, you can integrate me!” But this integration by parts formula says something more along the lines of “if you can integrate one of my factors and differentiate the other, then you can express me in terms of the integral obtained by integrating and differentiating those two factors”. That is a much more subtle statement. A product rule would be a hammer you could use to crack integrals; but the integration formula is a much more subtle card up your sleeve.
Essentially, integration by parts supplies you with a trick which, if you are clever enough, and the integral is conducive to it, you can use to rewrite the integral in terms of a different integral which is hopefully easier. Hopefully. While the product rule for differentiation is an all-purpose tool of the trade — a machine used to calculate derivatives — integration by parts is a subtle trick which, when wielded with enough sophistication and skill, can simplify (rather than calculate) integrals.
Similarly, there is no chain rule for integration. The chain rule for differentiation translates into the integration by substitution formula for integration:
A chain rule for integration would say to you “if I am a composition of two functions, and you can integrate both of them, then you can integrate me”. But integration by substitution says, instead, “if I am a composition of two functions, multiplied by the derivative of the inner function, then you can integrate me”. In a certain sense it’s easier than integration by parts, because it calculates the integral and gives you an answer, rather than merely reducing to a different (hopefully simpler) integral. But still, it remains an art form: it requires the skill to see how to regard the integrand as an expression of the form . Finally, there is no quotient rule for integration either.
So, while differentiation is a skill which can be learned and applied, integration is an art form for which we learn tricks and strategies, and develop our skills and intuition in applying them. Now, actually there are tables of standard integrals, far far beyond the small examples above. There are theorems about how functions of certain types can be integrated. There are algorithms which can be used to integrate certain, often very complicated, families of functions.
But the question remains: how far can we go? If we see an integral which we can’t immediately solve, do we just need to think a little harder, and apply something from our bag of tricks in a clever new way? Do we just need more skill, or is the integral impossible? How would we tell the difference between a “hard” and an “impossible” integral — and what does that even mean?
In a certain sense, no integrals are “impossible”. An integral of a continuous function always exists, in a certain sense. If you’ve got a continuous function , then its integral is certainly defined as a function, using the definition with Riemann sums — this is a theorem. Even if is not continuous, it’s possible that the Riemann sum approach can give a well-defined function as the integral. For more exotic functions , there is the more advanced method of Lebesgue integration.
But this is not what we have in mind when we say an “integral is impossible”. What we really mean is that we can’t write a nice formula for the integral. This would happen if the result were not an elementary function.
As we discussed above, if you take an elementary function and differentiate it, you can always calculate the derivative with a sufficiently careful application of product/chain/quotient rules, and the result is another elementary function.
So, we might ask: given an elementary function, even though there might not be any straightforward way to calculate its integral, is the result always another elementary function?
Indomitable impossible integrals
It turns out, the answer is no. There are elementary functions such that, when you take their integral, it is not an elementary function. When you try to integrate such a function, although the integral exists, you can’t write a nice formula for it. And it’s not because you’re not skillful enough. It’s not because you’re not smart enough. The reason you can’t write a nice formula for the integral is because no such formula exists: the integral is not an elementary function.
What is an example of such a function? The simplest example is one that high school students come up against all the time: the Gaussian function
It’s clearly an elementary function, constructed by composition of a polynomial and the exponential function. But its integral is not elementary.
You might recall that the graph of is a bell curve. Suitably dilated (normalised), it is the probability density function for a normal distribution. When you calculate probabilities involving normally distributed random variables, you often integrate this function.
You may recall painful time spent in high school looking up a table to find out probabilities for the normal distribution. That table is essentially a table of (definite) integrals for the function (or a closely related function). And the reason that it’s a table you have to look up, rather than a formula, is because there is no formula for the integral . You need a table because the integral of the elementary function is not elementary.
There’s no formula for normal distribution probabilities because integration is an art form, rather than algorithmic. And so we are sometimes reduced to the quite non-artistic process of looking up a table to find the integral.
Now, when I say that is not elementary, I mean that it’s known as a theorem. That is, it has been proved mathematically that is not elementary, and so doesn’t have a nice formula. But what could this mean? How could you prove that an integral doesn’t have a nice formula, isn’t an elementary function, can’t be written in a nice way? The proof is a bit complicated, too complicated to recall in complete detail here. But there are some nice ideas involved, and it’s worth recounting some of them here.
Proving the impossible
The fact that is not elementary was proved by the French mathematician Joseph Liouville in the mid-19th century. In fact, he proved quite a deal more. Suppose you have an elementary function , and you are trying to find its integral . Now as the integrand is continuous, the integral certainly exists as a continuous function; the question is whether is elementary or not, i.e. whether there is a formula for involving only complex numbers, powers of , rational functions, and , and ‘th roots (and their generalisations).
Liouville’s theorem, amazingly, tells you that if the function you’re looking for is elementary, then it must have a very specific form. Very roughly, Liouville says, can have more logarithms than , but no more exponentials. You can see the germ of this idea in some of the integrals above:
In these integrals, a new logarithm appears, that did not appear in the integrand. Never does a new exponential appear. If an exponential appears in the integral, then it appeared in the integrand, as in examples like
To state Liouville’s theorem more precisely, we need the idea of a field of functions. For our purposes, we can think of a field of functions as a collection of functions which is closed under addition, subtraction, multiplication, and division. The polynomials in do not form a field of functions, because when you divide two polynomials you do not always get a polynomial! However, the rational functions in do form a field of functions. A rational function in is the quotient of two polynomials (with complex coefficients) in , i.e. a function like
The first example is the quotient of a quadratic by a 10’th degree polynomial; the second example is the quotient of two linear polynomials. The third example illustrates the notion that any polynomial is also a rational function, because you can think of it as itself divided by , and is a polynomial: . The final example illustrates the notion that any constant is also a rational function.
The field of rational functions (with complex coefficients) in is denoted . You can make bigger fields of rational functions by including new elements! For instance, you could throw in the exponential function , and then you can obtain the larger field of functions . The functions in this field are those made up of adding, subtracting, multiplying, and dividing powers of and the function . So this includes functions like
Note however that a function like does not lie in . This function field is made up out of adding, subtracting, multiplying and dividing and , but not by composing these functions.
We can see, then, that this second function field is bigger than the first one: . In technical language, we say that is a field extension. Moreover, both these fields have the nice property that they are closed under differentiation. That is, it you take a rational function and differentiate it, you get another rational function. And if you take a function in , involving ‘s and ‘s, and differentiate it, you get another function in . In technical language, we say that and are differential fields of functions.
A differential field obtained in this way, by starting from rational functions and then throwing in an exponential, is an example of an field of elementary functions. In general, a field of elementary functions is obtained from the field of rational functions by successively throwing in extra functions, some finite number of times. Each time you add a function is must be either
- an exponential of a function already in the field, or
- a logarithm of a function already in the field, or
- an ‘th root of a function already in the field (or more generally the root of a polynomial equation with coefficients in the field but as I keep saying don’t worry too much about this!).
Note that, by definition, any function in a field of elementary functions is made up by adding, subtracting, multiplying, and dividing ‘s, and exponentials, and logarithms, and ‘th roots (or generalisations thereof). That is, a function in an field of elementary functions is an elementary function! So our definitions of “elementary function” and “field of elementary functions” agree — it would be bad if we used the word “elementary” to mean two different things!
We can now state Liouville’s theorem precisely.
Liouville’s theorem: Let be an field of elementary functions, and let be a function in . (Hence is an elementary function.) If the integral is elementary, then
where is a non-negative integer, each of is a constant, and the functions all lie in .
That is, Liouville’s theorem says that the integral of an elementary function must be a sum of a function that lies in the same field as , and a constant linear combination of some logarithms of functions in the same field as . The fact that and each lies in the same field as means that they cannot be much more complicated than : they must be made up by adding, subtracting, multiplying and dividing the same bunch of functions that you can use to define
So Lioville’s theorem says, in a precise way, that when you integrate an elementary function , if the result is elementary, then it can’t be much more complicated than , and the only way in which it can be more complicated is that it can have some logarithms in it. This is what we meant when we gave the very rough description “Liouville says can have more logarithms than , but no more exponentials“.
Let’s now return to our specific example of the Gaussian function . What does Liouville’s theorem mean for this function? Well, this function lies in the field of elementary functions where we start from rational functions and then throw in, not , but . That is, we can take .
The theorem says that if the integral
is elementary, then it is given by
where is a non-negative integer, each of is a constant, and the functions all lie in . That is, are “no more complicated” than ; they are all made by adding, subtracting, multiplying and dividing ‘s and ‘s.
If we differentiate the above equation, we obtain
On the left hand side is the function we started with, . On the right hand side is an expression involving several functions. However, all the functions and lie in ; they are “no more complicated” than . Now as is a differential field, their derivatives and also lie in ; they are also “no more complicated”. So in fact the right hand side is an expression involving functions no more complicated than . They are all just rational functions, with ‘s thrown in. And if you think about it, thinking about what you will get for each , you might find it hard to avoid having a big denominator. You likely won’t be able to cancel the fraction. So you might find, then, that none of the can make this equality work, and all have to be zero; or in other words, . But now that is, like everything else here, made up by adding, subtracting, multiplying and dividing ‘s and ‘s. You might find, when you differentiate such a function, that it’s very hard to get a lone . Every time you differentiate an you get a , which has a pesky extra factor of . And even if it appears together with other terms, as something like , when you differentiate it you get something like , which still has no isolated term. And so, in conclusion, you might find it very difficult to find any functions that make the right hand side equal to .
Of course, this is not a proof at all; it’s a mere plausibility argument. To prove the integral is not elementary does take a bit more work. But it has been done, and can be found in standard references.
Hopefully, though, this should at least give you some idea why it might be true, and how you might prove, that an integral is “impossible”, and can’t be written with any nice formula.
Mathematics is an amazing place.
Brian Conrad, Impossibility theorems for elementary integration, [[http://www2.maths.ox.ac.uk/cmi/library/academy/LectureNotes05/Conrad.pdf]].
Keith O. Geddes, Stephen R. Czapor, George Labahn, Algorithms for Computer Algebra, Kluwer (1992).
Andy R. Magid, Lectures on Differential Galois Theory, AMS (1994).
(Update 2/3/15: Typo fixed.)