I think our quandary is best displayed in the following diagram:
It feels to me almost like a mathematical theorem: each arrow a virtually inevitable consequence. But to elaborate, to address the most frequent nostrums people offer, I want to add a series of footnotes.
What's happening now? The street gangs in many cities, e.g. Chicago, are out of control. The Tribal warfare in many areas, especially the Middle East, show no signs of abating (haven't Jewish people been fighting the other tribes in Palestine for three millennia -- since the book of Exodus?). As for refugees, it seems to me significant that the refugees seeking asylum in Europe today come from so many countries, not merely Syria but from all over Northern Africa and the Middle East. If indeed Bangladesh (population 150 million) becomes the victim of massive floods as many expect, where on earth would their refugees go?
As a mathematician, all this reminds me of the Lotka-Volterra equation. For those who aren't mathematicians, this is a famous model of competing species taught in all introductory differential equation classes. It deals with foxes and rabbits and produces cyclical behavior in which the number of foxes explodes until they reduce the rabbit population to nearly zero, then the foxes starve until the rabbits reproduce and their population in turn explodes, etc. etc. In our case, humans are the foxes and all the rest of the earth -- animal, vegetable and mineral -- are the rabbits. We have gone through half the cycle: the ascendancy of the foxes/humans but not the second half, their collapse. Let us all pray the model fails to predict the future.
Jared Diamond has outlined all the ways previous cultures have blundered into terminal decline in his book "Collapse". The book makes instructive reading for us today. But there are also wild cards that could have a huge impact on the world my grandchildren and great-grandchildren live in. One is CRISPR technology: our rapidly developing skills to modify the genes of all flora and fauna, to design new variations of all life forms including our own. This is surely the opening of Pandora's Box and, just as surely, its temptations are likely to overcome our scruples. This was said best by Oppenheimer: "When you see something that is technically sweet, you go ahead and do it and you argue about what to do about it only after you have had your technical success. That is the way it was with the atomic bomb." The other wild card is the possibility of settling in outer space, an option Dyson has written about. Both of these seem more realistic to me than Kurzweil's wild talk of "the singularity". No matter what transpires, the well known curse (some call it Chinese, some Russian, some Victorian English) "may you live in interesting times" seems to apply.
The blog above has been translated into Latvian by Arija Liepkalnietis from Riga and "AlphaCast". You can find it at this URL.
My vision colleague Alan Yuille wrote me:
Look at the Cambridge Centre for the Study of Existential Risk if you want real pessimism -- the linkYes, indeed, those guys wallow in it.
Since the above was written, I have received several skeptical emails about the possibility of continued population growth. My son Jeremy wrote:
... I felt that (your essay) gives short shrift to nostrums 1 and 2, which are really the same nostrum: urbanization everywhere seems to lead to a rapid drop in fertility to below replacement level. This is not necessarily due to a middle class life-style: it seems to hold true for urban rich, urban poor and everyone in between, in small cities and megacities, in a variety of different cultures. A 17th-century author observed that London's population depended on constant migration from the countryside, otherwise it would shrink since deaths outnumbered births. Most of world population growth right now comes from rural Africa, but Africa is urbanizing rapidly. Clearly world population right now is dangerously high and growing, but the demographic transition can lead to a much lower population in a few generations, without need for a massive die-off. This is not to write off overpopulation, but to me, other risks seem even more likely to destroy our species in the next century than overpopulation.
And a good friend, Andrew Love wrote:
What we know is that population growth rates have been declining (not quite as rapidly rapidly as birth rates, with which they are often confused) everywhere?see your cited graph. And that for all except Africa, they are either sub zero population growth (ZPG) or, without heroic extrapolation seem to be approaching ZPG. My college classmate and leading U.S. demographer, Joel Cohen (and no Pollyanna) tells me just the other day that half the world is at ZPG or below. That?s a datum with respect to which one might be encouraged to make predictions, since it lies comfortably in the past. And which only a true believer could simply ignore.Most striking (and a concept of quite general significance) is ... that averages and gross figures often conceal more than they reveal. In this case a little mental disaggregation tells us that population issues and policy implications are largely best understood as geographically and and politically local, rather than worldwide. So, for Africa, largely matter of birthrate and related issues. For U.S., largely a matter of immigration. For Europe, native born demographic collapse and immigration. And so on.
For general reference and caution re apocalyptic predictions I suggest as only a sample the dismal record of your parson Malthus, Paul Ehrlich (The Population Bomb, and sequel The Population Explosion, together with his infamous bet with Julian Simon), The Club of Rome, Dennis Meadows etc. (The Limits to Growth).
In any event the forces suppressing birthrates(and ultimately population growth) which seem to be associated with prosperity, improvement in health, education, advancement of women, technical improvement in birth control, agricultural improvements, etc. appear to be far more powerful than any particular and transitory policy initiative?even such as China?s one child policy. And I shall hazard the prediction that China?s current benign population trend will survive revocation of that policy.
I certainly agree that predictions for population are full of pitfalls. But I think the more than 3-fold increase of population in my lifetime bears out a large part of Ehrlich's ideas. Where he went wrong was not to consider that science could more than double grain yields. His bad predictions also illustrate the weakness of crude differential equation modeling. But this weakness also applies to the UN predictions of future population scenarios. Checking the web, I find that the birthrate in India is still about 2.5 (above ZPG), in Pakistan about 3.25, in China, while the one child policy stood, it was 1.66 and it seems likely to spike now that this has been lifted. I just don't trust the current low birthrates in Europe, Russia and Japan to stay low. This has so much to do with fashion and the zeitgeist, optimism vs. pessimism. It should also be noted that the median age of humanity now is under 30, whereas if the population were stable and healthy, it would have be near 40.
My biggest fear is not that the present population size couldn't be stabilized at some slightly higher level, but that managing a world that size requires reasonably rational governments to deal with the huge number of problems it creates (e.g. managing megacities with vast slums, need for new jobs, rising expectations for meat and consumer goods). And I don't see many countries with reasonably rational governments. For example, there is no plan to deal with the abysmally poor sanitation situation in India nor any plan for the serf-like classes created there by the caste system, e.g. the more than 150 million Dalits.
Larry Gonick, (the author of multiple graphic educational texts, a former student and collaborator on the book Indra's Pearls) wrote:
This post is pretty much in line with what I thought on election night: "It's the beginning of the end of the world." We certainly live in perilous economic and environmental times. You have to admit, though, that countries like India have maintained themselves fairly successfully under more difficult circumstances than those now faced by the United States. I still think that political leadership (and followership) will play a major role in affecting outcomes. Still, if your flow chart is correct, there's not much to discuss.
That's why I looked back at an earlier post, about Igor Shafarevich. Without knowing the particulars, I think you may be too generous to him. Maybe because of my Jewish forebears, I can't take at face value his particular association of his traditions with the land itself, "narrowly" construed. This association is a barely veiled threat against anyone from Russia who, like the Jews, were expressly forbidden to own land. One may also be permitted to doubt that he identifies himself with a wretched muzhik, lazily scratching out a bare, inefficient living between the snowdrifts while his master's whip decorates his back, but rather with the aristocrat who owns that land, funds the church, etc. etc.
I also don't know what to make of his dismissal of the undoubted discrimination against Jews in the top Russian math institutes. Frenkel’s "Love and Math" gives a first-hand account from the 1980s. When Shafarevich calls himself a "moderate" nationalist rather than an extremist, one has to remember that when moderate nationalists gain power, they enable or fail to restrain the extremists. We're about to find that out here, I'm afraid.
We all would like to preserve some semblance of the youthful environment we grew up in (bland, suburban, '50s Arizona, in my case), but this is quite different from adherence to a movement dedicated to preserving a "national identity" that is usually a modern construction and false in many respects—just like the "greatness" that you-know-who so airily promises. You're absolutely right that nativist movements are spreading everywhere, but my inclination is to understand them only to the extent that understanding helps to resist them successfully.
If you want to see a devastating account of how such a movement is playing out in Hungary, do read Susan Faludi's superb recent book, In the Darkroom. You won't regret it.
In connection with Larry's comments, I want to add that Fijavan Brenk has translated this post into Russian on her blog as well as putting up the Shafarevich post in Hungarian here. As for whether and to what degree one might call Shafarevich "anti-semitic", this is a question on which there will never be a consensus so further words accomplish nothing. But I want to emphasize that that post was not meant to justify any or all nationalist movements. It was rather to describe my coming to better understand some of the emotional aspects of nationalism that affect an awful lot of people. Larry and my childhoods seem to have shielded us to a large degree from these emotions. But the speed with which conditions of life everywhere are changing is making them awfully powerful world-wide.
My former student and colleague Prof. Song-Chun Zhu wrote me some pointed comments:
Human reproduction is a topic that people in academics dare not to touch, as it is deemed politically incorrect if a social scientist or policy maker tries to optimize any sort of collective utility function that makes sense to the society (or human race) as a whole, but immediately violate basic civil rights of individuals.
But let's face it. Human reproduction is a key factor that defeats our immigration system and welfare system. Controlling the global population should be part of the solution to fighting global warming.
The Chinese government has just done this against strong criticism from the West. Most people inside China view birth control a policy necessary for the environments and improving quality of living people. The birth rate in China has dropped drastically in the past 2 decades.
The left wing in the US has been largely inconsistent on such topics. They have overly emphasized social justice, but totally ignored the boundary conditions of the economic equations. And we are about to reach such boundary conditions i.e. the limits. This might explain why they lost the election.
What is a solution to this? If, following the left wing suggestions, you should use smaller cars, live green, when you minimize your living space, you only yield living capacity to other countries, or invite more people migrating to your country. Do you remember a Ph.D student George at the Harvard Robotics Lab in the early 1990s? The hard disk in the lab was shared among all students and postdocs. When George first came to the lab, the computer manager suddenly found all free space was gone. George loaded gigabytes of junk files to occupy the disk, and then only he knew what to delete when he needed space to store his real data.
I know someone at UCLA. He once told me that he is the 12th child in his family. Then I asked him where he came from. He is a Palestinian living inside Israel. That immediately explains all.
Terrorism: Most of the terrorists in the middle east are youth who have no job in the dessert, but think how many children each family produce? Bin Laden has so many brothers, wives and children.
Immigration: Some people was discussing whether we should change the law that grants automatically citizenship for people born in the US. This law invites illegal immigrants, also the anchor babies (many pregnant women just fly to the US to give birth)
Global warming: Think about emission and pollution in China and India, primarily produced by the new middle class who desire to live American style life.
The equilibrium of this game is a disaster that you are pointing to, unless people change the rule of the game.
My close Indian friend for 50 years, Prof. Seshadri, wrote me these thoughtful comments:
Dear David
Population growth is indeed a very serious problem but I can't say that this is the root cause. For me the tragedy is the very success of science which we all admire. The industrial revolution led to impoverishment in India and it must be the same in other colonies of Western powers. The success of Western medicine is also a reason for the population growth. On the other hand one cannot say that Britain or the Western powers deliberately brought about the Industrial revolution. I would call it an accident of history. However, since human nature is not going to change, I agree with you of being pessimistic of the future.
Seshadri
I wouldn't put the finger on science and medicine. Human dominance goes back to stone tools, harnessing fire, skinning animals for clothes, basically the fact that we have a bigger frontal cortex with which we plan, plan and plan some more. The discovery of electricity and microbes are just more recent events that have further enhanced our control of the world -- though not our wisdom.
]]>I first encountered this idea in reading my colleague Phil Lieberman's excellent 1984 book "The Biology and Evolution of Language". Most of this book is devoted to the still controversial idea that Homo Sapiens carries a mutation lacking in Homo Neanderthalensis by which its airway above the larynx was lengthened and straightened allowing the posterior side of the tongue to form the vowel sounds "ee", "ah", "oo" (i,a,u in standard IPA notation) and thus increase hugely the potential bit-rate of speech. If true, this suggests a clear story for the origin of language, consistent with evidence from the development of the rest of our culture. However, the part of his book that concerns the origin of syntax -- and in particular Chomsky's language organ hypothesis -- is the beginning, esp. chapter 3. His thesis here is:
"The hypothesis I shall develop is that the neural mechanisms that evolved to facilitate the automatization of motor control were preadated for rule-governed behavior, in particular for the syntax of human language."He proceeds to give what he calls "Grammars for Motor Activity", making clear how parse trees almost identical to those of language arise when decomposing actions into smaller and smaller parts. It is curious that these ideas are nowhere referenced in the paper of Hauser, Chomsky et al (Frontiers of Psychology, vol. 5, May 2014) that generated Wolfe's diatribe.
My research connected to the nature of syntax came from studying vision and taking admittedly somewhat controversial positions on the algorithms needed, especially those used for visual object recognition, both in computers and animals. In particular, I believe grammars are needed in parsing images into the patches where different objects are visible and that moreover, just as faces are made up of eyes, nose and mouth, almost all objects are made up of a structured group of component smaller objects. The set of all objects indentified in an image then forms a parse tree similar to those of language grammars. Likewise almost any completed action is made up of smaller actions, compatibly sequenced and grouped into sub-actions. The idea in all cases is that the complete utterance resp. complete image resp. complete action carries many parts, some parts being part of other parts. Taking inclusion as a basic relation, we get a tree of parts with the whole thing at the root of the tree and the smallest constituents at its leaves (computer scientists prefer to visualize their "trees" upside-down with the root at the top, leaves at the bottom, as is usual also for "parse trees"). But at the same time, each part might have been a constituent of a different tree making a different whole and any part can be replaced by others making a possible new whole -- i.e. parts are interchangeable within limits set by constraints which apply to all trees with these parts. There is a very large set of potential parts and each whole utterance resp. image resp. action is built up like leggos of small parts put together respecting various rules into larger ones and continuing up to the whole. Summarizing, all these data structures are hierarchical and made up of interchangeable parts and subject to constraints of varying complexity. I believe that any structure of this type should be called a grammar.
Here are some examples taken from my talk at a vision workshop in Miami in 2009. Let me start with examples from languages. Remember from your school lessons that an English sentence is made up of a subject, verb and object and that there are modifying adjectives, adverbs, clauses, etc. Here is the parse of an utterance of a very verbal toddler (from the classic paper "What a two-and-a-half-year-old child said in one day", L. Haggerty, J. Genetic Psychology, 1929, p.75): Here we have two classical parse trees plus a question mark for the implied but not spoken subject of the second sentence plus two links between non-adjacent words that are also syntactically connected. The idea of interchangeability is illustrated by the words "for Margaret", a part that can be put in infinitely many other sentences, a part of type "prepositional phrase". The top dotted line is there because the word "cake" must agree in number with the word "it". For instance, if Margaret had said she wanted to make cookies, she would need to say "them" in the second sentence (although such grammatical precision may not have been available to Margaret at that age). A classic example of distant agreement, here between words in one sentence with three embedded clauses is "Which problem/problems did you say your professor said she thought was/were unsolvable?" This has been used to argue for the transformational grammars by Chomsky. This is not unreasonable but we will argue that identical issues occur in vision, so neural skills for obeying these constraints must be more primitive and cortically widespread.
In other languages, the parts that are grouped almost never need to be adjacent and agreement is typically between distant parts, e.g. in Virgil we find the latin sentence
Ultima Cumaei venit iam carminis aetaswhuch translates word-for-word as "last of-Cumaea has-arrived now of-song age" or, re-arranging the order as dictated by the disambiguating suffixes: "The last age of the Cumaean song has now arrived". Thus the noun phrase "last age" is made up of the first and last words and genitive clause "of the Cumaean song" is the second and fifth words, while the verb phrase "now arrived" is in the very middle. The subject is made up of four words, numbers 1,2,5 and 6. So word order is a superficial aspect but there is still an underlying set of parts, distinguished by case and gender, that are interchangeable with other possible parts, these parts altogether forming a tree (the "deep" structure). In other languages, e.g. sanskrit, words themselves are compound groups, made by fusing simpler words with elaborate rules that systematically change phonemes, as detailed in Panini's famous c.300 BCE grammar. Here the parse tree leaves can be syllables of the compound words.
It was a real eye-opener to me when it became evident that images, just like sentences, are naturally described by parse trees. For a full development of this theory, see my paper "A Stochastic Grammar of Images" with Song-Chun Zhu. The biggest difference with language grammars is that in images there is no linear order between parts. and even, when one object partly occludes another, two non-adjacent patches of an image may be parts of one object and a hidden patch is inferred. Here is an example, courtesy of Song-Chun Zhu, of the sort of parse tree that a simple image leads to: The football match image is at the top, the root. Below this, it is broken into three main objects -- the foreground person, the field and the stadium. These in turn are made up of parts and this would go on to smaller pieces except that the tree has been truncated. The ultimate leaves, the visual analogs of phonemes, are the tiny patches (e.g. 3 by 3 or somewhat bigger sets of pixels) which, it turns out, are overwhelmingly either uniform, show edges, show bars or show "blobs". This emerges both from statistical analysis and from the neurophysiology of primary visual cortex (V1).
Grammatical constraints are present whenever objects break up into parts whose relative position and size are almost always constrained so as to follow a "template". The early twentieth century school known as gestalt psychology worked out more complex rules of the grammar of images (although not, of course, using this terminology). They showed the way, for example, that symmetry and consistent orientation of lines and curves creates groupings of non-adjacent patches and demonstrated how powerfully hidden patches, hidden by occlusion, were inferred by subjects. Here is a simple image in which the occluded parts have been added to the parse tree: The blue lines indicate adjacency, solid black arrows are inclusion of one part in another and dotted arrows point to a hidden part. Thus H1 and H2 are the head, separated into the part occluding sky and the part occluding the field, and joined into the larger part H. S is the sky while VS is the visible part of the sky and H1 conceals an invisible part, Similarly for the field F and VF. The man M is made up of the head H and torso T. One can even make an example analogous to the above sentence concerning the professor's unsolvable problem, in which a chain of partially occluded objects acts similarly to the chain of embedded chauses: Here we have trees to the left of the chief, whose left arm occludes the teepee that occludes the reappearing trees -- whose color must match that of the first trees.
Returning to motor actions and formation of plans of action, it is evident that actions and plans are hierarchical. Just take the elementary school exercise -- write down the steps required to make a peanut butter sandwich. No matter what the child writes, you can subdivide the action further, e.g. not "walk to the refrigerator" but first locate it, then estimate its distance, then take a set of steps checking for obstacles to be svoided, then reach for handle etc. The student can't win because there is so much detail that we take for granted! Clearly actions are made up of interchangeable parts and clearly they must be assempled so as to satisfy many constraints, some simple like the next action beinning where the previous left off and some subtler.
The grammars of actions are complicated, however, by two extra factors: causality and multiple agents. Some actions cause other things to happen, a twist not present in the parse trees of speech and images. Judea Pearl has written extensively on the mathematics of the relation of causality and correlation and on a different sort of graph, his Bayesian networks and causal trees. Moreover, many actions involve or require more than one person. A key example for human evolution is that of hunting. It is quite remarkable that Everett describes how the Piraha use a very reduced form of their language based on whistling when hunting. From the standpoint of the mental representatation of the grammar of actions, a third complication is the use of these grammars in making plans for future actions. An example where some of the many expansions of one plan are shown is:
To summarize, I believe that any animal that can use its eyes to develop mental representations of the world around it or can carry out complex actions involving mutliple steps must develop cortical mechanisms for using grammars. This includes all mammals and certain other species, e.g. octopuses and many birds. These grammars involve a mental representation of trees built from interchangeable parts and satisfying large numbers of constraints. Language and sophisticated planning may well be unique to humans but grammar is a much more widely shared skill. How this is realized e.g. in mammalian cortex, is a major question, one of the most fundamental in the still early unraveling of how our brains work.
The blog above has been translated into Estonian by Sonja Kulmala in Tartu: here's the link . It's certainly interesting to contrast the grammars in synthetic agglutinative languages like Estonian and Turkish with anaytic languages like Chinese.
After reading the above, Prof. Shiva Shankar drew my attention to the following that appears on cover of Frits Staal's book Ritual and Mantras, Rules without Meaning:
An original study of ritual and mantras which shows that rites lead a life of their own, unaffected by religion or society. In its analysis of Vedic ritual, it uses methods inspired by logic, linguistics, anthropology and Asian studies. New insights are offered into various topics including music, bird song and the origin of language. The discussion culminates in a proposal for a new human science that challenges the current dogma of 'the two cultures' of sciences and humanities.He seems be saying that rituals and mantras are an embodiment of a purely abstract grammar, expressed in both action and speech with a minimum of semantic baggage. ]]>
I met Shafarevich in 1962 at the Stockholm Internation Congress of Mathematics. I spent an evening drinking a bit more vodka than was good for me with Shafarevich and Manin. I met them next in 1979 in Moscow, neither having been allowed to travel to the West in the interim. (I recall Manin having a desk with a glass top under which he had kept all the many invitations he had been forced to decline.) But in the meantime, in spite of being so isolated, Shafarevich had built in Moscow one of the best groups of mathematicians working on the synergistic fusion of algebraic geometry with algebraic number theory. He has a strong personality, is a wonderful teacher and is also quite religious (Eastern Orthodox). In addition he has thought deeply about social science and how history molds the character of a country. He is now 93 and I am writing this blog wishing him well in these difficult times that, I think, have made his views important to revisit. Here is a quote from the last section of his essay "Russophobia" , (p.29 of this pdf) that provoked the 1992 controversy:
A thousand years of history have forged such national character traits as a belief that the destiny of the individual and the destinies of the people are inseparable in their deepest underlying layers and, at fateful moments of history, are merged; and such traits as a bond with the land—the land in the narrow sense of the word, which grows grain, and the Russian land. These traits have helped it endure terrible trials and to live and work under conditions that have at times been almost inhuman. All hope for our future lies in this ancient tradition. ...
....... We most likely are dealing here with a phenomenon to which present-day science's standard methods of "understanding" are completely inapplicable. It is easier to point out why individual people need peoples. Belonging to his people makes a person a participant in History and privy to the mysteries of the past and future. He can feel himself to be more than a particle of the "living matter" that is for some reason turned out by the gigantic factory of Nature. He is capable of feeling (usually subconsciously) the significance and lofty meaningfulness of humanity's earthly existence and his own role in it. Analogous to the "biological environment," the people is a person's "social environment": a marvelous creation supported and created by our actions, but not by our designs. In many respects it surpasses the capacity of our understanding, but it is also often touchingly defenseless in the face of our thoughtless interference. One can look at History as a two-sided process of interaction between the individual and his "social environment"— the people. We have said what the people gives the individual. For his part, the individual creates the forces that bind the people together and ensure its existence: language, folklore, art, and the recognition of its historical destiny.
These words seem both romantic and an expression of the core of conservative appeals to preserve a country's traditions and cohesiveness, an appeal that we now hear around the world. The bulk of "Russophobia" is an attack on writers who, he believes, have denigrated the Russian "people" and who claim that the Russian peoples' salvation lies in replacing native Russian values with Western liberal and internationally oriented ideas. Naturally enough, some of these writers are Jews hence his being called anti-semitic in writing this essay. This seems quite ironic to me as the whole rationale for the state of Israel has been the restoration of Jewish traditions, language and religion, a homeland free of outside coercion. So his Jewish critics might have seen some parallels with their own aspirations rather than reading the essay as advocating a return to the days of pogroms. His letter to me, reproduced below, responds directly to some of the criticisms that he received.
However, my own upbringing and beliefs have always leaned towards these Western liberal values, so his essay has forced me to revisit my own biases. My own upbringing was in an international multi-cultural setting. My father had a PhD in anthropology and had started a school in Tanzania based on integrating tribes and teaching them basic technology and hygiene that they could bring back to their villages. Later he worked in the U.N. and invited home friends from many nations. It has always seemed an axiom to me that the world would gradually become one, each culture sharing its values with others and accepting the others' differences. How naive of me to expect anything so simple! Conflicts were far away from our sheltered privileged neighborhood. The woes of the great depression were nowhere to be seen, the devastation of Hiroshima was a world away and I could blithely recite the Apostle's Creed when our neighborhood was not divided into religious ghettos.
Math is the most international profession so travel should have opened my eyes a bit: in 1963 I saw the still devastated Hiroshima with my own eyes; in 1967 I spent 2 weeks in Israel obeying the Torah with separate milk/meat meals; and later in 1967/68 I lived side by side with the highly visible poverty of third world Bombay. What I failed to fully appreciate was the passion with which Japan, Israel and India were all driven by their intact -- and strongly exclusive -- cultures. I was never called a "gaijin" (a strange foreigner) in the largely closed society of Japan though I'm sure that is how I was seen. I saw the contrast between the brown earth on the "West Bank" and the green irrigated land in Israel but not the absence of any trust between the Palestinian and Jewish peoples. I saw people living in the streets and cleaning our apartment with rags in Bombay but not their label of "untouchable" or "dalit". Little did I know how strong Hindu culture is (though my wife, in love with Hindu myths, was enlightened by André Weil that she could not convert and the best she could hope for was to be born a dalit in her next life).
As I see it now, there is a major conflict, not to be papered over, between the tolerant international liberal viewpoint and the passion with which each culture maintains its traditions and passes them on generation after generation. I grew up completely committed to the former and my whole life working freely with colleagues from every part of the world reinforced this. But now I hear and read more and more voices that say "not so fast". Our culture, our jobs, our very identities are vanishing. The rapidity with which technology is advancing and the immense growth of international wealth, private and corporate, all support only the "one per cent" and the educated with ties to multiple countries. Moreover, the ever expanding population of refugees relentlessly aggravates the conflict. Every countries' unique identity is threatened by these forces and every country has plenty of right wing politicians riding the reaction to it.
I don't believe there is any simple right or wrong here. Much of the problem is due to the rapidity of change now. Everyone's lifetime is long enough for them to see whole livelihoods and communities disappear. It makes no sense to demonize either side. This is the core issue in the US election this year: Clinton represented the liberal "politically correct" internationalist standpoint and promised merely to fine tune the hurricane of change; Trump wildly asserts he can restore a strong and prosperous America with mid-twentieth century values without giving a hint of how he intends to do this. So Igor, wherever you are, I now look back on your essay "Russophobia" with more sympathy. I still feel that the persecution of Dalits in India and the apartheid of Palestinians in their homeland are evil but I also see how powerful is everyone's link to their traditions. As you said "individual people need peoples ... (their) "social environment": a marvelous creation supported and created by our actions". I am now living in a small town in Maine which, I suspect, embodies quite a bit of traditional American values. But even this is threatened by many outside forces, the warming of fishing grounds, the closure of the shore by one-percenters. Who knows whether this town will still be the same when my grandchildren grow up?
Nov.4, 1992
Dear Mumford,
Thank you for your friendly letter. Of course it is hopeless to explain "where I stand" in 1 or 2 pages but I will try to say what I can. Certainly the slogans of patriotism can lead to bad things, but I don't know what slogans can't. You know probably what were the consequences of the slogans "egalite, fraternite, liberte" during "la terreur" and how the idea of "God's own country" became a warrant for the genocide of North-American Indians. I do not see a danger of such tendencies in the movement of mild national flavor to which I belong. Of course, there is the famous "Pamyat" but it is (a) completely isolated, (b) extremely scanty, (c) without any influence at all in this country and (d) probably created exactly to draw a picture of "russian fascism" (but here I am not certain). I was interested to read about my participation in "political rallies where others have explicitly called for 'cleansing' the government of all Jews, the violent removal of Yeltsin and the re-conquest of the former Soviet Union". I never heard such appeals. Of course Yeltsin is a disaster but the common idea is to remove him by constitutional means which is quite possible and even probable if only he himself will not break the Constitution. Indeed it was exactly he who proclaimed the idea to "disperse the parliament". The idea to "re-conquest" the Soviet Union would be stupid if not insane. However, many people, including me, hope the country will re-unite in its principal parts -- simply because the people will see what a tragedy its disruption brings. The lies that are written about me would be not very important. But it is really dangerous if your media are feeding you with information of the same quality on more important subjects. In our country this is exactly the case.
But I think one has to say truly that all fuss about me was provoked by what I qrote about Russian-Jewish relations. The subject is painful but it is never good to avoid difficult situations pretending they do not exist. I tried to write with greatest restraint. Some people say that what I have written may be correct but it can give rise to anger and violence. I do not believe this is probable. But what is the logic of my opponents? My paper is composed mainly of quotations.Why do they not address their appeals to to people who write or publish such things that even a quotation from them can provoke violence? But what I have read about myself in American newspapers is beyond any logic. The foreign secretary of the NAS accused me of interfering in the careers of young Jewish mathematicians and preventing them from publishing their papers. Probably such accusations are punishable by court! In reality I have taken many troubles to help my students of Jewish (or partly Jewish) origin -- such as Golod or Manin -- in their careers. Not, of course, because of their origin -- I tried to do the same for all my students. The President of the NAS even makes me responsible for the policy of the Steklov Institute, while Arnold is in the same Institute and Fadeev is even its vice-director, both foreign members of the NAS. Novikov is head of a department there. Are all of them responsible? I also read how I advocated on television the views of "Pamyat" while I did not even mention the name. Formerly I believed that the novel of M. Twain about his attempt to be elected a governor was a parody and a vast exaggeration. Now I think it is a rather accurate description of American life. However I received many letters of support from the the US and this comforts me.
With best wishes,
Shafarevich
The above post has been translated into Hungarian and posted by Fijavan Brenk here. This is great but I just want to add that I was not trying to argue for or against Nationalistic movements in the above. I just wanted to say that, as an old man, I have come to understand them better. Personally, I am still an internationalist through relatives, friends and colleagues.
]]>A word of apology before I get started: much of what i want to say is understandable to non-mathematicians, but, in order to make my case, I need to cite many specific mathematicians and mathematical results that are only clear to fellow mathematicians. I have included some background to make the ideas clearer to non-mathematicians but this is an uneasy compromise.
I think one can make a case for dividing mathematicians into several tribes depending on what most strongly drives them into their esoteric world. I like to call these tribes explorers, alchemists, wrestlers and detectives. Of course, many mathematicians move between tribes and some results are not cleanly part the property of one tribe.
Explorers: I want to give examples for each tribe of specific beautiful results and specific people I have known and interacted with in this tribe. Arguably the archetypal discovery by explorers was the ancient Greek list of the five Platonic solids: the only 'regular' convex polyhedra (meaning that any face and vertex on that face can be carried to any other such face, vertex pair by a rotation of the polyhedron). This discovery is sometimes attributed to Theaetetus, is described by Plato in the Timaeus dialog and worked out in detail in Euclid's Elements. I find it curious that nowhere, to my knowledge, is an icosahedron or a dodecahedron ever described in Indian or Chinese writings prior to the 17th century merging of their mathematical traditions with those of the West. Enlarging the mathematical universe from three dimensions to higher dimensions started a gold rush for explorers. In the 19th century, the Swiss mathematician Ludwig Schläfli extended the Greek list to regular polytopes in n dimensions, finding 6 in four dimensional space but only 3 in all higher dimensional spaces. In the 20th century, exploring all possible low dimensional manifolds (both homeomorphic, piecewise-linear and differentiable types) has been a major focus. I knew my contemporary Bill Thurston fairly well and he seems to me to have been clearly a member of the explorer tribe. He was a fantastic topologist and it was especially intriguing to me that he was born cross eyed, thus his understanding the 3D world was forced to depend more on parietal areas and hand-eye coordination than occipital cortex, stereo based learning. I never met anyone with anything close to his skill in visualization (except perhaps for H. S. M. Coxeter).
But explorers are not all geometers: the list of finite simple groups is surely one of the most beautiful and striking discoveries of the 20th century. Although he is not a card-carrying explorer, having devoted much of his career to detective work, in the second half of his career, Michael Artin discovered an amazing rich world of non-commutative rings lying in the middle ground between the almost commutative area and the truly huge free rings. "Rings" are sets of things that can be added and multiplied, but here he allows \( x\cdot y \ne y \cdot x \). He really set foot on a continent where no one had a clue what might be found: this exploration is ongoing. And then there is that most peculiar, almost theological world of 'higher infinities' that the explorations of set theorists have revealed.
My own career has been centered in the mapper sub-tribe. My maps are called moduli spaces of varieties (finite-dimensional objects) and moduli spaces of sub-manifolds of Euclidean spaces (infinite-dimensional objects). But one can make the case that the earliest members of the explorer tribe, even the earliest mathematicians, were literally mappers. I have in mind the story told by cuneiform surveying tablets. The earliest organized states in the world confronted the tasks of keeping track of land ownership and of taxing farmers. We are lucky to have a vast collection of Mesopotamian tablets from the late third millennium to the mid first millennium BCE. Many of these tablets contain idealized maps of land or of geometric constructions stimulated by surveying tasks. It seems fairly clear that the scribes who wrote these tablets went on to discover much of the geometric algebra, Pythagoras's rule and the quadratic equation as a result of being presented with practical land use and accounting challenges. They had no interest in questions of proof, only in algorithms related to measuring the earth, its distances and areas, (which they called the wisdom of the goddess Nisaba with her rope and measuring reed).
The Atiyah-Zeki list has very few results of explorers, perhaps because their results are not usually expressed by formulas. However, it contains three gems: #12, the Mandelbrot set; #15, an integer expressible two ways as a sum of two cubes, famous because Ramanujan told it to Hardy; and #28, the (3,4,5) right triangle. Another set of formulas selected this time by ten mathematical scientists is the Concinnitas project of Bob Feldman and Dan Rockmore of ten aquatints (the link is to my post and contains the formulas). It contains a gem from the short list of finite simple groups, here the groups discovered by Rimhak Ree. I would like to add that some of the things that gave me the most pleasure in my own research were discovering unusual previously unknown geometric objects: one was a negatively curved algebraic surface whose homology was the same as that of the positively curved \( \mathbb{P}^2 \) .
Alchemists: For many people, the most wonderful results in mathematics are those that reveal a deep relationship between two very distant subjects, for instance a link between algebra and geometry, algebra and analysis or geometry and analysis. Such links suggest that the world has a hidden unity, previously concealed from our mortal eyes but blindingly beautiful if we stumble upon it. An early example of such a link is the connection of the geometric problem of trisecting an angle and the algebraic problem of solving cubic polynomial equations. The first was one of the major unsolved problems of the ancient Greek tradition. In the Renaissance, Italian algebraists found a mysterious formula for the roots of a cubic polynomial. But in the case where all three roots are real, their formula led to complex numbers and cube roots of such numbers. The French mathematician Viète was the 'alchemist' who made the link c.1593: he showed how, if you can trisect angles, you can solve these cubic equations and vice versa. It wasn't until the early18th century, however, that another Frenchman, Abraham De Moivre really explained the result with his formula $$ (\cos(\theta)+i.\sin(\theta))^n = \cos(n\theta)+i.\sin(n\theta).$$ This is surely alchemy. But I would classify the leading mathematicians of the 18th and early 19th century, Leonard Euler from Switzerland and Carl Fredrich Gauss from Germany as the 'strip miners' who showed how two dimensional geometry lay behind the algebra of complex numbers. Euler's form of De Moivre's formula appears as #5 (and #1) of the Atiyah-Zeki list.
My PhD advisor Oscar Zariski was surely an alchemist. His deepest work was showing how the tools of commutative algebra, that had been developed by straight algebraists, had major geometric meaning and could be used to solve some of the most vexing issues of the Italian school of algebraic geometry. More specifically, the algebraic notions of integral closure and of valuation rings were shown to relate to geometry in Zariski's 'Main theorem' and his work on resolving singularities. He used to say that the best work was not that proving new theorems but that creating new techniques that could be used again and again.
The famous Riemann-Roch theorem has been an especially rich source of alchemy. It was from the beginning a link between complex analysis and the geometry of algebraic curves. It was extended by pure algebra to characteristic p, then generalized to higher dimensions by Fritz Hirzebruch using the latest tools of algebraic topology. Then Michael Atiyah and Isadore Singer linked it to general systems of elliptic partial differential equations, thus connecting analysis, topology and geometry at one fell swoop. Out of modesty, Atiyah did not include this in his list but he did put in its special case, the Hirzebruch signature formula, in his aquatint in the Feldman-Rockmore project. These aquatints also include the Dyson-MacDonald combinatorial formula for \( \tau(n) \), numbers which come from complex analysis: surely alchemy. Finally, a most bizarre formula for \( 1/\pi \) appears as formula #14 in the Atiyah-Zeki list. I suspect this was included by the authors because they suspected that many would think it ugly. I have no idea where it comes from but whoever found it belongs to the sub-tribe of Baroque Alchemists. It stands in contrast to the much simpler but nonetheless alchemical formula #30 for \( \pi \).
Wrestlers: Wrestling goes back to Archimedes: he loved estimating \( \pi \) and concocting gigantic numbers. The very large and very small have always had a fascination for wrestlers. Calculus stems from the work of Newton and Leibniz and in Leibniz's approach depends on distinguishing the size of infinitesimals from the size of their squares which are infinitely smaller. A laissez-faire attitude towards infinities and infinitesimals dominated the 18th century, resulting in alchemy gone amuk as in Euler's really strange formulas: $$ \frac{1}{2} = 1-1+1-1+1- \cdots, \quad \frac{1}{4}=1-2+3-4+5-\cdots$$ Of course Euler knew these only made sense when viewed in a very special way and he himself had not gone crazy. In fact, many might say the above are very beautiful formulas. A notable much more understandable achievement of wrestlers in this century was Stirling's formula for the approximate size of n! (#41 in the Atiyah-Zeki list). The modern father of the wrestling tribe in the 19th century should be the Frenchman Augustin-Louis Cauchy who finally made calculus rigorous. His eponymous inequality, that the absolute value of the dot product of 2 vectors is less than the product their lengths, $$ |(\vec x \cdot \vec y)| \le \|\vec x\|\cdot \|\vec y\| $$ remains the single most important inequality in math. Atiyah-Zeki include the related triangle inequality as #25.
I was not trained as a wrestler but I, at least, had a small education later because of my work in applied math. I did fall in love with the wonderful inequalities of the Russian analyst Sergei Sobolev. The simplest of these illustrates what many contemporary wrestlers deal with: say f(x) is a smooth function on the real line. Then for all a,b, one has the simple Corollary of Cauchy's inequality: $$ |f(b)-f(a)|^2 \le |b-a| \cdot \textstyle\int ( \tfrac{df}{dx} )^2 dx.$$ Thus one says that a square integral bound on the derivative "controls" its point wise values. When I was teaching algebraic geometry at Harvard, we used to think of the NYU Courant Institute analysts as the macho guys on the scene, all wrestlers. I have heard that conversely they used the phrase 'French pastry' to describe the abstract approach that had leapt the Atlantic from Paris to Harvard.
Besides the Courant crowd, Shing-Tung Yau is the most amazing wrestler I have talked to. At one time, he showed me a quick derivation of inequalities I had sweated blood over and has told me that mastering this skill was one of the big steps in his graduate education. Its crucial to realize that outside pure math, inequalities are central in economics, computer science, statistics, game theory, and operations research. Perhaps the obsession with equalities is an aberration unique to pure math while most of the real world runs on inequalities.
Other examples of wrestler's work in the Atiyah-Zeki list are #11 (Cantor's inequality); #26 (the prime number theorem); and #38 (convexity of logs).
Detectives: Andrew Wiles said he worked on Fermat's claim that \( x^n + y^n = z^n \) has no positive integer solutions if \( n \ge 3 \) obsessively for eight years, describing the work as follows (in this PBS interview ):
I used to come up to my study, and start trying to find patterns. I tried doing calculations which explain some little piece of mathematics. I tried to fit it in with some previous broad conceptual understanding of some part of mathematics that would clarify the particular problem I was thinking about. Sometimes that would involve going and looking it up in a book to see how it's done there. Sometimes it was a question of modifying things a bit, doing a little extra calculation. And sometimes I realized that nothing that had ever been done before was any use at all. Then I just had to find something completely new; it's a mystery where that comes from. I carried this problem around in my head basically the whole time. I would wake up with it first thing in the morning, I would be thinking about it all day, and I would be thinking about it when I went to sleep. Without distraction, I would have the same thing going round and round in my mind. The only way I could relax was when I was with my children. Young children simply aren't interested in Fermat. They just want to hear a story and they're not going to let you do anything else.Although this is extreme, this sort of pursuit is well known to all mathematicians. The English mathematical physicist Roger Penrose once described his way of working similarly: "My own way of thinking is to ponder long and, I hope, deeply on problems and for a long time ... and I never really let them go." In many ways this is the public's standard idea of what a mathematician does: seek clues, pursue a trail, often hitting dead ends, all in pursuit of a proof of the big theorem. But I think it's more correct to say this is one way of doing math, one style. Many are leery of getting trapped in a quest that they may never fulfill. Peter Sarnak at the Princeton Institute for Advanced Study has described what it feels like to be a research mathematician by the sentence "The steady state of a mathematician is to be blocked". Arguably Landon Clay may have done maths no service by singling out seven of the deepest, most difficult math problems and putting a million dollar bounty on each. Putting a dollar value on a proof is quite bizarre and the prize was declined by Grigori Perelman, the only winner in this contest so far. In any case, I believe it is more common among mathematicians to become intimately familiar with a range of related problems while not necessarily actively working on any of them. But these problems are not far from their consciousness and from time to time, a clue will show up, a hint of some connection, and then it all rushes back and hopefully some progress is made on one of the problems.
Among those who attack major problems, a very small number are able to imagine a deeper more abstract layer of meaning in the problems of the day, that others never imagined.They are detectives who feel the answer is deeply hidden, so you need to strip away all the features of the situation that are accidental and thus irrelevant to understanding it. Underneath you find its true mechanisms, what makes it tick. It seems only logical to call such people strip miners, though not in a pejorative sense. The greatest contemporary practitioner of this philosophy in the 20th century was Alexander Grothendieck. Of all the mathematicians that I have met, he was the one whom I would unreservedly call a "genius". But there have been others before him.
I consider Eudoxus and his spiritual successor Archimedes to be strip miners. The level they reached was essentially that of a rigorous theory of real numbers with which they are able to calculate many specific integrals. Book V in Euclid's Elements and Archimedes The Method of Mechanical Theorems testify to how deeply they dug. Some centuries later and quite independently, Aryabhata in India reached a similar level, now finding what are essentially derivatives, fitting them into specific differential equations. But it is impossible to fully document the achievements of either of these mathematicians as only fragments of their work survive and there is no way to reconstruct much of the mathematical world in which they worked, the context for their discoveries. Grothendieck's ideas, however, and world both before and after his work are very clearly documented. He considered that the real work in solving a mathematical problem was to find le niveau juste in which one finds the right statement of the problem at its proper level of generality. And indeed, his radical abstractions of schemes, functors, K-groups, etc. proved their worth by solving a raft of old problems and transforming the whole face of algebraic geometry. Mike Artin, John Tate and I have documented four of his greatest successes in the obituary to appear in the Notices of the AMS (forthcoming, early 2016). Pretty wonderful French pastry.
Many of the formulas in the Atiyah-Zeki list seem to me to come from the Baptismal tribe. #10 defines e; #13 defines the \( \delta \)-function; #21 defines \( \pi \); #24 defines eigenvectors; #47 defines Möbius maps; #48 defines Clifford algebras. I have not mentioned many of the remaining equations in the Atiyah-Zeki list. It seems to me that many are intermediate results in a developing theory, found by detectives doing great work. It is hard for me to judge which are more beautiful: their attraction comes from their bringing to mind a whole beautiful theory of which they are one part. For instance, #36, \( \langle B,B \rangle_t = t \) , the variance of Brownian motion, is hugely important and beautiful but I would think of it as a natural consequence of the more basic fact that, when you add independent random variables x and y, their standard deviations follow the stochastic version of Pythagoras's rule.: $$ \text{St.Dev.}(x+y) = \sqrt{ (\text{St.Dev.}(x))^2 + (\text{St.Dev.}(y))^2}$$
Brain areas for the different forms of beauty?: It is clear that members of each tribe will make different judgements on the relative beauty of specific mathematical formulas or theorems. I want to take up each one in turn and ask what cortical activity they might produce. Explorers clearly find a tremendous thrill in the Systema Naturae, the flora and fauna and gazetteers produced by their explorer colleagues. Exotic creatures like non-standard differential structures on Euclidean 4-space continue to amaze and to defy visualization. But I suspect that geometers have mental tricks that allow them to piggy-back a sense for high dimensional constructions on top of their 3-dimensional skills. Thus constructions like surgery and suspension can be visualized in the simplest cases and the mind builds the skills that allow the general case to be grasped as an analog of these. I remember Zariski, getting stuck at a certain point in his lectures, drawing a bit of an algebraic plane curve (a cubic with a double point) in the corner of blackboard to kickstart his intuition. Steve Kosslyn and others have studied cortical activity with fMRI while a subject is forming a visual mental image of some object (here is one reference). There seems to be a complex pattern of widespread activity -- frontal as well as parietal and temporal -- as well suppression of activity in what I guess is pretty close to Zeki's mOFC (see the blue area in the top row in the figure on p.231 of the cited paper). But people who are not geometers may never use visualization in their research. There's a probably apocryphal story about the algebraist Irving Kaplansky: asked what he saw when you asked him to think about a ring replied "I see the letter 'R' ".
The most common 'beautiful' formulas are alchemical. The famous: $$ e^{i\pi} = -1$$ brings together exponential growth with the geometry of the circle. When a formula connects two concepts that would seem to have absolutely nothing to do with each other, you get a chill running down your back. It feels as though the universe wasn't forced to be this way so it is not unreasonable to ask God "why did you decide to make this happen?". In other words, it is hard to dispel a sense of mystery that clings to them. Is there an area of the brain which is active when you can't figure out why something happened, when you are mystified by some event? It would seem hard to devise fMRI experiments to find such a "mystery-center". But I believe that alchemists find the greatest beauty in such mysteries.
What is going on in the minds of wrestlers? My guess is that estimating size and relative power of math things is connected to our social behavior, to Darwinian selection of the fittest. Animal life is all about being strong enough to get the stuff you need. A large number of species exists in a hierarchical social setting, with each individual learning rapidly whom to defer to, whom to dominate. And Robin Dunbar has shown that the size of your working social group goes up exponentially with your brain size, thus humans must have large cortical areas devoted to deeply understanding the interactions of their large groups -- he estimates that on average each person lives in group of some 150 people whom "you wouldn't feel embarrassed about joining for a drink if you happened to bump into them in a bar". Although I haves not seen any experiments with this focus, I feel there must be cortical areas specialized for learning social structures and the complex web of pair relationships it contains. (Perhaps anterior cingulate cortex?) Given how central this is in our brains and lives, it feels to me that when structuring math objects, especially functions, by size (rate of growth, degree of smoothness, etc.), you would utilize this machinery built in for creating social hierarchy. I don't mean that you personify these math sizes, but only that making a partially ordered graph like structure is a skill you already have because of evolution.
Solving a puzzle is the basic drive for the detective tribe and the goal that gives them the greatest pleasure. In this case, there need not be a beautiful formula that encapsulates the solution. Rather, the proof itself is wonderful and beautiful. (Confession: I personally find quite stupid puzzles like Sudoku rather addictive.) This is surely a central aspect of pre-frontal activity: planning your activities is finding a path in a world satisfying many constraints that leads to some desired goal. Math is, however, a bit different from the world: if you are trying to prove a theorem, you have to be prepared to reverse course and prove its negation. Never put all your money on one result. Perhaps, way out the imaginary axis, the Riemann zeta function does have a zero with real part not equal to 1/2.
Summarizing, I see visualizing an alien abstract world, finding new mysteries, creating vast hierarchies or solving the hardest puzzles as four aspects of what mathematicians find most beautiful. But each has its characteristic form of beauty that connects it to distinct parts of our mental life. Can we expect to nail each down to a specific part of the brain? Recall that most of the qualities localized in 19th century phrenology have long since not been dropped as labels for specific cortical areas. The perception of mathematical beauty may also turn out to be a higher order derivative phenomenon characterized by patterns of activity widely distributed over the brain.
I've had a few emails relevant to this post. Andrew Solomon writes:
I like Freeman Dyson's Bird/Frog taxonomy, sort of meta to yours. Also the "Beauty is truth,truth beauty" quote is perhaps relevant if one considers, as I do, that the only real truth is math. (Not that I know much about it, but I'm sure that categorical surmise would take a lot of heat, especially if math is perceived as reliant on axioms.) Lastly, while you elegantly articulate the various thought processes that go into math thinking in your tribal division, I would say that resolution might be lost when simply looking at the equations under consideration. I wonder how much visual accessibility determines voting in comparison to the math lying behind a given equation. Similarly, what is the influence of sentiment, e.g., when considering (what might be considered an inside joke) Hardy's taxi number.
And my good friend and once student Ulf Persson also focussed on the issue of "visual accessibility" by rewriting \( e^{i\pi}=-1 \) :
What about m1_eq3PV)(Mo/? Is that a beautiful formula? This brings up the perhaps trivial issue of beauty being perhaps also in the visual sense of the formula, not only its mathematical contents. When artists try to write mathematical formulas, they use a lot of square roots signs. Maybe those are the most visually striking elements. I recall how intrigued I was by the formulas in Hardy's 'Pure Mathematics', in particular the partial derivative signs intrigued me, and would do so even when I had become mathematically literate. As to the initial formula, this is an amalgam of old notations. Thus 'm' for minus eq for equality, now P should mean taking the power of what preceded it, in this case 3, and V indicates that everything that comes after should be included in the exponent . (Cardano used V in RV to indicate that anything came after would be taken the square root of). Finally the Greeks used Greek letters to designate numbers, Diophantine dotted them when they should be taken literally as their numerical values. Thus 1_, while 3 is thought of as a letter to designate e. The imaginary i is given by )( to emphasize its imaginary nature, M is of course multiplication and o/ should obviously denote pi. There is something to be said for streamlined notation. The point of algebraic notation, as I tell my students, is not just to encode information, but more important to be able to manipulate it easily.
And another good friend, also once a student, Larry Gonick (the author of mathematical, scientific and historical cartoon textbooks) wrote me how the "visual accessibility" of Dan Rockore's series of formulas continues to intrigue the art world (!):
I've been meaning to write since we saw the equation prints at the Crown Point Press opening two weeks ago. Yours I thought the handsomest of the lot, though a couple of integral signs had some swash. The Press put your image on the post card. Sample available upon request. The discussion, as you may imagine, was a little strange. The head of the press put herself and her husband on a panel with Dan Rockmore and Dick Karp from Berkeley. All were good talkers, but the artists knew zero about math (and they spoke first, at great and aimless length). Afterward, I had the definite impression that some art groupies were flinging themselves at Dan. See what you missed?]]>
As a general matter, it is vital to realize that all mathematical models are idealizations. Newton himself realized that modeling planetary motion by his law of gravitational attraction left out tidal effects. But every neat mathematical model can and usually is pursued by pure or applied mathematicians who push it to extremes. Thus some have studied what pure gravitational forces do to planets billions of years from now when this approximation ceases to be relevant. to actual events. I have done the same in my own research pursuing a beautiful mathematical model for comparing shapes of 3D objects to elicit its mathematical secrets, way beyond its relevance to detecting diseased organs in a human body. As for economics, Keynes himself wrote (in "The General Theory of Employment, Interest and Money", Ch. 21):
Too large a proportion of recent 'mathematical' economics are merely concoctions, as imprecise as the initial assumptions they rest on, which allow the authopr to lose sight of the complexities and interdependencies of the real world in a maze of pretentious and unhelpful symbols.
I am not an economist nor have I even taken basic courses in economics. My understanding is that of a bystander living in a sea of economic events and trying to make sense of them. So when I heard of the experiments of Kahneman and Twersky on the irrationality of economic decisions made by average people, it was an "aha" moment on the limitations of economics. Here is one of their experiments: subjects do some task leading to a reward. They are then asked to choose between two possible rewards: $10 or a good quality pen and pencil set. Most chose the money. A second set of subjects were also offered a reward but now they had three choices: $10, the previous pen and pencil set and a clearly inferior mass-produced pen and pencil set. Now most chose the good quality pen and pencil set! So much for acting rationally on the basis of fixed personal values attached to all goods, as "rational economic agents" should. Kahneman has gone on to show the pervasive effects of emotions on the simplest of tasks.
For some time I have been wondering, reading headlines and so on, about employment, off-shoring, robotics etc. I thought: why don't I try to find some data myself and see what's up. Through the wonders of google and the internet, I found a table put out by the Bureau of Labor Statistics on how many people in the US work at one of hundreds of categories of jobs: Employment by detailed occupation. Naturally, one first checks out ones own occupation: out of a total of 145,355,800 working people in the US in 2012, 63,300 are college math science teachers, that is one in 2300 or about 0.04%. Then I checked my wife's occupation, classified as a "fine artist" (note that the table includes self-employed people, presumably through their schedule Cs as well as those with an employer): only 28,800 (this includes painters sculptors and illustrators), one in 5000 or 0.02%. I'm sure if you include struggling artists with either no art related income or undeclared income (!), there would be easily as many as us college math teachers. But now the figure I had started looking for: agricultural workers. This would have been nearly 100% at the time when the USA came into existence. Now it is 815,500 that works out to be 0.56% of the entire working population! As for fishing, which is the main occupation in my town of Tenants Harbor, Maine, there are 31,300 workers, just about the same numbers as fine artists. My guess is that allowing for imported food, we are fed by around 1% of our workers. I suppose the main driver for this is the use of massive machines to plant, irrigate, fertilize and harvest huge fields. Wow -- who would have expected this two centuries ago. If we lived a simple life like that of the first settlers (but with machines), could most of us sit back contemplate nature all day long?
How about the machines? Do we need many people to manufacture these for us? There has been a precipitous decline in manufacturing jobs too, caused first by off-shoring but more recently by the jobs coming back to the US but now being carried out by robots. While I was doing research in computer vision, I used to answer questions about its usefulness by saying that computer driven cars would be coming in a decade or so. And now they are here. Both the needed analysis of perceptual data and the control of the vehicle can now be done at a human level by computer. This convinces me that full industrial automation is now possible and that at least all the necessities of life can in principle be supplied by a very small percentage of the population.
How about the non-necessities: the overhead of a capitalist economy, health, education and fun stuff? With online shopping, stores become a dispensable luxury and, I have to admit, I prefer Amazon to going to a mall. Hard also to admit but the guts of my job can be dispensed with too: professors can easily be supplanted by MOOCs (massively online courses, prerecorded from the best lecturer in the country). The full gory picture is detailed in Martin Ford's recent book The Rise of Robots. He conjectures that a large percentage of white-collar jobs can be carried out by AI-type programs. IBM's success in winning Jeopardy! by computers with access to big datasets would seem to confirm this. We seem to be facing long term structural unemployment and that this will get progessively worse as computers take over more and more tasks. Now all this flies in the face of economists' orthodoxy. Some years back, as we were struggling to emerge from the great recession, I asked the economist Roland Fryer whether he didn't feel there just weren't enough jobs to ever recover. His answer: new jobs have always turned up as the economy adapts though you can't predict where. The question, as I see it, is whether the present is really different from the past and whether now machines are taking over for good?
If indeed it is going to be hard to find jobs for everyone who wants one, one need only look to France to see a simple near-term solution. In the US, many have been used to thinking of France as a decadent society, past its "gloire" and now falling behind the US and Germany who still believe in old fashioned hard work. Indeed, the official work week in France is 35 hours, not 40, and every worker is entitled to 5 weeks of paid vacation. I would contend that actually they know what's up better than us. Is it not rational if all the food and all the goods we need can be produced by fewer and fewer people that we should enjoy life more -- work less per week and take a decent vacation? We, the people of advanced countries in the 21st century, are wealthy -- or ought to be. It seems to me that the French have got it right. Of course, all this brings in the ridiculous level of inequality in our society. Again, it was a Frenchman, Piketty, who wrote the book making this utterly clear. If the number of working hours per year were reduced in the US, businesses would hire more people to do the needed work and this would decrease unemployment and be a first step in equalizing things as well.
Looking further ahead at the possibility of a society that is 90% mechanized and computerized, there is a bigger problem. Martin Ford proposes in the book mentioned above that a guaranteed minimum wage is the best long-term solution, allowing many to get by without work. But I would ask: what would this do to a person's self-image? A few would be creative, a few would take drugs, a few could fight wars but most people find their worth in doing a good job, earning their livelihood and raising a family. This pattern of life seems built in to our genes. It seems to require life-long silver spoon training to handle a life of leisure with any aplomb. There is a sci-fi literature on societies with no work and their imaginings are not pretty -- mainly the result is that people get into fights over nothing. And this challenge may be dwarfed by another: global warming that looks on track to generate intractable refugee problems. I wish my grandchildren were growing into a pleasanter world.
Thomas Riepe wrote me directly, noting how prescient an old article of Stanislaw Lem is:
Concerning the automatisation issue, I guess Lem's story turns out to become kind of correct: http://www.newyorker.com/magazine/1981/10/12/phools
My former colleague Alan Yuille wrote me also pointing out a startling even earlier precedent: John Maynard Keynes, no less, predicted in 1931 that the world was becoming so wealthy that a 15 hour work week would suffice to keep our economy going and should be expected to become the norm (see Essays in Persuasion, Norton, 1963, pp 358-373).
Moreover, he directed me to to the analysis of employment carried out by his colleague Uday Karmarkar in the UCLA School of Management. Summarizing the Foundations and Trends article entitled "The U.S. Information Economy", (vol.6,m 2012), Karmarkar, Apte and Nath divide the economy into 4 super-sectors:
Dear David, in your latest blog 'The Dismal Science and the future of work' you say 'the next 50 years will see the growth of a nearly completely automated society that requires only minimal work from the large majority of its citizens'. I wish that were true, but I fear quite the opposite. As the environment deteriorates, humans are being increasingly involved in elementary labour, for instance in pollination! In the report below, it says 'For specific tasks mainly allocated to women and children -- especially the labour-intensive cross-pollination --wages are paid that are substantially below the official state or zonal minimum wages'.
It is no brave new world we are hurtling towards!
Best wishes. Shiva.
"Almost half a million Indian children are working to produce the cottonseed that is the basis for our garments and all the other textile products that we use. ... Children below 14 -- of which two-thirds are girls - are employed in the seed fields on a long-term contract basis through loans extended to their parents by local seed producers, who have agreements with the large national and multinational seed companies. Children are made to work 8 to 12 hours a day and are exposed to poisonous pesticides used in high quantities in cottonseed cultivation. Most of the children working in cottonseed farms belong to poor Dalit ('outcaste'), Adivasi (tribal) or Backward Castes families. ..." (Report by the India Committee of the Netherlands, www.indianet.nl/english)
Added in October: I came across a set of charts in the June 2015 Atlantic entitled "Are We Truly Overworked?" that gives quite relevant data on hours worked and seems to me to substantiate Keynes' views that we simply don't need to work so long every week to achieve our current level of prosperity. The most striking figure, for me, is that in the period 1950-2012, workers in Germany decreased their average total hours worked per year by 991 hours! In the US, by contrast, we decreased our annual total by merely 200 hours. In fact, in the US, men with college degrees worked 2.5 hours more per week in 2010 than in 1988. What a twisted effect of wealth and progress is this?
Added in November: Alan Yuille just sent me this link to an article in the Guardian newspaper soundng the same alarm.
]]>First, some background on India for those who are not avid India-philes. The origins of the caste system are shrouded in mystery and hotly debated but what is clear is that they were codified in the last centuries BCE in the "rules of Manu" (the Manusmrti). This is a long treatise that can be found online in English translation here It lays down the Dharma, the rightful rules of conduct, for each of the four varnas (the major groups of castes), the Brahmins, Kshatriyas, Vaishyas and Shudras and their relation to the outcastes, especially the Candalas, who must live outside the village and do jobs despised by the Hindus. Each Varna is divided into multiple jatis, these being the effective castes with each assigned a specific occupation and marrying (and eating) only within the jati. Your jati is inherited from your parents and is yours for life. Manu assigns specific punishments for anyone who violates the rules, often demoting them all the way to untouchable status. It prescribes such heavy sentences as cutting off the tongue, or pouring of molten lead in the ears of the Shudra who recites or hears the Veda!. The castes are nearly linearly ordered in status, like Huxley's Brave New World, with castes whose members were named alphas, betas, gammas, deltas and epsilons. The 200 million untouchables in India today, its epsilons, are prohibited from entering Hindu temples and even their gaze is believed to defile. One of their notable occupations today is manual "scavenging" (cleaning latrines), considered the ultimate defiling task. And indeed, since they have no pumps nor even gloves or protective gear, descending into pits to manually clean them causes many diseases, which do defile them. Attempting to change their occupation has led, in many villages, to hideous retribution from higher castes.
Such a rigidly structured society has, of course, drawn strong reactions. Nietzsche approved: "Close the Bible and open the Manu Smriti. It has an affirmation of life, a triumphing agreeable sensation in life and that to draw up a lawbook such as Manu means to permit oneself to get the upper hand, to become perfection, to be ambitious of the highest art of living." (quoted in Wikipedia's article on Manusmrti). Today, the right wing Hindu nationalist party, the BJP, is in power but Indian politics has always been complex (e.g. some states have been ruled by coalitions of Brahmins and Dalits) so they have tempered their message to gain a majority. But many members of the ruling coalition advocate forcing all Muslims and Christians to convert or leave the country and have demolished mosques inconveniently located where temples once stood. They view the Manusmrti as sacred literature; "smrti" is just shy of "sruti", literature such as the Vedas given directly by the gods.
On the other side is B. R. Ambedkar, a Dalit who was the principle author of the Indian constitution and the first Law minister of independent India under Nehru. His influence in India and especially among Dalits cannot be underestimated. He converted to Buddhism and urged all Dalits to follow him to escape the tyranny of the Hindu castes. He expanded his criticism as early as 1936 in a famous undelivered speech entitled "Annihilation of Caste" in which he recognized that he was disputing the very core of Hindu beliefs. Here is a short quote:
You cannot build anything on the foundation of caste. You cannot build up a nation. You cannot build up a morality. Anything that you will build on the foundation of caste will crack and will never be a whole......Caste may be bad. Caste may lead to conduct so gross as to be called man's inhumanity to man. All the same, it must be recognized that the Hindus observe Caste not because they are inhuman or wrong-headed. They observe Caste because they are deeply religious. People are not wrong in observing Caste. In my view, what is wrong is their religion, which has inculcated this notion of Caste. If this is correct, then obviously the enemy you must grapple with is not the people who observe Caste, but the Shastras which teach them this religion of Caste. Criticizing and ridiculing people for not inter-dining or inter-marrying, or occasionally holding inter-caste dinners and celebrating inter-caste marriages, is a futile method of achieving the desired end. The real remedy is to destroy the belief in the sanctity of the Shastras [DM: Hindu scriptures].
Some more relevant quotes from Ambedkar: "'My social philosophy may be said to be enshrined in three words: liberty, equality and fraternity. Let no one however say that I have borrowed my philosophy from the French Revolution. I have not. My philosophy has its roots in religion and not in political science. I have derived them from the teachings of my master, the Buddha.". And " In Hinduism everyone is unequal, but some are more unequal than others."
Clearly, there is a major conflict between liberal values and some of the ancient and deeply rooted Hindu values with passionate Indian adherents on both sides. This is the context in which the fight at IIT-Madras is taking place. There are two student groups, the Vivekananda Study Circle believing in traditional Hindu values and the Ambedkar Periyar Study Group advocating Dalit rights. The first complained of hate speech by the second and the administration responded by "de-recognizing" it. My friend Shiva (incidentally born a Brahmin, converted to Buddhism), who works tirelessly for Dalit rights, wrote me about this, sending me a petition to sign. I suggested I might be more helpful writing the Director at IIT-Madras because I have stayed in their Guest House and given talks there. So I wrote this:
Dear Dr. Bhaskar Ramamurthi,
Although, as a foreigner, I acknowledge that it is difficult to understand the complexities of local disputes, I write as a long term friend of many distinguished academics in India and especially in Chennai. In addition, I have several times enjoyed the hospitality of IITM's guest house and have had the honor of giving a number of talks at your Institute. I am in your and your colleagues debt for this warm reception.
But, all this said, I have strong ideas about the importance of free speech and especially the importance of allowing students to discuss vital and difficult issues that confront society today. I have also become increasingly aware, during my nearly 50 years of visiting India, of the deep social struggles that quite possibly are coming to a head as India takes a central role in the world. For all these reasons, I was deeply shocked that the Ambedkar Periyar Study Circle was "derecognised". I believe campuses must allow open discussion of divisive issues even when it offends some people so that all aspects of an issue are out in the open. Today's youth are tomorrow's leaders and one wants them to think deeply about the direction in which we are all headed.
On a more personal note, I see many similarities between India's Dalit problems and the African-American problems that have rocked the US since its beginnings. For this reason, I personally take Dr. Ambedkar as one of my heroes.
sincerely yours,
David Mumford
Perhaps inadvisably, I agreed to let Shiva send my letter to others. To my amazement, it was reprinted in the Times of India and The Hindu (the major South Indian paper) -- and triggered a deluge of criticism. Allow me to paraphrase some of the criticisms and make some replies,
The soundest defense of the de-recognition was based on arguing that a genuine, deep and passionate love for India, for its ancient glories and wonderful achievements is being torn apart by advocating, as Ambedkar did, de-sanctifying the shastras. I was told by quite a few people that I should read the books of Rajiv Malhotra, an Indian-American living in Princeton. He argues in his book Breaking India that three foreign groups have worked to undermine true Indian identity: Muslim invaders, the Christian missionaries and the Communists preaching Marx and Mao. Specifically, the Christianizers have now been joined by NGOs and think-tank preachers bearing liberal ideas with no understanding of true meaning of Hinduism. He argues in Being Different: An Indian Challenge to Western Civilization that India's spiritual traditions and specifically, its dharma (as I said, meaning the rules of correct and moral behavior) are threatened and that India should return to an earlier, purer form of Hinduism and purge these foreign influences. And what about the things criticized by liberal thinkers? On his website, he downplays these saying: "Caste, dowry, child marriage, sati, poverty, and illiteracy: Many of these phenomena certainly existed in earlier Hindu society, but in a different form, perhaps milder and not so rigid, and usually not consistently or homogeneously over time. " This idea that you can have castes and yet not oppress anyone goes back to Vivekananda himself. He believed the caste system to be an integral part of Hinduism but argued for people, even Muslims and foreigners (see Prabuddha Bharata, Aprii 1899), being able to join a caste or form new ones by virtue of their abilities:
To the non-Brahmin castes I say, wait, be not in a hurry. Do not seize every opportunity of fighting the Brahmin, because, as I have shown, you are suffering from your own fault. Who told you to neglect spirituality and Sanskrit learning? What have you been doing all this time? Why have you been indifferent? Why do you now fret and fume because somebody else had more brains, more energy, more pluck and go, than you? Instead of wasting your energies in vain discussions and quarrels in the newspapers, instead of fighting and quarrelling in your own homes ? which is sinful ? use all your energies in acquiring the culture which the Brahmin has, and the thing is done. Why do you not become Sanskrit scholars? Why do you not spend millions to bring Sanskrit education to all the castes of India? That is the question. The moment you do these things, you are equal to the Brahmin. That is the secret of power in India. (in The Future of India, Collected Works, Vol. III, p.292)Will the scavenger who learns Sanskrit then be allowed in the temple? He acknowledged this would take many generations but one is tempted to reply 'dream on'.
So the issue is whether to take the Manusmrti literally or to adapt its precepts to modern times. This is an issue that has caused endless fights in all religions. I would argue that reading their founding documents with absolute literalism is always a trap. Not only does the meaning of words change but the boundary between allegorical and metaphorical stories on the one hand and the description of literal empirical facts on the other shifts as mankind itself changes. The texts are reinterpreted in each generation. Galileo is no longer considered to contradict the received truth in the Bible. Sufis read the Quran differently from the Sunni legal scholars of Shariya. Personally, I find many Indian texts and many of its stories inspiring though not the Manusmrti. Who knows -- Ambedkar might have found common ground with Vivekananda in a Hinduism that treasured certain shastras but considered others pertinent only if grossly reinterpreted. The quality that comes and goes in all religions is tolerance. Ahoka's reign of the Maurya empire, the Umayyad Caliphate in Spain, Akbar's Mughal court, Roger Williams' Rhode Island are treasured instances where very different religions co-existed peacefully. Unfortunately, intolerance seems to be the default position.
Another group of respondents threw in my face the case of Prof. Subramanian Swamy, saying it was hypocritical of me to ask for freedom of speech in Chennai when it was being denied in Cambridge. Swamy had been invited to teach an economics course at Harvard, but after a stormy faculty meeting, he was dis-invited. The context is this: Swamy is an extreme right wing politician and had written an Op Ed piece in the Mumbai newspaper DNA, that besides calling for the destruction of 301 mosques in retaliation against Muslim terrorism, he proposed to "make Sanskrit learning compulsory and singing of Vande Mataram mandatory, and declare India as Hindu Rashtra in which only those non-Hindus can vote if they proudly acknowledge that their ancestors are Hindus. Re-name India as Hindustan as a nation of Hindus and those whose ancestors are Hindus ... Enact a national law prohibiting conversion from Hindu religion to any other religion." etc. Not exactly a tolerant guy! Whether Harvard should have allowed him to speak in deference to the freedom of speech principle or ban him as propagating hate speech was controversial. Before jumping to one conclusion, note that most of the US press supported Charlie Hebdo's right to print cartoons highly offensive to Muslims after the cartoonist was murdered. Even if the cartoons were considered part of a political discussion by sophisticated Parisians, they were clearly hate speech to Muslims. It seems that in the West the distinction between allowed and prohibited public expressions is inconsistently drawn. We try but as in Chennai, emotions intrude.
Finally, one of my closest lifelong friends, C. S. Seshadri, wrote me giving a clear headed review of Indian politics but also questioning light heartedly what "All men are created equal" can really mean? "Men" of course refers to men and women: this is not the issue. The issue, as I see it, is that our genes are not equal, some babies are born with horrible deficiencies and some with unique gifts (take having perfect pitch as a simple example). My belief is that what we all share equally is having the sense of "me", that is, of being conscious of being alive, and of having or not having opportunities to carry out our universal and natural drives. (Do other mammals have all these? That's a good question.) Note that Jefferson did not use the word "happiness" alone. No, he wrote "pursuit of happiness", meaning having as reasonable a share of opportunity as the prosperity of their society permits, not being humiliated for some irrelevant attribute. I certainly realize that this is an ongoing struggle in the US but this does not mean it is hypocritical to be equally troubled at the degree to which it is denied to virtually all Dalits in India.
]]>Book and journal publishing have been rocked by two major changes during my lifetime. The first was the takeover of smallish niche publishers by their CFO's, subsequent mergers and the entry into this business of private equity firms. The second was the expansion of the internet to a state where it can provide instant availability of whole libraries everywhere at your fingertips.
Let me start with the first. In the 50's my first wife worked for Houghton-Mifflin, reading (and usually rejecting) submittted fiction. In those days, it was typical for an author to form a life-long relationship with a specific editor who would see him or her through the ups and downs of their creative muse and become an intimate friend. This sleepy world is nicely captured in J. L. Carr's satire Harpole & Foxberrow, General publishers. This is also the world in which the greatest mathematicians of the world (including Hilbert, Einstein, Courant, Caratheodory, Hecke, etc.) could write in 1923 a letter of appreciation to Ferdinand Springer for saving the then leading journals Mathematische Annalen and Mathematische Zeitschrift from bankruptcy. This letter is displayed in the sidebar. There was at that time a partnership between authors and specialized publishing firms that understood their needs and tried to serve them while doing business. Klaus recalled this spirit when he met Ferdinand Springer sometime in the 1960's in these words:
One day my phone rang: "Springer here, please come to my office." Ferdinand Springer, the legendary publisher, did not usually deal with junior members of the staff nor had I been formally introduced to him. I went to his office unsure what this all meant. His personal secretary kindly advised that I should listen and quietly excuse myself when the 'audience' was over. On entering his office I was greeted warmly as the new mathematics editor. Mathematics was one of Springer's favorite programs. He then proceeded to explain the raison d'etre of a publisher: to facilitate the work of the authors by taking away the burdensome aspects of editing, producing, and most importantly distributing their work widely. He made it very clear that these added values were the justification of a publisher's existence.
His fierce loyalty to authors and editors is confirmed by another story. When Ferdinand Springer sought to leave the occupied city of Berlin after World War II to rescue his family, he was stopped at a military control post. The commanding Russian officer demanded an explanation. Springer identified himself as a publisher of scientific books and journals (in his mind that was explanation enough) whereupon the officer commanded, "Tell me the names of the editors of such and such journal!" Springer had retained the names of Russian scientists and editors on the masthead of the journals they had served, despite the war. As he recited these names, the officer suddenly interrupted, "That's me, and I am honored to meet you." He provided Springer with free passage which allowed him to rejoin his family.
Klaus went on to nearly single-handedly rejuvenate Springer-Verlag's mathematical program bringing it back to its pre-WWII status of the leading math publisher in the world. He introduced the Lecture Note series and got to know most of the leading mathematicians of his generation, often soliciting new books from the world's top experts. But things changed: in the late 70's, the CFO was made the director with the final say and Klaus and Alice resigned in protest. In Springer's own self-published history, Klaus's role was erased. At the same time, the small math publishers were being swallowed up or their math series discontinued (van Nostrand, Wiley-Interscience, Benjamin, etc.). One saw journal prices for the leading journals go sky-high and prices of later editions of older books were raised to match those of the newest books. Circulation took second place to quarterly profits, often based only on library sales. Klaus and Alice continued to seek a position where the traditional values of publishing were respected, moving to the Swiss publisher Birkhauser until it was swallowed by Springer, then to Harcourt-Brace-Jovanivich until it was bought by General Cinema and finally striking out of their own as AKPeters.
The buyout and merger mania in the pursuit of higher profits and the abandonment of "service" continued. A controlling interest in Springer itself was bought by the privately held publishing and mass-media conglomerate Bertelsmann in 1999. When they put Springer on the market in 2002, a group of us at the Beijing ICM tried a last ditch attempt to appeal to the Mohn family who owned Bertelsmann for an alternate solution. A letter signed by the Presidents of the IMU, ICIAM, EMS and the math societies of Germany, France, Canada and the US was sent to Dr. Mohn, recalling the partnership of Springer and the math community and asking him to consider the formation of a not-for-profit foundation to continue this partnership. The letter is reproduced in the sidebar. Subsequently, Springer has been sold three times to private equity firms: in 2003, to the British investors Cinven and Candover who acquired and merged both Kluwer Academic Publishers and BertelsmannSpringer; next to the private equity firm EQT Partners and the Government of Singapore Investment Corp.; and again in 2013, to yet another private equity firm BC Partners. Only Mitt Romney seems to have missed the boat. If you think a large part of our professional life is not mortgaged to capitalists, perhaps you have spent too much time thinking only about theorems. Private equity buys a firm for one and only one reason: they believe they can squeeze more profits out of their operations, i.e. out of us mathematicians (and our societies and libraries). As Klaus put it in a piece entitled "A Vanishing Dream" on which he was working a few weeks before his death:
Alice and I feel that we have lived a dream to preserve and provide a service that was once considered worthwhile. I mean "publishing as a service". ... That this concept (with few exceptions of small individual publishers) is widely lost is no secret but what bothers me intellectually is the fact that publishing companies can be run financially successfully without an intellectual mission and without thought to optimize sales (by numbers of copies) or to produce well-edited and designed books. They compensate these shortcomings by optimizing the bottom line through skimping on editorial and production cost and offsetting revenue loss from smaller per-title sales (by number) by inflating prices.
Let's talk about the second huge change in our professional life: the internet. It was not clear to me, at least, in the early 1990's how the internet would do anything to our working lives except speed up communication, replacing some types of letters by emails. What opened my eyes was when Philippe Tondeur proposed that the math community could and should digitize the entire corpus of mathematical books and journals and make them available to all and sundry: a World Mathematical Library. Wow -- was this really possible? Of course, its practicality is obvious now and Google has gone even further, seeking to digitize all written material. From this, it's only a small step to ask: why put math on paper at all? If something is on the web (and not password protected), anyone can get it and either read it on the screen or print it out if they prefer.
Full of enthusiasm for this brave new world, Peter Michor and I worked to involve the IMU. We set up its Committee on Electronic Information and Communication (CEIC) that, we hoped, would help mobilize the mathematical community in navigating this transition. Now I realize how naive this was, not because the early dreams were unrealizable but because human nature is complicated and fast action was needed to stay ahead of aggressive publishers. A big meeting of all the groups doing digitization of math was organized in Washington DC where the various obstacles were discussed and it was proposed that the IMU could serve as an umbrella group coordinating the half dozen initiatives that had been started. But it was a case of "all Chiefs and no Indians": none of the digitizers wanted to cooperate if this meant modifying their ongoing efforts. I had two chances to talk at length with John Ewing, then Executive Director of the AMS, but his conservatism made him very reluctant to consider any radical change in the math publishing business model. The AMS was at that time dependent on the traditional model and John was building up its 100 million dollar nest egg. On the CEIC, John's deep knowledge of copyright complexities resulted in stymying all pro-active initiatives that might have been taken then. It was not long before the commercial publishers asserted that their copyrights blocked wide electronic sharing of older articles and found a new source of revenue in these older articles that they had previously thought were worthless. Springer has locked up its back issues in "Springer Link". Note how different this is from the idea of a library where everything published is available for nothing. In yet another twist, "open access" journals with exorbitant per article charges (e.g. 3000 euros!) are now proliferating. More recently Springer realized that even books out of copyright could generate new revenue and offered authors the "benefit" of keeping their books in print indefinitely by voluntarily extending copyright to infinity. Actually, you can get nearly all math books free online at the rogue Russian "Genesis Library", with websites libgen.in and gen.lib.rec.ec (most of my books are there -- help yourself). Which is better: lunch money royalties once a year or wider free distribution of your books?
Let's speculate on what an internet-based professionally controlled working environment might be:
Mathematicians, by nature, want to concentrate on their work and resist worrying about the mechanics of communicating their results to their colleagues. But business models for publishing are changing rapidly in this digital age and whether the ultimate control rests in our hands, the hands of the professional community, or in the hands of financial concerns who shift money from sector to sector following the scent of profit this is something we ought to be aware of. I hope that the new pro-active CEIC, the great interest shown at the Seoul ICM in three panels on the impact of the internet and mathematical publishing and the AMS's introduction of online journals all indicate that the whole community is coming to grips with this choice.
With regard to your most recent blog post, which I of course completely agree with: in order to drum up support you might also want to mention the decline in book quality over the last 30 years - digital reprints, poor typography and paper quality, little or no editing and copyediting, etc. Some of this is no doubt due to various social and economic factors, but the commercial publishers' desire for profits has also played a role, I think.
I can live with this quality. The post WWII Dover reprints were low quality too but allowed poor grad students to buy priceless treasures. Furniture is now all veneer, very rarely solid hard wood.
He is certainly right that the fact that libraries throw so much money at toll-gating publishers is the main impediment to change in scholarly communication.
But what regards this group, it is mainly the lack of concrete small-step forward projects that seems the problem.
Yes, maybe there are no small steps. I had a conversation with Stuart Shieber, a Professor of Computer Science at Harvard who started their 'Digital Access to Scholarship at Harvard' that is exploiting loopholes in copyright law to allow its faculty to post all their papers. He suggested that something like a phase transition would take place when a majority of scholars became aware of their being exploited. All it takes is mathematicians refusing to submit papers to commercial journals and demanding that their professional societies and university presses expand their publications. Your paper appearing in the Inventiones should be an embarrassment, like wearing a mink coat, not an honor.
1. My book (DM: Outer Circles, a great book) is being published by CUP (which calls itself the oldest publisher in the world), and my editor is David Tranah. Their books are not cheap, they do have to not make losses. In return they spend effort in making sure I am not violating copyrights, correct my Latex, they know how to insert colored pictures (some borrowed from Indra's Pearls--thank you David), they use an attractive font, etc.
2. There is the effort by Mathematical Science Publishers (MSP), initiated by Kirby, to publish math books, especially in geometry and topology, more cheaply and efficiently. Their books still need a wider circulation and display, in particular in our own library here. There is also AMS and Princeton U Press among the 'nonprofit' and noncommercial math publishers. More Journal editors of prestigious journals, like Inventiones, , should do as the editors of the former Topology journal did do, resign and restart their journals under a noncommercial publisher. This is the rationale of MSP, and it has worked.
The big profits in math are made with Calc books; millions have been made by Stewart. The most successful of these are pushed and marketed by big companies. How many sales reps have come to your office? Mathematicians too like to make money, when they can!
3. Most math papers (at least those I have looked for), esp those published in Annals and seemingly most other good journals are accessible from the internet now, some requiring an agreement with our library to do so. (DM: it would be nice if everyone used the arXiv though, saving time wasted in google searches.)
4. Here is an example of outrageous pricing. Many of us believe Lars Ahlfors text, Complex Analysis, is still the best intro, modulo some updating. Yet McGraw Hill is now selling it, the 3rd edition from 1979, for the outrageous price of about $250, and it is reprinted with poor typography. (Fortunately students seem to get the Chinese edition for a few dollars.) Altho I have frequently complained to McGraw Hill, it has no effect. I even offered to update the book, but that offer too was declined.
PS: Klaus was also a great supporter of the late, great Geometry Center. He published the important Word Processing in Groups (David Epstein, et al), and the Thurstonesque videos.
Great post! This is a point of view expressed increasingly by various scholars, and one that I definitely support.
I am, however, afraid that it is going to be extremely difficult to change the culture among authors, because we rely on traditional journals for one important reason: which journals we get published in is used to gauge our worth as researchers and scientists. I definitely think this is wrong, but we will probably not have much luck changing the publishing culture without changing this fact. I am working on an idea that I hope can contribute to this.
For now, publishing platforms such as http://www.sjscience.org/, https://thewinnower.com/, https://peerj.com/, and https://www.scienceopen.com/ are very inspiring.
I replied:The advent of electronic publication has been changing our documentation practices for a quarter of a century but, compared to what has been happening in the last few years, the change in the previous years has been a sort of analytic continuation from the previous era. During that period, journals have created electronic versions and some e-only journals have been created at the initiative of learned societies or groups of mathematicians, sometimes with scientific success, for example in probability theory. But this and the advent of freely accessible preprint archives have not deeply affected the definition of what a mathematical journal is. During that period also, we have heard visionaries explain how wonderful it would be if everything were freely accessible on the web, in Open Access (OA). It was also the time when electronic access allowed publishers to their journals more and more, selling them in large interdisciplinary batches. This was the major change in the business model and has succeeded because of many librarians' desire to maximise the number of publications to which their users have access. The financial and scientific drawbacks of bundling for the end users were not immediately apparent but now they are.
Apart from the beautiful dream of having everything freely accessible, one of the motivations of the OA movement has been to fight the financially predatory behaviour of some publishers, based in large part on the bundling technique. Indeed, we mathematicians have criticised the publishers' bundling as much because of its cost as because of the deleterious effect it has on the average quality of publications. It essentially annihilates the influence of readers' judgment and, in particular, the moderating effect which that judgment has had on the creation of new journals.
In the last few years we have entered a new phase. The day of OA has finally dawned; it is supported by everyone and policymakers have been convinced that publicly funded research should be freely accessible for all as quickly as possible. But now that OA is no longer a dream, we must cope with the problems of reality: how to build an economically sustainable publication and retrieval system providing OA but also all the necessities of science, such as the creation of an organised corpus of validated and cross-referenced results in their final form, as our libraries have been providing for centuries, and the preservation of this corpus for the distant future. All these activities are essential for our work and that of our successors and have a cost, which however must not be grossly exaggerated to satisfy shareholders' greed. We must also build a system which is scientifically sustainable. This may seem provocative, since OA is supposed to boost research, but in reality it has its dangers. For example, the multiplication of freely accessible, potentially useful documents competing for our attention makes visibility more and more important. With the growing greed of institutions for visibility they tend to rely more and more on evaluation tools which measure visibility and not true scientific value (admittedly a lot harder to measure but so much more important!) and when applied to promotions, hiring and grants, this leads to an ecosystem of science which many, and not only mathematicians, consider to be disastrous.
If we do not react strongly to this trend, in less than ten years we will be evaluated by ratings agencies which will induce universities to invest more in subject A and less in subject B because A has a greater impact factor and therefore will increase the visibility of the university (such agencies already exist: see Academic Analytics, http://www.academicanalytics.com). This behaviour is of course not new but it will become much more systematic and, well, 'scientific'. The more access is free, the more visibility becomes a merchandise: a new object of greed. Also, if we do not help our libraries to adapt their role of preservation and help in the access to documentation, we may find ourselves in the hands of private academic data-mining companies which will, for a fee of course, do for us and our students what librarians do now (only it will be on the basis of bibliometry). Major publishers are already investing in such companies. In the process, our libraries will disappear (see Odlyszko's article, http:// de.arxiv.org/pdf/1302.1105.pdf). The extent to which they are ignored in discussions on OA is truly amazing.
I believe that we need to consider overhauling the system in its totality: publishing ideas and knowledge, organising their accessibility, preserving them forever and evaluating the quality of research.
It is not enough to experiment with new business models, for example e-journals, which can offer OA because they are managed by dedicated volunteers and supported by a generous institution or association. These experiments are useful but if they remain just that, they compete not only with commercial publishers, which is often a part of their purpose, but also with the academic publishers, which are so precious to us and which are severely handicapped in this time of change because they do not often have the means, financial or otherwise, to make the necessary investments, while the big publishers do. These experiments also contribute to the proliferation of journals and increase the need for time spent refereeing by researchers. There are already voices advising the replacement of referees, which are so difficult to find, by statistics of downloads. Do we really want that?
We do need new business models, and experiments, but they must be compatible with a general vision. For example, few experiments in mathematics involve some measure of scientific control by the readers, as the old subscription system did. It seems that for them the OA economy is entirely a supply-side economy, à la Reagan. From this point of view, OA is the Universal Bundle of publishing and so it has the defects of bundles but worse! It is claimed that, like in the pre-OA era, the scientific quality is guaranteed by the refereeing system but in an OA world, where it is so easy to publish, if the editorial committees and the refereeing system are of course still necessary, they are no longer sufficient. We need other regulatory systems to prevent the proliferation of journals and papers which are published for the sake of publishing, a practice encouraged by bibliometry.
There are many new propositions for the funding of OA publication. The least imaginative, which is a brutal and thoughtless adaptation of the classical 'readers pay the costs' system to OA, is the 'Authors Pay the Costs (APC)' system, cleverly named 'Article Processing Charge' by the publishers. In spite of protestations to the contrary, it will create a documentation bubble and, in addition, puts the researcher under the scientific control of some funding authority and in need of spending even more time with funding requests. It is also, in the end, quite expensive with the charges now requested. Unfortunately, it is supported by an energetic lobby of publishers and some scientists outside mathematics, who are used to it for historical reasons (colour pictures in the life sciences) and who see nothing wrong with it. Most mathematicians reject it as an outrage to freedom and a threat to the quality of publications but some others disagree and believe that it is a viable model provided the financing system is tightly controlled and not driven by greed. I believe that in this regard the honest ones will unfortunately serve to justify predatory behaviours. More acceptable alternatives where institutions such as libraries pay for services (refereeing, editing, attributing compatible metadata, preparing for long term preservation, etc.) rendered by the publishers are being experimented. Being more original, they are less well understood by politicians but are developing well, in particular in the humanities (see Freemium, http:// en.wikipedia.org/wiki/Freemium) and should be studied by us.
In the new ecosystem, journals and documentation portals should have the support of a reasonable number of libraries or other public institutions, based on a positive judgment of users on their quality. This support should be conditional on the acceptance by the journal of a charter of good practice covering all the aspects of publishing.
This would help to pare down mediocre journals and in particular keep out 'predatory publishers' (see http://en.wikipedia.org/wiki/Predatory_open_access_publishing) in which the APC guarantees no serious refereeing or long-term preservation but just posting on some website. If the Gold APC system prevails, and there is no enforcement of a charter of good practice, in a short time it will be extremely difficult to distinguish predatory, mildly predatory, not-too-serious and serious journals. By serious of course I do not mean those with a high impact factor but those who do their job seriously, from refereeing to metadata and contribution to a stable corpus and ensuring very long-term preservation of what they publish.
The support could take the form of 'crowd funding' or subscriptions by consortia of libraries or institutions, again based on scientific judgment by competent people and not usage statistics. In this spirit, the French National network of Mathematics Libraries (RNBM) has negotiated a national subscription to all the journals of the EMS Publishing House and is negotiating a similar one with the French Mathematical Society. These subscriptions are funded by the CNRS but they could just as well be funded by a consortium of universities. Alternatively, universities could devote an incompressible part of their budget to the support of academic journals offering OA, chosen by researchers. This should be encouraged by governments as 'good practice' for universities.
Other good practices concern the balance between in-depth scientific judgment (which has its dangers) and bibliometry (which is dangerous by nature).
In the new ecosystem, hiring and promotion committees as well as grants committees will make explicit in writing how much of their decision is based on an in-depth scientific judgment of the work and how much on the reputation or the impact factor of the journals in which it is published.
Libraries will make explicit in writing how much of their decision to buy, subscribe or unsubscribe is based on an in-depth scientific judgment by competent scientists and how much on the reputation or the usage statistics or impact factors of the journals.
Concerning the problem of referees, I propose that some institution (for example, the EMS) should organise rather large groups of journals (to preserve anonymity) which would every year publish a list of referees found particularly meritorious by the editors. Those distinguished in this way could use it in their CV as a valuable recognition.
In conclusion, the notion, implicit in the discourse of many advocates of the APC, that we need to replace our existing system by OA and APC journals and 'there is no alternative' is part of the intoxication propagated by the APC lobby. The definition of the future system of mathematical documentation is, I think, still, in part and for a little while, in the hands of mathematicians, but not through an explosion of new journals which would guarantee a chaotic transition. It is, rather, through a daily discipline: in the decision to publish; in the decision to post articles, pre and post-refereeing on a free access archive with a long life expectation (remember that a lot of our research is already in OA thanks to these archives), in the time taken to referee, in the evaluation of research by reading papers and not by the reputation of journals, in supporting the journals with good practices, in particular for corpus-building and copyright freedom, in fighting energetically the author-pays system and the misuse of bibliometry, in helping the libraries to make the transition, and in working to convince academic authorities that it is in their interests, and also a part of their remit, to support such a system instead of following the lures of visibility merchants and giving them money cut out from the documentation budgets (it has happened). Of course one of our main collective tools to do this is the learned societies which represent us but our individual actions are crucial.
I think it is important to frame the problems a little bit differently. I think two trends are being somewhat confused here and I'd like to recall some ideas of Pittman. My original blog concerns the transformation of smallish publishers with a mission into giants controlled by financial firms and I concentrated on bringing to light how much, esp. at Springer, has changed. Quite different from this is the simple fact that the quantity of mathematical research going on today is surely at least 10x that of a century ago, probably more. Here I don't mean counting journal articles, so many of which are junk, but actual new results a researcher needs to know to be at the "frontier". Any growth of this size changes the nature of any enterprise. How do we achieve efficient communication so the people who need to hear of such and such a result do hear about it?
The second problem gets mixed up with the first through the proliferation of junk journals but even without these, it is a huge problem. It causes harmful specialization and fragmentation of research, where many mathematicians narrow their focus to tiny sub-sub-areas. I remember Pittman giving me an earful about the need for new "review" journals that would provide help, organizing the current state of an area so all practitioners were up to speed and certifying what had now been proven. Teissier mentions a few times that librarians can be a help to point users in good directions. Maybe so in France -- this has never been my experience. The level of expertise needed to organize well the body of research in any area is extremely high, too high for generalists. And then there are the unexpected links between distant areas, arguably the most important to bring to light. (Talking about this makes me feel quite old! I have tried, for example, to see where the "meat" is in the efflorescence of Voevodsky's vast vision, so far unsuccessfully.) But I believe Pittman's ideas should be pursued and extensive reviews are one thing that is needed. Nothing has taken the place of the 18th/19th century encyclopedias.
On April 19, I received an email from Dick Palais voicing strong support for the views above and in which he stated "I am attaching some material concerning a related matter, the phenomenon of what I call "citation scamming" which tends to exacerbate the problems that David discusses." This is indeed a major issue with commercial publishers and the material he sent me is here.
Now I've always had a complex attitude towards ART that started with my sister Daphne and brother-in-law (Charles DuBack) being artists and watching them struggle with evolving tastes and fashions and their own muses. Then my oldest son Steve became an artist, my second son Peter became a photographer, I married an artist Jenifer, whose sister Mimo, second son Andrew and his wife Heather are artists and finally Steve married the artist Inka Essenhigh -- you get the picture. Of course we collect a lot of art -- "friends and family" we call it, so I follow prices, galleries, reviews a tiny bit. I'm aware, especially after reading Seven Days in the Art World, of some of the bizarre aspects of the scene. In another direction, I have found striking parallels between the history of art and the history of math going back to 1800 at least (Here's a lecture I gave on this). These are two fields that are not dependent on language and so can express the zeitgeist more directly. In yet another direction, both the Paris school of Jean-Michel Morel and my own research on the statistics of images have led us to synthesize images and we have noticed how naturally some kinds of abstract art emerge.
Dan's project had emerged from a serendipitous meeting on a trans-continental flight with an unorthodox publisher, Bob Feldman of Parasol Press, who has created beautiful portfolios of many great artists, so we were in very exalted company. Sol Lewitt is perhaps the best point of reference. Bob had always wondered if math could be made into art and Dan had likewise wondered if art could be made out of math. So this tube full of many types of paper and drawing instruments (no copper plate) arrives in the mail and we spread them out on the dining room table. Thank god that my wife knows art materials and after I play with charcoal a bit, I find I can make believe I am talking to a class and writing on a blackboard. My contribution was a startling identity that arose studying moduli space, most peculiar in having the number 13 appear in it. As I said in the accompanying blurb, the only numbers bigger than 2 that are likely to appear in a math article are usually page numbers. This one has also the merit that it has been used by string theorists.
Now the plot thickens. Together with 9 other mathematicians, physicists and computer scientists, the portfolio was put together using aquatints that inverted the colors, now white on black like chalk on a blackboard. This is apparently an awfully hard process to master, especially with thin lines scratched on the paper. But Harlan and Weaver succeeded and the lot is being sent around the world with the title Concinnitas from art gallery to art gallery: Zurich, Seattle, Portland, Yale. In the sidebar I have put thumbnails of the 10 aquatints for your edification. A panel discussion was arranged at the Yale Art Gallery where I met the full cast of characters. Amazingly, a couple of hundred people showed up to hear the discussion. That's where I heard how challenging the aquatint process was and had the pleasure of meeting Bob Feldman. And we also learned from Yale professor Asher Auel that, like artists with different favorite paints, there are three types of chalk with which mathematicians can make quite different sorts of lines.
Most of the panel discussion, however, centered on the question -- "Is it Art?". In fact, it was reviewed in the Scientific American! I had just seen upstairs in the museum a quite wonderful wall done by Sol Lewitt made from panels with all permutations of two curved arcs, butting up to each other. What he found was that the serendipitous pairings on adjacent panels created a spider web of contours that, for me anyway, "worked" as an entry point for math into art. I was not so sure that any such unanticipated magic emerged from our scrawls, raising them above a status of fetishistic objects for the layman. Still confused about what is art, my wife and I went the next day to stay in NYC with two people in the thick of it -- my son Steve and his wife Inka. Steve told me: "read Tom Wolfe's The Painted Word". And I did. What an eye opener. The whole history of 20th century art began to make sense. If you haven't cracked this slim volume, let me reproduce the quote that sets him off -- from Hilton Kramer in the April 28, 1972 Times (reviewing a show at Yale in fact):
Realism does not lack its partisans, but it does rather conspicuously lack a persuasive theory. And given the nature of our intellectual commerce with works of art, to lack a persuasive theory is to lack something crucial -- the means by which our experience of individual works is joined to our understanding of the values they signify.
He goes on to detail the many theories shilled by art critics that supported all the isms of 20th century art. Now formulas began to seem more plausible as grist for this mill. Minimalism? Conceptual Art? Urban graffiti? Surely there's a place there somewhere for formulas. All it needs is its own unique persuasive theory! Maybe Dan and Bob's projects will have legs.
]]>How should one view such speculation?. My view of history in general, not just protohistory, is that it is always an exercise in Bayesian inference. We never have full knowledge of any past part of space-time. Even in our own lifetimes, we rely on faulty and selective memories in reconstructing events. Scholars have the illusion when they are relying only on primary sources that they not making significant inferences but I believe they are mistaken. Of course primary sources are infinitely better than secondary ones, but everyone has built up their personal prior on human behavior and human culture and uses this to expand the meager sources that survive into a full blown reconstruction of some events. Indeed, Rushdie quotes his Cambridge Professor Hibbert saying "You must never write history until you can hear the people speak". Of course this is also the fundamental reason why histories of the same event written at various times in later centuries differ so much.
My personal experience reading Archimedes for the first time illustrates my bias: after getting past his specific words and the idiosyncrasies of the mathematical culture he worked in, I felt an amazing certainty that I could follow his thought process. I knew how my mathematical contemporaries reasoned and his whole way of doing math fit hand-in-glove with my own experience. I was reconstructing a rich picture of Archimedes based on my prior. Here he was working out a Riemann sum for an integral, here he was making the irritating estimates needed to establish convergence. I am aware that historians would say I am not reading him for what he says but am distorting his words using my modern understanding of math. I cannot disprove this but I disagree. I take math to be a fixed set of problems and results, independent of culture just as metallurgy is a fixed set of facts that can be used to analyze ancient swords. When, in the same situation, I read in his manuscript things that people would write today (adjusting for notation), I feel justified in believing I can hear him "speak".
Getting back to the Pythagorean rule, I think the first task is to ask why ancient peoples were led to study triangles. I think there are two interconnected and quite convincing reasons. One is that the value of a field depends on its area and for buying and selling and inheriting and taxing farms, the numerical value of this area is indispensable. Another is that as towns grew and became cities, the most convenient shape for buildings and for the street plan was a rectangle. In the first case, the natural method is to break the field up into approximate rectangles or right triangles. A right triangle is half a rectangle and a rectangle can divided into two right triangles by its diagonal. So you need to be able to lay out perpendicular lines and recognize when one corner of a triangle is a right angle, when a quadrilateral is a rectangle. In other words, the rulers of all ancient kingdoms needed skilled land measurers and master builders who knew some basic facts from geometry. This does not mean they required Pythagorean rule, but it suggests how useful it would be.
In Mesopotamia we are unbelievably lucky that records made in clay tablets, unlike records made on paper, papyrus, birch bark or string, are nearly permanent. Fire, for instance, makes clay more permanent instead of destroying it. We have a nearly three millennium record of clay tablets (and tokens) from Mesopotamia from which its cultural history can be reconstructed. Denise Schmandt-Besserat has used this data to construct a very convincing story of the origin of writing in the third millennium BCE Mesopotamia starting from clay tokens, then clay envelopes and finally cuneiform on solid clay tablets. Essentially, her theory says it all started from needing to say "Mr. so-and-so owns such-and-such". Their highly sophisticated place-value base 60 arithmetic seems to have originated from the need for a unified central accounting (perhaps in Ur III) including goods and labor which had been measured with many units often related by multiples such as 4,5,6,10,12 etc. Remarkable accounting tablets survive with detailed entries of labor and goods: see the book of Richard Mattessich on "The Beginnings of Accounting".
How about the measurement of land? The following wonderful paean to the Goddess Nisaba, who received literacy and numeracy as a wedding present from Enlil and passed it down to human beings, is found on one Babylonian tablet:
Nisaba, woman sparkling with joy,The "1-rod reed" and the "measuring rope" are the basic tools of the surveyor, here praised on a par with writing. Many "deed" tablets survive with plans of fields and measurements. But, in my way of thinking, the most impressive demonstration of their knowledge of Pythagoras is on the tablet MS 3049 in the Schøyen collection., which appears above in the sidebar. Here the authors calculate the distance in a gateway through a thick wall from e.g. the inner left bottom corner to the outer right top corner. Now Pythagoras ostensibly is a theorem about triangles -- but really it describes distances in Cartesian coordinates in 2 dimensions. Iterating it, one gets the distance in \( \mathbb R^n \) as the square root of the sum of the squares of each coordinate change: $$ d(\vec x,\vec y) = \sqrt{\sum_{i=1}^n (x_i-y_i)^2}$$ The great importance of Pythagoras's theorem is this Corollary. And here from Uruk in Babylon, sometime in the 17th century BCE, we find this used in 3-space.
Righteous woman, scribe, lady who knows everything:
She leads your fingers on the clay,
She makes them put beautiful wedges on the tablets,
She makes them sparkle with a golden stylus,
A 1-rod reed and a measuring rope of lapis lazuli,
A yardstick, and a writing board which gives wisdom:
Nisaba generously bestowed them on you.
On the left, a translation of the tablet from J. Friberg, "A Remarkable Collection of Babylonian Mathematical Texts"; on the right, his diagram of the calculation. Note that the numbers are in sexagesimal (so, for example, "6 40" means 6*60+40=400), "heap them" means add, "let eat itself" means square it, and "likeside" is its square root.
An aside: another tablet, Plimpton 322 which is in the sidebar to the home page of the site, is often used as evidence of the Mesopotamians' knowledge of the Pythagorean theorem, This contains a list of pairs \( (s,d) \) where \( d^2-s^2 \) is a square of an exact sexagesimal number \( \ell \) (i.e. of finite length) -- otherwise said, "Pythagorean triples" \( (s, \ell, d) \). As the tablet lists these for triangles with angles steadily decreasing from about 44 degrees to 32 degrees, it has been thought to be an equivalent of a table of sines or perhaps a manual for earthworks giving simple distances that could be laid out by surveyors. However, Eleanor Robson has proposed instead (in the American Math Monthly, Oct. 2001) that it was simply a table of reciprocal pairs \( (x,1/x) \) (now missing because the tablet broke) together with their sums and differences reduced to sexagesimally simple forms to simplify the work of setting problems, i.e. a teacher's manual. Its only hint of relating to triangles is that the heading of column of \( d \)'s is labelled "diagonal". But Schøyen 3049 explicitly uses the Pythagorean rule twice.
Who were the people who came up with this -- arguably the first "non-trivial" fact in mathematics? We know that there were scribal schools in which apprentices were trained in the three 'R"s, reading, (w)riting and (a)rithmetic, all highly skilled professions at the time. (Aside: besides arithmetic base 60 being quite a challenge, the script, like contemporary Japanese, was a mixture, in this case of Sumerian logograms and the Akkadian syllabic alphabet, hence another major challenge.) Bins of hundreds of discarded student tablets, many with errors, survive! Students in these schools became scribes working as bureaucrats, accountants, surveyors or teachers. But I contend that some scribes must have been mathematical geniuses too or the Pythagorean rule could not have been discovered. Should we think of them as the world's first mathematicians? There is some controversy here. For Eleanor Robson, all this work was oriented to engineering, administrative and instructional needs -- measuring and designing canals, earthworks, etc. and she asserts that thinking of them as mathematicians is a misguided anachronism that ignores the society in which they lived.
Perhaps this is just a reflection of the age-old tension between pure and applied mathematics. Many engineers have been mathematical geniuses. You don't have to be a professional mathematician to be a mathematical genius and it does seem a stretch to call anyone from that time a mathematician. Following Hibbert's dictum, let's imagine a brilliant civil servant whose day job was measuring fields or construction sites and writing tablets with associated plans but whose imagination was caught by these geometric diagrams and who then played with how these diagrams constrained lengths and areas (one might think of Einstein in the Swiss patent office). But how was the rule found? This is the real mystery. Jens Høyrup in his book "Length, Width, Surfaces: A Portrait of Old Babylonian Algebra and its kin" proposes, in connection his analysis of tablet Db_{2} 146, that the Babylonians discovered a version of the famous Xian Tu diagram that appears in Chinese manuscripts of the Early Han dynasty (see below) that is also shown in the sidebar. The key to this diagram is to inscribe one square inside another at the angle that makes the gaps in the four corners all equal to the given triangle. Unfortunately, no trace of such a diagram has been found on a tablet. However, the case where the inner square is oriented at 45° is found on tablet BM 15285 shown in the sidebar. And once you conceive of this diagram, there are many ways to prove the rule. Høyrup, analyzing very carefully the exact words on tablet Db_{2} 146, proposes one in his book, p.259, figure 67. Here's my favorite with \( A,B,C \) denoting the sides of the white triangles in the four corners:
This begins to feel like a considerable speculative leap. But since Schøyen 3049 makes it unmistakable that somehow they found the result, I think we have to entertain such a speculation. But then did other cultures discover the result independently? Not necessarily: if we accept that Pythagoras's theorem and the accompanying geometry were very useful for taxes and building, it is only natural that its knowledge would spread to nearby civilizations with which Mesopotamia had regular trade. Master builders and surveyors would be in demand and some would likely migrate. Thus both Egypt and the Indus Valley culture flourished at pretty much the same time and so might learn of the latest technology from Babylon. Sadly, in both cases, we have much sparser remains from which to deduce what they knew. From Egypt, the so-called "Scorpion Macehead" shows the pharaoh seeding the fields adjacent to the Nile after its flood and is dated c.3000 BCE. To reconstruct the fields, "rope stretchers" were employed and paintings testify that knotted ropes were their principle tools. It is widely believed that they used the 3-4-5 triangle to lay out right angles for construction purposes. But the only evidence for this is problem 1 in the Berlin Papyrus 6619 where the equation \( x^2 + y^2 = 100, y/x=3/4 \) is solved. According to a recent review "Traditions and myths in the historiography of Egyptian mathematics" by Annette Imhaausen (in the Oxford Handbook of History of Math, p.791), judging from the mathematical papyri that have survived, it is doubtful that they knew the statement of the Pythagorean rule in general. Moreover, structures such as the great pyramid of Giza were built about 800 years before the above tablets were written. My guess is that, in the Old Kingdom, squares were laid out by using ropes to ensure that all sides were equal and both diagonals were equal. It's also plausible that the technique of laying out right triangles by a rope with knots at spaces 3, 4 and 5 could have been transmitted from Babylon during the Middle Kingdom while its theoretical background was not.
As for the Indus Valley culture, we have about 3700 inscriptions containing about 400 symbols but this is no help as they are still untranslated. But there are Sumerian descriptions of trade to a place in the East called "Meluhha", often identified with the Indus Valley, and identical clay seals are found in the Indus Valley and in Mesopotamia. Their cities were laid out with very regular rectangular street plans indicating their need for skilled surveying (as does the universal concern with fields). What makes the possibility of transmission of the full Pythagorean rule to the Indus valley a bit more plausible, however, is how the theorem crops up very explicitly in the Indian Vedic period, in the Sulba Sutra of Baudhayana, usually dated c. 800 BCE. Here the rule is used not for laying out fields, streets or buildings but for laying out sacrificial fire altars. The Vedic invaders of Northwest India are thought to have occupied the Indus Valley during the late periods of the Indus Valley culture and then spread East. How they interacted or interbred with the natives in this land and what, if anything, they picked up from them is the subject of great controversy. Regardless of where you stand on these sensitive issues, it is startling to find in Vedic Sutras not only the Pythagorean rule but the basic geometric constructions with ropes used in Mesopotamia and Egypt (and likely the Indus Valley): see figure above in the sidebar. If you put the Sulba Sutras next to a book on the geometry in the Mesopotamian tablets, the similarities are stunning. You might wonder why area was important to the Vedic peoples? There is a simple ritual reason: if a sacrifice did not achieve its aim, it was repeated, increasing the area of the altar by a ratio \( (n+1)/n \) for increasing values of n. If you use Pythagoras's rule, this is easy to do with ropes. We also find, a bit later, very sophisticated accounting used in the Maurya empire. All in all, it seems a reasonable speculation that a good deal of math was transmitted via the Indus Valley people to the Vedic Indians.
How about China? A key problem with the history of Chinese math is that mathematics and mathematicians never held an important place in Chinese culture. Math was a tool for low level bureaucrats and, in many dynasties, was not even part of the imperial exams. Astronomy and its sister, Astrology, held a somewhat higher place. But these were not esteemed as much as writing poetry and essays on Confucian ideals. After the massive burning of ancient documents and the burying alive of recalcitrant mandarins in the Qin dynasty, the Han dynasty scholars were able to reconstruct much of the ancient dynastic histories and Confucian manuscripts but only the final state of the math, not its history. Nonetheless, in what they reconstructed the Pythagorean rule emerges full blown. It occupies a full chapter in the main Han dynasty treatise, the "Nine Chapters on the Mathematical Art" (Jiu Zhang Suan Shu) and the proof using the Xian Tu appears in somewhat garbled form in the surviving late Zhou manuscript "Zhou Bi Suan Jing" (sometimes translated as the "Arithmetical Classic of the Gnomon").
Was this rule, as well as the use of Gaussian elimination and negative numbers to solve systems of linear equations, all discovered in the burst of creative activity in the Han dynasty? Chinese culture had expanded and built sophisticated societies with elaborate governments, earthworks etc. for over over a thousand years preceding the Qin. Confucius had lived three centuries earlier as had scientifically inclined philosophers like Mo Tzu. Although there is no direct evidence, it seems much more likely that Pythagoras's rule had been discovered sometime in the Zhou dynasty (1046-256 BCE, often subdivided into the Zhou proper, then the Spring and Automn period and finally the Warring States period). It also seems unlikely that its statement might have been transmitted from the Middle East in these early times. The culture of the Middle Kingdom has its own very distinct writing and founding myths. It seems most likely to me that another unsung mathematical genius discovered it in China in the early first millennium BCE.
Enough speculation. My central point is first that early math was applied math, embedded in practical tasks, especially accounting and surveying. Secondly, the algorithms in these fields can be transmitted to other cultures by their practitioners -- bureaucrats, scribes and master builders -- just as well as by the experts who first formulated them. But thirdly, for a few of these experts, the math they uncovered took on a life of its own, they pushed things to a deeper level and their discoveries, such as the Pythagorean rule, should be celebrated as much as those of metals and wheels. I think it is not anachronistic to call those experts mathematicians and I suspect they felt not unlike what my colleagues feel today when they find something new.
On Jan.10, Dick Palais wrote me the following comments:
The Pythagoras topic is one that has interested me for a long time, (for reasons that will appear below). What motivates me to write is a desire to convince you of my long held view that Pythagoras' Rule should be considered fairly superficial rather than ``deep'', and that certainly, when properly approached it is intuitively and mathematically almost obvious. Let me first give a quick outline and then add some further detail and discussion. Let's start by accepting:
Area Principle for Convex Polygons. The area of similar convex polygons is proportional to the squares of the lengths of corresponding sides. In particular, the areas of similar right triangles is proportional to the squares of the lengths of their hypotenuses.
Thus, given a right triangle \( T' \), there is a positive constant \( k \) such that, for any triangle \( T \) similar to \( T' \) and having a hypotenuse of length \( c \), the area of \( T \) equals \( k c^2 \). Now think of the hypotenuse of \( T \) as its base, and drop the perpendicular from the vertex of its right angle onto its base. This divides \( T \) into two non-overlapping triangles similar to \( T \) (and so to \( T' \)) whose hypotenuse lengths are the lengths, \( a \) and \( b \), of the two shorter sides of \( T \). Hence, by the additivity of area, \( k a^2 + k b^2 = k c^2 \), and dividing by \( k \) gives Pythagoras' Rule.
I suppose one might object that the Area Principle is not "obvious'" or elementary, but at least the special case, for right triangles is an immediate consequence of the proposition that "corresponding sides of similar triangles are proportional'", which I think most would agree is elementary. In fact, using the above notation, let \( a', b', c' \) be the three side lengths of a triangle \( T' \), so for \( T \) similar to \( T' \) with side lengths \( a,b,c \) we have \( {a\over a'} = {b\over b'} = {c\over c'} \). Now (by definition) the area of \( T \) and \( T' \) are respectively \( A(T) = {1\over 2} ab \) and \( A(T') = {1\over 2} a'b' \), so $$A(T) = A(T') {A(T)\over A(T')} = {1\over 2}(a'b') {a\over a'} {b\over b'} =\\ {1\over 2}(a'b') ({c\over c' })^2 = {1\over 2} \left({a'b' \over c'^2} \right) c^2 = k c^2$$ where \( k \) is the constant \( {1\over 2} \left({a'b' \over c'^2} \right) \), proving the special case of the Area Principle.
Now, while I think the above is all pretty "obvious'', I do not mean that I discovered it on my own (though I wish I had). What is true is that Einstein, in his autobiography, says that it is the proof that he discovered on his own, at age twelve, when his uncle challenged him to come up with a proof of Pythagoras Rule. And what I strongly suspect is also true is that many different ancient geometers from various cultures in the far past stumbled upon this argument, and perhaps you should take the above into account in your Bayesian analysis, particularly in view of your remark: "I think there are two interconnected and quite convincing reasons. One is that the value of a field depends on its area and for buying and selling and inheriting and taxing farms, the numerical value of this area is indispensable."
I agree this is a great proof. It is essentially in Euclid's book VI, specifically VI.8 and VI.19 though Euclid doesn't explicitly note that it reproves the Pythagorean rule. I just doubt that this was the route by which it was first discovered, being a pretty abstract approach. It strongly depends on the general notion of similarity as well as the quadratic dependence of area on size. This level of abstraction first appears in Greek math. Depending on how much abstraction one likes, one can also argue that rotations must be linear and that compact one-parameter subgroups of GL(2,R) leave invariant a quadratic form. This is rather like Grothendieck's belief that when you have a sufficiently high level understanding of math, the concrete classical results will fall out.
Dick replied: "Gee, I felt it was more concrete and natural than say the one involving the Xian Tu diagram. Chacun à son goût I guess." He is using "natural" the way all professional mathematicians do -- namely natural from their higher perspective. I think the Xian Tu is more elementary: you just shuffle blocks around.
Marius Kempe wrote me drawing my attention to Christopher Cullen's article in the AMS Notices. Cullen argues on the basis of the slim surviving source material from the Han dynasty that it is unlikely that the Pythagorean rule was known any earlier:
His conclusion: the rule was unknown in China before roughly the first century BCE. I feel this is exactly the main problem with standard historical scholarship. Why assume that the surviving sources describe well the full state of knowledge at that point in space-time? Specifically, why assume that the above mentioned manuscript, the Suan Shu Shu found in a tomb, contains a comprehensive survey of contemporary mathematical knowledge? We know that the Qin leaders systematically burned older manuscripts and killed mandarins and this caused a big problem for the Han dynasty in rebuilding their culture. Getting down to the brass tacks of his argument, Cullen translates the key passage in the possibly much older Zhou Bi Suan Jing on p.786 of his article in order to refute the admittedly creative translations that aim to show it contains a description of the Xian diagram that establishes the Pythagorean rule. But his translation is equally mysterious to me because he gives no diagram to explain it. If he had translated it so as to give a more pedestrian sense to the passage, he might have an argument. Such details aside, I feel the strongest argument for earlier Chinese knowledge of the rule is the long history of their culture, their cities and their civil works. It would seem bizarre if they failed to find this rule, one that was widely used in all other cultures that attained this level.A manuscript of a pre-canonical mathematical text recently found in a tomb of the second century B.C. does not make use of the gougu (=Pythagorean) relation, even where one might have expected to find it, in problems relating to sawing a square beam out of a round log. So it seems that we have at least a rough fix on when gougu thinking began in China.
As a rule, I do not like the trend that says Euclid knew algebra because we can interpret geometry algebraically today. Yet we can revisit the past and say, based on what we know now, we can also do this...
It is easy to read far too much into the Sanskrit of Brahmagupta for example. Then people say Brahmagupta defined a negative number as a positive number subtracted from zero. That may be the case, yet Brahmagupta did NOT say it!
"Pythagoras ostensibly is a theorem about triangles." Yet the circle on the hypotenuse equals the sum of the circles on the other two sides. From triangles to squares is one step, yet why has nobody written about the beauty of from triangles to circles? Yes, area, land and tax lead to a preference for squares on sides, not any of the other shapes that follow the same principle.
"Math was a tool for low level bureaucrats " Same in Greece and India. Arithmetica was number theory not the application of numbers to objects. (Arithmoi vs arithmos?) In India practical math was called logistics and associated with engineering.
The Chinese did NOT have negative numbers that mean a number is less than zero. They had opposite units that were both counted with the same numbers. So the red and black rods counted opposite things such as debts and assets or income and expense. As in India, numbers of objects jumped to the opposite unit upon reaching zero-India or empty-China.
The West fell for the trap of numbers less than zero only because of a focus on the right hand side of the number line and religious baggage associated with zero infinity and less than zero, which challenged the Church.
As for the comment of Roy Smith on this post, just two days ago I created this applet http://tube.geogebra.org/student/m1242381 which Steiner wrote about and Euclid proved in III P36.
As for Roy's last question proportion is at the heart of universal mathematics - not just because of Thales. I did a word count to compare additive words with proportional words in Newton's Principia. The results are at http://www.jonathancrabtree.com/mathematics/master-key-unlocks-science-addition-proportion/.
Now I must finish my applet showing Pythagoras with circles...http://tube.geogebra.org/student/m1256107
A couple of replies: first, the Vedic Indians needed to scale up complex shaped altars so that their area increased by a factor \( (n+1)/n \) and these shapes were sometimes approximations to circles (not exact because they were made of rectangular bricks). So this extension of the Pythagorean rule would not surprise them.
Second, if you can add, subtract and multiply mixed expressions of positive and negative numbers, it seems a quibble to say that because you made them red and black, you did not think of them as part of one number system.
]]>So John and I agreed and wrote the obituary below. Since the readership of Nature were more or less entirely made up of non-mathematicians, it seemed as though our challenge was to try to make some key parts of Grothendieck's work accessible to such an audience. Obviously the very definition of a scheme is central to nearly all his work, and we also wanted to say something genuine about categories and cohomology. Here's what we came up with:
Alexander Grothendieck
David Mumford and John Tate
Although mathematics became more and more abstract and general throughout the 20th century, it was Alexander Grothendieck who was the greatest master of this trend. His unique skill was to eliminate all unnecessary hypotheses and burrow into an area so deeply that its inner patterns on the most abstract level revealed themselves -- and then, like a magician, show how the solution of old problems fell out in straightforward ways now that their real nature had been revealed. His strength and intensity were legendary. He worked long hours, transforming totally the field of algebraic geometry and its connections with algebraic number theory. He was considered by many the greatest mathematician of the 20th century.
Grothendieck was born in Berlin on March 28, 1928 to an anarchist, politically activist couple -- a Russian Jewish father, Alexander Shapiro, and a German Protestant mother Johanna (Hanka) Grothendieck, and had a turbulent childhood in Germany and France, evading the holocaust in the French village of Le Chambon, known for protecting refugees. It was here in the midst of the war, at the (secondary school) Collège Cévenol, that he seems to have first developed his fascination for mathematics. He lived as an adult in France but remained stateless (on a "Nansen passport") his whole life, doing most of his revolutionary work in the period 1956 - 1970, at the Institut des Hautes Études Scientifique (IHES) in a suburb of Paris after it was founded in 1958. He received the Fields Medal in 1966.
His first work, stimulated by Laurent Schwartz and Jean Dieudonné, added major ideas to the theory of function spaces, but he came into his own when he took up algebraic geometry. This is the field where one studies the locus of solutions of sets of polynomial equations by combining the algebraic properties of the rings of polynomials with the geometric properties of this locus, known as a variety. Traditionally, this had meant complex solutions of polynomials with complex coefficients but just prior to Grothendieck's work, Andre Weil and Oscar Zariski had realized that much more scope and insight was gained by considering solutions and polynomials over arbitrary fields, e.g. finite fields or algebraic number fields.
The proper foundations of the enlarged view of algebraic geometry were, however, unclear and this is how Grothendieck made his first, hugely significant, innovation: he invented a class of geometric structures generalizing varieties that he called schemes. In simplest terms, he proposed attaching to any commutative ring (any set of things for which addition, subtraction and a commutative multiplication are defined, like the set of integers, or the set of polynomials in variables x,y,z with complex number coefficients) a geometric object, called the Spec of the ring (short for spectrum) or an affine scheme, and patching or gluing together these objects to form the scheme. The ring is to be thought of as the set of functions on its affine scheme.
To illustrate how revolutionary this was, a ring can be formed by starting with a field, say the field of real numbers, and adjoining a quantity \(\varepsilon\) satisfying \(\varepsilon^2 = 0\). Think of \(\varepsilon\) this way: your instruments might allow you to measure a small number such as \( \varepsilon = 0.001 \) but then \( \varepsilon^2 = 0.000001\) might be too small to measure, so there's no harm if we set it equal to zero. The numbers in this ring are \(a+b\cdot\varepsilon\) with real a,b. The geometric object to which this ring corresponds is an infinitesimal vector, a point which can move infinitesimally but to second order only. In effect, he is going back to Leibniz and making infinitesimals into actual objects that can be manipulated. A related idea has recently been used in physics, for superstrings. To connect schemes to number theory, one takes the ring of integers. The corresponding Spec has one point for each prime, at which functions have values in the finite field of integers mod p and one classical point where functions have rational number values and that is 'fatter', having all the others in its closure. Once the machinery became familiar, very few doubted that he had found the right framework for algebraic geometry and it is now universally accepted.
Going further in abstraction, Grothendieck used the web of associated maps -- called morphisms -- from a variable scheme to a fixed one to describe schemes as functors and noted that many functors that were not obviously schemes at all arose in algebraic geometry. This is similar in science to having many experiments measuring some object from which the unknown real thing is pieced together or even finding something unexpected from its influence on known things. He applied this to construct new schemes, leading to new types of objects called stacks whose functors were precisely characterized later by Michael Artin.
His best known work is his attack on the geometry of schemes and varieties by finding ways to compute their most important topological invariant, their cohomology. A simple example is the topology of a plane minus its origin. Using complex coordinates \((z,w)\), a plane has four real dimensions and taking out a point, what's left is topologically a three dimensional sphere. Following the inspired suggestions of Grothendieck, Artin was able to show how with algebra alone that a suitably defined third cohomology group of this space has one generator, that is the sphere lives algebraically too. Together they developed what is called étale cohomology at a famous IHES seminar. Grothendieck went on to solve various deep conjectures of Weil, develop crystalline cohomology and a meta-theory of cohomologies called motives with a brilliant group of collaborators whom he drew in at this time.
In 1969, for reasons not entirely clear to anyone, he left the IHES where he had done all this work and plunged into an ecological/political campaign that he called Survivre. With a breathtakingly naive spririt (that had served him well doing math) he believed he could start a movement that would change the world. But when he saw this was not succeeding, he returned to math, teaching at the University of Montpellier. There he formulated remarkable visions of yet deeper structures connecting algebra and geometry, e.g. the symmetry group of the set of all algebraic numbers (known as its Galois group Gal \( (\overline{\mathbb Q}/\mathbb Q) \) ) and graphs drawn on compact surfaces that he called 'dessin d'enfants'. Despite his writing thousand page treatises on this, still unpublished, his research program was only meagerly funded by the CNRS (Centre Nationale de Recherche Scientifique) and he accused the math world of being totally corrupt. For the last two decades of his life he broke with the whole world and sought total solitude in the small village of Lasserre in the foothills of the Pyrenees. Here he lived alone in his own mental and spiritual world, writing remarkable self-analytic works. He died nearby on Nov. 13, 2014.
As a friend, Grothendieck could be very warm, yet the nightmares of his childhood had left him a very complex person. He was unique in almost every way. His intensity and naivety enabled him to recast the foundations of large parts of 21st century math using unique insights that still amaze today. The power and beauty of Grothendieck's work on schemes, functors, cohomology, etc. is such that these concepts have come to be the basis of much of math today. The dreams of his later work still stand as challenges to his successors.
The sad thing is that this was rejected as much too technical for their readership. Their editor wrote me that 'higher degree polynomials', 'infinitesimal vectors' and 'complex space' (even complex numbers) were things at least half their readership had never come across. The gap between the world I have lived in and that even of scientists has never seemed larger. I am prepared for lawyers and business people to say they hated math and not to remember any math beyond arithmetic, but this!? Nature is read only by people belonging to the acronym 'STEM' (= Science, Technology, Engineering and Mathematics) and in the Common Core Standards, all such people are expected to learn a hell of a lot of math. Very depressing.
Added on Dec. 28
Well, Nature magazine really wanted to publish some obit on Grothendieck and wore us out until we agreed with a severely stripped down re-edit. The obit is coming out, I believe, in the Jan.15 issue, and copyright prevents me from putting it here. The whole issue of trying to bridge the gap between the mathematician's world and that of other scientists or that of lay people is a serious one and I believe mathematicians could try harder to find bridges. An example is Gower's work on bases in Banach spaces: when he received the Fields Medal, no one to my knowledge used the example of musical notes to explain Fourier series and thus bases of function spaces to the general public.
In the case of our obit, I had hoped that the inclusion of the unit 3-sphere in \( \mathbb{C}^2 - (0,0) \) would be fairly clear to most scientists and so could be used to explain the Mike Artin's breakthrough that \( H^3_{\acute{e}tale}(\mathbb{A}^2-(0,0)) \ne (0) \). No: excised by Nature. I had hoped that the "web of maps" was an excellent metaphor for the functor represented by an object in a category and gave one the gist. No: excised by Nature. I had hoped that the "symmetry group of the set of all algebraic numbers" might pass muster to define this Galois group. No: excised by Nature. To be fair, they did need to cut down the length and they didn't want to omit the personal details.
The essential minimum I thought for a Grothendieck obit was to make some attempt to explain schemes and say something about cohomology. To be honest, the central stumbling block for explaining schemes was the word "ring". If you haven't taken an intro to abstract algebra, where to begin? The final draft settled on mentioning in passing three examples -- polynomials (leaving out the frightening phrase "higher degree"), the dual numbers and finite fields. We batted about Spec of the dual numbers until something approaching an honest description came out, using "very small" and "infinitesimal distance". As for finite fields, in spite of John's discomfort, I thought the numbers on a clock made a decent first exposure. OK, \( \mathbb{Z}/12 \mathbb{Z} \) is not a field but what faster way to introduce finite rings than saying "a type of number that are added like the hours on a clock -- 7 hours after 9 o'clock is not 16 o'clock, but 4 o'clock"? We then describe characteristic p as a "discrete" world, in contrast to the characteristic 0 classical/continuous world. In another direction, we also added the clause "inspired by the ideas of the French mathematician Jean-Pierre Serre", an acknowledgement of their extraordinary collaboration.
The whole thing is a compromise and I don't want to say Nature is foolish or stupid not to allow more math. The real problem is that such a huge and painful gap has opened up between mathematicians and the rest of the world. I think that Middle and High School math curricula are one large cause of this. If math was introduced as connected to the rest of the world instead of being an isolated exercise, if it was shown to connect to money, to measuring the real world, to physics, chemistry and biology, to optimizing decisions and to writing computer code, fewer students would be turned off. In fact, why not drop separate High School math classes and teach the math as needed in science, civics and business classes? If you think about it, I think you'll agree that this is not such a crazy idea.
We've been having a lot of trouble with scientists, in particular life scientists. They are teaching calculus by radically dumbing it down. E.g. no trig, a half page on the chain rule, .... and very weak exams. This is being pushed by the Dean of LS, ostensibly so that math phobic students are not turned off science. The people in charge seem to be ecologists and they don't believe in any math that's not what they use. I suspect these students will be in real trouble when they take physics. I also suspect the readers of Nature think they know all important math and get upset if it's hinted that there's important math they haven't even heard of.
A sad story. How much math do biologists need? I would argue first of all that oscillations are central part of every science plus engineering/economics/business (arguably excluding computer science) and one needs the basic tools for describing them -- sines and cosines, all of trig of course, Euler's formula \( e^{ix} = \cos(x) + i.\sin(x) \) and especially Fourier series. And, of course, modeling a system by the path of a state vector in some \( \mathbb{R}^n \) , often with a PDE, is also ubiquitous. For example, surely all ecologists have studied the Lotka-Volterra equation (wolf and rabbit population cycles). Algebra is more of a mixed bag. Splines are much more useful than polynomials for engineers, finite fields arise mostly in coding applications and I doubt that the abstract idea of a ring is ever needed. But polynomials and varieties have been used in Sturmfels' algebraic statistics and, as Lior Pachter noted (see below), is very effectively used in modeling genome mutation. But evolutionary genomics is one community within biology and John and I figured we needed to throw into the obit a rough definition of a ring.
The laundry list of differences between biology and math that I aired above can be overwhelming. Real contact between the subjects will be difficult to foster, and it should be acknowledged that it is neither necessary nor sufficient for the science to progress. But wouldn't it be better if mathematicians proved they are serious about biology and biologists truly experimented with mathematics?
David Mumford asks on his blog, "Can one explain schemes to biologists?" in the context of his and John Tate's obituary for Grothendieck being prepared for publication in Nature. He offers a first draft obit which was rejected as too technical, along with a lament about the chasm between math and other scientific fields. Their draft introduces Grothendieck's field of algebraic geometry as follows: "This is the field where one studies the locus of solutions of sets of polynomial equations by combining the algebraic properties of the rings of polynomials with the geometric properties of this locus, known as a variety."
I find it surprising how someone who has worked at the interface of mathematics, applied math and biology for so long was surprised at Nature's reception. Of course this is too technical!
So, I wanted to try to take up Mumford's challenge.
Here's my first draft. Comments welcome.
Algebraic geometry is about solving equations. Not fancy equations involving trigonometric functions and exponentials, but ordinary, garden-variety equations involving \( x, x^2, x^3 \) and so on. Here's one: \( 3x^2 + 4x = 5x^3 + 6 \). Moving everything to the left side, we can write this as \( -6 + 4x + 3x^2 - 5x^3 = 0 \). The thing on the left is a polynomial, that is a sum of terms, each one a multiple of some pure power of x. Let us call the polynomial \( f(x) \). So, deciding that we'll move everything to the left side, we study equations like \( f(x) = 0 \). We can also study polynomials in two variables, such as \( g(x,y) = x^2 - y^2 \). In this case, the equation \( g(x,y) = 0 \) can be solved by factoring: \( x^2 - y^2 = 0 \) means (\( x+y)(x-y)=0 \), so the solutions are either \( y=x \) or \( y=-x \). That is, any point that lies on either the line \( y=x \) or \( y=-x \) (or both) gives a solution: indeed the point (4,-4) is solution since \( 4^2-(-4)^2 = 0 \). The set of solutions looks like a giant, infinite X shape. Some polynomials cannot be factored. For example, if we put \( h(x,y) = x^2 + y^2 - 2 \) then \( h(x,y) = 0 \) means \( x^2 + y^2 = 2 \), and the set of solutions looks like a circle. Note that the giant, infinite X had two pieces, corresponding to the factors of \( f(x,y) \), while the circle has one piece, corresponding to the fact that we could not factor \( h(x,y) \). We could also consider solving more than one equation simultaneously. For example, if we try to solve \( g(x,y) = 0 \) and \( h(x,y) = 0 \), this means that we need to find a point that both lies on the giant, infinite X and the circle \( h(x,y) = 0 \). In total, there are four such points: (1,1), (1,-1), (-1,1), and (-1,-1). So algebraic geometry tries to characterize what the set of solutions to some number of polynomial equations looks like.
Polynomials are interesting mathematical objects because, like numbers, you can multiply them. Think of a polynomial j with one variable, say, as a rule which transforms a number x into a new number \( j(x) \). Then given two polynomials, j and k, we can define the product jk by the rule which transforms x into the number \( j(x)k(x) \), i.e. the product of \( j(x) \) and \( k(x) \). In mathematical language, we say that they form a ring. You can also add them: \( j+k \) transforms x into the sum of \( j(x) \) and \( k(x) \). And like with numbers, you get distributivity and other nice properties. In fact, above we found that certain polynomials could be factored, while others could not. This is analogous to the fact that certain whole numbers can be factored, e.g. 6 = 2x3, while other "prime" numbers such as 5 cannot. This algebraic property (being prime or composite) is reflected in the geometry of the solution space: the prime polynomial \( h(x,y) \) had one component in the geometry of its solution set (a circle) while the composite polynomial \( g(x,y) \) had two (the two lines which cross). Algebraic geometry is the study of this interplay. For example, note here that both g and h were "degree-two" polynomials, since terms like \( x^2 \) or \( y^2 \) involve the multiplication of two things, like an x with an x or a y with a y, and two is the maximum number required by any term in the polynomial. When we considered the simultaneous set of solutions to g and h, we found four points. Here we meet a demonstration of a mathematical theorem in algebraic geometry: Bezout's theorem says that the number of points equals the product of the degrees, and indeed here we have 4 = 2x2.
A function is a machine which takes as input a point in some space and has as output a number. For example, our polynomials g and h are functions on the plane, since the inputs \( (x,y) \) are points in the plane. The output, such as \( g(2,3) = 2^2 - 3^2 = -5 \), is always a number. Recall that the set of points to which g assigns the number zero formed a giant X. What if we wanted to talk about functions on that X itself? That is, what if we were interested in assigning a number to each point on that X? In algebraic geometry, we often want to do such a thing. In order to study a space, you might study how it appears inside other spaces (such as the X in the plane) and you might study how other spaces appear inside it (such as the four points inside the X). Now here is one way to consider a function on X. Start with a function on the plane and restrict your inputs to points which lie on X. For example, we could apply the function \( h(x,y) \) to points \( (x,y) \) that lie on X (which is to say, points with \( g(x,y)=0 \) ). That's fine, but then you soon realize that sometimes two different functions on the plane restrict to the same function on X. For instance, if we compare h and \( h+g \), then on the plane they are different but on X they are the same, since \( h+g \) equals \( h+0 \) because on X, g is zero, and \( h+0 \) equals h). After we impose this notion of sameness, we get a new "ring" of functions, and in general these rings can have interesting properties. For example, consider the function \( j(x,y) = x+y \) as a function on X. Note j is equal to zero along the line from northwest to southeast, but j is nonzero on the other line. Therefore jis not the "zero" function which assigns zero to every point. Likewise, the function \( k(x,y) = x-y \) is nonzero, but is equal to zero along the line from southwest to northeast. Now note that together on X we have \( jk = 0 \). The product of these two nonzero functions is zero when considered as functions on X. This is a very different phenomenon from what we are used to with numbers. With numbers, if the product of two numbers is zero, then one of them must be zero (possibly both). The lesson is that functions can be multiplied just like polynomials. Sometimes, the ring that they define can be interesting in novel ways, such as having the product of two nonzero objects being zero.
Recap: we can learn about the geometry of the space of solutions of some polynomial equations by studying their algebraic properties. The relationship between factoring and having multiple components was one example. Bezout's theorem was another. Functions on a space organize into an algebraic structure called a ring, since you can multiply them, and these rings can be more exotic than the rings formed by numbers or by plain old polynomials.
Now here is the crucial insight: we free ourselves from geometry and simply describe a space with its ring of functions. The plane would be described by the polynomials in two variables (omitting, always, the fancy trigonometric functions and such, in the land of algebra). The X would be described by that ring but with h and \( h+g \) thought of as the same, i.e. with g being identified with the zero function. The one-dimensional line would be described by polynomials in one variable. Even a single point can be described in this way! A function on a point must assign to that point a number, so the ring of functions is the ordinary ring of numbers, where multiplication is the usual multiplication. So we may think of each space as providing a generalization of the algebraic structure of ordinary numbers: each space is defined by (or defines) a ring of functions. This construction gives many interesting objects -- the so-called "affine schemes," but algebraic geometry contains yet more.
A scheme is a space described locally by a ring of functions. To give a flavor for what this might mean -- particularly the word "locally" -- consider a space which looks more like a Q than an X. Near where the tail of the Q meets the circle, there is a crossing which looks like a miniature X. What that means is that we should be able to zoom in our perspective and describe the points near the crossing as we would describe the X itself. Now in truth giving a notion of "nearness" can be subtle. Up until this point, we haven't relied on distances. For instance, we could have made all the same essential conclusions above using \( x^2 + y^2 - 50 \) instead of \( x^2 + y^2 - 2 \) and the circle h described would have been five times as large. Thus a scheme is a set of points equipped with a notion of nearness, such that on each "small" region a ring of functions is given. Further, these rings of functions must be compatible when considering the overlap of two regions. What this means is that if a small region A is contained in both region B and in region C, then the functions on A can be considered as restrictions of functions on B or as restrictions of functions on C.
That's about it. You take your geometric object (if you have a notion of geometry), look at a "small region" and describe the object by some equations. These equations tell you what the space of functions on the object is (e.g. which polynomials to consider "the same"). And you do this on enough small regions so that the whole object is described. If you want to free yourself from geometry entirely, you must provide a set with a notion of "nearby points," and give a ring (of functions) for each such neighborhood.
In fact, the only thing left to specify here is what we meant at the start by a "number." That is, we have to decide on the set of functions on a point! This choice determines which "numbers" we are allowed to use in our polynomial expressions. We might have meant the real numbers, the rational numbers, complex numbers, the whole numbers, or -- and here it gets deep very fast -- something more exotic. The only thing we really require is that whatever we decide a "number" is, we ensure that multiplication is associative and commutes, like for ordinary numbers.
Why do all this? Having "algebraicized" the problem completely, the power of this approach emerges when geometry breaks down. For example, if you plot the solutions to the equation \( y^2 - x^3 = 0 \) you see a pointy object which doesn't have a tangent line at the origin (0,0). So certain geometric constructions are off limits. However, this kind of space poses no problems in scheme theory. The ring of functions is simply obtained: for example, you need to set two polynomials equal to each other if they differ by \( y^2 - x^3 \), since that is zero on the space.
Obviously, Tate and Mumford were not afforded this much space by Nature, and just as obviously, without constraints they could communicate these ideas, too -- and better. Whatever its origin, the challenge was a good one. Did I meet it?
Eric, this is certainly a simple introduction to some of the ideas needed to explain schemes. But I think that it also illustrates why mathematicians are often unsuccessful in explaining their ideas to other scientists. The reason is that it seems to me to suffer from the mathematicians compulsion to always be 100% precise and complete, defining every concept used. All scientists know what a function is already and the idea of restricting a function to some smaller set does not need to be spelled out in such detail. The second issue is that mathematicians, when giving examples, tend to start with trivial examples instead of going for an example that illustrates best the core idea. In your case, I think the equation \( x^2-y^2=0 \) is just too simple and emphasizing reducible varieties seems to me just distracting. In the version Nature accepted, John and I use your third example, the circle -- a variety certainly known to all scientists -- and say "Algebraic geometry is the field that studies the solutions of sets of polynomial equations by looking at their geometric properties. For instance a circle is the set of solutions of \( x^2+y^2 = 1 \) and in general such a set of points is called a variety." I think the trick is to bootstrap the math on things scientists know, simplifying definitions (Stewart's maxim "Lie a little") and getting to some core non-trivial motivating example if possible.
"Expliquer les maths de Grothendieck en termes du langage quotidien? Ça me semble difficile ... " Michel Demazure, lui-même mathématicien et élève d'Alexandre Grothendieck dans les années 1960, sait de quoi il parle. "D'une part, il y a une histoire mathématique dans laquelle il s'inscrit totalement, de l'autre son approche est très personnelle et d'une certaine manière unique. Grothendieck a reconstruit la géométrie algébrique, mais il n'a jamais écrit une équation algébrique. Il ne regardait jamais un objet particulier, le cadre était plus important pour lui que les objets qui s'y trouvaient. Tous les matheux ont des objets en tête, mais il n'avait pas les mêmes que les autres? "
But frankly, I was quite disappointed by their struggle to say something meaningful about what schemes and functors are. They start, as John and I finally did, with a circle but discussing how one can look at the integer and rational solutions of the equation of a circle as well as real and complex solutions. This leads them to the following passage where schemes and functors are strangely conflated. I'm not sure why they say a set of equations could have no solutions -- what happened to the nullstellensatz? I guess they meant the variety has no points rational over the ground field.
]]>Or, la mathématique étant le pays de la liberté, il n'y a aucune raison de ne pas considérer les solutions d'une équation, ou d'un système d'équations, pour n'importe laquelle des espèces de nombres évoqués ci-dessus. Ce qui enrichit encore considérablement la variété des variétés ...
Et c'est là qu'intervient Grothendieck. Rappelons-nous qu'une variété est un objet géométrique, qui représente les solutions d'un système d'équations. Mais il y a des cas où le système n'a pas de solution, de sorte que la variété correspondante n'a pas de points. On ne peut pas la dessiner comme une figure géométrique. Mais peut-on quand même l'étudier ? L'idée de Grothendieck est de généraliser la notion de variété, en passant par les propriétés algébriques, et en "ignorant" les points : "Grothendieck ne se préoccupe pas des points, il les oublie délibérément, explique le mathématicien français Jean-Michel Kantor. Son raisonnement revient à dire : même si j'ai une équation sans solution, je veux pouvoir étudier cet objet ; donc je vais rassembler toute une série de variétés, sans savoir s'il y a des points, et je vais construire un objet plus général, qui inclut tous les cas possibles."
Cet objet plus général s'appelle un "schéma". L'intérêt des schémas est qu'ils élargissent le cadre de l'algèbre, tout en conservant les propriétés les plus importantes. Les schémas permettent de traiter dans le même cadre le monde des nombres entiers et celui des grandeurs continues, répondant aux questions soulevées par Diophante il y a 1 800 ans. Ainsi, avec les schémas, notre cercle peut être étudié aussi bien en considérant les nombres entiers que les réels ou un autre type de nombres.
To start, suppose \( A \) is an \( n\times n \) matrix (real or complex, doesn't matter). Of course, its powers \( A^n \) are given by the usual formula, here for \( n=3 \): $$ (A^3)_{i,\ell} = \sum_{j,k} A_{i,j} \cdot A_{j,k} \cdot A_{k,\ell}. $$ One can think of this in a new way: let \( S=\{1,2,\cdots,n\} \). Then \( \{i,j,k,\ell\} \) is a discrete path in the set \( S \) from \( i \) to \( \ell \), and the matrix coefficients of the power are sums of terms, one for each path from some column index to some row index.
Let's make the problem harder: instead of powers of a matrix, let's consider a 1-parameter group of matrices obtained by exponentiating a fixed matrix \( H \), namely \( U_t = e^{tH} \). Then one might expect that the matrix coefficients of \( U_t \) are sums (or better integrals) over continuous paths in \( S \). \( S \) being discrete, a path in \( S \) means a sequence of constant intervals interspersed with jumps, like a frog jumping on lily pads.
This is a finite version of what Feynman introduced in his path integral formalism for quantum mechanics. Note that in quantum mechanics the vectors in the space \( \mathbb R^n \) on which A operates are called the states of the system. (In QM, the states must actually be complex vectors, not real.) Feynman was dealing not with a matrix giving a linear operator on \( \mathbb R^n \) but with operators on a Hilbert space \( \mathcal H \). In the simplest case, \( \mathcal H \) could be \( L^2(\mathbb R) \), the states are then complex valued functions on \( \mathbb R \) and \( U_t \) could be an integral operator given by convolution with a kernel \( K(x,y,t) \). Then his goal was to write \( K(x,y,t) \) as a sum over all paths in \( \mathbb R \) from \( x \) to \( y \) of an expression involving the path and \( t \). He thinks of these as paths of an underlying classical particle moving in \( \mathbb R \). Of course, the set of paths is an infinite dimensional manifold and then to sum over all paths one needs a measure on the set of these paths with respect to which one can integrate. Finding the appropriate measure is one problem and showing the integrand he needs is in some sense integrable turned out to be even harder.
I want to develop his approach for finite dimensional \( U_t \) where everything is quite elementary. This is useful because finite dimensional quantum systems have come into prominence in the last decades as the setting for quantum computing. And the path integral formalism is the right one to use when you treat the interaction of this elementary system with the external world from which it can never be totally insulated.
Start by fixing a large integer \( N \) . Then: $$ \begin{align*} (U_t)_{a,b} &= \left((U_{t/N})^N\right)_{a,b} \\ &= \sum_{a=k_0,k_1,\cdots,k_N=b} \prod_{i=1}^{i=N} (U_{t/N})_{k_{i-1},k_i} \end{align*} $$ Now if \( N>>0 \) , \( U_{t/N}=e^{tH/N} \) is approximately equal to \( I+(t/N)H \). Thus if at some \( i \), \( k_{i-1}=k_i \), the term in the product is near 1 while otherwise it is a bounded number divided by \( N \), hence very small. From this we see that the more jumps the sequence \( k_i \) makes, the smaller the corresponding term in the product. So let \( J \) be the number of jumps and consider the sparser sequence of values \( a=k_0, k_1, \cdots, k_J=b \) where now \( k_{i-1} \ne k_i \) for all \( i \). The jumps take place at particular `times' \( \ell_i/N \) and we reformulate the above expression as: $$ \approx \sum_{J=0}^{\infty} \left( \frac{t}{N}\right)^J \!\! \sum_{\begin{array}{c} a=k_0\ne k_1\ne \cdots \ne k_J=b \\ 1 < \ell_1 < \cdots <\ell_J < N \end{array}} \prod_{i=0}^{i=J-1} H_{k_{i-1}k_i} \cdot e^{\sum_{\ell = 0}^{\ell = J} \frac{\ell_{i+1}-\ell_{i}}{N} H_{k_i,k_i}} $$ It shouldn't be hard to quantify the approximation error here but let's skip this and pass quickly to the limit as \( N\rightarrow \infty \) where the expression becomes exact again. This leaves the \( k \) sequence alone but now the \( \ell_i/N \)'s are replaced by intermediate times \( t_i \) in the interval \( [0,t] \) where the jumps take place, the sum over \( \ell \)'s is replaced by an integral over the \( t \)'s and you take into account the constant needed when the sum over the \( \ell \)'s is looked at as a Riemann sum for the integral over the \( t \)'s. What comes out is: $$ =\sum_{J=0}^{\infty} \! \underset{\begin{array}{c} a=k_0 \ne k_1 \ne \cdots \ne k_J=b \\ 0 < t_1 < \cdots < t_J < t \end{array}}{\sum\int} \prod_{i=0}^{i=J-1} H_{k_{i-1}k_i} \cdot e^{\sum_{\ell = 0}^{\ell = J} (t_{i+1}-t_i) H_{k_i,k_i}} dt_1\cdots dt_J $$ Note that the integrand is bounded by a constant to the power \( J \) and the integral is over a simplex with volume \( t^J/J! \), hence we get convergence of the sum over \( J \). Going a step further, let \( X \) be the path space of piecewise constant functions \( f:[0,t]\rightarrow \{1,2,\cdots,n\} \) with a finite number of jumps. \( X \) breaks up into pieces \( X_J \) according to the number of jumps and these into pieces depending the the sequence \( \vec k \) of values of \( f \) and finally what remains are simplices in \( \mathbb R^J \). We have the euclidean measure on these components, hence a finite measure \( \mu_X \) on \( X \). We may write a point of \( X \) as a pair of vectors \( (\vec k, \vec t) \) describing its jumps and values. Let \( X(a,b) \) be the paths that begin at \( a \) and end at \( b \). Then we have the final theorem:Theorem. For any \( n \times n \) matrix \( H \) the matrix entries of \( e^{tH} \) are given by: $$ \left(e^{tH}\right)_{a,b} = \int_{X(a,b)} \prod_{i=0}^{i=J-1} H_{k_{i-1}k_i} \cdot e^{\sum_{\ell = 0}^{\ell = J} (t_{i+1}-t_i) H_{k_i,k_i}} d\mu(\vec k, \vec t) $$
I'm not sure one can convince college teachers of this but this result fits easily into the curriculum of undergrad linear algebra courses!
There is a definite reason why this way of writing \( U_t \) is important which we now sketch. Consider a quantum computer or any other quantum effect that is modeled by a finite dimensional space. The space is now always a complex vector space with a Hermitian inner product, that is a finite dimensional Hilbert space \( \mathcal H_\text{fin} \). But the rest of world always intrudes to some extent and this is modeled by a tensor product \( \mathcal H_\text{fin} \otimes \mathcal H_\text{heat} \). The second factor is another Hilbert space usually referred to as a heat bath because it is often assumed to be thermodynamic equilibrium. The evolution will be described a joint Hamiltonian operator \( H = H_\text{fin} \otimes I_\text{heat} + I_\text{fin} \otimes H_\text{heat} + H_\text{inter} \) where \( H_\text{inter} \) is the interaction term. Then the system evolves according to \( U_t = e^{itH} \) (imaginary powers here -- note that this doesn't affect anything we did before).
A classic 1963 paper of Feynman and Vernon showed how to describe the perturbation of the finite system that is caused by its coupling with the heat bath. Start by describing the evolution of \( \mathcal H_\text{fin} \) by integrating over paths \( f(t) \in X \). For each fixed path \( f \), the effects of coupling \( \mathcal H_\text{fin} \) on the heat bath is to add to its native Hamiltonian the term: $$ H_f:\mathcal H_\text{heat} \rightarrow \mathcal H_\text{heat}, \text{ where } \langle y, H_f(t) \rangle = \langle f(t)\otimes y, H_\text{inter}(f(t) \otimes x\rangle.$$ For simple models of heat baths, this \( H_f \) looks like adding an external force field to the heat bath and it urns out that the composite Hamiltonian can be integrated by an explicit formula. Using this formula, we find that the evolution of the joint system can be described by the path integral in \( \mathcal H_\text{fin} \).
But what happens now is that states of the finite system get entangled with the heat bath and this is not a useful description. We need to 'trace out' the heat bath in order to describe its effects on the finite system. This done by retreating a bit from describing the system by a single state and accepting that we need to describe it as a mixed state. A mixed state is a probabilistic combination of many states described by a density matrix. If the mixture is made up of a set of vectors \( \vec x^{(a)} \in \mathbb C^n \), each with a probability \( p(a) \), so that \( \sum_a p(a)=1 \), then one defines the density matrix describing this mixed state by the Hermitian matrix: $$ \rho_{i,j} = \sum_a p(a) \bar x_i^{(a)} x_j^{(a)}.$$ It's a sad fact of life that any system entangled with the world needs to be described by these \( \rho \)'s and is never 'pure' anymore. What Feynman and Vernon did was to show that at least the density matrix of the finite system coupled to a heat bath could be described by path integrals for the finite system if one adds a factor called the influence function determined by the heat bath. Because we're dealing with density matrices, we need to integrate over not one but two piecewise constant paths \( (f,f') \) of the finite system. Let the integrand in the the above theorem be denoted by \( H(f) \) for a path \( f \in X \). For the simplest case of a two state finite system, with values \( \{+1,-1\} \) now, we get $$ \begin{align*} (\rho_{\mathcal H_\text{fin}} \! (t))_{a,b} \! &= \! \! \! \iint \!\! (\rho_{\mathcal H_\text{fin}}(0))_{f(0),f'(0)} \! \! {\mathcal F}(f,f') H(f) \overline{H(f')} d\mu(f) d\mu(f') \\ \log({\mathcal F}(f,f')) \! &= \! \! \! \iint_{ 0 < r < s < t } \! \! \! iL_1(s-r)(f(s)-f'(s)(f(r)+f'(r))\\ & \qquad - L_2((s-r)(f(s)-f'(s))(f(r)-f'(r)) dr ds \end{align*} $$ where \( L_1, L_2 \) are determined by temp.erature coupling and frequency spectrum of heat bath. No one ever said physics was easy.
]]>Riemann called the terms in \(\rho_k\) the oscillating terms because if \( \rho_k = 0.5 + i.\omega_k \) and we pair symmetric roots \( \pm \omega \), then $$ \sum_k x^{\rho_k-1} = 2\sum_k \cos(\log(x).\omega_k) / \sqrt{x}.$$ Thus Riemann showed that the logs of the primes show periodic behavior. Let's start from scratch and ask if we find periodic behavior in the logs of the smallest primes or, as they get larger, clusters of primes.
The ratios of the lowest primes 2, 3, 5, 7, 11 are roughly 1.5, 1.67,1.4,1.57 which all cluster around 1.55. But then 13/11 is only about 1.18. To fix this, after 10 we shift from single primes to prime pairs, replacing the pair by the even number in the middle, getting the new sequence:
2, 3, 5, 7, 12 for (11,13), 18 for (17,19), 23?, 30 for (29,31), 37?, 42 for (41,43), 47?.
Skipping the isolated primes 23 and 37, the ratios are now 1.5,1.67,1.4,1.71, 1.5,1.67,1.4. If you make a linear fit to the logs, you find the sequence is approximated by $$ 1.27\cdot (1.557)^n \approx 1.98, 3.08, 4.80, 7.47, 11.64, 18.12, 28.22, 43.94. \cdots $$ Hmm: not bad. Also note that we ignored prime powers, which explains why the prime 5, dragged down by 4 became 4.8 and the prime 7, dragged up by 8 and 9, became 7.47. Even more startling, this power law would come from a periodic term in log-prime density of form \( \cos(2\pi \log(x)/1.557 \) and \( 2\pi/\log(1.557) = 14.185... \), which is very close to the first zero of Riemann's zeta, namely 14.1347...! In other words, the basic idea behind Riemann's periodic terms is indeed apparent in these small primes. This is especially startling because the convergence of the explicit formula is very slow: there are very many slowly oscillating terms beyond the first one so there is no compelling reason why the lowest \( \omega_k \) should nail these primes this well.Let's go back to the explicit formula and change coordinates to \(y=\log(x) \). Again writing the zeros as \( (0.5+i.\omega_k) \) where \( \omega_k \) is real under the Riemann hypothesis, being careful with the deltas and summing only over k with \( \omega_k > 0 \), you get: $$ \sum_{p, n} \tfrac{\log(p)}{p^{n/2}} \delta_{\log(p^n)}(y) = e^{y/2} - 2\sum_k \cos(y\omega_k) - \tfrac{1}{e^{y/2}(e^{2y}-1)} $$ Note that instead of thinning out logarithmically as the primes do, the logs of primes now get dense at an exponential rate. After weighting the prime powers as shown, they still have density \( e^{y/2} \), the first term on the right. But after that we get oscillations. Curiously an immense amount of work has been done on very large primes and very large zeta zeros while this formula for small values of y doesn't seem to have been looked at. Let's first look graphically at the small log-prime-powers weighted as in this formula. The horizontal axis is log scale, the filled circles are the logs of the primes up to 50, the dots the prime powers. The solid line is the convolution of the weighted sum of deltas as above with a Gaussian with standard deviation 0.1. The line of hatch marks is its approximation with the above explicit formula but using only ONE zero of zeta and the vertical lines are its peaks where the cosine equals -1. Note that 23 and 37 are being ignored and will require the next zero of zeta as will separating 5 and 7 from adjacent prime powers.
How many of the zeta zeros are hidden in the primes up to 53? Let's sample the interval [0,4] in the log-prime line discretely so we have deltas that are functions and take the discrete cosine transform. We find chaos in the high frequencies but terms \( cos(\pi \log(p) (k-1)/4) \) for \( 1 \le k \le 50 \) seem to be coherent and give us oscillating terms whose discrete frequencies correspond to $$ \omega = 14.1, 20.0, 25.0, 30.4, 32.9, 37.6 \pm 0.4 $$ that are quite close to the true zeros 14.1, 21.2, 25.1, 30.6, 33.0 and 36.9. Below is the low frequency part of the DCT:
Can we find the oscillations in larger primes? The simple answer is that they get drowned in the exponentially increasing density of log-primes. Extending the above plot to higher primes, one finds that the slope of the large exponential function erases the local minima apparent for the small primes. There are several ways to find them however. One can simply subtract the mean density \( e^{y/2} \) or one can convolve the weighted sum of deltas with a suitable filter that kills the average. An engineer knows how to form filters that not only do this but also pick out some range of frequencies. This can be used to find the oscillations caused by all the zeros of zeta.
Let's stick to the simplest case. If we want to kill a constant term and suppress higher frequencies, a simple way is to convolve with the second derivative of the Gaussian. But we want to kill \( e^{y/2} \), so we need to first premultiply by \( e^{-y/2} \), then convolve with the second derivative and finally multiply back by \( e^{y/2} \). In one step, this amounts to convolving with: $$ (y^2 - \sigma^2)\cdot e^{-\frac{1}{2\sigma^2}(y-\frac{\sigma^2}{2})^2}. $$ For \( \sigma = 0.2 \), the value we will use, this kernel looks like this:
If we use this filter and convolve the weighted sum of deltas at the logs of all prime powers up to 3 million, we finally get: There are peaks for example around 1.9 million and around 3 million. I don't know if anyone has noticed this extra density of primes around these values. Note that we are not looking at one precise value but at a range, e.g. 1.75 million to 2.1 million and comparing it to a dip before and after. One wonders whether Gauss noticed this during his numerical exploration of \( \pi \).
As Barry asked me, the fact that the lowest zeros of zeta show themselves in the very smallest primes seems to extend to Dirichlet L-series too. The simplest case is the mod 4 series, giving the sign +1 to primes congruent to 1 mod 4, and -1 when congruent to 3 mood 4. In fact, just as the lowest zero of Riemann's zeta is close to 2pi divided by the log of the ratio of the two lowest primes 3 and 2, the lowest zero of the L series (6.02) is close to \( \pi \) divided by the log of the ratio of 5 and 3 (6.15). This is because 5 and 3 are the two lowest odd primes and they have opposite residues mod 4, hence should differ by \( \pi \), not 2\( \pi \), in the oscillation caused by this zero.
Here's the plot, convolving the signed and weighted sum of deltas with a Gaussian of standard deviation 0.2.:
Note how we have negative peaks at 3, 7 and the pair [19 23], all congruent to 3 mod 4, and a positive peaks at 5, the pair [13 17] and the pair [37 41], all congruent to 1 mod 4. The vertical lines are half periods of the lowest frequency L-function oscillating term.
]]>
In a nutshell the reason for the usefulness of algebra is this: life is full of situations where several numbers are needed to describe a situation, these numbers vary from one situation to another but in each case the numbers have a fixed arithmetical relationship to each other that doesn't vary. Writing this relationship as an equation gives you a clearer grasp of all these situations, much as having the right word in your vocabulary can help you grasp immediately new situations described by this word: in both cases, your mind learns a structure that will fit many situations in the future. An equation can be thought of as a quantitative metaphor. Those who never internalize this equation are condemned to dredge up isolated rules every time similar situations come their way.
The simplest case is that in any trip, distance travelled is the product of the time the trip takes by the speed of travel. Going by plane, 3000 miles from NYC to SF equals 6 hours times 500 miles per hour; a 2 mile walk is 40 minutes (2/3 of an hour) times a typical walking pace of 3 miles per hour. We write this: $$d=s.t$$ using simple abbreviations for distance, speed and time. Clearly, if s and t are known, the formula tells us what d is. But algebra tells us that we can also play the game using: $$s=d/t \text{ or } t=d/s$$ so that if we know d and t, we get out s, etc. The rules of algebra show how a numerical relationship of one kind can be used in multiple ways. Once you get the hang of thinking in terms of a formula, the formula becomes a much clearer way of describing a situation than an awkward long sentence. It becomes the natural way of grasping how numbers fit together. But before this happens, you need to see a lot of meaningful instances and schools, all too often, just drill the student in abstract formulas with no real world meaning.
It is in financial matters that most of us need to grasp numerical relationships more clearly and where formulas can help a lot and give us the power not to have to accept blindly everything told to us by 'experts' (who are usually salesmen). A spreadsheet is a terrific stepping-stone for some: to use these efficiently, you enter formulas into cells that calculate a new value from values in other cells. The spreadsheet is not merely a set of numbers but a whole web of numerical relationships.
A major high school topic, pretty much always taught without showing any relevance to the real world, is the theory of polynomials. This is sad because they are relevant in understanding paying off loans. Your average student wants a car and may be able to get one on time. But, for instance, if they charge a bad credit risk teen-ager 1.33% interest per month (16% APR) on a 5 year loan, he would do well to know that his total cost works out to be about 50% more for the car as he would pay if he had the cash.
His high school class can give him the confidence to "do the math" himself and not rely on others with their own agendas. The first step is to assign abbreviations to the numbers involved. Use C for the cost, P for the monthly payment, r for the rate of interest. Then one month's interest increases the loan from C to \( C.(1+r) \) and one payment decreases it to \( C.(1+r)-P \). Repeating this for the second month, the balance owed becomes \( (C.(1+r)-p).(1+r)-p\) . Seems like a mess only a math nerd would love. But use the rules of algebra and it becomes a quadratic polynomial in the number (1+r): $$ C.(1+r)^2 - P.(1+r) - P.$$ If you go on for, say 4 months, the balance owed will be this polynomial: $$ C.(1+r)^4 - P.\left( (1+r)^3 + (1+r)^2 + (1+r) + 1\right).$$ We're not giving a lecture here, just hoping to show how algebra can be useful. So let's just say -- if you use the stuff taught in every Algebra II class and pursue what we have started, you'll wind up easily seeing that, if you need to pay off the loan in 5 years (60 months), your payment P will be $$ P = C.\frac{r}{1-(1+r)^{-60}}.$$ In the example above, make r = 0.0133 and work out his total cost, 60P, on a hand held calculator, and you get about 1.5 times the cost C of the car.
The formula above, though it might show up in a New Yorker cartoon with white-coated scientists, when you play with it reveals an essentially simple relationship between interest rates, loans and payments. The majority of real life mathematicians work on real problems like this and not on abstract stuff in ivory towers.
A nation-wide discussion, verging on a political fight, is going on right now pro and con the Common Core State Standards in Math (CCSS-M) and the involvement of the Department of Education. As we see it, the CCSS-M have considerably upped the ante in abstract math but have also opened the option of introducing 'modeling', a code word for math that might relate to the real world as students know it. All K-12 math can be enlivened and made relevant, exciting even, to students by dipping into the vast array of applications that math has to real life. Our message: math, properly taught, need not turn you off.
For a long time, I thought algebra was a natural language for everyone who had had a decent middle or high school math teacher and I thought that probably around half of all high school graduates had the gist of it down. Then Dave Wright, Caroline Series and I wrote Indra's Pearls and I gave copies to quite a few non-mathematical friends. Essentially all of them loved the pictures but, when reading the text, got stuck on Chapter One where we tried to write a gentle introduction to the arithmetic of complex numbers. I began to understand why no general circulation magazine or newspaper will ever print a formula.
Where did math class loose this large group of students? An image that has gone viral shows a blackboard on which a student has written "Dear Algebra, Please stop asking us to find your X. She's never coming back and don't ask Y." Sigh.
Very nice blog post ... you describe where we want to arrive in mathematics education very well with the idea of an equation as a "quantitative metaphor," a "natural way of grasping how numbers fit together." The question is how to get there.
An important part of the problem that interests me (by no means the entire solution) is grasping the complex relationship between fluency and conceptual understanding. One of the great divides in mathematics education is between those who pay lip service to one or the other of these. Those of us who love algebra bring to it a certain native fluency that we are sometimes not even aware of; it frees us to look at complex algebraic expressions with confidence and equanimity. What do we do with students who don't have that confidence? Some dry practice is necessary, but it should be practice in simple things deeply understood, like the relationship between the three equations resulting from \( d = rt \).
This starts in elementary school, with the relationship between multiplication and division facts. A first step would be a truce between the warring camps there: yes, kids need to know their facts cold, but they don't need to memorize every single one as a separate fact: if you know 3 × 5 = 15, then you know 15 ÷ 5 = 3 and 15 ÷ 3 = 5, not to mention 5 × 3 = 15.
Arithmetic is a great seed bed for algebra. This was brought home to me forcefully when I watched one of the Tea Party anti Common Core videos complaining about the complicated way an elementary classroom was dealing with 9 + 6 = 15. The teacher decomposed the 6 as 1 + 5, then put the 1 with the 9 to make a 10, then added the 5 to get 15. Why don't they just do it the old-fashioned way? scoffed the narrator: just memorize the facts. No sense of irony here: the narrator clearly had no idea that if you understand why 9 + 6 = 15, then you understand why 19 + 6 = 25, 29 + 6 = 35, etc. And also why 8+6 = 14, 7 + 6 = 13. It struck me like a lightning bolt that the narrator actually *did* memorize all these as separate facts. But if you start kids out with a flexible understanding of arithmetic, then they are more likely to appreciate your formula for the total amount paid on a loan (by the way, you ignored the time value of money there, but never mind).
On the other side, we have people expressing horror at the very idea of memorizing anything at all. And yet, which one of us has not experienced the pleasure of memorizing a poem, a formula, how a spectral sequence works? Why deny kids that pleasure?
Procedural fluency is very much informed by understanding (as in the 9+6 example), and understanding is very much informed by procedural fluency (as in the ability to see \( r = t/d \) right away from \( d = rt \) ); they are deeply intertwined in ways I have never quite figured out how to articulate.
I agree Bill, procedural fluency (sometimes through memorization) and understanding have to grow side by side. The part that I feel is often overlooked, though, is that using letters for numbers is a strange and challenging idea for many and that concrete examples using abbreviations are a natural stepping stone to the doubly abstract x and y. It would seem natural to me to introduce very simple formulas with abbreviations at least a year before x and y. I hope there will be good sources for concrete examples available to teachers. Zalman Usiskin sent me a preprint of a paper of his forthcoming in Mathematics Teaching in the Middle School
I agree that the post is a very nice description of why equations (and functions and expressions) are so useful. And I can't stress enough how much I agree with "The question is how to get there.''
How to get there is not easy. As Bill's examples show, these mathematical ways of thinking start early in arithmetic and require nurturing throughout middle and high school.
Your example of monthly payments on a loan is a perfect example of just how subtle the ideas are and how they need to be built up over time. You claim that "The first step is to assign abbreviations to the numbers involved.'' My colleagues at EDC and I have used this example for decades in our own CME high school curriculum in our high school teaching before that (I know that you're familiar with all this history), and the step of writing down the relationships in precise algebraic language is somewhere near the midpoint of a long development that is preceded by carefully orchestrated numerical calculations, an introduction to functions and recursively defined functions, and experiments with a spreadsheet and later with a CAS. Once the basic algebraic relationships are in place, there are a host of other sophisticated ideas that need to be in place before one can get the closed form for the monthly payment.
A more detailed description of how all this might be developed is in chapter 2 of the NCTM monograph "Reasoning and Sense Making in Algebra.'' The PARCC Content Frameworks uses the example to describe this habit of using precise language as an (essential, in my experience) intermediate step between numerical examples and algebraic generalization:
"Capturing a situation with precise language can be a critical step toward modeling that situation mathematically. For example, when investigating loan payments, if students can articulate something like, ?What you owe at the end of a month is what you owed at the start of the month, plus 1/12 of the yearly interest on that amount, minus the monthly payment,? they are well along a path that will let them construct a recursively defined function for calculating loan payments.'"
Our curriculum develops the idea over three courses, introducing refinements as the kids develop more sophisticated tools. I took some time this morning to cull out the relevant lessons, attached here.
In elementary algebra, students learn to define functions recursively and then to experiment with the monthly payment context using a spreadsheet.
In advanced algebra, they build a recurrence to calculate the balance on a loan, month by month, and they analyze the computational complexity of the resulting function (modeled in a CAS), using simple algebra to make it more efficient.
Later, they learn several techniques for resolving recurrences of different types and hence derive a formula for the monthly payment on a loan (using, for example, geometric series or affine transformations). A real source of excitement here is that they finally get to understand some things they noticed empirically before, like the fact that, interest and term held constant, the monthly payment is a linear function of the cost of the car.
So, many of us agree with you that this kind of investigation is exciting to many students, at many levels of sophistication. I've used the monthly payment investigation with high school juniors and seniors who got by their previous courses by the skins of their teeth.
But I'm convinced that it's not the context that's the source of excitement. It's the challenge, the ability to see into the structure of a problem, and the chance to use mathematics to exploit that structure. I've seen HS kids get just as excited trying to figure out which integers can be written as a sum of two squares.
Although I agree with much that Al says, I diverge from him where he says "the step of writing down the relationships in precise algebraic language is somewhere near the midpoint of a long development that is preceded by carefully orchestrated numerical calculations, an introduction to functions and recursively defined functions and experiments with a spreadsheet and later with a CAS (?)". Numerical calculations and spreadsheets sure but why teach the general concept of function and recursion first? This seems to me the point of view of a pure mathematician -- that you cannot understand an idea until you have a general definition for it. I would put it backwards: you cannot understand the general idea of a recursive function until you have seen some motivating examples. After working with numbers in spreadsheets, a recursive formula with abbreviations is not a big step. I want to stick to my guns: show real examples first, trusting that the concrete context allows the teacher to explain easily the arithmetic in the formula. After enough examples are seen, then one might introduce general functions and general recursion rules. A confession: this is how my mind works and Al's approach is a stumbling block for me reading many math books.
Here is the attachment he sent me: CME Pages, three sections from three books I guess, leading up to the formula for the monthly payment in my blog. It is hard to judge the pedagogy from such fragments, BUT If you have a look at the very beginning, note that the general idea of a function, using the letter f, has been introduced earlier and that an unmotivated linear recursive function is recalled (I hope it was motivated by approximating data). I suspect that this section is probably the first place where the student sees that a recursion is actually needed. My interest in this example was that here is a financial situation which might motivate studying polynomials.
Nov.13. I received the following from Ulf Persson, a student of mine many decades ago who teaches math at Chalmers University in Sweden and often writes me long emails. I think he typifies how most mathematicians react to discussions of curricula:
It is true when it comes to counting, until recently at least, counting meant for most people counting money. Even uneducated people in a store usually could count in their head quickly and accurately, this is no longer the case I suspect as there is no longer the need, you only push in the figures on the keys and out it comes. I guess most people think of this as progress. When I attended elementary school there were a lot of word problems to the effect that you buy so and so many shirts at such and such a price and add a percentage or what, and with other costs you compute the profit (or as we say in Swedish 'vinst' (gain, from winning) as 'profit' is a dirty word indeed). I found it extremely boring. You saw through the structure once and after that it was just mindless repetition. Such real life connections are supposed to motivate people, or at least give to the exercises a certain meaning. I doubt whether it works at all. When it comes to say celestial mechanics, which of course can be thought of applied mathematics, it was different. The idea that you could compute the paths of planets, the flattening of the earth (which is very hard as I realized when I tried it myself, I wonder how Newton did it), the height of tides (likewise quite difficult, the standard explanations are not quantitative) or the period of the precession, is very exciting. Here it helps that it has 'real life' applications, just as in optics, when the fact that the lines obeying certain reflection and refraction laws are supposed to be light rays not just formal laws for the behavior of certain lines, stimulate your imagination. When I was considering the problem of measuring heights of lunar mountains from their shadows cast on a perspectively distorted sphere, I began to feel the need for systematic trigonometry. It is not just a question of pure versus applied, some applications stimulate your imagination, others simply kill it.
I must say that I am a bit skeptical about stimulating the imagination of mathematically recalcitrant children by giving them something from which they may tangibly gain such as their personal financial well-being. It certainly would not have worked with me, this does not mean that it would not work with others, we are all different supposedly. On the other hand we are less unique than we vainly believe, and many of our deficiencies are shared with substantial fractions of the population, much to our consolation. If it is just the case of one single formula their response would just be, why not simply have it implemented in some application so you can just plug in the values. If they are truly made to love formulas, it involves them setting up formulas for themselves for slight variations of the problem. But to do so I suspect there must be some additional intellectual stimulation beyond finding facts relevant to their financial situation. It is like 'like meeting like'. Problems in astronomy, mechanics, optics etc very much have the same flavor as problems in mathematics making for the possibility of transform, which is the ultimate goal of education. It is also why a physical intuition is useful mathematically and the other way round. Financial accounting is something different, it does not even have to do with making money. I recall a story which was spread around in my childhood about some indifferent student who was doing very well being asked why. 'I am buying the stuff for a dollar a piece and selling it for four, and on those four percent I am doing quite well' was his reply. My point is that there has to be something else to kick in for people to love a formula. Of course I have no empirical experience to draw on, but am resorting to the cheap and flexible and thus time-honored method of introspection.
How did I learn to count? I remember the occasion very vividly. Unfortunately I do not remember my precise age at the time. I was north at my grandparents farm and helping, no doubt very ineptly, to put hay on some stacks along with my father. I asked or was told that trettio and trettio was sextio, and knowing beforehand for some reason that tre and tre was sex, everything fell into place, and I recall thinking to myself that things hang together. Obviously it made a deep impression on me, and I have often afterwards recalled the episode in my mind, with the danger of elaborating on it. As I have grown older and wiser I have been able to make more sophisticated interpretations of it. Now I think that it was then I realized that you could count not only hay stacks and cows, but numbers themselves. This is indeed a very potent idea which unfolds an entire new world, which I instinctively embraced (and became subsequently for my age quite adept at doing arithmetical operations in my head, impressing my mother of the clever method I multiplied by splitting up in factors and rearranging them, in short I fell in love with numbers). This anecdote also shows the advantage of the illustrative example rather than the general theory. It is much more instructive to be led to a flight of fancy, than to mechanically decode and reduce. A case of which we seem to be in total agreement. Learning to count you never thought of the commutative and associative laws, they were internalized. Thus mathematics is not a question of following rules, as it is often presented as.
What is clear in Ulf's story is that he is a born mathematician: bored with any example worked more than once, wanting really challenging real world problems and picking up on the idea that numbers have a life of their own not merely as numbers of cows. This is, of course, why teaching math is so frustrating -- bore one student, confuse others. Placing students in different tracks is great but not always possible. My emphasis on finance and accounting is due to this being a topic that most kids from 7th grade on really want to master. His friend who profited 4% (or 300%) certainly illustrates this. Of course you need to present multiple variations of each class of problems and teaching that you can transform a formula from one setting to another is a central goal but not an easy one for the weaker students.
]]>