The Climate Change Scandal and the Importance of Good Code
_Vicki’s note: I decided to take down the previous post I had because, upon reflection, it wasn’t consistent with my website content and additionally, I can see it leading to long discussions of the type that I don’t necessarily want to have on this blog.</p>
Instead, today, I have a guest post from Mr B (wooo!) on the importance of validity of code, and in general, of continuing to learn new methods and processes for more efficiency on the job.
As an analyst, I sometimes look at code, be it in SAS or SQL, from projects past. What I’ve found to be most indicative of good code (and actually any past project even unrelated to programming) is, most importantly, documentation of past tasks. </em>This is 100 times more important if you are leading global warming studies that could be the cause of billions of dollars spent in policy implementations. I know little about Fortran, but I still think this is an interesting take on the scandal that most news people don’t talk about (and not just because Mr. B peels oranges for me on a daily basis.)
You might have heard about the big Climate Gate scandal of CRU’s emails and software being leaked to the public over the past two weeks. Everyone, from John Stewart to Fox News, is highlighting passages from these emails and pointing out damning quotes which seem to show that global warming is a myth.
However what most people don’t hear about is the software that crunched the temperature data to actually give the climate models that scientist use to make their predictions. This software is written in an almost-defunct programming language called Fortran- and it shows.
Fortran is a general purpose programming language designed by IBM in 1950, for mathematical computations. It became the language of choice of many fields such as engineering, scientific research, and economic research. This was a great leap forward in innovation and allowed many computations on a scale that was never feasible before by other means.
If it’s so great, why isn’t it used now?
- Fortran does not impose any discipline on the programmer and programs often end up with very little structure. It makes it difficult to reason about the logic and correctness of the program.
- Fortran is needlessly verbose, accomplishing something seemingly simple requires a page of code.
- There are limited ways of abstracting away complexity.
FORTRAN—the “infantile disorder”—, by now nearly 20 years old, is hopelessly inadequate for whatever computer application you have in mind today: it is now too clumsy, too risky, and too expensive to use. — Edsger W. Dijkstra
FORTRAN’s tragic fate has been its wide acceptance, mentally chaining thousands and thousands of programmers to our past mistakes. — Edsger W. Dijkstra
- A lot of legacy code is already written in Fortran and so we’re collectively stuck with it, might as well learn it
- It’s there it works, get used to it
- I already know it and it is good enough for me
This type of thinking is bad for each new generation of economists and scientists.
Why was the climate study data written in Fortran? Probably because whoever wrote it, knew it and was too lazy to learn something new, by using the reasoning from (1) (2) and (3) as an excuse.
The second issue is not only that Fortran is inefficient, the code also lies. Here is some actual code from the project.
The moral of the story is, if you have important code that affects how people will live their
lives by basing their decisions on it, make sure it is easy to understand and allows you to concentrate on the problem you’re trying to answer instead of it fighting you and your colleagues with technical issues not related ot the goal. Additionally, make sure you don’t lie about it.