Advances in Pencil Science: Why Python?

Friday, January 25, 2008

Why Python?

Other people have other reasons, but mine is succinctly summarized by Paul Prescod in his article On the Relationship between Python and Lisp. Here's the core of the argument:

Paul Graham says that a language designed for "the masses" is actually being designed for "dufuses". My observation is that some of the smartest people on the planet are dufuses when it comes to programming, (or in some cases just dynamic programming languages) and I am pleased to share a language (and code!) with them.

I usually spend a big chunk of my day in Python. But most professional programmers will not be able to do that until Python is a dominant language. In the meantime they must switch back and forth between Python and the language of their day-job. Python is designed to make that easy. During the day they can pound out accounting code and at night switch to hacking distributed object oriented file systems. Every intuititively named function name makes it that much easier to "swap" Python back into your brain. After we have been away from a language for a while, we are all dufuses, and every choice made in favor of the duffers actually benefits even high-end programmers.

I get paid to share my code with "dufuses" known as domain experts. Using Python, I typically do not have to wrap my code up as a COM object for their use in VB. I do not have to code in a crappy language designed only for non-experts. They do not have to learn a hard language designed only for experts. We can talk the same language. They can read and sometimes maintain my code if they need to.

From a purely altruistic point of view, it feels good to me to know that I am writing code (especially open source code) that can become a lesson for a high school student or other programming beginner. I like to give programming classes to the marketing departments at the companies where I work.

Even hard-core geeks get a certain aesthetic pleasure out of using something that feels minimal and simple because most of the complexity has been hidden away or removed.

Code is a communication medium. The machine is not the only reader. Python is the only language explicitly designed with both beginners and experts in mind. This has direct benefits for the transition from beginner to expert, and it also has direct benefits for development collaboration among groups with distinct expertise.

Here is an example. I have never taken a compiler course and I still consider code compilation to be deep magic, though not as much as I did before Python began boosting my sophistication. Nevertheless, I can understand and appreciate the following.

# romanNumerals.py
#
# Copyright (c) 2006, Paul McGuire
#

from pyparsing import *

def romanNumeralLiteral(numeralString, value):
return Literal(numeralString).setParseAction(replaceWith(value))

one         = romanNumeralLiteral("I",1)
four        = romanNumeralLiteral("IV",4)
five        = romanNumeralLiteral("V",5)
nine        = romanNumeralLiteral("IX",9)
ten         = romanNumeralLiteral("X",10)
forty       = romanNumeralLiteral("XL",40)
fifty       = romanNumeralLiteral("L",50)
ninety      = romanNumeralLiteral("XC",90)
onehundred  = romanNumeralLiteral("C",100)
fourhundred = romanNumeralLiteral("CD",400)
fivehundred = romanNumeralLiteral("D",500)
ninehundred = romanNumeralLiteral("CM",900)
onethousand = romanNumeralLiteral("M",1000)

numeral = ( onethousand | ninehundred | fivehundred | fourhundred |
       onehundred | ninety | fifty | forty | ten | nine | five |
       four | one ).leaveWhitespace()

romanNumeral = OneOrMore(numeral).setParseAction( lambda s,l,t : sum(t) )

print romanNumeral.parseString("XLII") # prints "42"

Try doing that in a dozen or so lines of C++ or Java, mate, so that a random reader could get half a clue as to what was happening! Yes of course the "import" statement hides a great deal of magic, but that's the whole point, see?

14 comments:

David B. Benson said...: Here is what is looks like in Standard ML (except that I don't know how to produce (what I consider to be) the proper indenting):

val romanNumeralLiteralsWithValues =
[
("I",1),
("IV",4),
("V",5),
("IX",9),
("X",10),
("XL",40),
("L",50)
]
fun arabicize input = Int.toString(greedyParse romanNumeralLiteralsWithValues (op+) 0 input)
(*example*)
val () = print("XLII ~~> "^(arabicize "XLII")^"\n")

which at the end prints

XLII ~~> 42

on a separate line.; 25 January, 2008
Michael Tobis said...: David, I can't deny that is even nicer as far as it goes. Does it extend neatly to validation?; 25 January, 2008
David B. Benson said...: Michael --- You'll have to explain (or give a link) to what yoou mean by "validation" in this context. Otherwise I don't know how to answer.; 25 January, 2008
Michael Tobis said...: mean, the natural extension is to validate that the string is a valid roman numeral else throw an exception.

Python's exception handling is another thing of beauty by the way.

Of course, Fortran wins for sheer minimalism: it usually either ignores the exception or gives you a mysterious segfault...; 25 January, 2008
Anonymous said...: Beware! The pyparsing bug bites deep!

Have fun with it!; 26 January, 2008
David B. Benson said...: Yes, the greedyParser does some of that: If the parse cannot continue to the end it raises the exception Domain.
What it does not do is check that the roman numerals occur in numerically non-increasing order nor that forms such as "IV" are not repeated.

I opine that Python borrowed exceptions from Standard ML (or maybe Java(?)).

I'll just mention that developing the SML version required exactly three compiles and no run-time executions: the first compile showed that I forgot that string have size, not length (which arrays have); the second compile printed the answer, but I forgot the new-line character. The third produced the answer shown.; 26 January, 2008
Michael Tobis said...: Mine was even easier. I just lifted it from the pyparsing website.

Python exceptions are **vastly** more useful and convenient than Java exceptions.

In Java you have to enumerate all possible exceptions futher down the call stack. In Python you just assert the ones you intend to handle at the place you handle them. They propagate up the stack until caught, or as a last resort they terminate the Python process.

Exceptions are sufficiently lightweight that loop terminations are (usually? always?) handled as exceptions.; 26 January, 2008
David B. Benson said...: Aha. Then copied from ML's exceptions, which appear to be almost identical.

However, as a matter of style, in ML exceptions should be actually exceptional and not used routinely for terminations, etc. Of course, nothing enforces this other than a sense of clean, clear programming.; 26 January, 2008
Anonymous said...: Hi,

Though I like python, I don't think this a very strong case for it's superior readability over C++.

Using templates you could write a simmilar parsing library and your code would look along these lines:

#include "NiftyParsing.h"

NiftyParsing::Literal romanNumeralLiteral(const string& numeralString, const int value)
{
return NiftyParsing::Literal(numeralString).setParseAction( NiftyParsing::ReplaceWith<int, value>() );
}

one = romanNumeralLiteral("I",1)
four = romanNumeralLiteral("IV",4)
//etc.

NiftyParser::SomeClass numeral = ( onethousand | ninehundred | fivehundred | fourhundred |
onehundred | ninety | fifty | forty | ten | nine | five |
four | one ).leaveWhitespace()

OneOrMore romanNumeral(numeral, ParseAction<int, Summation>())

cout << romanToInteger("XLII") << endl;

Validation could be as easily done as in python, though ,regardless of the language- you would not really use exceptions (whose behaviour in C++ is very similar to pyhton's).

Cheers

P.S.
If you don't like the NiftyParsing:: scoping, you could import the symbols into you namespace.; 27 January, 2008
Michael Tobis said...: D, OK, fair enough as far as it goes.

On the other hand, has someone actually written NiftyParsing? Else your argument is to an extent somewhat theoretical. PyParsing is under 4 KLOC and written by a hobbyist. The advantage of Python is that a very wide array of such libraries exist, and that it's such fun to create them that people do so voluntarily.

All that said, what fascinates me most about the story is that most practicing computational scientists, certainly in the climate and weather domains, don't find the sort of elegance we are discussing interesting at all.

It's how to make the case for OOP to them that really matters, and it's for that reason that I display this approach to roman numerals.

I make very little headway in convincing them that such matters are other than an irrelevant curiosity, although the relevance of parsing to modeling complex systems to me is obvious.

I think a working prototype may be more valuable than any proposals or verbiage I can pull together.; 27 January, 2008
Anonymous said...: Hi Michael,

There might well be such a library available. I don't know.
Since C++ is also quite popular, there is also a myriad of libraries available. One project to provide a industrial standard library is boost (http://www.boost.org). They have a parser, but I am not sure if it is as "nifty" as what pyparse does.
Another library to do numerical (OO style) computing en-par with fortran is blitz (http://www.oonumerics.org/blitz/)

I have started to play around with f2py to see if I can get the OSS model plasim (http://www.mi.uni-hamburg.de/Theoretische-Meteorologie.6.0.html)
"ported" to python. I don't want to re-write the whole model, so I start by wrapping core functionality into python and only replacing the higher level code.
I hope that bit-by-bit this will allow me to develop a framework where people can quickly and easily inject (or replace) python code.

We'll see.

I believe you can fascinate (at least some) of the computational scientist by showing them how much more fun and productive it can be to write OO code.

When I came into the group I am in at the moment, I brought with me some of that enthusiam. Now people that have before written perl + fotran suites tell me they feel "insecure" if they can't write unit-tests along with their (Python + C++) code:)

Cheers; 27 January, 2008
Michael Tobis said...: I haven't worked with them myself, but as I understand it C++ templates (Blitz, etc.) are an impractical solution to a class of problem that dynamic languages actually solve.

I have heard of compile times for very high level physics models in C++ in the hours, often to fail because the *compiler* ran out of core!; 27 January, 2008
Anonymous said...: Oh, I am sure there are cases where C++ is not a good solution. I am not arguing for C++ instead of python. The both have their value.
What I was saying is that you can write readable OO code in C++ and you also get a lot of libraries for C++. One reason I am using C++ at work is that we have to produce operational application and the static typing gives us some additional security. For, say, climate modelling that would not necessarily be an issue. Speed can be more of a problem. But that's where inter-language calls come in handy.

Cheers; 27 January, 2008
Michael Tobis said...: Some poor sod showed up here looking for how to do int.tostring in Python; in case it comes up again for somebody, str() will convert anything to a string representation in Python.; 25 March, 2008

Advances in Pencil Science

Friday, January 25, 2008

Why Python?

14 comments:

My Other Blogs and SItes

Blog Archive