Advances in Pencil Science

Friday, April 12, 2013

Re: "SEND ME YOUR DATA - PDF IS FINE," SAID NO ONE EVER

http://www.caitlinrivers.com/1/post/2013/04/send-me-your-data-pdf-is-fine-said-no-one-ever-how-to-share-your-data-effectively.html

There's a serious problem with the current state of shared data - it is almost completely unusable! Here are some ideas for sharing more effectively.


Wednesday, February 29, 2012

CAM3 fails on namelist read

CAM3 won't run at TACC out of the box,. It will fail to read the namelist file, as the nodes can't read from stdin. This may apply to other HPC setups as well.

Here's my fix.

Copy the following text to a file called patchCAM


diff -ru cam1/models/atm/cam/src/control/runtime_opts.F90 ../CAM3/cam1/models/atm/cam/src/control/runtime_opts.F90
--- cam1/models/atm/cam/src/control/runtime_opts.F90 2005-03-08 11:06:36.000000000 -0600
+++ ../CAM3/cam1/models/atm/cam/src/control/runtime_opts.F90 2012-02-15 17:57:05.000000000 -0600
@@ -1065,7 +1065,15 @@
!
! Read in the camexp namelist from standard input
!
- read (5,camexp,iostat=ierr)
+! read (5,camexp,iostat=ierr)
+
+
+ open(16,file='namelist',iostat = ierr)
+ write(6,*) 'READING NAMELIST'
+ read(16, camexp)
+ write(6,*) 'FINISHED READING NAMELIST'
+
+
if (ierr /= 0) then
write(6,*)'READ_NAMELIST: Namelist read returns ',ierr
call endrun
diff -ru cam1/models/lnd/clm2/src/main/controlMod.F90 ../CAM3/cam1/models/lnd/clm2/src/main/controlMod.F90
--- cam1/models/lnd/clm2/src/main/controlMod.F90 2005-04-08 13:27:10.000000000 -0500
+++ ../CAM3/cam1/models/lnd/clm2/src/main/controlMod.F90 2012-02-16 15:26:58.000000000 -0600
@@ -369,7 +369,14 @@
fsurdat=lsmsurffile
finidat=lsminifile
#else
- read(5, clmexp, iostat=ierr)
+ ! read (5,camexp,iostat=ierr)
+
+
+ ! open(16,file='namelist',iostat = ierr)
+ write(6,*) 'READING NAMELIST 2'
+ read(16, clmexp, iostat = ierr)
+ write(6,*) 'FINISHED READING NAMELIST 2'
+
if (ierr /= 0) then
if (masterproc) then
write(6,*)'error: namelist input resulted in error code ',ierr


Go to the directory above cam1 where you unpacked the source and issue


patch -p0 < patchCAM

Wednesday, January 18, 2012

Friday, August 12, 2011

Fixing Excel text exports in OS X

Excel exports .tsv and .csv with a broken linefeed character.

To fix: copy this file to "fixcel.py"


# fixcel.py

from sys import argv
infnam, outfnam = argv[1:]
inf = file(infnam)
indata = inf.read()
inf.close()
outdata = ""
for ch in indata:
if ord(ch) == 13:
outdata += "\n"
else:
outdata += ch
outf = file(outfnam,"w")
outf.write(outdata)
outf.close()
#

Then invoke with:

python fixcel.py oldfilename newfilename

Thursday, December 9, 2010

Wordify

This code can count to a million in words. It could probably be refactored so it was shorter and could count to any number. If you were patient enough.

The "-b" flag puts "and" in the places a Brit would put it. It's optional but preferred.
I have random stopping points to increase the modest entertainment value of it running.

Can you recursively get wordify up to the quintillions? How should the "-b" flag be handled?

Can you do this in French?


from string import ascii_lowercase as alphabet
from sys import argv

units = [""] + "one two three four five six seven eight nine".split()
teens = "ten eleven twelve thirteen fourteen fifteen sixteen seventeen eighteen nineteen".split()
tenties = 2 * [""] + "twenty thirty forty fifty sixty seventy eighty ninety".split()


def name(number,terminate = True,continued = False,brit=False):
assert type(number) == int
assert number > 0 and number < 1001
if number == 1000: return "one thousand"
hundreds,residual = divmod(number,100)
tens,ones = divmod(residual,10)
result = ""
if hundreds:
result += units[hundreds] + " hundred"
if residual and hundreds:
result += " "
if brit and residual and (hundreds or continued):
result += "and "
if (tens == 1):
result += teens[ones]
if terminate:
result += "."
else:
if tens:
result += tenties[tens]
if ones: result += "-"
if ones:
result += units[ones]
if terminate:
result += "."
return result

def wordify(number,brit=False):
assert type(number) == int
assert number > 0 and number < 1000000
thousands,residual = divmod(number,1000)
result = ""
if thousands:
result = name(thousands,False,brit=brit) + " thousand"
if residual:
result += " "
result += name(residual,continued = True,brit=brit)
else:
result += "."
else:
result = name(number)
return result

if __name__ == "__main__":

from time import sleep
from random import randint

brit = "-b" in argv

sleeper = 30
for i in range(1,1000000):
if i == sleeper:
sleep(3)
sleeper += randint(3000,90000)
print wordify(i,brit=brit)



output:


one.
two.
three.
four.
five.
six.
seven.
eight.
nine.
ten.
eleven.
twelve.
...
nine hundred and ninety-nine thousand nine hundred and ninety-six.
nine hundred and ninety-nine thousand nine hundred and ninety-seven.
nine hundred and ninety-nine thousand nine hundred and ninety-eight.
nine hundred and ninety-nine thousand nine hundred and ninety-nine.


That's as far as it goes so far. Go fix it.

Up to 11

I managed to confuse myself for hours on #11 trying to be elegant. But I really like this solution:


grid = [[0]*23]*3 + [[int(x) for x in line.split()]+[0,0,0] for line in
'''08 02 22 97 38 15 00 40 00 75 04 05 07 78 52 12 50 77 91 08
49 49 99 40 17 81 18 57 60 87 17 40 98 43 69 48 04 56 62 00
81 49 31 73 55 79 14 29 93 71 40 67 53 88 30 03 49 13 36 65
52 70 95 23 04 60 11 42 69 24 68 56 01 32 56 71 37 02 36 91
22 31 16 71 51 67 63 89 41 92 36 54 22 40 40 28 66 33 13 80
24 47 32 60 99 03 45 02 44 75 33 53 78 36 84 20 35 17 12 50
32 98 81 28 64 23 67 10 26 38 40 67 59 54 70 66 18 38 64 70
67 26 20 68 02 62 12 20 95 63 94 39 63 08 40 91 66 49 94 21
24 55 58 05 66 73 99 26 97 17 78 78 96 83 14 88 34 89 63 72
21 36 23 09 75 00 76 44 20 45 35 14 00 61 33 97 34 31 33 95
78 17 53 28 22 75 31 67 15 94 03 80 04 62 16 14 09 53 56 92
16 39 05 42 96 35 31 47 55 58 88 24 00 17 54 24 36 29 85 57
86 56 00 48 35 71 89 07 05 44 44 37 44 60 21 58 51 54 17 58
19 80 81 68 05 94 47 69 28 73 92 13 86 52 17 77 04 89 55 40
04 52 08 83 97 35 99 16 07 97 57 32 16 26 26 79 33 27 98 66
88 36 68 87 57 62 20 72 03 46 33 67 46 55 12 32 63 93 53 69
04 42 16 73 38 25 39 11 24 94 72 18 08 46 29 32 40 62 76 36
20 69 36 41 72 30 23 88 34 62 99 69 82 67 59 85 74 04 36 16
20 73 35 29 78 31 90 01 74 31 49 71 48 86 81 16 23 57 05 54
01 70 54 71 83 51 54 69 16 92 33 48 61 43 52 01 89 19 67 48'''.split('\n')
] + [[0]*23]*3

import operator

print max([reduce(operator.mul, [grid[y+n*d[0]][x+n*d[1]] for n in (0,1,2,3)])
for x in xrange(0,20) for y in xrange(3,23)
for d in ((0,1),(1,0),(1,1),(-1,1))])


The halo of zeros is very clever and completely works around what made mine confusing, and despite everything, inelegant:



linelen = len(array[0])
for line in array:
assert len(line) == linelen

numlines = len(array)
offset = 4

directions = {
"row":(0,linelen-offset+1,numlines,(0,1)),
"column":(0,linelen,numlines-offset+1,(1,0)),
"slash":(offset-1,linelen-offset+1,numlines-offset+1,(-1,1)),
"backslash":(0,linelen-offset+1,numlines-offset+1,(1,1))
}

maxprod = None
valmax = 0

for direction in directions.keys():
x0, xlen, yrg, delta = directions[direction]
vec0 = [(x0 + i * delta[0] ,i * delta[1]) for i in range(offset)]
for y in range(yrg):
vec1 = [(item[0]+y,item[1]) for item in vec0]
for x in range(xlen):
vec = [(item[0],item[1]+x) for item in vec1]
vals = [array[item[0]][item[1]] for item in vec]
prod = reduce(mul,vals,1)
if prod > maxprod:
valmax = vals
maxprod = prod

print valmax, maxprod

Sunday, December 5, 2010

Puzzle 5

Profiting from my overwrought example 3


from bf7 import *
from sys import argv

if __name__ == "__main__":
if len(argv) == 1:
upto = 10
else:
upto = int(argv[1])
allprimes = primesupto(upto)

result = 1
for prime in allprimes:
factor = prime
while factor < upto:
result *= prime
factor *= prime
print result


For me it would have been less work to just write

print 2 * 2 * 2 * 2 * 3 * 3 * 5 * 7 * 11 * 13 * 17 * 19

Best solution, for those who can remember grade school:



def gcd(a, b):
while(b != 0):
a, b = b, a%b
return a

def lcm(a,b):
return a * b / gcd(a, b)

print reduce(lcm, range(2, 21))