Advances in Pencil Science
Wednesday, January 18, 2012
Friday, August 12, 2011
Fixing Excel text exports in OS X
Excel exports .tsv and .csv with a broken linefeed character.
To fix: copy this file to "fixcel.py"
# fixcel.py
from sys import argv
infnam, outfnam = argv[1:]
inf = file(infnam)
indata = inf.read()
inf.close()
outdata = ""
for ch in indata:
if ord(ch) == 13:
outdata += "\n"
else:
outdata += ch
outf = file(outfnam,"w")
outf.write(outdata)
outf.close()
#
Then invoke with:
python fixcel.py oldfilename newfilename
Thursday, December 9, 2010
Wordify
This code can count to a million in words. It could probably be refactored so it was shorter and could count to any number. If you were patient enough.
The "-b" flag puts "and" in the places a Brit would put it. It's optional but preferred.
I have random stopping points to increase the modest entertainment value of it running.
Can you recursively get wordify up to the quintillions? How should the "-b" flag be handled?
Can you do this in French?
output:
That's as far as it goes so far. Go fix it.
The "-b" flag puts "and" in the places a Brit would put it. It's optional but preferred.
I have random stopping points to increase the modest entertainment value of it running.
Can you recursively get wordify up to the quintillions? How should the "-b" flag be handled?
Can you do this in French?
from string import ascii_lowercase as alphabet
from sys import argv
units = [""] + "one two three four five six seven eight nine".split()
teens = "ten eleven twelve thirteen fourteen fifteen sixteen seventeen eighteen nineteen".split()
tenties = 2 * [""] + "twenty thirty forty fifty sixty seventy eighty ninety".split()
def name(number,terminate = True,continued = False,brit=False):
assert type(number) == int
assert number > 0 and number < 1001
if number == 1000: return "one thousand"
hundreds,residual = divmod(number,100)
tens,ones = divmod(residual,10)
result = ""
if hundreds:
result += units[hundreds] + " hundred"
if residual and hundreds:
result += " "
if brit and residual and (hundreds or continued):
result += "and "
if (tens == 1):
result += teens[ones]
if terminate:
result += "."
else:
if tens:
result += tenties[tens]
if ones: result += "-"
if ones:
result += units[ones]
if terminate:
result += "."
return result
def wordify(number,brit=False):
assert type(number) == int
assert number > 0 and number < 1000000
thousands,residual = divmod(number,1000)
result = ""
if thousands:
result = name(thousands,False,brit=brit) + " thousand"
if residual:
result += " "
result += name(residual,continued = True,brit=brit)
else:
result += "."
else:
result = name(number)
return result
if __name__ == "__main__":
from time import sleep
from random import randint
brit = "-b" in argv
sleeper = 30
for i in range(1,1000000):
if i == sleeper:
sleep(3)
sleeper += randint(3000,90000)
print wordify(i,brit=brit)
output:
one.
two.
three.
four.
five.
six.
seven.
eight.
nine.
ten.
eleven.
twelve.
...
nine hundred and ninety-nine thousand nine hundred and ninety-six.
nine hundred and ninety-nine thousand nine hundred and ninety-seven.
nine hundred and ninety-nine thousand nine hundred and ninety-eight.
nine hundred and ninety-nine thousand nine hundred and ninety-nine.
That's as far as it goes so far. Go fix it.
Up to 11
I managed to confuse myself for hours on #11 trying to be elegant. But I really like this solution:
The halo of zeros is very clever and completely works around what made mine confusing, and despite everything, inelegant:
grid = [[0]*23]*3 + [[int(x) for x in line.split()]+[0,0,0] for line in
'''08 02 22 97 38 15 00 40 00 75 04 05 07 78 52 12 50 77 91 08
49 49 99 40 17 81 18 57 60 87 17 40 98 43 69 48 04 56 62 00
81 49 31 73 55 79 14 29 93 71 40 67 53 88 30 03 49 13 36 65
52 70 95 23 04 60 11 42 69 24 68 56 01 32 56 71 37 02 36 91
22 31 16 71 51 67 63 89 41 92 36 54 22 40 40 28 66 33 13 80
24 47 32 60 99 03 45 02 44 75 33 53 78 36 84 20 35 17 12 50
32 98 81 28 64 23 67 10 26 38 40 67 59 54 70 66 18 38 64 70
67 26 20 68 02 62 12 20 95 63 94 39 63 08 40 91 66 49 94 21
24 55 58 05 66 73 99 26 97 17 78 78 96 83 14 88 34 89 63 72
21 36 23 09 75 00 76 44 20 45 35 14 00 61 33 97 34 31 33 95
78 17 53 28 22 75 31 67 15 94 03 80 04 62 16 14 09 53 56 92
16 39 05 42 96 35 31 47 55 58 88 24 00 17 54 24 36 29 85 57
86 56 00 48 35 71 89 07 05 44 44 37 44 60 21 58 51 54 17 58
19 80 81 68 05 94 47 69 28 73 92 13 86 52 17 77 04 89 55 40
04 52 08 83 97 35 99 16 07 97 57 32 16 26 26 79 33 27 98 66
88 36 68 87 57 62 20 72 03 46 33 67 46 55 12 32 63 93 53 69
04 42 16 73 38 25 39 11 24 94 72 18 08 46 29 32 40 62 76 36
20 69 36 41 72 30 23 88 34 62 99 69 82 67 59 85 74 04 36 16
20 73 35 29 78 31 90 01 74 31 49 71 48 86 81 16 23 57 05 54
01 70 54 71 83 51 54 69 16 92 33 48 61 43 52 01 89 19 67 48'''.split('\n')
] + [[0]*23]*3
import operator
print max([reduce(operator.mul, [grid[y+n*d[0]][x+n*d[1]] for n in (0,1,2,3)])
for x in xrange(0,20) for y in xrange(3,23)
for d in ((0,1),(1,0),(1,1),(-1,1))])
The halo of zeros is very clever and completely works around what made mine confusing, and despite everything, inelegant:
linelen = len(array[0])
for line in array:
assert len(line) == linelen
numlines = len(array)
offset = 4
directions = {
"row":(0,linelen-offset+1,numlines,(0,1)),
"column":(0,linelen,numlines-offset+1,(1,0)),
"slash":(offset-1,linelen-offset+1,numlines-offset+1,(-1,1)),
"backslash":(0,linelen-offset+1,numlines-offset+1,(1,1))
}
maxprod = None
valmax = 0
for direction in directions.keys():
x0, xlen, yrg, delta = directions[direction]
vec0 = [(x0 + i * delta[0] ,i * delta[1]) for i in range(offset)]
for y in range(yrg):
vec1 = [(item[0]+y,item[1]) for item in vec0]
for x in range(xlen):
vec = [(item[0],item[1]+x) for item in vec1]
vals = [array[item[0]][item[1]] for item in vec]
prod = reduce(mul,vals,1)
if prod > maxprod:
valmax = vals
maxprod = prod
print valmax, maxprod
Sunday, December 5, 2010
Puzzle 5
Profiting from my overwrought example 3
For me it would have been less work to just write
print 2 * 2 * 2 * 2 * 3 * 3 * 5 * 7 * 11 * 13 * 17 * 19
Best solution, for those who can remember grade school:
from bf7 import *
from sys import argv
if __name__ == "__main__":
if len(argv) == 1:
upto = 10
else:
upto = int(argv[1])
allprimes = primesupto(upto)
result = 1
for prime in allprimes:
factor = prime
while factor < upto:
result *= prime
factor *= prime
print result
For me it would have been less work to just write
print 2 * 2 * 2 * 2 * 3 * 3 * 5 * 7 * 11 * 13 * 17 * 19
Best solution, for those who can remember grade school:
def gcd(a, b):
while(b != 0):
a, b = b, a%b
return a
def lcm(a,b):
return a * b / gcd(a, b)
print reduce(lcm, range(2, 21))
Puzzle 4
What's the largest palindrome number that is the product of two three digit numbers.
This time I have one of the better solutions:
Most started from the factors to see if they built palindromes. Much better to go the other way; far fewer tests (the 94th palindrome succeeded, so less than 9400 calculations. The cast to string for the palindrome test is easy, but is it efficient?
The most interesting items on the discussion were the Pytho one-liner:
max(map(lambda s: bool(s) and max(s) or 0, [(filter(lambda n: n == int(str(n)[::-1]), map(lambda n: i*n, range(999, 99, -1)))) for i in range(999, 99, -1)]))
and the pen-and-paper solution
This time I have one of the better solutions:
def nextpal(low,high,reversed=False):
assert low < high
if reversed:
candidate = high
final = low
increment = -1
else:
candidate = low
final = high
increment = 1
while candidate != final:
if str(candidate) == str(candidate)[-1::-1]:
yield candidate
candidate += increment
def getfactor(low,high,target):
for test in range(low,high):
if not target % test:
residual = target/test
if residual in range(low,high):
return test, residual
return None
if __name__ == "__main__":
from time import time
start = time()
pal = nextpal(10000,1000000, reversed=True)
while True:
target = pal.next()
result = getfactor(100,1000,target)
if result:
print target
print result
break
end = time()
print end - start
Most started from the factors to see if they built palindromes. Much better to go the other way; far fewer tests (the 94th palindrome succeeded, so less than 9400 calculations. The cast to string for the palindrome test is easy, but is it efficient?
The most interesting items on the discussion were the Pytho one-liner:
max(map(lambda s: bool(s) and max(s) or 0, [(filter(lambda n: n == int(str(n)[::-1]), map(lambda n: i*n, range(999, 99, -1)))) for i in range(999, 99, -1)]))
and the pen-and-paper solution
You can also do this with pen and paper. You have a number:
(100a + 10b + c)(100d + 10e + f)
Which is a palindrone. This factors out to:
100cd + 10ce + cf +
1000bd + 100be + 10bf +
10000ad + 1000ae + 100af
Assuming the first digit is 9, then cf must be equal to nine.
Also, both a and d must then be equal to nine. The only ways
to make the last digit of the product of two integers 9 would
be if those integers were either of:
1 and 9
3 and 3
7 and 7
So, both numbers must start with 9, end with either 1, 3, 7,
or 9, and one of them must be divisible by 11. The only
numbers divisible by 11 from 900 - 1000 are:
902
913
924
935
946
957
968
979
990
You can discard all of those that do not end in either 1, 3,
7, or 9, and you are left with:
913
957
979
So now the presumed answer is either:
(900 + 10 + 3)(900 + 10x + 3)
(900 + 50 + 7)(900 + 10x + 7)
(900 + 70 + 9)(900 + 10x + 1)
Factoring all those out, you get:
810000 + 9000x + 2700 + 9000 + 100x + 30 + 2700 + 30x + 9
824439 + 9130x
Now, for the first digit 824439 + 9130x to be 9, x must be 9
(if x were 8, then 824439 + 9130x = 897479, and the first
digit is 8). And so you have 913 * 993, which is the answer.
You can factor the others out to see if they produce a bigger
answer, which they don't.
Saturday, December 4, 2010
Puzzle 3 without cheating
I am about 2800 times slower than the cheat I downloaded. But I get the right answer, and in a tolerable time (30 seconds). Pure brute forcing this seems like it would take much longer.
Update:
Best answer, wow!
Runs in 0.7 msec. I feel like a complete idiot now. Oh well, it was interesting anyway.
from sys import argv
from math import sqrt
from time import time
# PRIME GENERATOR
def nextprime():
""" generate next prime """
myprimes = [2]
candidate = 3
yield 2
while True:
factors = factorize(candidate,myprimes)
if len(factors.keys()) == 1:
myprimes.append(candidate)
yield candidate
candidate += 2
def primesupto(upto):
""" collect primes from the generator"""
result = []
primes = nextprime()
p = None
while not p or p < upto:
if p:
result.append(p)
p = primes.next()
return result
def factorize(target,allprimes=None,verbose=False):
""" get factors of an integer
this works but is very slow if the primes aren't cached """
factors = {}
upto = int(sqrt(target))
residual = target
prime = nextprime()
if allprimes:
prime = allprimes.__iter__()
while True: # get factors and multiplicities up to sqrt(N)
try:
factor = prime.next()
except StopIteration:
break
if factor > upto:
break
while not residual % factor:
residual /= factor
factors[factor] = factors.get(factor,0) + 1
if residual > 1: # if anything is left it is the sole prime factor > sqrt(N)
factors[residual] = 1
return factors
if __name__ == "__main__":
target = int(argv[1]) # get target from command line
upto = sqrt(target)
starttime = time()
allprimes = primesupto(upto)
print max(factorize(int(argv[1]),allprimes).keys())
endtime = time()
print endtime - starttime
Update:
Best answer, wow!
from time import time
from sys import argv
n = int(argv[1])
d = 3
starttime = time()
while (d * d < n):
if n % d == 0: n /= d
else: d += 2
endtime = time()
print n
print endtime - startime
Runs in 0.7 msec. I feel like a complete idiot now. Oh well, it was interesting anyway.
Subscribe to:
Posts (Atom)