Now, as a simple, fun exercise. Let's replace the monkey with a snake and write a Python script that tries to generate the Shakespeare quote "Live, Love, and let be". To make things a bit easier, we will ignore the uppercase and the commas for now and only search for the string "live love and let be".
Here is the code, in all of its simian glory:
import random
#live love and let be len==20
def generator(strlen):
sigma='abcdefghijklmnopqrstuvwxyz '
result = ''
for i in range(strlen):
result = result + sigma[random.randrange(27)]
return result
def score_function(target_str,gen_str):
score = 0.0
for i in range(len(target_str)):
if target_str[i] == gen_str[i]:
score = score + 1
return score / len(target_str)
def main():
target = 'live love and let be'
gen = generator(20)
best_score = 0
curr_score = score_function(target, gen)
while curr_score < 1:
if curr_score > best_score:
print gen,curr_score
best_score = curr_score
gen = generator(20)
curr_score = score_function(target, gen)
main()
I let this run for a few minutes on an Intel i3 with 2GB ram, and the highest match I saw was a 0.4.
Let's go ahead and break down the algorithm at a higher level, and then delve into it line by line. We basically want to generate a random string of characters of some length using the generator function. We then want to see how well that generated string of characters compares to our target Shakespeare quote by using the score function. Score returns a value between 0 and 1. 0 meaning none of the characters matched, and 1 meaning it was a direct hit. We then use our main routine to generate and score strings until we find a perfect match, all the while providing updates on what the longest match so far was.
I'm working through a book that has this problem in it. I have what you a few more little bits for myself.
ReplyDeleteQuestion: can this be improved to keep the letters that are correct? So where target_str[i] == gen_str[i], the letter is kept there and future iterations don't change that letter, only the others?