Sunday, December 13, 2009

Intro to Computer Science Lesson Plan: Numeric Search


In this lab we have a number puzzle challenge that can be solved with a simple program. The lab works to give students a sense of the significance of round-off error and limited numeric precision.

Here is the puzzle: a music composer who would like to experiment with musical scales different from the traditional 12-note chromatic scale. This chromatic scale is essentially the notes derived from the series of eleven perfect fifths each with ratio 3:2 (or, equivalently think of a string fretted at length 2/3). This ratio series causes a peculiar tone issue, due to a round-off error sometimes called the Pythagorean Comma. This error in unavoidable since 2/3 raised to any power can not equal an octave, which must be an integral power of 1/2. The traditional 12-note scale produces a relative error in tone over 7 octaves of around 1.36%, as can be seen by the following calculation.

>>> (3.0/2.0)**12 / (2.0/1.0)**7
1.013643264770507812

The challenge is to find an alternative "chromatic scale" with more notes (derived by a longer series of perfect fifths) that produces less relative error over some number of octaves. Determine a method to systematically search for a) the precise number of notes in the scale and b) the precise number of octaves that will make the error smaller than the 1.3% c) finally, find the smallest such error possible (given restriction of double precision arithmetic.)

Saturday, December 12, 2009

Intro to Computer Science Lesson Plan: Machine Learning


As noted I wanted to bring "interesting" CS topics to my freshman class. Here is a relatively easy lab lesson that presents a 'linear algebraic' method for movie recommendation. The main idea is that each user and movie is described by a vector of hidden attributes values. How well a user likes a given movie is just the dot product of vectors. To find these vectors we do a computation based on a simplified steepest descent to SVD. I adapted an idea from Simon Funk's blog. Here is the lab. For our class we found that "Dark Knight" and "Napoleon Dynamite" had the extreme hidden attribute values.

------------------------------------------------------------------

We can determine recommendations by computing hidden attribute values that describe the tastes of each user, and how well a movie can be described by this attribute. Let us work on computing the most significant such value, one value for each user and one value for each movie. We can begin by letting all these values be equal to 1.0 as follows.

numUsers = len(allRatings)
numMovies=30
userVal= [1.0]* numUsers
movieVal = [1.0]*numMovies

We are now ready to try to learn the hidden values by using the actual rating values (submitted in advance by each student). We will try to update them in rounds. In each update round we choose one actual rating at random. To do this choose a user and a movie at random and check that the entry in the allRatings matrix is not 0. We will compute the error by comparing the predictedRating to the actualRating stored in our dataset mldata.py.

predictedRating = movieVal[movie] * userVal[user]
error = actualRating – predictedRating

As it turns out, we can always improve the situation my changing the userVal according to the error and the movieVal as follows:

userVal[user] += error * movieVal[movie] * smallnum

We explained why this is true in lecture (steepest descent). Similarly, we can improve the movieVal according to the error and the userVal as follows:

movieVal[movie] += error * userVal[user] * smallnum

We can put the whole update routine in a function as follows where we’ve added a dampening factor so that the updates change things in a smooth way:

def update(user, movie, actualRating):
predictedRating = movieVal[movie] * userVal[user]
error = actualRating - predictedRating
userVal[user] += error * movieVal[movie] * 0.0001
movieVal[movie] += error * userVal[user] * 0.0001

Now simply call this update function over and over on random users and random movies (being sure that there is an actual rating) and (with luck) the numbers will converge quickly to the hidden values we have been seeking.

Intro To Computer Science Lesson Plan : Visualizing Chaos via Orbit Diagrams of Logistic Map and the Henon Map



Last Fall I taught a class to incoming Freshman to introduce "interesting" ideas from the field of Computer Science. I tried to create some labs that would engage the students without being too technical or too dumbed-down. One successful lab had the students take a well-known chaotic function known as the logistic equation x(n+1) = r*x(n)*(1-x(n)) and asked them to try to form a visualization. Here are some of the results of the students.

Here are project results from CS0 students: Daniel Hagerstrand, Cody Mays, and Scott Resnick



Here is basic python code that generates the graphic for the orbit diagram of the logistic equation:



from Tkinter import *
import random
root = Tk()
canvas = Canvas(width=1000, height=750, bg='white')
canvas.pack(expand=YES, fill=BOTH)
text = canvas.create_text(50,10, text="Chaos Test")

def drawcircle(canv,x,y,radius,color):
canvas.create_oval(x-radius,y-radius,x+radius, y+radius,width=0,fill=color)

# Feigenbaum bifurcation values.
# b1 = 3, b2 = 3.449490…,
# b3 = 3.544090…, b4 = 3.556441…, b5 = 3.568759…, and b6 = 3.569692

r=2.4
while r < 4:
for pix in range(50):
x = random.random()
for i in range(100):
x = r*x*(1-x)
drawcircle(canvas,x*1000,(r-2.4)*(750/1.6),1,'blue')
r+=0.005
root.mainloop()























Here is basic python code that generates the graphic for the orbit diagram of the Henon Map:



from Tkinter import *
import random
root = Tk()

def drawcircle(canv1,x,y,radius,color):
canv1.create_oval(x-radius,y-radius,x+radius, y+radius,width=0,fill=color)

canvas = Canvas(width=1400, height=800, bg='white')
canvas.pack(expand=YES, fill=BOTH)
text = canvas.create_text(100,10, text=" Orbit Diagram of the Henon Map")


def HenonMap(a,b,x,y):
return y + 1.0 - a *x*x, b * x

# Map dependent parameters
a = 1.4
b = 0.3
iterates = 100000

xtemp = 0.1
ytemp = 0.2

for pix in xrange(0,iterates):
xtemp, ytemp = HenonMap(a,b,xtemp,ytemp)
drawcircle(canvas,400*xtemp+750,1000*ytemp+400,1,'blue')

root.mainloop()