Review of TUIC: Enabling Tangible Interaction on Capacitive Multi-touch Display(November 25, 2013)

Download: Review (100K)
LaTeX Source Files (60K)

I reviewed “TUIC: Enabling Tangible Interaction on Capacitive Multi-touch Displays” (citation below) for my course on human-computer interaction.

TUIC is a method for providing tangible input to unmodified capacitive-touch screens by simulating finger-touch events. In my review, I list the advantages and disadvantages of the technology as reported in the paper, then provide some possible improvements.

N. Yu, L. Chan, S. Lau, S. Tsai, I. Hsiao, D. Tsai, L. Cheng, F. Hsiao, M.Y. Chen, P. Huang, and Y. Hung. TUIC: Enabling Tangible Interaction on Capacitive Multi-touch Display. Proceedings of ACM CHI. 2011.
Original Paper
ACM Digital Library

IASTED Technology for Education and Learning (TEL) 2013(November 13, 2013)

The Virtual Cell, from

I presented my paper: The Design of Multiplayer Simulations in Online Game-Based Learning at the TEL 2013 Conference in Los Angeles, CA (Marina del Ray) on 12 November 2013.

New LF Hardware, Commodity Webcams(October 18, 2013)

A light field camera using commodity webcams
Profile picture of light field camera

Here, some hardware! As described in the Intel proposal.


The cameras are commodity 640×480 USB webcams. The 3mm plywood mounts (SVG) were cut on a laser cutter.

Under construction:
Assembly of light field camera

Projected Homography(October 8, 2013)

Download: Arduino: Read Ambient Light (2K)
Arduino: Read Ambient Light from Four Sensors (2K)
Arduino: Detect Calibration Scene, Output Calibration (4K)
Python: Produce the Calibration Scene (4K)
Python: Perform a Projected Homography (8K)
Image used in Demonstrations (32K)

Bouncing the calibration sequence off a mirror

This project’s aim is to replicate the results of J. Lee’s projective homography demonstration (citation at bottom).

The original project used costly fiber optic hardware to input a binary positional argument to each of four light sensors (each corner of a rectangular surface, later extended to arbitrary surfaces) via a projector that shows a series of calibration images a.k.a “structured light”, “gray-coded binary” etc.

The approach for this project will be:

  • fiber optic cabling sourced from TOSLINK (cheap) cables (optical fiber audio)
  • commodity light sensors (<1USD/ea i.e. "ambient light photodiodes")
  • simple binary calibration images (gray-coded binary images add more fidelity, but aren’t as simple to implement)
  • decoupling of the automation aspects (this project will read in a binary code, compute the position of the surface, then the user will manually input these positions to the homography program)

First order of business: acquire ambient light sensors and see if we can get a reading out of them:
An arduino reading an ambient light sensor
Pictured: 10K resistor, 472 capacitor, arduino.

const int LED = 13;
const int AMBIENT = A0;
void setup()
  pinMode(LED, OUTPUT);
  digitalWrite(LED, HIGH);
  delay(500); // ms
  digitalWrite(LED, LOW);

void loop()
  // analogRead returns an int 0-1023
  // requires 0.001 ms to read
  int measurement = analogRead(AMBIENT);
  // units are AREF / 1024
  // when AREF = 5V, each unit is ~4.9mV
  // The AREF value can be changed via analogReference/1
  unsigned int decimillivolts = measurement * 49;
  delay(500); //ms

With our preliminary step complete, we move on to the more interesting problem: How to interface a fiber optic cable with our cheap sensor?

Cutting the fiber is tricky… each cleave yielded inconsistent results, I had to just keep cutting until it things looked “good enough”. Heat might have helped, anecdotes from a google search suggested otherwise. Ditto for cutting the tops (producing a flat surface) of the ambient light sensors.

I chose superglue (non-gel) to adhere the cleaved cabled to the light sensor for two reasons:

  • Diffusion: The imperfections in the cleaved fiber optic as well as the cleaved light sensor motivated a diffuse interface that would reduce inter-sensor variability (four were constructed).
  • Cost: Easy to obtain, I wouldn’t need to make a trip to a specialty store.

The superglue worked out well enough that no other adhesives were experimented with.

Last, the light sensors are wrapped to prevent light not from the fiber optic triggering them.
A fiber optic cable is attached to the ambient light sensor

At this point I immediately noticed that output from the sensors was cut by 10-12 times (3.1V to 0.28V). I wrote up a small procedure to accumulate light readings over time to compensate for the higher affect of noise on the output:

const int LED = 13;
const int AMBIENT = A0;

// integrate over a flourescent bulb's cycle. Usually 100
// or 120 hz. analogRead() takes .001 ms .
const unsigned int MEASURES = 120;
unsigned int measurements[MEASURES];
const unsigned int FIRST = 8; // 120 >> 3 = 15
const unsigned int FIRST_SHIFT = 3;
const unsigned int SECOND = 15;

void setup()
  pinMode(LED, OUTPUT);
  digitalWrite(LED, HIGH);
  delay(500); // ms
  digitalWrite(LED, LOW);
  // we could zero out measurements[]... nah

void loop()
  Serial.println("# measure");
  // analogRead returns an int 0-1023
  // requires 0.001 ms to read
  for (unsigned int i=0; i < MEASURES; i++)
    measurements[i] = analogRead(AMBIENT);
  Serial.println("# accumulate");
  // We will now accumulate and average, using the
  // fact that our measures are 0-1023 and 
  // an unsigned int can hold 64 (or fewer) accumulated
  // values before overflowing.
  for (unsigned int i=0; i < (MEASURES >> FIRST_SHIFT); i++)
    unsigned int block = i << FIRST_SHIFT;
    for (unsigned int j=1; j < FIRST; j++)
      measurements[block] += measurements[block + j];
    measurements[block] >>= FIRST_SHIFT;
    unsigned int i=0;
    unsigned int block = i << FIRST_SHIFT*SECOND;
    for (unsigned int j=1; j < SECOND; j++)
      measurements[block] += measurements[block + j << FIRST_SHIFT];
    measurements[block] /= SECOND;

  // average is now in measurements[0]
  // units are AREF / 1024
  // when AREF = 5V, each unit is ~4.9mV
  // The AREF value can be changed via analogReference/1
  unsigned int decimillivolts = measurements[0] * 49;
  delay(500); //ms

Next up was a bit of elbow grease; I made three more, tested them, then installed them in the four corners of a plywood panel:
A surface with four embedded fiber optic cables

I then set up a laptop and projector. In order to test the sensitivity of each sensor, I wrote a small web page that I could move around the screen:
Using a projector to test the readings of each sensor

< style>
		height: 33%;
		background-color: #777;
		background-color: black;
< /style>
< div class="binary on">< /div>
< div class="binary half">< /div>
< div class="binary off">< /div>

...and used this to calibrate (find the expected output of) each sensor when exposed to #FFFFFF from the projector.

Now that we have our rig completely set up, we can code a calibration sequence:
Part of the calibration sequence

And use the readings from the sensors to detect the surface in the projected view:
Seeing if we found the four corners of the surface

The output of the calibration reader is a bit unnecessary, but fun. I chose 8 scenes for each calibration sequence (one vertical sequence, one horizontal sequence, for a total of 16 scenes) with black screens between each.

A horizontal calibration scene. Its a binary pattern of alternating white and black stripes of equal width. At each step, the stripes subdivide.
Pictured: Horizontal calibration sequence showing calibration images #2-#5. #1 is a white image, and #6 & #7 are painful to view (seriously).

For each sensor, I detect when the first scene (all white) is detected by all sensors. Then, for each calibration scene, a binary number is appended to. When finished I have two (horizontal & vertical) 8-bit (0-255) numbers for each sensor:
An eight-bit binary number representing the position of the sensor in the projected view
Pictured: One of eight outputs ("151") of the calibration code. The most significant bit (Here labeled #1) will always be "on", since scene #1 is an all-white image. Each bit encodes whether or not white was detected at each step in the 8-step calibration sequence.

We take our encoded position and convert it to screen coordinates for our pyOpenGL application (non-working example):

def parse_bitwise_positions(y_0, y_1, y_2, y_3, x_0, x_1, x_2, x_3):
  our values range from 128-255.
  The binary representation of the value [i.e. bin(176) is '0b10110000']
  denotes its position in the scene. The MSB (here the 7th) denotes 
  whether our sensor is on the projected surface or not (if this bit is 
  not present we can not continue, hence we check that all inputs are 
  greater than 2**7 .
  The next bit (here the 6th bit) tells us whether we are in the top or
  bottom half of the screen, 0 is top, 1 is bottom. This halves our
  position space. The following bits continue to halve the position 
  space until we run out of precision, at which point we are confident
  we know the encoded position.
  # coerce our string input to ints
  (y_0, y_1, y_2, y_3, x_0, x_1, x_2, x_3) = (int(y_0), int(y_1),
      int(y_2), int(y_3), int(x_0), int(x_1), int(x_2), int(x_3)
  top_left = Point(0, 0)
  top_rght = Point(0, 0)
  bot_rght = Point(0, 0)
  bot_left = Point(0, 0)
  for i in (y_0, y_1, y_2, y_3, x_0, x_1, x_2, x_3):
    if i < 2**MSB:
      print "Bad sensor input!"
      return (0, 0) + (0, height) + (width, height) + (width, 0)

  for i in range(MSB):
    if y_0 & 2**i == 0:
      top_left.y += height / float(2**(MSB-i))
    if y_1 & 2**i == 0:
      top_rght.y += height / float(2**(MSB-i))
    if y_2 & 2**i == 0:
      bot_rght.y += height / float(2**(MSB-i))
    if y_3 & 2**i == 0:
      bot_left.y += height / float(2**(MSB-i))
    if x_0 & 2**i == 0:
      top_left.x += width / float(2**(MSB-i))
    if x_1 & 2**i == 0:
      top_rght.x += width / float(2**(MSB-i))
    if x_2 & 2**i == 0:
      bot_rght.x += width / float(2**(MSB-i))
    if x_3 & 2**i == 0:
      bot_left.x += width / float(2**(MSB-i))

  return (bot_left.x, bot_left.y) + (top_left.x, top_left.y) + (top_rght.x, top_rght.y) + (bot_rght.x, bot_rght.y)

def draw_quad(x_0, y_0, x_1, y_1, x_2, y_2, x_3, y_3):
  gl.glBindTexture(gl.GL_TEXTURE_2D, texture)
  gl.glTexCoord(1, 0);
  gl.glVertex(x_0, y_0)
  gl.glTexCoord(0, 0);
  gl.glVertex(x_1, y_1)
  gl.glTexCoord(0, 1);
  gl.glVertex(x_2, y_2)
  gl.glTexCoord(1, 1);
  gl.glVertex(x_3, y_3)

Et voilà:
And the output! A projected homograph

The interesting property of this system is that it works with any sized OpenGL window (so long as the first calibration image hits all sensors), with the projector upside down or turned on its side, or even when a mirror is introduced in the path of the projection (!):
Bouncing the calibration sequence off a mirror
A projected homograph bounced off a mirror
Pictured: The image aligned with the sensors on the projection surface. Below: the output of the laptop to the projector.

J. Lee, P. Dietz, D. Aminzade, and S. Hudson. "Automatic Projector Calibration using Embedded Light Sensors", Proceedings of the ACM Symposium on User Interface Software and Technology, October 2004.

Headless Rendered OpenGL Scenes(September 22, 2013)

Download: Headless .obj Renderer (6K)
.obj Renderer (6K)
.obj (Stanford Rabbit) used in Demonstration (400K)

Running the renderer

Output of the headless renderer. The stanford rabbit.

Recent versions of PyOpenGL (>3.0.2) have support for software rendering using OSMesa. This makes rendering with OpenGL on a headless node much easier!

The dependencies for these examples (you may need to download pyopengl 3.0.2 directly):
$ sudo apt-get install python-opengl python-opencv libosmesa6

I plan to use this system for a Light Field simulator, and decided to load in and render .obj files. First up: loading in obj files:

from OpenGL import GL as gl

def MTL(filename):
  contents = {}
  mtl = None
  for line in open(filename, "r"):
    if line.startswith('#'): continue
    values = line.split()
    if not values: continue
    if values[0] == 'newmtl':
      mtl = contents[values[1]] = {}
    elif mtl is None:
      raise ValueError, "mtl file doesn't start with newmtl stmt"
    elif values[0] == 'map_d':
    elif values[0] == 'map_Ks':
    elif values[0] == 'map_Bump':
    elif values[0] == 'map_Kd':
      # load the texture referred to by this declaration
      mtl[values[0]] = values[1]
      surf = pygame.image.load(mtl['map_Kd'])
      image = pygame.image.tostring(surf, 'RGBA', 1)
      ix, iy = surf.get_rect().size
      texid = mtl['texture_Kd'] = gl.glGenTextures(1)
      gl.glBindTexture(GL_TEXTURE_2D, texid)
      gl.glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER,
      gl.glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER,
      gl.glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, ix, iy, 0, GL_RGBA,
        GL_UNSIGNED_BYTE, image)
      mtl[values[0]] = map(float, values[1:])
  return contents
class OBJ:
  def __init__(self, filename, swapyz=False):
    """Loads a Wavefront OBJ file. """
    self.vertices = []
    self.normals = []
    self.texcoords = []
    self.faces = []
    material = None
    for line in open(filename, "r"):
      if line.startswith('#'): continue
      values = line.split()
      if not values: continue
      if values[0] == 'v':
        v = map(float, values[1:4])
        if swapyz:
          v = v[0], v[2], v[1]
      elif values[0] == 'vn':
        v = map(float, values[1:4])
        if swapyz:
          v = v[0], v[2], v[1]
      elif values[0] == 'vt':
        self.texcoords.append(map(float, values[1:3]))
      elif values[0] in ('usemtl', 'usemat'):
        material = values[1]
      elif values[0] == 'mtllib':
        self.mtl = MTL(values[1])
      elif values[0] == 'f':
        face = []
        texcoords = []
        norms = []
        for v in values[1:]:
          w = v.split('/')
          if len(w) >= 2 and len(w[1]) > 0:
          if len(w) >= 3 and len(w[2]) > 0:
        self.faces.append((face, norms, texcoords, material))
    self.gl_list = gl.glGenLists(1)
    gl.glNewList(self.gl_list, gl.GL_COMPILE)
    for face in self.faces:
      vertices, normals, texture_coords, material = face
      if material:
        mtl = self.mtl[material]
        if 'texture_Kd' in mtl: # use diffuse texmap
          gl.glBindTexture(GL_TEXTURE_2D, mtl['texture_Kd'])
        else: # just use diffuse colour

      for i in range(len(vertices)):
        if normals[i] > 0:
          gl.glNormal3fv(self.normals[normals[i] - 1])
        if texture_coords[i] > 0:
          gl.glTexCoord2fv(self.texcoords[texture_coords[i] - 1])
        gl.glVertex3fv(self.vertices[vertices[i] - 1])


model = OBJ('/path/to/scene.obj')

Great! Now let’s make sure our OpenGL calls from Python are good (this will only work on a machine with graphics enabled, if your machine is headless just skip to the next code listing):

from OpenGL import GL as gl, GLU as glu, GLUT as glut
import sys
Setup etc. repurposed from:
name = 'obj_viewer'
model = None
width, height = 2816/4, 2112/4

def main():
  global model
  if len(sys.argv) < 4:
    raise BaseException("usage: %s cam_x cam_y cam_z file.obj" % sys.argv[0])
  glut.glutInitDisplayMode(glut.GLUT_DOUBLE | glut.GLUT_RGB | glut.GLUT_DEPTH)
  glut.glutInitWindowSize(width, height)
  model = OBJ(sys.argv[4])
  lightZeroPosition = [40.,4.,40.,1.]
  lightZeroColor = [0.8,1.0,0.8,1.0] # green tinged
  gl.glLightfv(gl.GL_LIGHT0, gl.GL_POSITION, lightZeroPosition)
  gl.glLightfv(gl.GL_LIGHT0, gl.GL_DIFFUSE, lightZeroColor)
  gl.glLightf(gl.GL_LIGHT0, gl.GL_CONSTANT_ATTENUATION, 0.1)
  gl.glLightf(gl.GL_LIGHT0, gl.GL_LINEAR_ATTENUATION, 0.05)
  fov_y = 55.
  aspect = float(width / height)
  near = 1.
  far = 300.
  glu.gluPerspective(fov_y, aspect, near, far)
      0, 0, 0,
      0, 0, 0,
      0, 1, 0

def display():

if __name__ == '__main__': main()

Note: for this last example you will need to run the code with the environment variable PYOPENGL_PLATFORM set to "osmesa":
$ PYOPENGL_PLATFORM=osmesa python

...and finally we can render without needing a graphics card:

import cv, numpy
from OpenGL import GL as gl, GLUT as glut, GLU as glu, arrays, platform
import sys
Evoke like:
$ PYOPENGL_PLATFORM=osmesa python

Matti Kariluoma Sep 2013
width, height = 2816/2, 2112/2
model = None

def setup(model_obj_filename):
  global model
  ctx = platform.OSMesaCreateContext(gl.GL_RGBA, None) # requires PYOPENGL_PLATFORM=osmesa
  buf = arrays.GLubyteArray.zeros((height, width, 4))
  p = arrays.ArrayDatatype.dataPointer(buf)
  assert(platform.OSMesaMakeCurrent(ctx, p, gl.GL_UNSIGNED_BYTE, width, height))
  model = OBJ(model_obj_filename)
  color = (
        int('b0', 16) / 255.0,
        int('f0', 16) / 255.0,
        int('f0', 16) / 255.0,
  return ctx, buf

def teardown(ctx):

def lights():
  lightZeroPosition = [10., 4., 10., 1.]
  lightZeroColor = [1.0, 1.0, 0.8, 1.0] # yellow tinged
  gl.glLightfv(gl.GL_LIGHT0, gl.GL_POSITION, lightZeroPosition)
  gl.glLightfv(gl.GL_LIGHT0, gl.GL_DIFFUSE, lightZeroColor)
  gl.glLightf(gl.GL_LIGHT0, gl.GL_CONSTANT_ATTENUATION, 0.1)
  gl.glLightf(gl.GL_LIGHT0, gl.GL_LINEAR_ATTENUATION, 0.05)

def camera(dx=0, dy=0, dz=0):
  fov_y = 55.
  aspect = float(width / height)
  near = 1.
  far = 300.
  glu.gluPerspective(fov_y, aspect, near, far)
  gl.glTranslate(0. + dx, 0. + dy, 0. + dz)

def render():
  def scene():


def save_render(filename):
  ia = gl.glReadPixels(0, 0, width, height, gl.GL_BGR, gl.GL_UNSIGNED_BYTE)
  im = cv.CreateImageHeader((width, height), cv.IPL_DEPTH_8U, 3)
  cv.SetData(im, ia)
  cv.SaveImage(filename, im)

def main():
  if len(sys.argv) < 4:
    raise BaseException("usage: %s cam_x cam_y cam_z file.obj" % sys.argv[0])
  gl_context, offscreen_buffer = setup(sys.argv[4])

if __name__ == '__main__': main()

COSYNE 2013(March 24, 2013)

Deciding to break away from the usual, I attended the Computational Neuroscience conference COSYNE 2013 in Salt Lake City, UT. The take-aways that were the most compelling for me:

  • Performing experiments is tremendously difficult! Imagine attaching probes to indivdual neurons, or reducing interfering factors when training rats to react to certain stimuli. The most “out-there” method for me was optogenetics, the use of laser light to stimulate neurons in genetically-modified mice.
  • Dr. Anthony Movshon’s talk, where he provided evidence that using anything simpler than a primate was not producing compelling enough of results. He suggested that researchers investigate mouse lemurs instead, a very small primate with a sophisticated brain.
  • The discussion during the “Large-Scale Neuronal Simulations” workshop, where Dr. Chris Eliasmith introduced me to the philosophical idea of a “hyper-Turing machine” and directed me to the work of Dr. Marcin Miłkowski, particularly his soon-to-be released “Explaining the Computational Mind”. Having been in Computer Science for so long, it had never even occured to me to study any “non-Turing” computational models!

LFManip in Your Browser(January 5, 2013)


Here, a version of LFmanip that doesn’t require an executable! As described in the Intel proposal.

Wheat Information Project (WHIP)(November 18, 2012)

Live site:

Sometime during March 2011, I was asked if I could so some work on a web view of NDSU’s varietal trial publications, which are currently sent out yearly to farmers and available as PDFs. I threw together a demo of search results with some jQuery polish using static data, and we were off!

Like all great projects, we needed some data entry. We hired a student to do this data entry (copying rows of numbers from PDFs), bullet dodged. Now that we were actually using a database, I tried to rope in our department’s systems administrator, but he wisely resisted. He did highly recommend the Django project (Python), and that’s how our framework was chosen. Since we couldn’t get the sysadmin, we hired an additional web developer, and later an artist.

After some steady progess, and talks with the administrators of, we were ready to put the site up live. By this point, the other developers had finished their tasks and gone back to school, and I was doing the work during my off-hours.

After a bit, we decided to tweak the search results a bit, but found the legacy code (only 1-year old!) to be getting in the way. Since summer (2012) was approaching, we decided to hire a handful of undergraduates to help with the polish and the rewrite of the search result backends.

With the fall semester underway, we’re back to a sole developer, and performing a few bug fixes now and again, with the exception of a new method for calculating our LSD results, using R behind-the-scenes as we orignally intended… turns out the Python LSD I wrote wasn’t as robust as we’d like.


  • Wiersma, Ransom, Kariluoma. “Wheat Information Project (WHIP): Another Web-Based Variety Trials Database.” ASA Section: Biometry and Statistical Computing. ASA International Annual Meeting, San Antonio TX, October 16-19 2011.
  • Wiersma. Conference talk. Wheat Production Forum, Moorhead MN, October 25-26 2011.
    Conference Program.
  • Ransom, Wiersma. Break-out session. Prairie Grains Conference, Grand Forks ND, December 7-8 2011.
  • Ransom. Casselton ND. 2011.
  • Wiersma. NWREC. December 2011.
  • Wiersma. Prairie Grains Conference, Grand Forks ND, December 12-13 2012.

Seminar Presentation: Intelligent Systems(October 14, 2012)

Download: Presentation (600K)
LaTeX Source Files (300K)

I presented the paper “Comparing forgetting algorithms for artificial episodic memory systems” (citation below) during my seminar on intelligent systems.

In the paper, the authors describe a simulation where an agent must remember transaction histories in order to perform well, and such memory space is limited. They then compare a number of processes for forgetting (deleting) previous transactions in order to maximize the agent’s performance.

Nuxoll, Andrew, Dan Tecuci, Wan Ching Ho, and Ningxuan Wang. “Comparing forgetting algorithms for artificial episodic memory systems.” In Proc. of the Symposium on Human Memory for Artificial Agents. AISB, pp. 14-20. 2010.
Original Paper

Intel Proposal(August 13, 2012)

Download: Proposal (6M)
LaTeX Source (300K)

Intel passed on this one, ‘s shame. I planned to eventually implement most of what’s in there (minus the kinect bit) on my own time, and this would have been a great stimulus! Oh well, file it away.

Unipolar, Optoisolated Stepper Driver(July 31, 2012)

Download: PCB (svg) (300K)
Arduino Code (3K)

Schematic of the unipolar optoisolated stepper driver

Assembled PCB of the unipolar optoisolated stepper driver

Pictured: An optoisolated (the computer is separated from the motor’s power supply by an air gap. The control signals are transmitted over a LED / photodiode pair) unipolar stepper driver.

There is one caveat: if the power source is high-current, the “thicker” traces will need to have a layer of solder tinned to them:
Udnerside of the PCB of the unipolar optoisolated stepper driver, showing the tinned traces for high current

I’ve also written a test suite for the Arduino, enjoy:

int LED = 13;
int A_1 = 8;
int A_2 = 9;
int B_1 = 10;
int B_2 = 11;
int DELAY_MS = 200;

void single_step(int cycles)
  // Apply power to one winding at a time.
  // 1a 1000
  // 1b 0010
  // 2a 0100
  // 2b 0001
  for (int i=0; i < cycles; i++)
    digitalWrite(A_1, HIGH);
    digitalWrite(A_1, LOW);
    digitalWrite(A_2, HIGH);
    digitalWrite(A_2, LOW);
    digitalWrite(B_1, HIGH);
    digitalWrite(B_1, LOW);
    digitalWrite(B_2, HIGH);
    digitalWrite(B_2, LOW);

void double_step(int cycles)
  // Apply power to two windings at a time.
  // ~= 1.4x torque over single-stepping.
  // 1a 1100
  // 1b 0011
  // 2a 0110
  // 2b 1001
  digitalWrite(B_2, HIGH);
  for (int i=0; i < cycles; i++)
    digitalWrite(A_1, HIGH);
    digitalWrite(B_2, LOW);
    digitalWrite(A_2, HIGH);
    digitalWrite(A_1, LOW);
    digitalWrite(B_1, HIGH);
    digitalWrite(A_2, LOW);
    digitalWrite(B_2, HIGH);
    digitalWrite(B_1, LOW);
  digitalWrite(B_2, LOW);

void half_step(int cycles)
  // Combine single and double stepping to take half-steps.
  // ~= 2x angular resolution over single or double stepping.
  // 1a 11000001
  // 1b 00011100
  // 2a 01110000
  // 2b 00000111
  digitalWrite(A_1, HIGH);
  for (int i=0; i < cycles; i++)
    digitalWrite(A_2, HIGH);
    digitalWrite(A_1, LOW);
    digitalWrite(B_1, HIGH);
    digitalWrite(A_2, LOW);
    digitalWrite(B_2, HIGH);
    digitalWrite(B_1, LOW);
    digitalWrite(A_1, HIGH);
    digitalWrite(B_2, LOW);
  digitalWrite(A_1, LOW);

void blink_led(int times)
  for (int i=0; i < times; i++)
    digitalWrite(LED, HIGH);
    digitalWrite(LED, LOW);

void setup()
  pinMode(LED, OUTPUT);
  digitalWrite(LED, LOW);
  pinMode(A_1, OUTPUT);
  digitalWrite(A_1, LOW);
  pinMode(A_2, OUTPUT);
  digitalWrite(A_2, LOW);
  pinMode(B_1, OUTPUT);
  digitalWrite(B_1, LOW);
  pinMode(B_2, OUTPUT);
  digitalWrite(B_2, LOW);

void loop()
  half_step(6 / 2);

erfinv (erf, erfc) in Javascript(May 30, 2012)

I wanted to try out the calculations for the Python LSD implemetation in a browser instead, and was frustrated by the lack of an erfinv function! So, I wrote one:

// Matti Kariluoma May 2012 
function erfc(x)
	z = Math.abs(x)
	t = 1.0 / (0.5 * z + 1.0)
	a1 = t * 0.17087277 + -0.82215223
	a2 = t * a1 + 1.48851587
	a3 = t * a2 + -1.13520398
	a4 = t * a3 + 0.27886807
	a5 = t * a4 + -0.18628806
	a6 = t * a5 + 0.09678418
	a7 = t * a6 + 0.37409196
	a8 = t * a7 + 1.00002368
	a9 = t * a8
	a10 = -z * z - 1.26551223 + a9
	a = t * Math.exp(a10)
	if (x < 0.0)
		a = 2.0 - a
	return a


function erf(x)
	return 1.0 - erfc(x)

function erfinv(y)
	if (y < -1.0 ||y > 1.0)
		alert("input out of range!")
		return 0
	if (y == -1.0)
	else if (y == 1.0)
	else if (y < -0.7)
		z1 = (1.0 + y) / 2.0
		z2 = Math.Ln(z1)
		z3 = Math.sqrt(-z2)
		z = z3
		x1 = 1.641345311 * z + 3.429567803
		x2 = x1 * z + -1.624906493
		x3 = x2 * z + -1.970840454
		x4 = 1.637067800 * z +  3.543889200
		x5 = x4 * z + 1.0
		x6 = -x3 / x5 // note: negative
		x = x6
	else if (y < 0.7)
		z = y * y
		x1 = -0.140543331 * z + 0.914624893
		x2 = x1 * z + -1.645349621
		x3 = x2 * z + 0.886226899
		x4 = 0.012229801 * z + -0.329097515
		x5 = x4 * z + -0.329097515
		x6 = x5 * z + 1.442710462
		x7 = x6 * z + -2.118377725
		x8 = x7 * z + 1.0
		x9 = y * x3 / x8
		x = x9
		z1 = (1.0 + y) / 2.0
		z2 = Math.Ln(z1)
		z3 = Math.sqrt(-z2)
		z = z3
		x1 = 1.641345311 * z + 3.429567803
		x2 = x1 * z + -1.624906493
		x3 = x2 * z + -1.970840454
		x4 = 1.637067800 * z +  3.543889200
		x5 = x4 * z + 1.0
		x6 = x3 / x5 // note: positive
		x = x6
	x = x - (erf(x) - y) / (2.0/Math.sqrt(pi) * Math.exp(-x*x));
	x = x - (erf(x) - y) / (2.0/math.sqrt(pi) * Math.exp(-x*x));
	return x

SMS++(May 12, 2012)

Download: Report (1M)
Code (6K)
IJITCS Paper (300K)

Update 21 Apr 2013: We’ve been published in the International Journal of Information Technology and Computer Science (IJITCS)!

Juan Li, Justin Anderson, Matti Kariluoma, Kailash Joshi, Prateek Rajan, Pubudu Wijeyaratne,”iText – SMS-based Web Services for Low-end Cell Phones”, IJITCS, vol.5, no.5, pp.22-28, 2013.DOI: 10.5815/ijitcs.2013.05.03

Also available on github :

For my computer networks class, I convinced our group to do a project involving the SMS protocol. We agreed to do some sort of SMS service, where you’d message a particular number or email address (not obvious that you can message an email address!) using a small command language, and access web services such as email, calender, etc.

We ended up with web search and email (through google’s OpenAuth and a registration website we cooked up), and managed to do a demo live in class during our report. We used a Nokia N900 and two webservers, which aren’t live anymore unfortunately, shit’s expensive.

Seminar Presentation: Educational Media(January 29, 2012)

Download: Presentation (300K)
LaTeX Source Files (60K)

I presented the paper “Empirical evidence for the existence and uses of metacognition in computer science problem solving”, citation below. The paper details the result of a small (n=10) experiment into the methods that computer science undergraduates use during domain problem solving.

Parham, Jennifer, Leo Gugerty, and D. E. Stevenson. “Empirical evidence for the existence and uses of metacognition in computer science problem solving.” In Proceedings of the 41st ACM technical symposium on Computer science education, pp. 416-420. ACM, 2010.

Least Significant Difference (LSD) in Python(October 4, 2011)

The Least Significant Difference (LSD) is a measure of how much of a difference between means must be observed before one can say the means are significantly different. e.g. the means

  • 10.2
  • 10.5
  • 11.0

are not significantly different from one another is their LSD is 1.0, but they are significantly different if their LSD is 0.1.

# coding=utf8
# Python routines to caclulate the LSD of a balanced data set.
# Taken line-by-line from the agricolae project
# ( ,
# for R 
# (
# Matti Kariluoma Sep 2011 

from math import sqrt, pi, cos, sin, exp
from scipy.special import erfinv

def qnorm(probability):
	A reimplementation of R's qnorm() function.
	This function calculates the quantile function of the normal
	Required is the erfinv() function, the inverse error function.
	if probability > 1 or probability <= 0:
		raise BaseException # TODO: raise a standard/helpful error
		print (2*probability - 1)
		print erfinv(2*probability - 1)
		return sqrt(2) * erfinv(2*probability - 1)
def qt(probability, degrees_of_freedom):
	A reimplementation of R's qt() function.
	This function calculates the quantile function of the student's t
	This algorithm has been taken (line-by-line) from Hill, G. W. (1970)
	Algorithm 396: Student's t-quantiles. Communications of the ACM, 
	13(10), 619-620.
	Currently unimplemented are the improvements to Algorithm 396 from
	Hill, G. W. (1981) Remark on Algorithm 396, ACM Transactions on 
	Mathematical Software, 7, 250-1.
	n = degrees_of_freedom
	P = probability
	t = 0
	if (n < 1 or P > 1.0 or P <= 0.0 ):
		raise BaseException #TODO: raise a standard/helpful error
	elif (n == 2):
		t = sqrt(2.0/(P*(2.0-P)) - 2.0)
	elif (n == 1):
		P = P * pi/2
		t = cos(P)/sin(P)
		a = 1.0/(n-0.5)
		b = 48.0/(a**2.0)
		c = (((20700.0*a)/b - 98.0)*a - 16.0)*a + 96.36
		d = ((94.5/(b+c) - 3.0)/b + 1.0)*sqrt((a*pi)/2.0)*float(n)
		x = d*P
		y = x**(2.0/float(n))
		if (y > 0.05 + a):
			x = qnorm(P*0.5)
			y = x**2.0
			if (n < 5):
				c = c + 0.3*(float(n)-4.5)*(x+0.6)

			#c = (((0.05*d*x-5.0)*x-7.0)*x-2.0)*x+b+c
			c1 = (0.05*d*x) - 5.0
			c2 = c1*x - 7.0
			c3 = c2*x - 2.0
			c4 = c3*x + b + c
			c = c4
			#y = (((((0.4*y+6.3)*y+36.0)*y+94.5)/c-y-3.0)/b+1.0)*x
			y1 = (0.4*y+6.3)*y + 36.0
			y2 = y1*y + 94.5
			y3 = y2/c - y - 3.0
			y4 = y3/b + 1.0
			y5 = y4*x
			y = y5
			y = a*(y**2.0)
			if (y > 0.002):
				y = exp(y) - 1.0
				y = 0.5*(y**2.0) + y

			#y = ((1.0/(((float(n)+6.0)/(float(n)*y)-0.089*d-0.822)
			y1 = float(n)+6.0
			y2 = y1/(float(n)*y)
			y3 = y2 - 0.089*d - 0.822
			y4 = y3 * (float(n)+2.0) * 3.0
			y5 = 1.0 / y4
			y6 = y5 + 0.5/(float(n)+4.0)
			y7 = y6*y - 1.0
			y8 = y7 * (float(n)+1.0) 
			y9 = y8 / (float(n)+2.0) 
			y10 = y9 + 1.0/y
			y= y10
		t = sqrt(float(n)*y)
	return t

def LSD(response_to_treatments, probability):
	A stripped-down reimplementation of LSD.test from the agricoloae
	package. (
	Calculates the Least Significant Difference of a multiple comparisons
	trial, over a balanced dataset.

	trt = response_to_treatments
	#model = aov(y~trt)
	#df = df.residual(model)
	# df is the residual Degrees of Freedom
	# n are factors, k is responses per factor (treatments)
	n = len(trt) 
	k = len(trt[0]) # == len(trt[1]) == ... == len(trt[n]) iff we have a balanced design
	degrees_freedom_of_error = (n-1)*(k-1)
	treatment_means = {}
	for i in range(n): # n == len(trt)
		total = 0.0
		for j in range(k):
			total += float(trt[i][j])
		treatment_means[i] = total/k
	block_means = {}
	for j in range(k):
		total = 0.0
		for i in range(n):
			total += float(trt[i][j])
		block_means[j] = total/n
	grand_mean = sum(treatment_means.values()) / float(n)
	# TODO: what is the difference between type I and type III SS? (
	SSE = 0.0
	for i in range(n): # n == len(trt)
		for j in range(k):
			SSE += (float(trt[i][j]) - treatment_means[i] - block_means[j] + grand_mean)**2.0
	#print "SSE: %f\n" % (SSE)
	mean_squares_of_error = SSE / degrees_freedom_of_error
	#print "MSE: %f\n" % (mean_squares_of_error)
	Tprob = qt(probability, degrees_freedom_of_error)
	#print "t-value: %f\n" % (Tprob)
	LSD = Tprob * sqrt(2.0 * mean_squares_of_error / k)

	return LSD

Hill, G. W. (1970) Algorithm 396: Student’s t-quantiles. Communications of the ACM, 13(10), 619-620.
Hill, G. W. (1981) Remark on Algorithm 396, ACM Transactions on Mathematical Software, 7, 250-1.

RenameAll.exe in Python(June 1, 2011)

Download: (1K)

Someone wanted to use the tools I wrote for the cardboard bookscanner on a non-windows platform… et voilà, the same functionality in Python (2.7.3):

 Matti Kariluoma Jun 2011 
 A python rendition of RenameAll.exe (
 from the Cardboard Bookscanner project, Jan 2010.
 This script loads a directory of images, then renames them all such 
 that the first half and the second half are intersperesed. The input 
 directory is expected to contain 2n or 2n+1 images, where the first n 
 images are of the right-hand side of a book, and the last n are the
 left-hand side of the book.

import os, sys, glob
import cv

images_filename = []
images_filename =	glob.glob1(sys.argv[1], "*.[j,J][p,P,pe,PE][g,G]")


NUM_PIC = len(images_filename)

n = 0

for filename in images_filename:
	image = cv.LoadImage(sys.argv[1]+"/"+filename)
	if (n < NUM_PIC / 2):
		#sprintf(temp, "%06d-a.jpg", n+1)
		cv.SaveImage(sys.argv[1]+"/%06d-a.jpg" % (n+1), image)
		#sprintf(temp, "%06d-b.jpg", n+1-NUM_PIC/2)
		cv.SaveImage(sys.argv[1]+"/%06d-b.jpg" % (n+1-NUM_PIC/2), image)
	n += 1

Truth be told, I only wrote the tools so windows users could play along too; I imagined non-windows users would write some shell one-liners.

MICS 2011(April 9, 2011)

A colleague recently presented a paper at the MICS 2011 Conference in Duluth, MN. Since I was the first to google the knapsack problem for him (among other minor things), I became second author!

Laser Overdrive(April 8, 2011)

Download: EMC Configuration (5K)
Sample gcode (1K)
Gcode conversion script (3K)
Gcode concatenation script (1K)

Apparently, owning a laser cutter opens a whole new world insofar as building things is concerned. I just couldn’t resist the allure!
The warning label on the back of a laser cutter

I purchased a 40W engraver/cutter from Full Spectrum Engineering during spring break (Mar 11-18). It finally arrived on the 6th of April. I stayed up until the wee hours of the morning configuring, aligning, and getting a few test cuts done:
A laser cutter cutting out a piece of 3mm plywood

I do not regret opting out of the Windows-only USB printer interface for the machine, I’ve gotten great results using only the parallel port, EMC2, and Inkscape. Configuration files for the 40W FSE Laser under EMC2.

The only hiccup so far is the manual massaging of the gcode that the gcodetools Inkscape extension spits out, but that should be easy to remedy. A sample massaged gcode file.

An Overdrive Circuit(February 22, 2011)

Last summer I slapped together a workbench project into an enclosure and gave it to a friend, he liked how it sounded. Eventually it broke down, so I offered to design a PCB to replace the circuitry.

The Overdrive circuit, next to the replacee:
An Overdrive circuit populating a PCB
Design files: PCB, Silk

Wireless DDR Pad for the PS3(January 20, 2011)

A friend needed help hooking up his PS3 USB DDR pad to his PS3 while it was emulating a PS2 game.

The original pad used a USB connection, but was not a ‘Sony blessed’ interface. The first step was to acquire a PS3 controller to sacrifice. We removed the plastic shell and plastic interface board, then I cooked up the following PCB to interface with it:
A PCB To interface with a PS3 controller
Design files: PCB, Silkscreen

Next, we cracked the USB Pad, removed some extraneous plastic and circuitry, and attached our “controller-within-a-controller”:
A PS3 controller connected to a USB dnace pad
The completed PS3 controller within a USB dance pad

cURL + gzip (Beating a Dead Horse)(October 14, 2010)

Download: Source (11K)

I wasn’t entirely satisfied with my previous attempt at HTTP/POST gzip-compression, so I wrote a more elegant solution (no XMLRPC libraries this time).

A function to perform gzip-compression on a string:

// see zlib's compress.c:compress() function for the original of this
// modified function
// dest      : destination buffer, malloc'd already
// destLen   : size of the malloc'd buffer
// source    : the uncompressed text
// sourceLen : the size of the uncompressed text
// this function returns an error code from the zlib library.
// upon return, dest contains the compressed output, and destLen 
// contains the size of dest.
int string_gzip (char *dest, unsigned long *destLen, char *source, unsigned long sourceLen)
  char *header;
  sprintf(dest, "%c%c%c%c%c%c%c%c%c%c", gz_magic[0], gz_magic[1],
          Z_DEFLATED, 0       , 0,0,0,0      , 0        , OS_CODE);
        //          , flags   , time         , xflags   ,        );
  header = dest;
  dest = &dest[10]; // skip ahead of the header
  *destLen -= 10; // update our available length
  z_stream stream;
  int err;

  stream.next_in = source;
  stream.avail_in = sourceLen;
  #ifdef MAXSEG_64K
    /* Check for source > 64K on 16-bit machine: */
    if (stream.avail_in != sourceLen)
      return Z_BUF_ERROR;
  stream.next_out = dest;
  stream.avail_out = *destLen;
  if (stream.avail_out != *destLen)
    return Z_BUF_ERROR;

  stream.zalloc = Z_NULL;
  stream.zfree = Z_NULL;
  stream.opaque = Z_NULL;

  // instructs zlib not to write a zlib header
  err = deflateInit2( &stream, Z_DEFAULT_COMPRESSION, Z_DEFLATED, 
                      -MAX_WBITS, DEF_MEM_LEVEL, Z_DEFAULT_STRATEGY );
  if (err != Z_OK)
    return err;

  err = deflate(&stream, Z_FINISH); // Z_FINISH or Z_NO_FLUSH if we have
                                    // more input still (we don't)
  if (err != Z_STREAM_END) {
      return err == Z_OK ? Z_BUF_ERROR : err;
  *destLen = stream.total_out;

  err = deflateEnd(&stream);
  dest = header; // put the header back on
  *destLen += 10; // update length of our data block
  return err;

Using this function to make a gzip-compressed HTTP/POST using libcURL:

   char *data = readXMLInput();
   unsigned long dataLength = strlen(data);
   unsigned long compressedDataLength = dataLength*1.1 + 22; // see zlib's compress.c
   char *compressedData = calloc(compressedDataLength,1);
   string_gzip(compressedData, &compressedDataLength, (char*)data, dataLength);

    curl =  curl_easy_init();
    struct curl_slist *header_list=NULL;

    curl_easy_setopt(curl, CURLOPT_URL, "");
    // set the "Accept-Encoding: " header
    curl_easy_setopt(curl, CURLOPT_ENCODING, "gzip");
    // set the "Content-Encoding: ", "Content-Length: ", "Content-Type: " headers
    header_list = curl_slist_append(header_list, "Content-Encoding: gzip");
    header_list = curl_slist_append(header_list, "Content-Type: text/xml");
    sprintf(contentLengthBuf, "Content-Length: %d", compressedDataLength);
    header_list = curl_slist_append(header_list, contentLengthBuf);
    curl_easy_setopt(curl, CURLOPT_HTTPHEADER, header_list);

    // Now specify the POST data 
    curl_easy_setopt(curl, CURLOPT_POSTFIELDS, compressedData);
    curl_easy_setopt(curl, CURLOPT_POSTFIELDSIZE, compressedDataLength);

    // Perform request
    res = curl_easy_perform(curl);

XML-RPC; Gzip(September 21, 2010)

Download: Source (8K)

Above is the source code for a simple* extension of ulxmlrpcpp that allows Gzip compression of XML-RPC.

The simplest* method to perform a Remote Procedure Call (RPC) from one language to another is XML-RPC.

The XML-RPC specification is dead simple; using XML, construct a procedure call, with the parameters of the call as children. Then, send this XML document using HTTP/POST to the resource server, where the request is processed and an XML response is sent back containing the procedure’s result.

The problem I’m concerned with: using a Java Server (JavaMOO), construct an XML-RPC server that can communicate with a client written in C/C++, and do so efficiently enough to run a real-time virtual world.

This hardly seems like anything to right home about; in fact, I had no problem finding Java and C++ implementations of XML-RPC, and was performing RPCs within the hour.

The drama came when I considered efficiency, and hence compression. The XML-RPC specification is unclear/denies the use of any sort of compression on the XML document, which is a shame. Gzip compression is available as a vendor extension for the Java implementation I was using (Apache’s ws-xmlrpc). I found no such capabilities in any C/C++ implementations, so I wrote one up.

*: Nothing is ever simple.

JavaMOO, javamoo-legacy(August 28, 2010)

I’ve been tasked with some basic research.

Now that we have two versions of JavaMOO (backstory), it’d be nice to know if one is more capable than the other.

javamoo-legacy has the benefit of the doubt; we know it works in practice, and we have many usage tests backing up the majority of its functionality. JavaMOO has the benefit of a professional project; with plugins, post-build-time configuration; all the trappings of a polished product.

Here’s a plan of action:

  • Familiarize with JavaMOO, compile
  • Write a client that communicates with both servers (RMI)
  • Perform some simple tasks like object creation, database queries, etc.
  • Draw conclusions, reformulate & reiterate experiment

A Simple Game Client(August 14, 2010)

Download: Source (2M)

Screenshot of a simple ogre client

Nothing too exciting here. A C++/Ogre/Blender effort; controls are w,a,s,d for movement, ‘b’ to drop a box, and ‘r’ to load the auxiliary map texture.

A Salaryman(August 8, 2010)

I’ve been hired by WoWiWe Instruction Co.

The job description:

Using the existing JavaMOO server, build a “World Development Tool” to aide content experts (grad students) in the rapid development of educational games.

The project must be delivered before December 31st, 2010; when my employment terminates. This sort of a finite-length employment makes a lot of sense to me; it allows me to plan for the future (graduate school) while building my job experience. Much like an internship, but better paying.

A Simple Equalization Circuit(July 8, 2010)

A friends’ birthday’s coming up, and two of us have gotten together to give him a great present.

A guitar with two on-board speakers

A guitar’s the gift, but we can’t just give the birthday boy a used guitar! We decided to put a speaker into the body of the guitar, but in testing (before drilling) we found that the bass strings overtook the treble. We had two choices: a simple equalizer, or two speakers.

After we chose to use two speakers, we tried a few things to get the desired effect. The simplest approach, the approach used in mass-produced dual-speaker systems (think plastic boombox), is to use a filtering capacitor for the bass speaker, and feed the plain signal into the treble speaker.

This approach was dismissed for two reasons:

  • We needed to decrease the volume of the bass in both speakers
  • We were halfway towards an op-amp buffered equalizer circuit already, as we needed to amplify the guitar to drive the speakers

Our second point became moot in the end, as we ended up using three op-amps to make the 1x buffer/ 2x equalizer, and an additional two op-amps to drive the speakers themselves. We could have used the fourth op-amp as an adder, and fed the output to one speaker. However, the clarity would have suffered so we went ahead with a two speaker design.

An equalization circuit employing a TL074 and two LM386 ICs
The equalization circuit implemented

Software Testing, A Story(June 11, 2010)

Sometimes the best way to test a piece of software is to throw a bunch of kids at it.

Some background: I didn’t really graduate. I was missing one class, Database studies. Dr. Brian Slator was kind enough to offer me a few alternatives in order to graduate on time, and I chose to work on the development of a graduate’s project from x years ago. The graduate, Ben Dischinger, completed his Master’s Defense and delivered a newer version of his software, JavaMOO, earlier this year.

At the time, and still, our development was in the older version of JavaMOO (denote javamoo-legacy). JavaMOO is a server architecture that allows clients to connect via Java’s Remote Method Invocation (RMI) interface to a database (Derby, MySQL) on the server. JavaMOO is intended for the development of educational games, and Dr. Slator has one such (simple) game with a few hard-to-track bugs abound.

Since the server is threaded to allow higher user-throughput, a database concurrency error was suspected to be the cause, hence the credit substitute for the Database requirement for graduation. I brought myself up-to-speed throughout last semester, and had cracked a few unrelated bugs while testing out the system.

At semester end, Dr. Slator had offered extra credit to one of his classes for volunteers to come into the lab and stress-test the server, to check our progress and help reveal any new bugs. We’d made steady progress as usual since the semester end, and yesterday we had another round of testing. Dr. Slator had an opportunity to give a group of high-school-age students a one-hour demonstration of our lab’s work, and we jumped on the opportunity to show off, and test, our educational game.

The kids did a great job, they managed to crash all three of our servers.

Mill Resurrection(June 9, 2010)

Dan‘s gone! He left the 18th of last month, but not before dropping off a present:

One trashed CNC Mill

A “D&M 4 Machining Center,” an educational model. This last week I’ve been cleaning the mill, and imagine my surprise when I pop the back to find this:

Modular circuit board design!?

Although the front controls and computer control port were toasted, the power supply was functioning, and after sending a 5V probe into the logic input of one of the motor controllers, we find that the Mill is, in fact, in operating condition! It’s now “feasible” to attempt some repair:

  • New controls:

X,Y, and Z momentary toggle switches

  • New circuit board to translate the controls to 5V logic (using 555 timers, two per axis, each pair allows two duty cycles per axis):

Circuit to send logic +5V or +0V to three controller boards

On an up-note, while researching the ICs on the motor controller boards, I found that they have a boolean state for half- or full-stepping (!). The makers of the controller boards even put a small rocker switch on-board to toggle the stepping-mode:

8-dip toggle switch/ rocker block

Graduation(May 15, 2010)

@ 10:00 May 15th 2010,

I confer upon you the title “Bachelor of Arts.”

This means I have time to work on my own projects now, right?

Sidetrack: Edge-finding and Line Representations(March 2, 2010)

So, you want to take two images, slightly offset in space from one another, and combine them into a larger image.

This is an old problem, and seems “solved” if you consider the Photoshop plugins, etc. to produce such panoramas. Most don’t realize that these tools are less than perfect, for instance the final product can be greatly improved by removing the lens distortion from each image before attempting to perform a panoramic merge.

But we don’t want to make a panorama with our images, we just want to do all the steps (correctly) up to the point where we’d try and combine them, i.e. perform translations/rotations/scaling to make the images lie on the same plane, so that each image’s only difference is being offset in that plane (only x and y offset, no z offset, no rotations in space, etc. )

Consider a model of the Large Camera Array Dan and I have been using. Our intention is to have the cameras lined up so that each of the camera’s sensors lie in the same plane, so that the only information being added by each adjacent camera is an image x units to the left. This is impractical in reality: each camera is slightly offset from the others, some even have tilted lenses causing them to look slightly down, etc. as we’ve observed in the data we’ve collected.

In order to find the transformations that will rotate/scale/translate the images so that they all lie in the same plane, we must make a set of assumptions about our problem to have any hope of implementing a solution. (The alternative being trying all such rotations in all three axis of freedom, etc. until the images are planar, an algorithm that would take prohibitively long [years] to complete.)

We can assume that all such rotations take place from (have origin at) the center of the image, since our images were taken with a camera and lens system. We can also assume the maximum displacements in the x,y,z plane, based on measurements of our physical system. We can additionally put maximum bounds on our rotations, but we may not want to in order to allow for flexibility in camera orientations (say landscape, portrait mixes).

The goal all of this is working toward is to be able to calibrate an array of cameras, or a collection of images from one camera, moved through space between shots, so as to put there output on the same plane. In the case of an array of cameras, this calibration need only be computed once (or every once in awhile) in order to better fit the model of being perfectly planar.

If such an algorithm is successful, it would enable a hastily constructed or ad-hoc array to provide useful Light Field representations, and improve the output for constructed arrays as well.

Currently, our lfmanip only concerns itself with translations in the image’s x,y planes (not the real-world x,y plane, mind you). If we knew the rotations and scaling between each image, we could line up our images in a much better way.

To tackle this problem, I’ve decided to attempt to reduce each image to a line representation, a vector image. Given each image as a set of lines, it will be remarkably easier to compare two sets of lines and try and decide which set of transformations will make them most similar.

I’ll be graduating with my Bachelor’s of Arts in Mathematics/Computer Science this spring, so  I’ve decided to tackle the problem of mathematically describing the “optimal” way to prepare an image for edge-detection for my capstone research project. The hope is that working with the nit-picky parts of this process I’ll be able to extract a set of lines from a given image, particularly those lines that will be most helpful in deciding how an image has been transformed with respect to rotation/scaling/translation of a previous image.

It’s likely such a best approach does not exist or is computationally infeasible. Whatever the case I’m optimistic that by slowly adding in more assumptions about the problem I’ll be able narrow the search space to the point where an algorithm can be written to do the heavy lifting.

lfmanip w/ Threading(January 31, 2010)

Download: Windows (5M)
Linux 32 64 (2M)
Mac (15M)

There were some issues when using images in portrait mode, and I’ve upped the maximum number of allowed images to 500.

I’m still cranking away on the correction of rotations and almost-perfect matching. It turns out almost-perfect matching is just a matter of rewriting the way lfmanip stores/compares the already collected feature points, but the rotation issues have turned into an ugly monster.

I’ve enlisted the help of a mathematics professor to help with the rotation issues, and he assures me that it will not be a trivial problem to solve.

Cardboard Book Scanner(January 11, 2010)

Dan and I tried to build a cheap book scanner. Enjoy the Cardboard Book Scanner.

A book scanner made of cardboard and duct tape

Next Steps / lfmanip(January 9, 2010)

Download: Windows (5M)
Linux 32 64 (2M)
Mac (15M)

This is an updated version of lftextures. lfmanip makes use of the OpenCV libraries and the SURF algorithm to generate feature points, which are then used to match features between images, allowing us to automatically align the images in the application, rather than requiring the user to do so before hand.

This application also saves the resulting offsets into a file “offsets.txt” in the image directory, and will automatically load the offsets from this file if it is found. (Therefore subsequent viewings do not require the alignment procedure.)

Currently the code isn’t very fancy; it must match the same feature in all images or else it’ll fail. It’s also a bit slow, it takes approx. 4-12 seconds per image to calculate all of the SURF points. I haven’t polished the code base as well, so if a segmentation fault or crash happens be sure to e-mail me with a short description of what happened.

To use other datasets with this program, just drag and drop a folder containing your data onto lfmanip.exe and it’ll load those images; otherwise it will use the hard-coded “lf/” directory.

In the first view, use the drawn arrows to page through the images, and look for a suitable feature. In the provided dataset, the tower on the top of the image looks suitable. Drag a selection box around the feature you think will work, at this point lfmanip should freeze. Watch the command window for output.

lfmanip and its selection mode

If lfmanip finds the same feature in all images, it’ll write it’s findings to “offset.txt” in your image directory and switch to viewing mode. If it does not, it’ll draw red dots where every feature was found.

At any point, (except when processing a selection) you may hit the ‘S’ key to switch between the two viewing modes.

Otherwise, the controls are on-screen and are the same as lftextures‘.

Coming soon:

  • The ability to match almost-perfect features
  • Automatic image rotation for wobbly pictures

OpenGL and libjpeg: Using Software Libraries in C(December 27, 2009)

This is a PSA for anyone who’d ever want to load jpegs into an OpenGL program (in c)…

Loading textures into OpenGL is a pain. The API expects the image information to be presented to it in binary form, of which the user may choose from:

  • {Luminance | Red | Green | Blue} for single valued streams
  • {RA | RG | GA | etc.} for double valued streams
  • {RGB | RGA | etc.} for triple valued streams
  • {RGBA} for quad valued streams

(Full listing @

The first pain a user faces is getting a binary stream loaded. While the rest of OpenGL is calling blackbox functions, the user must write a file opening routine:

FILE *image = fopen("texture.raw", "rb");
//must be a power of two in both dimensions
int height = 512;
int width = 512;
GLuint texture;
glGenTextures(1, texture);
glBindTexture(GL_TEXTURE_2D, texture);
glTexImage2D(GL_TEXTURE_2D, 0, 3, width, height, 0, GL_RGB, GL_UNSIGNED_BYTE, image);

But then its not always so obvious how to convert a given image into a raw format, how to skip past the header if it’s an almost raw format, etc.

Most users want to use a common format that’s easy to work with. Since lftextures uses digital still cameras as it’s input, loading the jpegs directly seemed like a fantastic idea.

Fantastic until you see the code:

FILE *fd;
unsigned char *image;
int width, height, depth;
fd = fopen("texture.jpg", "rb");
struct jpeg_decompress_struct cinfo;
struct jpeg_error_mgr jerr;
JSAMPROW row_pointer[1];
unsigned long location = 0;
cinfo.err = jpeg_std_error(&jerr);
jpeg_stdio_src(&cinfo, fd);
jpeg_read_header(&cinfo, 0);
cinfo.scale_num = 1;
cinfo.scale_denom = SCALE;
width = cinfo.output_width;
height = cinfo.output_height;
depth = cinfo.num_components; //should always be 3
image = (unsigned char *) malloc(width * height * depth);
row_pointer[0] = (unsigned char *) malloc(width * depth);
/* read one scan line at a time */
while( cinfo.output_scanline < cinfo.output_height )
jpeg_read_scanlines( &cinfo, row_pointer, 1 );
for( i=0; i< (width * depth); i++)
image[location++] = row_pointer[0][i];

Then you’ll want to load this into a texture; the code is the same as above except we’ll use the GL_TEXTURE_RECTANGLE_ARB extension to allow for any sized image:

GLuint texture;
glGenTextures(1, texture);
glBindTexture(GL_TEXTURE_RECTANGLE_ARB, texture);
glTexImage2D(GL_TEXTURE_RECTANGLE_ARB, 0, depth, width, height, 0, GL_RGB, GL_UNSIGNED_BYTE, image);

Last and most importantly; to use the libjpeg library you need to include it:
#include <jpeglib.h>
and compile like so:
gcc -ljpeg this.c

Other examples:

libjpeg specification:

Making of lftextures(December 26, 2009)

Prologue: The Need
Dan Reetz thought to himself, “I have this great idea for a large camera, but I need to have the tools to use the camera before I build it.” Dan had been collecting the cameras and other parts for awhile before I’d met him. Once he told me about his idea, we decided that the project was within reach and went for it.

Our plan was to design a program that could display the additional information captured in such a fashion; and to do so we needed some data. So we built a “slide-type” camera, and made a set of overlapping images of a few different scenes for an initial dataset.
Dan's Original Slide Camera

Step 1: The Method
At the workstation, Photoshop is fired up. Dan’s intuition is to pick the object we want to focus, align all images so that feature is centered in each image, and then take the mean of all the images. Success! We’ve refocused the scene! We also experimented with the mode, median, etc. and found the mean and median to look the best.

Step 2: Emulation
My task now was to perform the same tasks in a manner that was repeatable on any suitable dataset.

First our requirements:

  • We need to display 12 or so images at once
  • We need the user to have interactive control

These lead me to choose a graphical language of some sort. I decided to use OpenGL (used in 3D games and applications) and the GLUT toolkit (OpenGL extensions for keyboard, mouse, window, etc. control). I chose them for two reasons: cross-compatibility, and familiarity.

Using OpenGL allowed me to take advantage of the blending modes and the ability to set the transparency of an object to perform the “mean” of the images without an intermediate processing step.

A problem was therefore introduced: I would want to make big rectangles and display (texture) the images on them for speed and efficiency reasons, but OpenGL expects texture dimensions to be in powers of two, meaning we couldn’t use any input image we wanted without preprocessing them first. (I actually built an initial attempt that sampled the image and made thousands of little “pixel” rectangles, but this method was very slow and memory intensive, i.e. over 2GB of RAM)

Of course, others wanted to texture arbitrary images as well, and the problem had been solved quite a long time ago when graphics cards began to support the GL_TEXTURE_RECTANGLE_ARB extension; this allows textures to be any dimension, among other things.

At this point:

  • We could perform our target procedure
  • We had reduced the CPU and memory footprint

All that remained was to implement the steps to calculate the refocused scene. By a stroke of luck our initial attempt worked very well; at first we didn’t realize that it’s success was due to the careful manner in which we had taken the pictures.

Step 3: Fire-fighting
We now had a working program, but it wouldn’t perform correctly on some of our data, confusing and frustrating us.
Images before alignment

It didn’t take us long to root out the problem; once we had the data open in a viewer we noticed that some datasets were aligned very well while others were not. We quickly did a manual alignment in Photoshop on one of the problem datasets et voilà!
Images after alignment

Epilogue: Onward
The core of the application hasn’t changed since. It is a major pain to manually align each dataset however, I’ve begun work on a rewrite that aligns the images in the program, instead of relying on the user to do so before hand.

Now that we’d completed our tool, we could see what our plenoptic camera was seeing. We now had a flurry of questions to answer:

  • Why is the plane of focus so shallow? (1-2 cm!?)
  • Can we retain sharpness on the edges of our output?
  • How can we improve the resolution of our compositions?
  • What other sorts of operations can we perform?

And additionally:

  • How many of our limitations are caused by our software?
  • Are some of our problems caused by our hardware?

So we march on.

DIY Camera Array Parts 1 & 2 Up(December 19, 2009)

Dan and I have just put the finishing touches on two new Instructables:

First, a primer on Computational Photography: Computational Photography Primer
Second, a quick tutorial on using a single camera to make a refocusable image: Using one Camera to Make Refocusable Pictures

A plenoptic camera using one camera

Using Your Own Images in lftextures(December 18, 2009)

Edit: Newer versions of this application still use the hard-coded “lf” directory, but only if no other directory is given. To use a different directory, just drag and drop the folder of pictures onto the “lfmanip.exe” icon.

I’ve gotten a lot of e-mails asking how to put your own data into lftexures, or how to the put the other available data in.

The version I released uses a hard-coded directory, “lf”. Just replace all the pictures inside the ‘lf’ directory with your own data and the program will automatically load them all, in alphabetical order (maximum of 22 images).

A Windows folder with the lf directory highlighted

Introducing lftextures(December 4, 2009)

Download: Windows (3M)
Linux 32 64 (3M)
Mac (5M)
Datasets: Path (2M), Landscape (3M)

Here we have available some binaries of our simple light field viewer, lftextures. When Dan and I first tried to capture a light field, trying to manipulate the data was a real pain. How were we to know if our data was any good if we didn’t have a simple test to show we had captured the scene? We decided to implement the easiest operation, digital refocusing. Here’s the product:

An example of digital refocusing

The controls are simple, use the mouse wheel or Home/End keys to scroll through the scene; the application refocuses as you scroll.

Scroll up = move to background, Scroll down = move to foreground.

  • Mouse wheel, or Home/End keys: Scroll through scene
  • Click and drag mouse: Rotate the viewing plane
  • Arrow keys: Move the viewing plane
  • Page Up/ Page Down: Zoom in/out from the viewing plane
  • R: Reset the viewing pane
  • P: Print the screen (files are dropped in the directory lftextures is in, i.e. the working directory)
  • O: Turn off/on the on-screen display (The text in the main window)
  • X: Speed up/down the rate of scrolling
  • L: Brighten/Darken the scene
  • M: Switch render modes

And finally,a few macros (for making movies)

  • I: Print screen and Scroll up once
  • U: Print screen and Scroll down once
  • Y: Print screen and rotate away
  • T: Print screen and rotate towards

Once you’re satisfied scrolling through the scene, try hitting “M” until the on-screen display says “M: 2, 3D” then click and drag the scene back and forth:

An example of a 3D light field representation