Senior Thesis

For our senior thesis, my team is working determining the readability of colored text. My job has been to create the web app that is used to store our images, collect results from users, and organize the results so that they can be analyzed by our tools written in MATLAB, and later by Weka (a machine learning tool).

The webapp is written in Rails and is integrated in many components of our workflow.

The workflow:

  • Jordan generates images using the python Image class. These images are generated uniformly at random across the RGB color space (Defines a color by the amount of red, green and blue).

  • I take the image and upload them to our web app. Our database stores the images, as well as the corresponding foreground and background colors.

  • I then divide them into groups of 8 images and create an experiment. An experiment contains 8 images and three signpost images. I will attempt to discuss the sign posts in more detail in a future post, but they are used to help calibrate our results across experiments.

  • Using the gem Turkee, I can launch these experiments in groups to Mechanical Turk. We pay users get 3-5 cents for each survey they complete.

  • As results come in, we download these results, and generate a CSV file of all comparisons. Ayaka feeds these comparisons into a MATLAB implementation of an model called MLSD. This converts our binary comparisons to a linear scale where each image is given a single value representing its location on a line. I will try to go into more detail on this in a future post.

  • We can then feed these values back into the database, giving us a single value for each image. From here we generate an Attribute-Relation File (ARFF) that we can feed into Weka. Our goal here is to generate a model that can accurately predict the readability value we previously assigned.

It has taken a while to get all our pipeline components working the way we want, but now that everything is well tested we should begin launching a large number of experiments in the near future and have results soon after that.

Comments