E15 on the iPhone

November 2nd, 2008


E15 on the iPhone from Kyle Buza on Vimeo.
 

Well, not exactly. I simply ported the “WWW data access” technique that E15 uses to get web content.

E15 was always intended to be an application for the re-contextualization of web content — all of those pieces of text, data, and images that we see when we use the browser. A lot of work went into figuring out the best way to go about grabbing all of this data, and placing it within an environment that is potentially more sloppy and playful than the browser. Over the course of the past few months, I’ve found myself faced with the same task on two separate occasions, for applications other than E15. A major revelation came over the summer when I was working with Jamie Zigelbaum in the Tangible Media Group. During that time, we were trying to figure out a way to obtain all of this WWW data and place it into a different software architecture that wasn’t E15. It was then that I proposed a different solution to the existing E15 approach, which relied much more heavily on custom web services, deployed to provide extremely elegant and simple interfaces to the content. This approach worked so well that I replaced the current E15 mechanism with it. What once took 30-40 lines of Python can now be done in 3. I’m most particularly fond of is the resulting elegance and readability.

I’ve recently started another project that needs to grab WWW data in this way, with one piece running on the iPhone. I stripped out the implementation from E15 and stuffed it into a simple iPhone application. I’d like to say that it worked straight away, but the iPhone appears to have difficulties (only randomly, of course) with my use of the NSMachPort to initiate the asynchronous download and web service access. The application is simple, it just makes a request to my custom web service to find out where to find some flickr images, and downloads them. While waiting for the iPhone to get the image data, I render small circles that show pending requests.

Pendulum Physics

October 30th, 2008

A few days ago, I was looking for something that would give me a a nice effect using swinging pendulum physics. Being lazy, I much prefer to use an existing solution than to spend time re-inventing the wheel. Sadly, all of the source code variants I found were either only half implemented or in a language that wasn’t C. So, failing to cheat my way out of it, I was forced to learn about the fourth order Runge-Kutta method for solving the relevant differential equations, and subsequently write my own.

Here’s to hoping that someone looking for an rk4 solver in C can take advantage of the time I spent beating my own version into submission. Here’s the source. It’s sloppy and undocumented. But at least it spits out the right values.

Blobs by OpenCV and CoreVideo

October 22nd, 2008

I was recently in need of a clean and simple mechanism for doing realtime blob detection using OpenCV. One technique that would at least get me halfway there would be to use something like OpenFrameworks, which contains a sample project for this purpose. However, when writing applications that have no reason to run on anything other than a particular OS (in my case, OS X), it makes sense to leverage as much of the native OS APIs as possible for performance. In addition, I was also drawn to the possibility of being able to take the video frames and pre-process them with some CoreImage filters before sending them to OpenCV.

The right answer is to use CoreVideo in conjunction with OpenCV to accomplish this task. Unfortunately, I found a number of discussions on the web asking how to accomplish precisely this task — apparently with little success. To be honest, CoreVideo still scares me a bit, but, according to Apple, it’s the best way to deal with video on OS X. I suspect the main reason that I couldn’t find an existing solution to a relatively simple problem is mainly due to the complexity of the CoreVideo APIs compounded with the issue that OpenCV can be challenging to compile for OS X.

Surprisingly, this task is incredibly simple. In less than 200 lines of Objective-C, I’ve got a CoreVideo-based OS X application that does blob detection using OpenCV. My performance concerns were also justified, as it uses ~10% CPU to accomplish the same task the OpenFrameworks does with nearly 80%. The only downside is that even though I’ve been able to inject OpenCV into the CoreVideo pipeline and obtain some great results, there doesn’t appear to be any way to insert CoreImage into this pipeline as well. The benefit of CoreImage filters, of course, is that they are running on the GPU. This turns out to be a double-edged sword however, as the only way to get the pixels back from the GPU (or wherever Apple keeps them), is to copy them back out. And copying is sloooow.

jit.gb bang

October 12th, 2008

After approximately three years of procrastination, I finally rolled up my sleeves and put together jit.gb, a GameBoy emulator for Max/MSP, similar in nature to the jit.atari2600 and jit.intellivision externals I made so long ago (apparently around the same time I made that website… sigh). I want to thank the people that continue to ask me for things like jit.nes and jit.gb, because it reminds me that they’re used, and that I should stop slacking. In addition, I’d also like to apologize to those people for having it take so long. It’s the first external I’ve written using Max 5, which is rapidly growing on me.

I tried to make this a number of times, using other emulators, but it was Gambatte that turned out to be the easiest to integrate. Sound, keyboard input… it’s all there. But there’s one last bug in the sound handling that’s keeping me up at night. Soon.

Update (11/07/08): I’m lazy. Here’s the pre-alpha OS X version for those that can’t wait.

E15 : Custom OpenGL

October 7th, 2008

Very much in the spirit of the animation subsystem, I spent the past week building a similar mechanism to allow E15 developers to write custom chunks of pure OpenGL directly within the E15 core. The implementation was easy enough to get up and running, but took me a bit longer to get all of the bugs worked out of my first functional demo — Conway’s Game of Life running as a user-defined and Python-controllable E15 module. One FBO, three shaders, and a lot of texture swapping.

Clearly, running a simulation like this is something that shouldn’t be running in Python, especially because it can be made to run directly on the GPU. In E15, Python shouldn’t really know about OpenGL or FBOs — all that it should have access to are the more general properties of visual objects like position, rotation and scale. With this new architecture, E15 is now entering an entirely new generation of possibility. I’m excited.

IMPs and Animations

October 3rd, 2008

Over the course of the past year, the idea of adding animation support to E15 has been a subject of much discussion. At the end of the day, the main issue lies within the way E15 leverages Python.

Today, many applications leverage the Python language. Some use it to implement the application itself (e.g. NodeBox), while others use things like SWIG to provide function bindings to applications written in C/C++, so that developers can develop applications in Python that would have otherwise requred C/C++. E15, in contrast, uses Python as a mechanism to provide interactive, procedural, REPL-style interaction. Interpreted languages have been used in this way before, most notably in Pad++, a so-called zoomable interface. Pad++ uses a Tcl interpreter to serve essentially the same purpose as Python in E15. As a result, Python is insufficient for handling application aspects such as scene rendering, but is acceptable as a mechanism for creating new objects and handling certain input types.

So, although E15 contains an embedded Python interpreter, its role is relegated to a form of abstracted object management, as opposed to fine-grained object manipulation and control. In other words, it doesn’t make sense for an E15 user to write Python code to directly manipulate the x, y, z coordinates of an object frequently, because that should be done by the application engine, which is currently written in a C/C++ wrapper — Objective-C.  Clearly, this aspect makes animations difficult. How to allow users to create fast, powerful, animations, without having to force Python to do something it’s not really designed to?

After many months of deliberation, I think I’ve finally found a solution I’m comfortable with. This solution allows users to write their animations not in Python, but in pure C. This allows animations to run fast, without having to first be evaluated by the Python interpreter. So, animations are written in C, and attaching animations to Python-created objects can be done in Python. The only question that remains is how to integrate with the E15 scene rendering engine so that animations run without substantial overhead. The solution happens via a technique called implementation pointer (IMP) caching.

The basic idea behind IMP caching is the understanding that Objective-C is a wrapper around C, and that there’s a lot of additional “thinking” that has to go into an Objective-C method invocation as opposed to C. In C, a function calls another function at a known address. In Objective-C (and many other languages), these addresses aren’t known exactly — they have to be calculated from offsets extracted from the class hierarchy. It would certainly be nice if we could get some Objective-C methods to be invoked as quickly as C methods, without the additional overhead. It turns out that this is possible. Objective-C allows developers to directly access the address of functions within the class hierarchy, which, of course, is dangerous under certain circumstances, but I’ll leave those discussions to the purists. All I want to do is invoke methods fast, and IMP caching allows me to do so.

For those interested in the details, you can’t beat the treatment found here.

Combinatorial arrangements

August 24th, 2008

My recent trip to Minneapolis brought me by the studio of algorithmic art pioneer Roman Verostko. During my visit, he posed an interesting question about his Upsidedown Mural installation at the Fred Rogers Center. Because each drawing in the mural can be viewed two ways, each rotated 180 degrees from the other, he had long been wondering how to write an algorithm that would iterate over every possible combination of images in the mural by only rotating a single image at a time. After a number of iterations, I think I’ve finally found a solution (which, of course, doesn’t preclude it from being trivial, or just plain wrong). 

In my implementation, I use a bit string to represent a collection of drawings (there are 11 in the mural itself). In this string, each position represents an orientation of a single image. The goal is to iterate over all 2^11 = 2048 possible orientation combinations, with the requirement that the Hamming distance between each consecutive bit string is 1. Noticing that algorithmically generating bit strings with this requirement is relatively easy for short bit strings of length 1 to 4, I chose to decompose each longer string into a collection of shorter strings of length 2 and 3. The set of possibilities is then recursively generated from this collection of smaller bit strings that are chained together. For example, the set from a 13 bit string would simply be 2 2 3 3 3.

Strings of length 2 work well in this capacity, because we can continuously cycle through the set without breaking the Hamming distance requirement:

00 01 11 10 -> 00 01 11 10 -> 00 01 11 10 -> 00 01 11 10 -> 00 01 11 10 ...

Bit strings of length 3 can be produced using the algorithm proposed by Martini as follows:

000 001 011 010 110 100 101 111

Unfortunately, this poses a problem when we reach “111″, as we cannot simply cycle back around to “000″ and continue like the 2 bit case (as the Hamming distance between 111 and 000 is 3). All that needs to be done is simply traverse this list backwards when needed:

000 001 011 010 110 100 101 111 -> 111 101 100 110 010 011 001 000 ...

Piecing together this these steps, I’ve written an algorithm that does the job. Here’s what the generated set for a bit string of length 5 looks like:

The C program I wrote to generate these sets can be found here, and a text file containing the generated set of 2^11 bit strings is here. I’m sure there’s a more formal way to go about producing this result, so if anyone has any insight, Roman and I are all ears.

 

RISD Workshop/Presentation

August 1st, 2008

 

I just got back from a 2-day trip to RISD for an E15 presentation. It was part of John Caserta’s Summer Fusion Arts Workshop with students from around the globe. The emphasis was on exploring serendipitous experiences in the Library, thinking more about how things like digital archives can maintain and accommodate these types of “fortuitous discovery” experiences.

I wrote up my presentation notes with some additional images and thoughts on the buzamoto wiki. I wanted to thank everyone that came out, as I had a great time.

A few images from the E15 scripts I ran during my presentation:

Below is a short movie demonstrating my recent E15 iPhone integration, enabling 3D navigation functionality with an image collection obtained from YouTube video thumbnails. It also shows E15 playing live video from YouTube, something I’ve never quite gotten around to showing in anything other than static images on my Flickr account and E15:Web.


When all of the excitement about the beta iPhone developer program was happening a few months ago, I must’ve been busy writing my thesis or something, because I really didn’t take much notice. About a week and a half ago, I decided I really wanted to build some iPhone support into E15, so I applied to the developer program, expecting to get rejected as so many people have. Then, two days later, I got accepted. Excellent.

Although I really haven’t been looking, I don’t think I’ve seen any iPhone applications that use multi-touch for 3D navigation, so I thought I’d see if I could come up with a simple mapping suitable for navigation in E15, effectively turning the iPhone into a device for remote navigation of a 3D space. The iPhone multi-touch interface provides additional degrees of freedom that can be exploited for navigating spaces that are not inherently 2D. The diagram below shows the mapping I came up with, using the pinching gesture to navigate in the z direction, and parallel two-fingered gestures for rotation about the x and y axes (rotation about z doesn’t seem to be very interesting at the moment). In the diagram, x runs horizontally, and y runs vertically. Solid lines denote the directions that are manipulated by each gesture.

After sketching these up, I wrote an iPhone application that detects the relevant gestures, and communicates with a running E15 instance on another machine, sending this gesture information over the network. I added E15 support to handle this incoming data, as well as built some infrastructure for invoking user-defined Python callbacks in response to actions received from the iPhone (e.g. tapping and shaking).

Grabbing video from YouTube

June 18th, 2008

I recently found out that QuickTime on my new laptop (running the same OS and QuickTime versions as my desktop) simply refuses to play 3gp files. This is unfortunate because YouTube videos are currently streamed into E15 using this format. So, I started looking into other ways of directly grabbing YouTube content.

The main difficulty here is that E15 tries to walk the line in between the browser and the desktop, which means I simply cannot look to some browser plugin or bookmarklet for assistance. I need to do things procedurally, from outside of the browser. The first step is to find a way to procedurally download YouTube video files (traditionally in .flv format). After some digging, I wrote a compact Python script that downloads both flv and mp4 files from YouTube to desktop, provided a YouTube video URL like the following:

http://youtube.com/watch?v=ZlWaTAZUxUQ

Here’s the condensed version, which should be passed the YouTube video URL and output file name as parameters. For example:

$ python ytgrabber.py http://youtube.com/watch?v=ZlWaTAZUxUQ g.mp4


import sys, urllib

# YouTube video grabber.
# Currently supports mp4 and flv download types.

if __name__ == ‘__main__’:
  yturl = sys.argv[1]

  # Grab the html sitting at the specified url
  yturldata = urllib.urlopen(yturl)
  yturltxt = yturldata.read()

  lidx = yturltxt.find(”&video_id=”)
  ridx = yturltxt.find(”&title=”, lidx)

  # The output file name.
  videoname = sys.argv[2]

  # The video data sits at this url.
  resulturl = “http://www.youtube.com/get_video.php?” + yturltxt[lidx:ridx]

  if videoname.endswith(”.mp4″):
    resulturl = resulturl + “&fmt=18″
  elif videoname.endswith(”.flv”):
    pass
  else:
    print “Unsupported file extension. .flv and .mp4 only!”
    exit(0)

  ytfile = open(videoname, “w”)
  ytfileref = urllib.urlopen(resulturl)
  filedata = ytfileref.read()
  ytfile.write(filedata)
  ytfile.close()

A more verbose version can be downloaded here.