Fault tolerant computing

As a first step to writing my own simulation code while attempting to do something useful, a few days ago I started writing a code to explore failure and recovery from failure in a distributed computation. By failure in this case, I mean when one of the computation units goes down. My test system is N harmonic oscillators on N nodes (or processes on a shared memory machine). Read More …

The Intersection of Productivity and Joy (over the past months)

Since last we spoke a couple of months ago, I had a hell of a time personally: I moved house, went from one location to another too often, had a major and very stressful financial crisis and had some rough times with friends which rocked the emotional boat. Although there are many things which didn’t Read More …