The Cat Detector

Using a Microsoft Kinect v1 to keep from stepping on your cat while playing VR games

In the many years we’ve lived together, my 22 year-old cat and I have worked out a system to prevent me from stepping on her.  Whenever she walks anywhere near my feet or my chair, she’ll let out a quick meow so that I know she’s there and I’ll watch my feet.

It’s worked great until the last few years, when I got big into VR with such headsets as the HTC Vive, Oculus Rift, and Samsung Odyssey Windows Mixed Reality Headset.  I’m not exactly a small person, and we had a couple of close calls when the cat came to inform me of a lack of food or water while I was stomping around, shooting at robots.

Fortunately, I still had an old Kinect sensor lying around, and I used it to put together an easy solution that didn’t involve locking the cat away in a bedroom while I’m playing VR games in the living room.   I downloaded the SDK from Microsoft here, then installed it and plugged my Kinect into my PC.  The Kinect is shockingly easy to work with even now, years after it fell out of favor, and I can’t recommend it enough to anyone who wants to tinker with some basic 3D app development.

The included sample apps had some very cool functionality, but they didn’t include one that would find cats in real time.  On the plus side, detecting when my cat was nearby was actually a simpler problem than one might think at first glance.  I could have looked for movement, but sometimes the cat will hold still for hours, blended almost perfectly invisibly into the carpet.  I was hoping to avoid having to train a neural network to spot cats, since at the time I was still unfamiliar with modern AI development.  I decided to take a first shot at the problem by having the Kinect take a depth snapshot of my room, then warning me when it detects two or more contiguous blobs that are different from the initial snapshot.

The implementation was pretty straightforward.  I just took some sample code from the SDK that takes depth snapshots, then changed one of the buttons to save a baseline snapshot to memory.  Then I added a function to the live video stream that filters out parts of the live stream that are the same as the baseline, and used a friendly flood fill algorithm to figure out which different blobs are contiguous and count them.

I cringed and loaded up the app, and was shocked to find out that it satisfied my purposes perfectly on the first try!  I didn’t go out of my way to make the code efficient at all, but it was simple enough that it barely used any CPU power to do its thing, and the first test-shot algorithm did the job more than adequately.  Since I’m not selling it or even putting it on a store in binary form, that meant all I needed to do was add some wav files of myself shouting “INTRUDER ALERT!  INTRUDER ALERT!” and code to play them when multiple entities are walking around the room.

Feel free to check it out yourself!  Source code is available here: (MIT license)