September 24, 2025

Did some deep research on ChatGPT the other day about camera and hand tracking upgrades and learned some things.

For instance, I learned how “Active IR depth sensors can falter in bright sunlight due to IR interference.” I always thought my Kinect tracking was environment proof because it still tracked bodies in the dark—ironically, the reverse scenario defeats its depth sensing capabilities.

Also, I learned that the Azure Kinect “has effectively been succeeded by Orbbec’s ‘Femto’ line”. I can get my hands on a new unused Orbbec Femto Mega for $695. The fact that every other third party seller sells these cams for hundreds of dollars more does have me a bit sus—pause—unpause 🌈.

Like its predecessor the Azure Kinect, the Femto Mega has no inbuilt hand-state processing. Since I could still technically use any camera for the RGB stream, this leaves a gaping hole in my desire to purchase one of these. The issue at hand becomes how I’d implement an alternate hand-tracking solution, and thankfully, GPT had suggestions.

The low-hanging fruit path forward seems to be Nuitrack’s SDK, whose Hand Tracker module attempts to classify “click” events—aka, indicators of closed hands. This reminds me of how the Azure Kinect Examples for Unity package also attempts to classify hands. Emphasis on ”attempts”—I went back to the V2 for a reason. Add the fact that Nuitrack’s SDK is proprietary and requires a paid license, and this option really doesn’t seem like my best bet.

ChatGPT’s genius idea is to use RTMPose for body tracking, crop ROI screenshots around the location of each wrist joint, run MediaPipe Hands on those cropped images, and finally map those 2D joints and hand landmarks to depth data via “per-camera intrinsics/extrinsics” (whatever that is). My simpler idea, while still bold, would be to skip RTMPose body processing and instead handle wrist joint tracking with the depth camera. This sounds like a fun and bold experiment to try!

GitHub just so happened to recommend another Unity Runtime Inspector asset to me. This one didn’t have a complex install process and was just a simple .unitypackage. I couldn’t get the Runtime Inspector prefab to work although I did get the Runtime Heirarchy working. Either way, this asset doesn’t include an animation curve editor. I probably should have looked through the source or at the issues page before I wasted time testing it. Still, I imagine that when I get my own runtime animation curve editor working, it would be cool to integrate it into that repo.

I made some improvements to my media resizing logic by adding a maximum side length for images and videos. Now, rectangular media that fit within the smallSideMaxLength I established but are still too wide/tall for comfort get detected and sized properly. I also copied the image processing logic from my resize-existing-media script into my convert-media script so that I could get rid of the former and instead use pnpm convert-media --force to reprocess media files in my /media directory. I also deleted some obsolete test script files that I only noticed after checking my readme to see if it needed updating. Lastly, I added a max-width CSS property to videos in the generated site to fix videos from overflowing the screen width on mobile.

Tags: unity hand-tracking depth chatgpt quartz gamedev workflow animation-curve mediapipe orbbec

Energyball.dev

Explorer

September 24, 2025

Backlinks