ThermoHands

The first benchmark for evaluating egocentric 3D hand pose estimation using RGB, depth, NIR, and thermal imaging.

Sep 2023 - Jun 2024

Our official GitHub repo

Abstract

In this work, we present ThermoHands, a new benchmark for thermal image-based egocentric 3D hand pose estimation, aimed at overcoming challenges such as varying lighting conditions and obstructions (e.g., handwear). The benchmark includes a multi-view and multi-spectral dataset collected from 28 subjects performing hand-object and hand-virtual interactions under diverse scenarios, accurately annotated with 3D hand poses through an automated process. We introduce a new baseline method, THEFormer, utilizing dual transformer modules for effective egocentric 3D hand pose estimation in thermal imagery. Our experimental results highlight THEFormer’s leading performance and affirm thermal imaging’s effectiveness in enabling robust 3D hand pose estimation in adverse conditions.

Dataset Details

We use the Head-mounted Sensor Platform to construct the first dataset comprising egocentric hand actions captured through RGB-D, IR, and thermal cameras.

Dataset scene setup of ThermoHands

The 2.5D hand pose is semi-automatically annotated using OpenPose on RGB-D images and is further optimized before being projected onto the thermal image frame.

Ground truth annotation of the ThermoHands dataset

We also evaluated this dataset using HTT (Wen et al., 2023) and our baseline method for thermal imaging, THEFormer; more details can be found in our paper.

Performance of HTT on our four spectrum modalities