Name: | Description: | Size: | Format: | |
---|---|---|---|---|
3.79 MB | Adobe PDF |
Authors
Advisor(s)
Abstract(s)
With continuous advancements in Immersive technology, enhancing humancomputer interaction (HCI) with virtual environments has become a relevant
topic. Commercial virtual reality (VR) systems are equipped with handheld
controllers, which are limited in naturalness and user interactions. Immersive
technologies benefit from the sense of presence, immersion, and embodiment
to augment user experience.
This project addresses these aspects by developing a computer vision-based
dynamic gesture recognition system with full-body pose tracking integrated into
a virtual reality experience. Unlike most existing applications that use only body
tracking, our system combines both body and hand tracking and includes a
gesture recognition module. To demonstrate the effectiveness of the approach,
we propose a 3D virtual immersive environment scenario where the user
engages in a game of "rock-paper-scissors" against the system, aiming to
outperform it.
This project uses the Mediapipe framework as a body pose tracking mechanism
to extract the user’s articulation coordinates. The obtained data is processed
using a long-short-term memory (LSTM) deep neural network (DNN) to
classify dynamic gestures. The game engine Unity 3D is used to represent the
avatar and to present the immersive experience to the user.
To validate the developed system, experimental tasks were conducted with eight
participants. Each participant had to play the game 5 times. The system was validated quantitatively by measuring the online classification accuracy, and the
v
subjective sense of realism, presence, involvement and system usability were
evaluated using the iGroup Presence Questionnaire (IPQ).
Results show the effective possibility of tracking both body and hands, with
potential applications ranging from rehabilitation to sports. Regarding the
integration of pose tracking for controlling the avatar in the 3D immersive VR
environment, results indicate the experience causes a positive impact on users
in terms of realism and presence, as evidenced by a 73.44% score on the IPQ.
However, users reported a mismatch in their movements and the avatars during
the experience. The performance of the gesture recognition classification model
did not match the accuracy achieved during the offline validation and testing
phases. This lack of generalisation of the model is attributed to the limited
number of training samples and the low variability in participant's gestures, as
the dataset included only a single individual.
Overall, the biggest challenge of the project was the integration of the pose data
to control the avatar. Nevertheless, the results demonstrated the feasibility of
combining computer vision-based gesture interaction and full-body tracking ina VR experience
Description
Keywords
Computer vision Pose tracking Skeleton tracking Gesture classification Immersive reality