Part-aware Prototypical Graph Network for One-shot Skeleton-based Action Recognition(*Best Student Paper*)

Overview of our framework. Cascaded embedding module extracts part-based representations with a two-stage graph network. In the first stage, a body GCN computes an initial context-aware features for all joints. The second stage is about part-level modelling, where we first generate multiple part graphs according to a set of rules, and then feed the representations sampled by the part graphs into a series of part GNNs to compute part representations. The attentional part fusion module highlights important parts based on a class-agnostic attention mechanism, and generates part-aware prototypes. The matching module outputs the class label of the query based on the cosine distance between the part-aware prototype of the query and support examples.

Abstract

In this paper, we study the problem of one-shot skeleton-based actionrecognition, which poses unique challenges in learning transferablerepresentation from base classes to novel classes, particularly forfine-grained actions. Existing meta-learning frameworks typically rely on thebody-level representations in spatial dimension, which limits thegeneralisation to capture subtle visual differences in the fine-grained labelspace. To overcome the above limitation, we propose a part-aware prototypicalrepresentation for one-shot skeleton-based action recognition. Our methodcaptures skeleton motion patterns at two distinctive spatial levels, one forglobal contexts among all body joints, referred to as body level, and the otherattends to local spatial regions of body parts, referred to as the part level.We also devise a class-agnostic attention mechanism to highlight importantparts for each action class. Specifically, we develop a part-aware prototypicalgraph network consisting of three modules: a cascaded embedding module for ourdual-level modelling, an attention-based part fusion module to fuse parts andgenerate part-aware prototypes, and a matching module to perform classificationwith the part-aware representations. We demonstrate the effectiveness of ourmethod on two public skeleton-based action recognition datasets: NTU RGB+D 120and NW-UCLA.

Publication
In 2023 IEEE 17th International Conference on Automatic Face and Gesture Recognition (FG)

Related