TY - JOUR
T1 - Student behavior recognition for interaction detection in the classroom environment
AU - Li, Yating
AU - Qi, Xin
AU - Saudagar, Abdul Khader Jilani
AU - Badshah, Abdul Malik
AU - Muhammad, Khan
AU - Liu, Shuai
N1 - Publisher Copyright:
© 2023 Elsevier B.V.
PY - 2023/8
Y1 - 2023/8
N2 - With the development of multimedia technologies, surveillance videos and other multimedia data have received widespread attention in several fields. Surveillance videos can monitor students' learning statuses in real time. However, the current action recognition methods for teaching have limitations. First, the ethical privacy of AI and education makes public datasets on student behavior scarce. Therefore, based on the summarization of seven typical student behaviors in the classroom, course videos were obtained from the smart classroom to generate a dataset of student behavior. Compared with existing student behavior recognition datasets, the proposed dataset is distinguished by cluttered backgrounds, crowded scenes, and occlusions. Second, relational reasoning using existing methods is not ideal for distinguishing between students' body parts and small objects in a cluttered background; the interactive utilization rate of different relational features is low, and it cannot take advantage of the complementarity of different relational features, resulting in poor performance of interaction action recognition. Therefore, the attention-based relational reasoning module strengthens the interactive representation between small objects and human body parts. At the same time, considering that there is a certain complementary relationship between relational features, this study constructs a relational feature fusion module which models a human-to-human interaction relationship built upon supporting human's body part and surrounding context. Finally, the reconstructed features and human-appearance features were fused to achieve accurate interactive action recognition. Through an experimental comparison between the proposed and current mainstream algorithms on the generated student behavior dataset, it was verified that the proposed model achieves state-of-the-art performance in action recognition.
AB - With the development of multimedia technologies, surveillance videos and other multimedia data have received widespread attention in several fields. Surveillance videos can monitor students' learning statuses in real time. However, the current action recognition methods for teaching have limitations. First, the ethical privacy of AI and education makes public datasets on student behavior scarce. Therefore, based on the summarization of seven typical student behaviors in the classroom, course videos were obtained from the smart classroom to generate a dataset of student behavior. Compared with existing student behavior recognition datasets, the proposed dataset is distinguished by cluttered backgrounds, crowded scenes, and occlusions. Second, relational reasoning using existing methods is not ideal for distinguishing between students' body parts and small objects in a cluttered background; the interactive utilization rate of different relational features is low, and it cannot take advantage of the complementarity of different relational features, resulting in poor performance of interaction action recognition. Therefore, the attention-based relational reasoning module strengthens the interactive representation between small objects and human body parts. At the same time, considering that there is a certain complementary relationship between relational features, this study constructs a relational feature fusion module which models a human-to-human interaction relationship built upon supporting human's body part and surrounding context. Finally, the reconstructed features and human-appearance features were fused to achieve accurate interactive action recognition. Through an experimental comparison between the proposed and current mainstream algorithms on the generated student behavior dataset, it was verified that the proposed model achieves state-of-the-art performance in action recognition.
KW - Action recognition
KW - Human-to-object interaction
KW - Intelligent education
KW - Relational reasoning
KW - Smart classroom
KW - Surveillance
UR - https://www.scopus.com/pages/publications/85162894014
U2 - 10.1016/j.imavis.2023.104726
DO - 10.1016/j.imavis.2023.104726
M3 - Article
AN - SCOPUS:85162894014
SN - 0262-8856
VL - 136
JO - Image and Vision Computing
JF - Image and Vision Computing
M1 - 104726
ER -