TY - GEN
T1 - Context-Aware Recognition of Elevator Buttons Using a Sequential Training Methodology
AU - Ghosh, Arpan
AU - Joo, Kyeong Jin
AU - Giraldo, Gilberto Galvis
AU - Kuc, Tae Yong
N1 - Publisher Copyright:
© 2024 ICROS.
PY - 2024
Y1 - 2024
N2 - In this paper, we present a sequential training methodology aimed at improving the recognition of elevator buttons using the YOLOv5 object detection model. The methodology is structured into three distinct phases. In the first phase, we generate a synthetic dataset where elevator buttons, cropped from their original context, are placed on random image backgrounds. This phase is designed to help the model learn to identify buttons independently of their surroundings, ensuring a foundational understanding of button features without contextual distractions. In the second phase, we augment the cropped button dataset by applying various transformations such as random flips, rotations, and scaling. These augmentations increase the diversity and robustness of the training data, allowing the model to generalize better to variations in button appearances. The final phase involves training the model on images of full elevator panels. This step is crucial for helping the model understand the contextual placement and spatial relationships of the buttons within the panel, which is essential for accurate detection in real-world scenarios. Additionally, we enhance the real-time video input exposure to improve visibility under varying lighting conditions, addressing common challenges faced in practical applications. For post-processing, we integrate a Channel and Spatial Reliability Tracker (CSRT) to maintain button-tracking consistency in video sequences. This tracker helps ensure that once a button is detected, its position is reliably followed across frames, improving the overall accuracy and reliability of the system. This comprehensive approach, which combines the use of synthetic data, extensive data augmentation techniques, and contextual training on full panel images, aims to better simulate real-world scenarios. As a result, the proposed methodology significantly enhances the robustness and reliability of the YOLOv5 model in recognizing elevator buttons under diverse conditions.
AB - In this paper, we present a sequential training methodology aimed at improving the recognition of elevator buttons using the YOLOv5 object detection model. The methodology is structured into three distinct phases. In the first phase, we generate a synthetic dataset where elevator buttons, cropped from their original context, are placed on random image backgrounds. This phase is designed to help the model learn to identify buttons independently of their surroundings, ensuring a foundational understanding of button features without contextual distractions. In the second phase, we augment the cropped button dataset by applying various transformations such as random flips, rotations, and scaling. These augmentations increase the diversity and robustness of the training data, allowing the model to generalize better to variations in button appearances. The final phase involves training the model on images of full elevator panels. This step is crucial for helping the model understand the contextual placement and spatial relationships of the buttons within the panel, which is essential for accurate detection in real-world scenarios. Additionally, we enhance the real-time video input exposure to improve visibility under varying lighting conditions, addressing common challenges faced in practical applications. For post-processing, we integrate a Channel and Spatial Reliability Tracker (CSRT) to maintain button-tracking consistency in video sequences. This tracker helps ensure that once a button is detected, its position is reliably followed across frames, improving the overall accuracy and reliability of the system. This comprehensive approach, which combines the use of synthetic data, extensive data augmentation techniques, and contextual training on full panel images, aims to better simulate real-world scenarios. As a result, the proposed methodology significantly enhances the robustness and reliability of the YOLOv5 model in recognizing elevator buttons under diverse conditions.
KW - Context-Aware
KW - CSRT tracker
KW - Data augmentation
KW - Object detection
KW - Sequential training
KW - Synthetic data
KW - YOLOv5
UR - https://www.scopus.com/pages/publications/85214409928
U2 - 10.23919/ICCAS63016.2024.10773121
DO - 10.23919/ICCAS63016.2024.10773121
M3 - Conference contribution
AN - SCOPUS:85214409928
T3 - International Conference on Control, Automation and Systems
SP - 596
EP - 601
BT - 2024 24th International Conference on Control, Automation and Systems, ICCAS 2024
PB - IEEE Computer Society
T2 - 24th International Conference on Control, Automation and Systems, ICCAS 2024
Y2 - 29 October 2024 through 1 November 2024
ER -