TY - GEN
T1 - TrackNetV2
T2 - 1st International Conference on Pervasive Artificial Intelligence, ICPAI 2020
AU - Sun, Nien En
AU - Lin, Yu Ching
AU - Chuang, Shao Ping
AU - Hsu, Tzu Han
AU - Yu, Dung Ru
AU - Chung, Ho Yi
AU - Ik, Tsi Ui
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/12/3
Y1 - 2020/12/3
N2 - TrackNet, a deep learning network, was proposed to track high-speed and tiny objects such as tennis balls and shuttlecocks from videos. To conquer low image quality issues such as blur, afterimage, and short-term occlusion, some number of consecutive images are input together to detect an flying object. In this work, TrackNetV2 is proposed to improve the performance of TrackNet from various aspects, especially processing speed, prediction accuracy, and GPU memory usage. First of all, the processing speed is improved from 2.6 FPS to 31.8 FPS. The performance boost is achieved by reducing the input image size and re-engineering the network from a Multiple-In Single-Out (MISO) design to a Multiple-In Multiple-Out (MIMO) design. Then, to improve the prediction accuracy, a comprehensive dataset from diverse badminton match videos is collected and labeled for training and testing. The dataset consists of 55563 frames from 18 badminton match videos. In addition, the network mechanisms are composed of not only VGG16 and upsampling layers but also U-net. Last, to reduce GPU memory usage, the data structure of the heatmap layer is remodeled from a pixel-wise one-hot encoding 3D array to a real-valued 2D array. To reflect the change of the heatmap representation, the loss function is redesigned from a RMSE-based function to a weighted cross-entropy based function. An overall validation shows that the accuracy, precision and recall of TrackNetV2 respectively reach 96.3%, 97.0% and 98.7% in the training phase and 85.2%, 97.2% and 85.4% in a test on a brand new match. The processing speed of the 3-in and 3-out version TrackNetV2 can reach 31.84 FPS. The dataset and source code of this work are available at https://nol.cs.nctu.edu.tw:234/open-source/TrackNetv2/.
AB - TrackNet, a deep learning network, was proposed to track high-speed and tiny objects such as tennis balls and shuttlecocks from videos. To conquer low image quality issues such as blur, afterimage, and short-term occlusion, some number of consecutive images are input together to detect an flying object. In this work, TrackNetV2 is proposed to improve the performance of TrackNet from various aspects, especially processing speed, prediction accuracy, and GPU memory usage. First of all, the processing speed is improved from 2.6 FPS to 31.8 FPS. The performance boost is achieved by reducing the input image size and re-engineering the network from a Multiple-In Single-Out (MISO) design to a Multiple-In Multiple-Out (MIMO) design. Then, to improve the prediction accuracy, a comprehensive dataset from diverse badminton match videos is collected and labeled for training and testing. The dataset consists of 55563 frames from 18 badminton match videos. In addition, the network mechanisms are composed of not only VGG16 and upsampling layers but also U-net. Last, to reduce GPU memory usage, the data structure of the heatmap layer is remodeled from a pixel-wise one-hot encoding 3D array to a real-valued 2D array. To reflect the change of the heatmap representation, the loss function is redesigned from a RMSE-based function to a weighted cross-entropy based function. An overall validation shows that the accuracy, precision and recall of TrackNetV2 respectively reach 96.3%, 97.0% and 98.7% in the training phase and 85.2%, 97.2% and 85.4% in a test on a brand new match. The processing speed of the 3-in and 3-out version TrackNetV2 can reach 31.84 FPS. The dataset and source code of this work are available at https://nol.cs.nctu.edu.tw:234/open-source/TrackNetv2/.
KW - Badminton
KW - deep learning
KW - heatmap
KW - shuttlecock tracking
UR - http://www.scopus.com/inward/record.url?scp=85100047208&partnerID=8YFLogxK
U2 - 10.1109/ICPAI51961.2020.00023
DO - 10.1109/ICPAI51961.2020.00023
M3 - Conference contribution
AN - SCOPUS:85100047208
T3 - Proceedings - 2020 International Conference on Pervasive Artificial Intelligence, ICPAI 2020
SP - 86
EP - 91
BT - Proceedings - 2020 International Conference on Pervasive Artificial Intelligence, ICPAI 2020
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 3 December 2020 through 5 December 2020
ER -