TY - GEN
T1 - ShuttleSet
T2 - 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2023
AU - Wang, Wei Yao
AU - Huang, Yung Chang
AU - Ik, Tsi Ui
AU - Peng, Wen Chih
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/8/6
Y1 - 2023/8/6
N2 - With the recent progress in sports analytics, deep learning approaches have demonstrated the effectiveness of mining insights into players' tactics for improving performance quality and fan engagement. This is attributed to the availability of public ground-truth datasets. While there are a few available datasets for turn-based sports for action detection, these datasets severely lack structured source data and stroke-level records since these require high-cost labeling efforts from domain experts and are hard to detect using automatic techniques. Consequently, the development of artificial intelligence approaches is significantly hindered when existing models are applied to more challenging structured turn-based sequences. In this paper, we present ShuttleSet, the largest publicly-available badminton singles dataset with annotated stroke-level records. It contains 104 sets, 3,685 rallies, and 36,492 strokes in 44 matches between 2018 and 2021 with 27 top-ranking men's singles and women's singles players. ShuttleSet is manually annotated with a computer-aided labeling tool to increase the labeling efficiency and effectiveness of selecting the shot type with a choice of 18 distinct classes, the corresponding hitting locations, and the locations of both players at each stroke. In the experiments, we provide multiple benchmarks (i.e., stroke influence, stroke forecasting, and movement forecasting) with baselines to illustrate the practicability of using ShuttleSet for turn-based analytics, which is expected to stimulate both academic and sports communities. Over the past two years, a visualization platform has been deployed to illustrate the variability of analysis cases from ShuttleSet for coaches to delve into players' tactical preferences with human-interactive interfaces, which was also used by national badminton teams during multiple international high-ranking matches.
AB - With the recent progress in sports analytics, deep learning approaches have demonstrated the effectiveness of mining insights into players' tactics for improving performance quality and fan engagement. This is attributed to the availability of public ground-truth datasets. While there are a few available datasets for turn-based sports for action detection, these datasets severely lack structured source data and stroke-level records since these require high-cost labeling efforts from domain experts and are hard to detect using automatic techniques. Consequently, the development of artificial intelligence approaches is significantly hindered when existing models are applied to more challenging structured turn-based sequences. In this paper, we present ShuttleSet, the largest publicly-available badminton singles dataset with annotated stroke-level records. It contains 104 sets, 3,685 rallies, and 36,492 strokes in 44 matches between 2018 and 2021 with 27 top-ranking men's singles and women's singles players. ShuttleSet is manually annotated with a computer-aided labeling tool to increase the labeling efficiency and effectiveness of selecting the shot type with a choice of 18 distinct classes, the corresponding hitting locations, and the locations of both players at each stroke. In the experiments, we provide multiple benchmarks (i.e., stroke influence, stroke forecasting, and movement forecasting) with baselines to illustrate the practicability of using ShuttleSet for turn-based analytics, which is expected to stimulate both academic and sports communities. Over the past two years, a visualization platform has been deployed to illustrate the variability of analysis cases from ShuttleSet for coaches to delve into players' tactical preferences with human-interactive interfaces, which was also used by national badminton teams during multiple international high-ranking matches.
KW - badminton dataset
KW - machine learning
KW - sports analytics
KW - stroke-level records
UR - http://www.scopus.com/inward/record.url?scp=85164887421&partnerID=8YFLogxK
U2 - 10.1145/3580305.3599906
DO - 10.1145/3580305.3599906
M3 - Conference contribution
AN - SCOPUS:85164887421
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 5126
EP - 5136
BT - KDD 2023 - Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
Y2 - 6 August 2023 through 10 August 2023
ER -