ShuttleSet: A Human-Annotated Stroke-Level Singles Dataset for Badminton Tactical Analysis

Wei Yao Wang, Yung Chang Huang, Tsi Ui Ik, Wen Chih Peng

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations


With the recent progress in sports analytics, deep learning approaches have demonstrated the effectiveness of mining insights into players' tactics for improving performance quality and fan engagement. This is attributed to the availability of public ground-truth datasets. While there are a few available datasets for turn-based sports for action detection, these datasets severely lack structured source data and stroke-level records since these require high-cost labeling efforts from domain experts and are hard to detect using automatic techniques. Consequently, the development of artificial intelligence approaches is significantly hindered when existing models are applied to more challenging structured turn-based sequences. In this paper, we present ShuttleSet, the largest publicly-available badminton singles dataset with annotated stroke-level records. It contains 104 sets, 3,685 rallies, and 36,492 strokes in 44 matches between 2018 and 2021 with 27 top-ranking men's singles and women's singles players. ShuttleSet is manually annotated with a computer-aided labeling tool to increase the labeling efficiency and effectiveness of selecting the shot type with a choice of 18 distinct classes, the corresponding hitting locations, and the locations of both players at each stroke. In the experiments, we provide multiple benchmarks (i.e., stroke influence, stroke forecasting, and movement forecasting) with baselines to illustrate the practicability of using ShuttleSet for turn-based analytics, which is expected to stimulate both academic and sports communities. Over the past two years, a visualization platform has been deployed to illustrate the variability of analysis cases from ShuttleSet for coaches to delve into players' tactical preferences with human-interactive interfaces, which was also used by national badminton teams during multiple international high-ranking matches.

Original languageEnglish
Title of host publicationKDD 2023 - Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Number of pages11
ISBN (Electronic)9798400701030
StatePublished - 6 Aug 2023
Event29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2023 - Long Beach, United States
Duration: 6 Aug 202310 Aug 2023

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining


Conference29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2023
Country/TerritoryUnited States
CityLong Beach


  • badminton dataset
  • machine learning
  • sports analytics
  • stroke-level records


Dive into the research topics of 'ShuttleSet: A Human-Annotated Stroke-Level Singles Dataset for Badminton Tactical Analysis'. Together they form a unique fingerprint.

Cite this