Crowdsourcing Detection of Sampling Biases in Image Datasets

Xiao Hu, Haobo Wang, Anirudh Vegesana, Somesh Dube, Kaiwen Yu, Gore Kao, Shuo Han Chen, Yung Hsiang Lu, George K. Thiruvathukal, Ming Yin

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

18 Scopus citations

Abstract

Despite many exciting innovations in computer vision, recent studies reveal a number of risks in existing computer vision systems, suggesting results of such systems may be unfair and untrustworthy. Many of these risks can be partly attributed to the use of a training image dataset that exhibits sampling biases and thus does not accurately reflect the real visual world. Being able to detect potential sampling biases in the visual dataset prior to model development is thus essential for mitigating the fairness and trustworthy concerns in computer vision. In this paper, we propose a three-step crowdsourcing workflow to get humans into the loop for facilitating bias discovery in image datasets. Through two sets of evaluation studies, we find that the proposed workflow can effectively organize the crowd to detect sampling biases in both datasets that are artificially created with designed biases and real-world image datasets that are widely used in computer vision research and system development.

Original languageEnglish
Title of host publicationThe Web Conference 2020 - Proceedings of the World Wide Web Conference, WWW 2020
PublisherAssociation for Computing Machinery, Inc
Pages2955-2961
Number of pages7
ISBN (Electronic)9781450370233
DOIs
StatePublished - 20 Apr 2020
Event29th International World Wide Web Conference, WWW 2020 - Taipei, Taiwan
Duration: 20 Apr 202024 Apr 2020

Publication series

NameThe Web Conference 2020 - Proceedings of the World Wide Web Conference, WWW 2020

Conference

Conference29th International World Wide Web Conference, WWW 2020
Country/TerritoryTaiwan
CityTaipei
Period20/04/2024/04/20

Keywords

  • crowdsourcing
  • image dataset
  • sampling bias
  • workflow design

Fingerprint

Dive into the research topics of 'Crowdsourcing Detection of Sampling Biases in Image Datasets'. Together they form a unique fingerprint.

Cite this