TY - GEN
T1 - Visual knowledge transfer among multiple cameras for people counting with occlusion handling
AU - Weng, Ming Fang
AU - Lin, Yen-Yu
AU - Tang, Nick C.
AU - Liao, Hong Yuan Mark
PY - 2012/12/26
Y1 - 2012/12/26
N2 - We present a framework to count the number of people in an environment where multiple cameras with different angles of view are available. We consider the visual cues captured by each camera as a knowledge source, and carry out cross-camera knowledge transfer to alleviate the difficulties of people counting, such as partial occlusions, low-quality images, clutter backgrounds, and so on. Specifically, this work distinguishes itself with the following contributions. First, we overcome the variations of multiple heterogeneous cameras with different perspective settings by matching the same groups of pedestrians taken by these cameras, and present an algorithm for accomplishing cross-camera correspondence. Second, the proposed counting model is composed of a pair of collaborative regressors. While one regressor measures people counts by the features extracted from intra-camera visual evidences, the other recovers the yielded residual by taking the conflicts among inter-camera predictions into account. The two regressors are elegantly coupled, and jointly lead to an accurate counting system. Additionally, we provide a set of manually annotated pedestrian labels on the PETS 2010 videos for performance evaluation. Our approach is comprehensively tested in various settings and compared with competitive baselines. The significant improvement in performance manifests the effectiveness of the proposed approach.
AB - We present a framework to count the number of people in an environment where multiple cameras with different angles of view are available. We consider the visual cues captured by each camera as a knowledge source, and carry out cross-camera knowledge transfer to alleviate the difficulties of people counting, such as partial occlusions, low-quality images, clutter backgrounds, and so on. Specifically, this work distinguishes itself with the following contributions. First, we overcome the variations of multiple heterogeneous cameras with different perspective settings by matching the same groups of pedestrians taken by these cameras, and present an algorithm for accomplishing cross-camera correspondence. Second, the proposed counting model is composed of a pair of collaborative regressors. While one regressor measures people counts by the features extracted from intra-camera visual evidences, the other recovers the yielded residual by taking the conflicts among inter-camera predictions into account. The two regressors are elegantly coupled, and jointly lead to an accurate counting system. Additionally, we provide a set of manually annotated pedestrian labels on the PETS 2010 videos for performance evaluation. Our approach is comprehensively tested in various settings and compared with competitive baselines. The significant improvement in performance manifests the effectiveness of the proposed approach.
KW - correspondence estimation
KW - people counting
KW - transfer learning
UR - http://www.scopus.com/inward/record.url?scp=84871375682&partnerID=8YFLogxK
U2 - 10.1145/2393347.2393411
DO - 10.1145/2393347.2393411
M3 - Conference contribution
AN - SCOPUS:84871375682
SN - 9781450310895
T3 - MM 2012 - Proceedings of the 20th ACM International Conference on Multimedia
SP - 439
EP - 448
BT - MM 2012 - Proceedings of the 20th ACM International Conference on Multimedia
T2 - 20th ACM International Conference on Multimedia, MM 2012
Y2 - 29 October 2012 through 2 November 2012
ER -