QMSampler: Joint Sampling of Multiple Networks with Quality Guarantee

Hong-Han Shuai, De Nian Yang, Chih Ya Shen, Philip S. Yu, Ming Syan Chen

Research output: Contribution to journalArticlepeer-review

Abstract

Because Online Social Networks (OSNs) have become increasingly important in the last decade, they have motivated a great deal of research on Social Network Analysis (SNA). Currently, SNA algorithms are evaluated on real datasets obtained from large-scale OSNs, which are usually sampled by Breadth-First-Search (BFS), Random Walk (RW), or some variations of the latter. However, none of the released datasets provides any statistical guarantees on the difference between the sampled datasets and the ground truth. Moreover, all existing sampling algorithms only focus on sampling a single OSN, but each OSN is actually a sampling of a complete social network. Hence, even if the whole dataset from a single OSN is sampled, the results may still be skewed and may not fully reflect the properties of the complete social network. To address the above issues, we have made the first attempt to explore the joint sampling of multiple OSNs and propose an approach called Quality-guaranteed Multi-network Sampler (QMSampler) that can jointly sample multiple OSNs. QMSampler provides a statistical guarantee on the difference between the sampled real dataset and the ground truth (the perfect integration of all OSNs). Our experimental results demonstrate that the proposed approach generates a much smaller bias than any existing method. QMSampler has also been released as a free download.
Original languageEnglish
Pages (from-to)90-104
Number of pages15
JournalIEEE Transactions on Big Data
Volume4
Issue number1
DOIs
StatePublished - Mar 2018

Fingerprint

Dive into the research topics of 'QMSampler: Joint Sampling of Multiple Networks with Quality Guarantee'. Together they form a unique fingerprint.

Cite this