Modern GPGPU supports executing multiple tasks with different run time characteristics and resource utilization. Having an efficient execution and resource management policy has been shown to be a critical performance factor when handling the concurrent execution of tasks with different run time behavior. Previous policies either assign equal resources to disparate tasks or allocate resources based on static or standalone behavior profiling. Treating tasks equally cannot efficiently utilize the system resources, while the standalone profiling ignores the correlated impact when running tasks concurrently and could hint incorrect task behavior. This paper addresses the above drawbacks and proposes a heterogeneity aware Selective Bypassing and Mapping (SBM) to manage both computing and cache resources for multiple tasks in a fine-grain manner. The light-weight run time profiling of SBM properly characterizes the disparate behavior of the concurrently executed multiple tasks, and selectively applies suited cache management and workgroup mapping policies to each task. When compared with the previous coarse-grained policies, SBM can achieve an average of 138% and up to 895% performance enhancement. When compared with the state-of-art fine-grained policy, SBM can achieve an average of 58% and up to 378% performance enhancement.