Heterogeneous computing has become popular in the past decade. Many frameworks have been proposed to provide a uniform way to program for accelerators, such as GPUs, DSPs, and FPGAs. Among them, an open and royalty-free standard, OpenCL, is widely adopted by the industry. However, many OpenCL-enabled accelerators and the standard itself do not support preemptive multitasking. To the best of our knowledge, previously proposed techniques are not portable or cannot handle ill-designed kernels (the codes that are executed on the accelerators), which will never ever finish. Tis paper presents a framework (called CLPKM) that provides an abstraction layer between OpenCL applications and the underlying OpenCL runtime to enable preemption of a kernel execution instance based on a software checkpointing mechanism. CLPKM includes (1) an OpenCL runtime library that intercepts OpenCL API calls, (2) a source-to-source compiler that performs the preemption-enabling transformation, and (3) a daemon that schedules OpenCL tasks using priority-based preemptive scheduling techniques. Experiments demonstrated that CLPKM reduced the slowdown of high-priority processes from 4.66x to 1.52-2.23x under up to 16 low-priority, heavy-workload processes running in the background and caused an average of 3.02-6.08x slowdown for low-priority processes.