Pattern-based weight pruning on CNNs has been proven an effective model reduction technique. In this paper, we first present how to select hardware-friendly pruning pattern sets that are universal to various models. We then propose a progressive pruning framework, which produces more globally optimized outcomes. Moreover, to the best of our knowledge, this is the first paper dealing with the pruning issue of the first and also the most sensitive layer of a CNN model through a two-staged pruning strategy. Experiment results show that the proposed framework achieves 2.25x/2x computation/model reduction while minimizing the accuracy loss.