This paper presents a two-dimensional editing system which compromises the advantages of using hand gestures and real blocks to manipulate virtual objects. The system uses a RGB-D sensor to capture the color and depth images of the hands and real blocks. The raw images of the hands and blocks are segmented. Then we employ a contour based algorithm to perform hand gesture and shape recognition. The contours are extracted and reduced for eliminating the redundant points. After that a set of feature points are collected and they can be used for recognizing the hand gesture and the shapes of the real blocks. The users have two ways to interact with the virtual objects: 1) manipulating the real blocks to control them; and 2) using hand gestures as editing operations to edit them. Experimental results indicate that our system achieves high recognition rate. Our system supports the basic editing operations to the virtual objects, such as 'move', 'scale', 'rotate', and 'copy'. We also conducted a user study which reveals that our system is intuitive and entertaining.