Traditional photography projects a 3D scene to a 2D image without recording the depth of each local region, which prevents users from changing the focus plane of a photograph once it has been taken. To tackle this problem, Ng et al.  presented light-field cameras that record all focus planes of a scene and synthesized the refocused image using ray tracing. Nevertheless, the captured photographs are of low resolution because the image sensor is divided into subcells. Levin et al.  embedded a coded aperture on the camera lens and recover depth information from blur patterns in a single image. However, the coded aperture blocks around 50% of light. Their system requires longer exposition time when taking pictures. Liang et al.  also embedded a coded aperture on the camera lens to capture the scene but with multiple exposures. It produces high quality depth maps yet is not suitable to hand-held devices. Recently, Microsoft Kinect directly estimates the depth information using infrared light, which works only in a indoor environment.