Deep neural networks (DNNs) have achieved out-standing accuracy on machine learning applications. However, the numbers of parameters and computational costs of DNNs have grown dramatically. To accelerate the numerous matrix multiplication operations in DNNs, a systolic array of multiplyand-accumulate units (MACs) is a widely-used architecture. In this paper, both timing error prediction and approximate computing are leveraged to relax the timing constraints of MACs. Afterwards, voltage underscaling is applied to further enhance the energy efficiency of the systolic array. In the experiments, our proposed approximate systolic array can obtain 36% energy reduction with only 1% accuracy loss for CFAR-10 image classification.