In this paper, to solve the problem of inconsistencies between the predictions in today's SOTA object detection networks, which incorporate the pyramid architecture with multi-level prediction, we proposed a scale-aware framework for IR image-based on-road object detection. The proposed framework uses scale-based attention mechanism to assign responsibilities to each feature levels. With this design, each feature level will focus on detecting a certain range of object scales, thereby minimizing the conflict among the predictions in the final result. Compared to Scaled-YOLOv4 baseline, our proposed method can achieve better performance without increasing FPS on FLIR dataset. The experimental results on RGB image-based object detection datasets also show that our proposed method gives good improvements when applied to RGB images.