Motivated by the emerging internet of things (IoT) applications, wireless systems encounter the challenges of providing accurate indoor localization to massive IoT devices. Although the received signal strength indicator (RSSI)-based fingerprinting can provide accurate localization with low system requirements, it still suffers from multipath and fading effects. To resolve this, we propose a beam domain-based fingerprinting localization that can leverage the spatial feature with multiple antenna systems to improve the localization. Specifically, we consider using the beam domain receive power map (BDRPM), which is an RSSI-based map that captures important features of spatial fingerprints of the environment, for localization. To learn the environmental fingerprints via using BDRPMs and to conduct the localization, we propose a deep-learning approach based on the 2D convolutional neural network and auto-encoder structure. We conduct practical simulations to evaluate our proposed localization approach. The results show that our approach can provide very accurate localization, be resistant to environmental changes, and outperform the reference schemes in the literature.