WMN's mesh access points (MAPs) are linked through a wireless backhaul network that consist of mesh points (MPs) that is equipped with multiple radios that use multiple non-overlapping channels in parallel. MPs will establish designated links that should satisfy both common channel constraint and interference constraint which are conflicting in nature. Yen and Dai proposed game-theoretic radio resources allocation in WMN which is better than centralized and greedy approach if only two radios are available at each node, but when there are more than two radios per node, centralized and greedy approach perform better. So, this study would like to utilize reinforcement learning to improve previous research so the approach is also effective if there are more than two radios available per node. This study attempts to maximize the number of operative designated links in the backhaul networks subject to common channel constraint and interference constraint. We use multi-agent deep Q-learning to tackle this problem. We conduct simulations to compare the proposed approach with game based approach. The results of our experiments show that the proposed deep Q-learning algorithm performs better than game-theoretic approach in dense network where there are more than two in each MP, while the game-theoretic approach performs better than our proposed DQL algorithm in sparse network.