Mobile Edge Computing (MEC) is a promising technique in the 5G Era to improve the Quality of Experience (QoE) for online video streaming due to its ability to reduce the backhaul transmission by caching certain content. However, it still takes effort to address the user association and video quality selection problem under the limited resource of MEC to fully support the low-latency demand for live video streaming. We found the optimization problem to be a non-linear integer programming, which is impossible to obtain a globally optimal solution under polynomial time. In this paper, we formulate the problem and derive the closed-form solution in the form of Lagrangian multipliers; the searching of the optimal variables is formulated as a Multi-Arm Bandit (MAB) and we propose a Deep Deterministic Policy Gradient (DDPG) based algorithm exploiting the supply-demand interpretation of the Lagrange dual problem. Simulation results show that our proposed approach achieves significant QoE improvement, especially in the low wireless resource and high user number scenario compared to other baselines.