It is generally impossible to store all weights into an MLP accelerator because of limited on-chip SRAM capacity. However, the performance can still be improved if a portion of weights are allocated in faster SRAM. In this paper, we first present an analytical method for performance evaluation under different weight allocation approaches. We then propose an ILP-based on-chip weight allocation strategy that can maximize the overall performance. Experiment results show that the proposed strategy constantly outperforms several trivial heuristic methods over a large set of various MLP models, MLP accelerator configurations, and on-chip SRAM capacities.