This article deals with an infinite-capacity multi-server queueing system, in which the servers are assumed unreliable and may fail at any time. To conserve energy while delivering reliable service, a controllable repair policy is introduced. With such a policy, the failed servers will be sent to the repair facility only when the number of failed machines in the system arrives at a preset threshold value. A quasi-birth-and-death process is used to model the complex system and the stability condition is examined. The rate matrix is calculated approximately and steady-state stationary distributions are obtained by a matrix-analytic approach. The closed-form expressions of important system characteristics are presented. A cost model is constructed to determine the optimal repair policy, the optimal value of service rate and the optimal value of repair rate. Three heuristic algorithms are employed to deal with the optimization problem. Some numerical results are provided to compare the efficiency of two methods.