As fabrication process exploits even deeper submicron technology, global interconnect delay is becoming one of the most critical performance obstacles in system-on-chip (SoC) designs nowadays. Recent years latency-insensitive system (LIS), which enables multicycle communication to tolerate variant interconnect delay without substantially modifying pre-designed IP cores, has been proposed to conquer this issue. However, imbalanced interconnect latency and communication back-pressure residing in an LIS still degrade system throughput. In this paper, we present a throughput optimization technique with minimal queue insertion. We first model a given LIS as a quantitative graph (QG), which can be further compacted using the proposed techniques, so that much bigger problems can be handled. On top of QG, the optimal solution with minimal queue size can be achieved through integer linear programming based on the proposed constraint formulation in an acceptable runtime. The experimental results show that our approach can deal with moderately large systems in a reasonable runtime and save about 28% of queues compared to the prior art.