User interaction history with items is used to infer user preferences in conventional recommendation systems. Among these, conversational recommendation systems (CRSs), which provide effective recommendations based on a framework combining recommendations and conversations with users, have been proposed. However, existing CRS model still have some shortcomings, such as the lack of using dialogue records for recommendations. In this research, a method was proposed to solve multiround conversational recommendations with heterogeneous questions. The model included a hierarchical reinforcement learning framework and introduced methods of effectively incorporating user feedback for online recommender updates. Experiments were conducted on several datasets to verify model effectiveness; the model outperformed the current state-of-the-art method. Finally, the architecture was more realistic than other conversational recommendation scenarios and provided richer explanations.