With air pollution having become a global concern, scientists are committed to working on its amelioration. In the field of air pollution prediction, there have been good results in experimental research so far, but few studies have integrated weather forecast information and the properties of air pollution drift. In this work, we propose a novel wind-sensitive attention mechanism with a long short-term memory (LSTM) neural network model to predict the air pollution - PM2.5 concentrations by considering the influence of wind direction and speed on the changes of spatial–temporal PM2.5 concentrations in neighbouring areas. Preliminary predictions for PM2.5 are then made by an LSTM neural network regarding neighbouring pollution; these predictions are “paid attention to” and we finally apply an ensemble learning method based on eXtreme Gradient Boosting (XGBoost) to combine the preliminary predictions with weather forecasting to make second phase predictions of PM2.5. The experiment is conducted using PM2.5 data and weather forecast data. Our results illustrate that the proposed method is superior to other methods in predicting PM2.5 concentrations, including multi-layer perceptron, support vector regression, LSTM neural network, and extreme gradient boosting algorithm.