TY - JOUR
T1 - Toward explainable and advisable model for self-driving cars
AU - Kim, Jinkyu
AU - Rohrbach, Anna
AU - Akata, Zeynep
AU - Moon, Suhong
AU - Chen, Yi-Ting
AU - Darrell, Trevor
AU - Canny, John
PY - 2021/12
Y1 - 2021/12
N2 - Humans learn to drive through both practice and theory, for example, by studying the rules, while most self-driving systems are limited to the former. Being able to incorporate human knowledge of typical causal driving behavior should benefit autonomous systems. We propose a new approach that learns vehicle control with the help of human advice. Specifically, our system learns to summarize its visual observations in natural language, predict an appropriate action response (eg, “I see a pedestrian crossing, so I stop”), and predict the controls, accordingly. Moreover, to enhance the interpretability of our system, we introduce a fine-grained attention mechanism that relies on semantic segmentation and object-centric RoI pooling. We show that our approach of training the autonomous system with human advice, grounded in a rich semantic representation, matches or outperforms prior work in terms of control prediction and explanation generation. Our approach also results in more interpretable visual explanations by visualizing object-centric attention maps. We evaluate our approach on a novel driving dataset with ground-truth human explanations, the Berkeley DeepDrive eXplanation (BDD-X) dataset.
AB - Humans learn to drive through both practice and theory, for example, by studying the rules, while most self-driving systems are limited to the former. Being able to incorporate human knowledge of typical causal driving behavior should benefit autonomous systems. We propose a new approach that learns vehicle control with the help of human advice. Specifically, our system learns to summarize its visual observations in natural language, predict an appropriate action response (eg, “I see a pedestrian crossing, so I stop”), and predict the controls, accordingly. Moreover, to enhance the interpretability of our system, we introduce a fine-grained attention mechanism that relies on semantic segmentation and object-centric RoI pooling. We show that our approach of training the autonomous system with human advice, grounded in a rich semantic representation, matches or outperforms prior work in terms of control prediction and explanation generation. Our approach also results in more interpretable visual explanations by visualizing object-centric attention maps. We evaluate our approach on a novel driving dataset with ground-truth human explanations, the Berkeley DeepDrive eXplanation (BDD-X) dataset.
KW - advisable AI
KW - eXplainable AI
KW - self-driving vehicles
U2 - 10.1002/ail2.56
DO - 10.1002/ail2.56
M3 - Letter
VL - 2
SP - 1
EP - 13
JO - Applied AI Letters
JF - Applied AI Letters
SN - 2689-5595
IS - 4
ER -