selected publications conference paper Multi-modal fusion transformer for visual question answering in remote sensing 2022