Evaluation Server


We are hosting our evaluation server on Dynabench. You must register first in order to submit prediction files/models for evaluation. Follow this link to sign up.

The prediction files must be in jsonl format and each line follows the data structure below:

{
"uid" : question id,
"answer" : str,
}

You can also download the sample predictions by the majority baseline model following the links: AVQA val & test, AdVQA val & test.

Coming soon...


Evaluation Code


We follow the same evaluation protocol as VQA. The official evaluation code is available here.