We are hosting our evaluation server on Dynabench. You must register first in order to submit prediction files/models for evaluation. Follow this link to sign up.
The prediction files must be in jsonl format and each line follows the data structure below:
{
"uid" : question id,
"answer" : str,
}
You can also download the sample predictions by the majority baseline model following the links: AVQA val & test, AdVQA val & test.
Coming soon...