JMMMU Leaderboard
๐ Homepage | ๐ค Dataset | ๐ HF Leaderboard | ๐ arXiv (coming soon) | ๐ป GitHub
"Which LMM is expert in Japanese subjects?" ๐ Welcome to the leaderboard of JMMMU
We introduce JMMMU (Japanese MMMU), a multimodal benchmark that can truly evaluate LMM performance in Japanese.
JMMMU consists of 720 translation-based (Culture Agnostic) and 600 brand-new questions (Culture Specific), for a total of 1,320 questions, updating the size of the existing culture-aware Japanese benchmark by >10x.
Submit on JMMMU Benchmark
Introduction
We do not recommend including results obtained from extensive prompt engineering since it is important to prevent performance hacking and better reflect real-world use cases. For more details, please refer to the lmms-eval code base and the upcoming paper (coming soon).
- Obtain Result JSON File from lmms-eval code base.
- If you want to update existing model performance by uploading new results, please ensure 'Model Name Revision' is the same as what's shown in the leaderboard. For example, if you want to modify LLaVA-OV 7B's performance, you need to fill in 'LLaVA-OV 7B' in 'Revision Model Name'.
- Please provide the correct link of your model's repository for each submission.
- After clicking 'Submit Eval', you can click 'Refresh' to obtain the latest result in the leaderboard.
Note: The example of the submitted JSON file is this url: result.json.
Submit Example
If you want to upload LLaVA-OV 7B's result in the leaderboard, you need to:
- Select LMM in 'Model Type'.
- Fill in 'LLaVA-OV 7B' in 'Model Name' if it is your first time to submit your result (You can leave 'Revision Model Name' blank).
- Fill in 'LLaVA-OV 7B' in 'Revision Model Name' if you want to update your result (You can leave 'Model Name' blank).
- Fill in 'https://huggingface.co/lmms-lab/llava-onevision-qwen2-7b-ov' in 'Model Link'.
- Fill in '7B' in 'Model size'.
- Upload results.json.
- Click the 'Submit Eval' button.
- Click 'Refresh' to obtain the uploaded leaderboard.
To check whether the submission is successful, you can click the 'Logs' button. If the message 'Success! Your submission has been added!' appears, the submission is successful.