Q

qwen3-vl-235b-a22b

Inndata:$75/M
Utdata:$300/M
Kontekst:2M
Maks utdata:30K
qwen3-vl-235b-a22b is a multimodal model that unifies strong text generation with visual understanding for images and videos. Its Instruct variant optimizes instruction-following for general multimodal tasks. It excels in perception of real-world/synthetic categories, 2D/3D spatial grounding, and long-form visual comprehension, achieving competitive multimodal benchmark results.
Ny
Kommersiell bruk