ReadWeb.ai : Free Instant Multi-Language Web Page Translation & Bilingual Viewing for Everyone

Content

A visual multimodal version of the large model series, Qwen (abbr. Tongyi Qianwen), proposed by Alibaba Cloud. Qwen-VL accepts image, text, and bounding box as inputs, outputs text and bounding box.

Link

https://fal.ai/models

Summary

Alibaba Cloud has introduced Qwen-VL, a visual multimodal version of the large model series. Qwen-VL can process inputs such as images, text, and bounding boxes, and provides outputs in the form of text and bounding boxes.