vision
vision_language
+++ title = "Vision-Language Models" description = "Advances in models that combine vision and language understanding." +++
- Vision-Language Pretraining
- Multimodal Reasoning
Relevant Papers:
- Vltp: Vision-language guided token pruning for task-oriented segmentation
- Taskclip: Extend large vision-language model for task oriented object detection
Key research and applications in vision-language AI.