Tag
multimodal
8 verified claims carrying this tag. Each has 2+ primary sources and an HMAC-SHA256 signature.
CLIP (Contrastive Language-Image Pretraining) introduced in paper: Learning Transferable Visual Models From Natural Language Supervision (Radford et al., 2021).
85a3ca745eaf4ee0 · 2 sources · 100% confidence
GPT-4o released on: 2024-05-13.
bd065b91ca6e880b · 1 source · 100% confidence
Llama 3.2 (multimodal release including 11B and 90B vision models) released on: 2024-09-25.
e27816c692a28ce9 · 2 sources · 100% confidence
CLIP introduced in paper: Learning Transferable Visual Models From Natural Language Supervision (Radford et al., 2021).
bcdef949cc6d3644 · 2 sources · 100% confidence
DALL·E 2 released on: 2022-04-06.
0b0e64476bd25bd6 · 2 sources · 100% confidence
Flamingo introduced in: Alayrac et al. 2022 — DeepMind few-shot vision-language model.
72ea74efc723bd06 · 2 sources · 100% confidence
Llama 4 released on: 2025-04-05 by Meta — Scout + Maverick + Behemoth lineup.
d5ce871dc69e7b04 · 2 sources · 100% confidence
GPT-4 Vision publicly released on: 2023-09-25 by OpenAI.
e15378ee47c08761 · 2 sources · 100% confidence