visionMicrosoft

Phi-4 Multimodal

Microsoft's compact multimodal model handling text, image and audio understanding.

microsoft/phi-4-multimodal-instruct

Test it live

Live playgroundPhi-4 Multimodalin-browser demo

Test Phi-4 Multimodal live

Send a message and watch the model respond. Multi-turn — it remembers the conversation.