Issue #30: Vision model added to Phi-3 family
Microsoft introduces Phi-3-vision at Microsoft Build
Welcome to Issue #30 of One Minute AI, your daily AI news companion. This issue discusses the new Phi-3-vision model announced by Microsoft.
Microsoft introduces new vision model to Phi-3
At this year's Microsoft Build, Microsoft unveiled the Phi-3-vision, the first multimodal model in the Phi-3 series. This innovative model integrates text and images, enabling it to analyze real-world images and extract and interpret text from them. With 4.2 billion parameters and a context length of 128,000 tokens, it is designed for broad commercial and research usage in English.
The Phi-3-vision model supports general-purpose AI systems and applications that require both visual and text input capabilities, especially in environments with
memory and compute constraints
scenarios demanding low latency
general image understanding
optical character recognition (OCR)
understanding of charts and tables.
Want to help?
If you liked this issue, help spread the word and share One Minute AI with your peers and community.
You can also share feedback with us, as well as news from the AI world that you’d like to see featured by joining our chat on Substack.