Mistral AI adds Flux image generation and web search to Le Chat, launches Pixtral large

2 months ago 2
ARTICLE AD BOX

Mistral AI adds web search and image generation to its Le Chat AI assistant, while introducing a new visual model that performs well on industry benchmarks.

Le Chat users can now access current web content through integrated web search and create images using Black Forest Labs' Flux Pro model. In addition, the assistant processes documents and images using Mistral's new Pixtral Large model.

The company also added a canvas interface that allows users to edit generated content directly in the chat window. Users can write documents, create presentations, and edit code without generating new responses.

With the integration of Pixtral Large, Le Chat can now analyze complex PDF documents, including graphics, tables, diagrams, and formulas. These new features are initially being rolled out as a free beta on the startup's "Le Chat" platform.

Ad

THE DECODER Newsletter

The most important AI news straight to your inbox.

✓ Weekly

✓ Free

✓ Cancel at any time

Pixtral Large shows competitive performance in visual tasks

The new Pixtral Large model, built on Mistral Large 2, shows good results in visual benchmarks. It scored 69.4 percent on MathVista, a test of mathematical reasoning with visual data, outperforming both GPT-4o and Gemini 1.5 Pro, according to the company.

Mistral says Pixtral Large also outperforms Claude 3.5 Sonnet, Gemini 1.5 Pro, and GPT-4o in analyzing diagrams and documents (ChartQA and DocVQA) and in real-world use cases (MM-MT-Bench).

 Pixtral Large leads in DocVQA and AI2D, and performs competitively against Gemini-1.5 Pro and GPT-4o in all benchmarks.Pixtral Large performs particularly well in document analysis (DocVQA: 93.3%). Mathematical problem-solving (Mathvista: 69.4 %) is also ahead of top models from much larger companies, such as Google's Gemini-1.5 Pro. | Image: Mistral

The model combines a 123 billion parameter multimodal decoder with a one billion parameter vision encoder. It can process up to 30 high-resolution images at once with a 128K context window.

In addition to Le Chat, Mistral AI offers Pixtral Large under two licenses on Hugging Face: a research license for academic use and a commercial license for business applications.

The company is also updating its Mistral Large language model with improved long-context understanding and more precise function calling. The updated model is available through Mistral's API and will soon come to Google Cloud and Microsoft Azure.

Recommendation

Read Entire Article
LEFT SIDEBAR AD

Hidden in mobile, Best for skyscrapers.