🤖 Apple's first AI model revealed

Also: Google's Gemini AI inside iPhones?

Sponsored by

Welcome, AI enthusiasts

Apple has just unveiled MM1, its family of multimodal AI models, showcasing capabilities that stand toe-to-toe with leading technologies like GPT-4V and Google Gemini. Next to that, Apple is also in negotiations to integrate Google’s Gemini AI engine into the iPhone. Adding to the excitement, xAI, led by Elon Musk, has made the decision to open source its AI chatbot Grok. Let's dive in!

In today’s insights:

  • Apple reveals MM1, its first family of multimodal LLMs

  • Apple is in talks to build Google’s Gemini AI engine into the iPhone

  • xAI open sources Grok

Read time: 5 minutes

🗞️ LATEST DEVELOPMENTS

Source: the-decoder

Evolving AI: Apple quietly published a new paper unveiling MM1, a new family of multimodal AI models that can compete with GPT-4V and Google Gemini.

Key Points:

  • Apple engineers published a research paper about Multimodal Large Language Models (MLLMs).

  • The paper outlines how they built a family of MLLMs of up to 30B parameters called MM1.

  • MM1 displays impressive image captioning, visual question answering, natural language inference.

Details:

Even though Apple hasn't officially unveiled an AI model yet, insights from a new research paper shed light on the company's strides in the world of multimodal AI technologies. They've introduced MM1, a model that stands out for its remarkable ability to process and understand both images and texts. This ability places it in the same league as, or perhaps even ahead of, notable models like GPT-4V and Google Gemini, especially in tasks that involve visual information. What's truly impressive about MM1 is that it manages to achieve these feats despite being smaller in size. It's particularly adept at making sense of images, responding to questions based on what it 'sees,' and tackling problems by analyzing multiple images. A big part of its success comes from a special component known as a visual encoder. This component, together with high-quality training data, plays a crucial role in boosting the model's performance. Moreover, the way MM1 has been trained is with a mix of different types of data that turns out to be a game-changer. This approach is especially effective in situations where the examples to learn from are scarce, highlighting the value of diverse data in improving the model's ability to learn

Why This Matters:

It’s interesting to see how open Apple has been in sharing its research with the broader AI community. Apple, a company famous for its secrecy, published a paper with staggering amount of details on their multimodal foundation model. The researchers state that “in this paper, we document the MLLM building process and attempt to formulate design lessons, that we hope are of use to the community.” Exactly how MM1 models will be implemented in Apple’s products remains to be seen. The published examples of MM1’s capabilities hint at Siri becoming a lot smarter when she eventually learns to see.

Work lesser & drive 10x more impact using AI

HIGHLY RECOMMENDED: A Power-packed workshop (worth $199) for FREE and learn 20+ AI tools to become 10x more efficient at your work.

👉 Become an AI Genius in 3 hours. Register here (FREE for First 100) 🎁
In this workshop you will learn how to: 

Simplify your work and life using AI

Do research & analyze data in seconds using AI tools

Automate repetitive tasks & save 10+ hours every week

Build stunning presentations & create content at lightning speed

Source: REUTERS/Aly Song/File Photo

Evolving AI: Apple is in talks to build Google's Gemini AI engine into the iPhone.

Key Points:

  • Apple wants to use Google's Gemini AI for new iPhone features.

  • Apple also explored potential collaboration with OpenAI.

  • The collaboration aims to enhance iPhone capabilities with generative AI for creative tasks.

Details:

Tech giants Apple and Google are possibly on the verge of joining forces. Apple, in search of a formidable ally in the world of AI, has shown interest in Google's Gemini project. Such a partnership could significantly enhance the capabilities of Apple's iOS 18 for the iPhone by incorporating next-generation AI features. This upgrade promises to empower users with the ability to generate images and compose essays with unprecedented ease. However, the precise details of this collaboration, including the terms and how it will be branded, remain undecided. An official announcement might be made at Apple's Worldwide Developers Conference. As the competition in AI technology heats up, the choice of Apple to collaborate could very well dictate the direction of future technological developments.

Why This Matters:

This potential collaboration between Apple and Google may signal a new era of AI integration in consumer technology, blending Apple's hardware excellence with Google's AI innovation. Beyond technical synergies, it poses questions about market dynamics, privacy, and the future of AI-powered interfaces. Will this partnership lead to a renaissance in smartphone capabilities, or will it spark further regulatory concerns? Only time will unveil the full impact.

Source: Jaap Arriens/NurPhoto/Getty Images

Evolving AI: On March 11th, Elon Musk said xAI would open source its AI chatbot Grok, and now an open release is available on GitHub

Key Points:

  • Musk's xAI released Grok's model weights and architecture.

  • Grok, a 314-billion parameter model, surpasses its open-source rivals.

  • The open sourcing aligns with Musk's broader business and ideological strategies.

Details:

True to his word, billionaire multi-company leader Elon Musk’s startup xAI today made its first large language model (LLM) Grok open source. A company blog post explains that this open release includes the “base model weights and network architecture” of the “314 billion parameter Mixture-of-Experts model, Grok-1.” It continues saying the model is from a checkpoint last October and hasn’t undergone fine-tuning “for any specific application, such as dialogue.” Grok was open sourced under an Apache License 2.0, which enables commercial use, modifications, and distribution, though it cannot be trademarked and there is no liability or warranty that users receive with it. In addition, they must reproduce the original license and copyright notice, and state the changes they’ve made.

Why It Matters:

The open sourcing of Grok is also clearly a helpful ideological stance for Musk in his lawsuit and general criticisms of OpenAI, which he sued recently, accusing his former company of abandoning its “founding agreement” to operate as a non-profit. OpenAI released emails in its defense in the court of public opinion, at least, indicating Musk was aware of and possibly supportive of its move away toward proprietary, for-profit technology.

💡 Tip of the Day

It’s not a tip, but rather a request from our side today. We'd like to ask you a few questions to understand why you're interested in AI and what you hope it will bring you. The process is quick and easy, and your feedback will shape our next steps. Thank you for sharing your responses!

🎯 SNAPSHOTS

Direct links to relevant AI articles.

🤖 Mercedes: Mercedes is trialing humanoid robots for ‘low skill, repetitive’ tasks.

🥼 Medical Technology: Doctors are turning medical generative AI into a booming business.

📈 Trending AI Tools

  • 🤖 DoNotPay - uses AI to help you fight big corporations, protect your privacy, find hidden money, and beat bureaucracy (link)

  • 📜 Policies by AI - generates private policy and terms of service for your website (link)

  • 🎥 Kapwing - video editing through the power of artificial intelligence (link)

  • 🎧 Listnr - AI audio tool encompassing text-to-speech, speech-to-text, and voice cloning features (link)

  • 🔊 LALAL.AI - AI audio tool, excels in stem splitting, enabling users to extract individual components from audio or video (link)

  • 🎨 StarryAI - A prominent AI image generation tool on the internet landscape (link)

  • 🖼️ Topaz Labs AI - AI-powered image and video enhancement tools (link)

Reply

or to participate.