- Evolving AI Insights
- Posts
- 🌟 OpenAI’s new AI model ‘Strawberry’
🌟 OpenAI’s new AI model ‘Strawberry’
Also: Microsoft's 'too dangerous' AI voice generator
Welcome, AI enthusiasts
OpenAI is working on a reasoning technology now called 'Strawberry,' which was previously known as Q*. Microsoft has developed an AI voice generator that they deem too dangerous to release, and Meta is set to release its largest Llama 3 model on July 23, with its weights included. Let’s dive in!
In today’s insights:
OpenAI’s new AI model ‘Strawberry’
Microsoft has developed an AI voice generator so realistic that it’s deemed too dangerous to release
Meta to release largest Llama 3 model on July 23
Read time: 4 minutes
🗞️ LATEST DEVELOPMENTS
Evolving AI: OpenAI is reportedly working on a new AI model code-named 'Strawberry'
Key Points:
OpenAI's Strawberry aims to improve AI's reasoning capabilities.
The project remains a secret within OpenAI, similar to Stanford's STaR method.
Strawberry could help AI perform complex, long-term tasks autonomously.
Details:
According to a Reuters report, OpenAI is working on a project called "Strawberry", previously known as Q* or Q-Star. The goal is to significantly enhance the reasoning abilities of the company's AI models. Details about Strawberry are closely guarded, but it builds on techniques like fine-tuning and iterative self-training. This is in line with OpenAI's vision that AI agents that first reason logically and then take action represent the next level of technology. According to the Reuters source, Strawberry is being specifically tested to take over tasks from software and machine learning engineers.
Why It Matters:
Experts believe that Strawberry, also known as Q*, combines large language models with planning algorithms, reinforcement learning, and longer computation times during application to create AI systems that can understand better and think more independently. If OpenAI is able to make Strawberry a success, without a doubt, we will see AI agents profoundly making their way into our world, having a huge economic impact.
Evolving AI: Microsoft’s new AI voice generator achieves human-level accuracy, but its risks are too great to release publicly.
Key Points:
Microsoft’s VALL-E 2 text-to-speech (TTS) generator achieves human parity, requiring only a few seconds of audio to replicate a voice.
Researchers are withholding VALL-E 2 from public release due to potential misuse risks, such as voice identification spoofing.
While the technology offers benefits for speech-impaired individuals, ethical considerations are paramount.
Details:
Microsoft has developed VALL-E 2, an AI voice generator capable of producing speech with human-level accuracy. The model only needs a few seconds of audio to convincingly mimic a person's voice, surpassing previous benchmarks. Despite its potential benefits, such as aiding those with speech impairments, Microsoft is not releasing VALL-E 2 to the public due to concerns about misuse, including fraud and impersonation. This cautious approach is shared by other AI leaders, like OpenAI, which also restricts certain voice technologies and has developed tools to detect deep fakes. The future of VALL-E 2 remains uncertain as ethical considerations take precedence.
Why This Matters:
Speech and voice is clearly the next big battleground for generative AI and a number of companies are working hard to produce models that can understand and replicate natural voice patterns. Despite releasing audio samples, Microsoft considers VALL-E 2 too advanced for public release due to potential misuse like voice spoofing. This cautious approach aligns with the wider industry’s concerns, as seen with OpenAI’s restrictions on its voice technology.
Evolving AI: Meta Platforms plans to release the largest version of its open-source language model Llama 3 on July 23, according to an employee.
Key Points:
Meta to launch Llama 3, a 405-billion-parameter, multimodal AI model, on July 23.
The model processes both images and text, enhancing versatility.
Despite initial objections, Meta will release the model's weights as open-source.
Details:
Meta Platforms is set to release Llama 3, its largest open-source language model yet, on July 23. This new model features 405 billion parameters and can process both images and text. Unlike previous versions that only generated text, Llama 3 can create new images from a mix of images and text. Initially, there were concerns about releasing the model's weights due to financial and safety reasons. However, Meta has decided to make the weights available as open-source. Weights are essential for making accurate predictions and allow developers to use and improve the model without needing extensive training resources. Without weights, the model is just an empty framework and can't perform useful tasks.
Why It Matters:
The release of Llama 3’s 405B parameters model, complete with weights, marks a significant milestone in AI accessibility and innovation. It will mark a watershed moment that the community gains open-weight access to a GPT-4-class model. It will change the calculus for many research efforts and grassroot startups. There is so many research potential that can be unlocked with such a powerful backbone. Expect a surge in builder energy across the ecosystem!
💡 Tip of the Day
This is a great interview with Microsoft’s CTO, Kevin Scott, on AI. If you only have 10 minutes, we highly recommend watching from 46:13 to 56:57. He discusses building products with the mindset that the underlying models will improve, provides valuable insights on ‘what’s to come’ from someone who truly understands, and shares a personal story about AI's potential to improve the world.
On the latest episode of Training Data @BillCoughran and I spoke to @Microsoft CTO @kevin_scott who has led their AI strategy for the past seven years.
Kevin describes himself as a “short-term pessimist, long-term optimist” and he sees the scaling trend as durable for the… x.com/i/web/status/1…
— Pat Grady (@gradypb)
5:17 PM • Jul 9, 2024
🎯 SNAPSHOTS
Direct links to relevant AI articles.
🛒 Amazon: Amazon’s AI shopping assistant rolls out to all users in the US.
🤖 AI in practice: Experiment finds AI boosts creativity individually — but lowers it collectively
💡 NATO: NATO releases revised AI strategy.
🔉 Suno & Udio: AI music companies Suno and Udio hire elite law firm for copyright battle with major labels.
📈 Trending AI Tools
🚀 Evolving AI’s Prompt Hub - The world's #1 ChatGPT Prompt Hub, featuring prompts that consistently produce great results (link)
🖌️ AI Logo Maker - A tool to generate custom logos (link)
🔍 DuckDuckGo AI - A privacy-centric search engine with AI chat. (link)
📚 Enago Read - A tool to review literature for researchers with summarization and organization (link)
📽️ ModelScope Text-to-Video - Generate videos from text-based prompts (link)
🎨 Animated Drawing - A tool to bring children's drawings to life. (link)
What'd you think of today's edition? |
Reply