🚀 Microsoft adds voice, vision and o1 to Copilot

Also: OpenAI unveils Realtime API for realistic app conversations

In partnership with

Welcome, AI enthusiasts

Exciting developments: Microsoft unveils new Copilot features, offering more personalized interactions, voice responses, and enhanced screen-reading capabilities. Meanwhile, OpenAI takes it further with their Realtime API, allowing developers to integrate six lifelike AI voices into apps, transforming user experience, while Oracle’s latest investment highlights Malaysia’s role in the growing cloud infrastructure boom, fueled by skyrocketing AI demand. Let’s dive in! 

In today’s insights:

  • Microsoft adds voice, vision and o1 to Copilot

  • OpenAI unveils Realtime API for realistic app conversations

  • Oracle invests $6.5 billion in Malaysia cloud facilities

Read time: 5 minutes

🗞️ LATEST DEVELOPMENTS

Evolving AI: Microsoft has launched new Copilot features, enabling more personalized interactions, voice responses, and enhanced screen reading capabilities.

Key Points:

  • Copilot Vision analyzes what you see on-screen and responds to queries.

  • Think Deeper offers step-by-step problem-solving.

  • Copilot Voice introduces conversational AI with voice feedback.

  • Personalization tailors Copilot's suggestions to user preferences.

Details:

Microsoft's Copilot Vision provides a new way to engage with content viewed in Microsoft Edge, allowing users to ask questions about images and text on their screens. Its ‘Think Deeper’ feature uses enhanced models to tackle complex problems with step-by-step reasoning, ideal for comparison tasks and math challenges. Additionally, Copilot Voice introduces four new synthetic voices, allowing natural language conversations with spoken responses, and it can detect tone to adjust its replies. Lastly, Copilot personalization learns from past interactions, offering tailored suggestions—though it remains unavailable in the EU for now.

Why It Matters:

These are some exciting updates by Microsoft for Copilot. But, what do they exactly mean for the users and what do they look like? Let’s have a look.

Copilot Vision

Copilot Voice

Copilot - Personalized Discover Cards

The fastest way to build AI apps

We’re excited to introduce Writer AI Studio, the fastest way to build AI apps, products, and features. Writer’s unique full-stack design makes it easy to prototype, deploy, and test AI apps – allowing developers to build with APIs, a drag-and-drop open-source Python framework, or a no-code builder, so you have flexibility to build the way you want.

Writer comes with a suite of top-ranking LLMs and has built-in RAG for easy integration with your data. Check it out if you’re looking to streamline how you build and integrate AI apps.

Evolving AI: OpenAI's latest Realtime API lets developers integrate six AI voices into their apps, enhancing user interaction with realistic speech.

Key Points:

  • OpenAI’s Realtime API introduces six new AI voices for app integration.

  • Developers can improve GPT-4o performance using just 100 images.

  • New features reduce API costs, improve latency, and optimize smaller models.

Details:

At its DevDay event, OpenAI introduced the Realtime API, which gives developers six AI voices to use in apps. These voices are different from ChatGPT’s and cannot use third-party voices due to legal rules. OpenAI showed how a travel app could use the API to help users plan London trips and suggest restaurants in real-time. The API can also be used for phone calls, with developers deciding if they disclose that an AI voice is used. Other updates include fine-tuning GPT-4o with 100 images for better visual tasks and prompt caching, which cuts costs by 50%. Smaller models like GPT-4o mini can also be improved using larger models.

Why It Matters:

OpenAI’s DevDay made one thing clear to AI app developers: the future holds limitless potential. With the Realtime API and other innovative advancements, developers can now create more advanced, intuitive, and impactful AI applications that will reshape industries and enhance lives.

Source: Askar Karimullin / Alamy Stock Photo

Evolving AI: Oracle's investment signals Malaysia's growing role in the cloud infrastructure boom, driven by AI demand.

Key Points:

  • Oracle to invest $6.5 billion for a public cloud region in Malaysia.

  • The move follows major tech firms investing in Southeast Asia’s cloud infrastructure.

  • The cloud region will support Malaysia's modernization, innovation, and AI adoption.

Details:

Oracle’s $6.5 billion investment aims to establish Malaysia’s first Oracle cloud region - one of the largest tech investments in the country. This venture, part of Southeast Asia's ongoing cloud infrastructure growth, promises significant upgrades for Malaysian organizations, enabling advanced data, AI, and analytics solutions. Oracle’s customers, including government agencies and banks, will benefit from localized cloud services. This is Oracle's third facility in Southeast Asia, adding to existing locations in Singapore.

Why It Matters:

Oracle's expansion highlights Southeast Asia's emerging importance in the tech ecosystem, particularly in AI and cloud services. As demand for digital infrastructure grows, Malaysia is becoming a key hub for global tech players looking to expand their cloud and AI capabilities across the region.

🎯SNAPSHOTS

Direct links to relevant AI articles.

🚀 Nvidia released its new AI model to rival GPT-4.

👀 Anthropic hires OpenAI co-founder Durk Kingma.

🎧 Meta will expand production of mixed reality headsets in Vietnam.

🎨 Pinterest rolls out Gen AI tools to advertisers.

💡 Tip of the Day

It's been almost a year since Pika introduced its 1.0 text-to-video AI platform, and in that time, the competition has stepped up in a big way. Rivals like Runway and Luma AI, which unveiled its Dream Machine 1.5 in August, have surged ahead, offering more realistic visuals and advanced effects, leaving Pika playing catch-up.

But that changes now. After what seemed like a long silence, Pika has just announced the launch of Pika 1.5, an upgraded model that pushes the boundaries of creativity. This new version boasts jaw-dropping, physics-defying special effects, cleverly dubbed "Pikaffects," that can morph imagery in mind-bending ways, turning subjects into wildly flexible versions of themselves. Have a look for yourself below.

📈 Trending AI Tools

  • 📈 Danelfin- AI-Powered stock picking and analysis (link)

  • 🎨 G-Prompter - A tool to create prompts for image generation with customizable styles (link)

  • 🤖 Singlebase - All-in-one backend platform for AI+ appss (link)

  • 🩺 HeyDoc - A chat bot for personalized medical assistance and guidance (link)

  • 🎥 Ubique - A tool for creating personalized video messages for sales outreach using voice and face cloning (link)

Reply

or to participate.