News

GPT-4o just got outplayed — by a smaller model.

Taledy AI News

12 Apr 2025 — 3 min read

Welcome back to Taledy AI, your go-to resource for staying up to date with the latest advancements in AI, one edition at a time.

This week? Things got spicy.

DeepSeek is teaching models how to think better than GPT-4o. OpenAI gave ChatGPT a memory upgrade. Meta just casually dropped a 2 trillion parameter tease. And NVIDIA? They've got a model that can switch reasoning on and off.

Let's get into it ⬇️

Latest Video:

Discover how to use Manus to bulk-create YouTube videos.

📈 The Latest in AI Signals

🚀 OpenAI Levels Up ChatGPT (Again)

ChatGPT now remembers everything. Literally.

🤔 What’s New:

Memory is persistent across chats
Better personalization, long-term context
You control what it remembers or forgets

🚫 Who Doesn't Get It Yet?

EEA, UK, Switzerland, Norway, Iceland, Liechtenstein

🔮 Why It Matters: We’re inching toward AI personal assistants that truly know you. The more you use it, the smarter it becomes. Creepy or cool? You decide.

Siri's Makeover Incoming: Apple is finally updating Siri for fall 2025. Expect smarter responses and deeper personalization.

📊 ByteDance Gets Logical: ByteDance introduces Seed-Thinking-v1.5 - a MoE model built for STEM reasoning. It's their first real leap into serious LLM territory.

🔥 HEADLINER: DeepSeek Reinvents Reinforcement

DeepSeek just dropped DeepSeek GRM — a generative reward model that critiques itself during inference. Think: "This AI doesn’t just answer. It provides explanations, evaluates its own responses, and adjusts its performance dynamically.

🔎 Method: SPCT (Self-Principled Critique Tuning)

Rejective Fine-Tuning (RFT): Uses 1.07M instruction + 186K tough rejective samples
Online RL via GRPO: Reward +1 if its answer matches ground truth, -1 if not. KL penalty? 0.08.
Inference-Time Sampling: Get multiple self-critiques, then use a Meta-RM to filter out bad takes.

🔢 Results

RewardBench: From 86.0% → 90.4% with meta-filtering
PPE Preference: 64.7% → 67.2%
Overall: Single-pass 69.5% → 72.8% with 32-sample voting
Outperforms: Nemetron 4340BR, rivals GPT-4o

🔮 Why It Matters: This flips the script. Instead of scaling to 671B+ models, DeepSeek shows you can train smaller, smarter AIs that teach themselves. It’s like AI doing code reviews on its own brain.

🎉 New Model Watch: NVIDIA's Nemetron Ultra 253B

NVIDIA just launched a 253B parameter model that beats bigger beasts like DeepSeek R1 on most tasks, and it runs on a single 8x H100 node.

🎯 Highlights:

Built on LLAMA 3.1
Uses Neural Architecture Search (NAS)
Reasoning On/Off toggle
- Math500: 97% with reasoning
- LiveCodeBench: 29% → 66.3%

🌊 Key Feature: Max sequence length of 131,072 tokens, long context window heaven.

🔮 Why It Matters: You don't need trillion-parameter monsters. Smart architecture + selective reasoning = efficient power. It's open-source with commercial use.

🚀 AI Tool of the Day: Fresh Picks

Taledy.com: Create viral short clips from YouTube videos or your own uploads.

👨‍🎨 Mind-Blowing AI: DreamActor-M1

From ByteDance comes DreamActor-M1: an AI that animates static images into full-body video sequences.

Smooth motion
Realistic expressions
Rivals Runway's Act-One

Perfect for content creators who want motion without mocap. 🎥

🚀 Rumor Radar

Meta teased a 2 TRILLION parameter "LLaMA 4 Behemoth" model
OpenAI’s GPT-4.1 + GPT-O3 expected to drop this week
DeepSeek R2 chatbot might already be in testing — SPCT-powered?
Google’s Gemini Live now has real-time camera + screen input AI

📢 From Taledy

MCPs are taking over the world. Are you familiar with them, or should I create a tutorial for that next time? Let me know in the comments or by replying to this email.

📊 TL;DR Recap