transcript
Speaker 1:
[00:01] This is the Everyday AI Show, the Everyday Podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life.
Speaker 2:
[00:15] If you're a heavy ChatGPT user, your life just got a lot better this week. That's because OpenAI had their biggest and best week since probably December of 2024. That was their whole ship miss event where they first shipped thinking models, Sora, Canvas projects, and more. But this week, they even outshipped Anthropic over the past month, which has been on a tear. So from user-friendly agents to the world's best image model, to now the world's best overall AI model, OpenAI cleaned house and stole back the narrative from Anthropic. OpenAI released so much this week. Actually, we had to leave off one of my personal favorites off of our weekly top 7 AI features list, which that one was Chat GPT being released in Google Sheets, by the way. All right. But there was a lot more news aside from OpenAI when it came to big new features in AI this week, from an awe-inspiring design tool that makes you an instant art director to a new way to work inside of Gemini. There was a ton happening this week that you might have missed. And if you did miss it, don't worry because that's what this new Friday show on Everyday AI is all about. It is our Friday features highlighting seven of the biggest features that you might have missed, that you can use today. All right, so let's first tell you what you're going to learn on today's show, shall we? So stick with us for the next 25-ish minutes, and you're going to learn why OpenAI had its biggest week in years. You're going to know how Google is changing, how businesses can collaborate inside of Gemini. And I'm like about time. I've been wanting this for years. And you're going to find out the small print that most people missed on the new ChatGPT Agent announcement. All right, let's get into it. Welcome to Everyday AI. If you're brand new here, sweet. Welcome. My name is Jordan. And while we do this every day, Everyday AI is a daily, unedited, unscripted, live stream podcast and free daily newsletter, helping business leaders like you and me make sense of all the AI updates. I tell you what's worth it, what's not, how to use it. You grow your company and career. In return, you just got to do a couple things. One, if you're listening to the podcast, do me a favor, make sure to subscribe and rate the show. I'd really appreciate it. Make sure you go to youreverydayai.com to sign up for our free daily newsletter. We're going to be recapping the highlights from today's story. So in case you're out walking your dog or your mom's dog or you're walking your mom or the dog's walking you, I don't know. If you miss anything, it's all going to be in the newsletter. All right, so let's take a look at what is going on. My gosh, a ton happening this week. So if you are listening on the podcast, nothing super visual, but I am going to be sharing my screen here, but you can always find the video version on our website for free, youreverydayai.com. All right, first one, this one looks so good. This is the new Claude Design by Anthropic. So here's what it is. It is a new AI power design tool from Anthropic Labs that turns text prompts into prototypes, slide decks, UIs and marketing one-pagers. So yeah, this one actually, it's technically been out a week, right? But the show comes out on Fridays and this was literally announced like an hour after the podcast was up and I'm like, really? So it might seem like, wait, hasn't this been out? Well, yeah, it has, but it is that good that I had to make sure to talk about it on our Friday show. So here's the cool thing. It can read a company's code base in design, any type of file to automatically apply brand colors, typography, design components, et cetera. So who has access to this? So right now it is a research preview. All right. And it's available to paid Claude subscribers. It's also included in existing Claude subscriptions, but you're gonna hit your quota pretty quick. All right. Here's the thing most people were like kind of confused about. It's actually kind of like a new interface altogether. Right? So you're gonna go to your normal claude.ai and then you're going to find on the left side, there's a thing called Design at the end. And then it's not gonna look like your normal, you know, Claude on the web. It's not gonna behave like it either. Just FYI so you know that. But I personally like it. It's very interactive. It walks you through things step by step. It asks you a series of questions depending on what you are trying to design. Here's why it's useful. Well, I mean, it lets non-designers. So, you know, you could be just a marketer, right? And normally you're working with a design person. Maybe you used to have a design person and you don't need more. You're a founder, you're a product manager, whatever it is. This lets you ship so many different kinds of branded assets without even tying up a design team, right? Or maybe, yeah, maybe you do have a design team and you need something on brand, but you don't want to distract them. And that's huge because, you know, the big thing is it does maintain that brand consistency automatically once a design system is loaded. So, yeah, you can like load an actual design system. So I'm not going to bore you with what that is, but it's essentially this combination of your colors, your fonts, how you use them in coordination, right? So it's a lot more than that. But that's just like an example of how you can bring your entire brand, your visual brand identity and then create a host of different visuals. I mean, you can even create animations. You can create short videos. I was messing around with that a little bit. So really cool. So who's going to find this valuable? I mean, designers, anyone who's exploring certain concepts, wanting to take a direction farther, product managers and founders who need working prototypes before engineering. That's huge, right? And it's going to walk you through to make sure that you are thinking of everything in your app or whatever it is. And then marketing and sales teams who are trying to produce on brand decks, one-pagers, whatever. Downside to this in true anthropic fashion, if you look at it the wrong way, your limits are gone. I am on the $200 a month max plan and I had to stop, right? Because I'm like, okay, I don't know if I'm going to have enough limits left to do a show on this. If I end up doing a show on Wednesdays, we do our AI at work on Wednesdays. I did probably like five or six tests and I'm like, I'm almost out on a $200 plan. So yeah, I'm not exaggerating on a $20 plan. Good luck. I do hope that Anthropic changes this a little bit, but I'll tell you this. Do a lot of your work first in the second tool and then bring it into Claude Design. Speaking of second tool or second feature update, let's get to that. So yeah, before you bring anything into Claude Design, first, start it in the Image 2.0 from ChatGPT. This is a crazy, an absolutely crazy drop. Let me first tell you what's new here. So this is OpenAI's most capable image generation model yet, has stronger editing, better layouts, near perfect text rendering, and more reliable instruction following. It can handle multilingual text, full infographics, slide maps, and manga. And it can generate up to eight coherent images from a single prompt. This is the, you could call it their third generation flagship model. You had the GPT Image 1, that was last April. You had the GPT Image 1.5 in December, and then this, Images 2. So who has access? Well, everyone, even people on the free tier, right? That's the good thing about OpenAI, Chat GPT. Yeah, you can get more utility sometimes on a free, I kid you not, on a free plan on Chat GPT, then you can get on the basic $20 a month plan on Claude. You know, Gemini Copilot are great with limits, but, you know, OpenAI is obviously really, really good with limits. So here's the thing. It is the thinking mode that you are going to want. So if you are on the free plan, you know, you'll still be able to create literally world class images. It's it's there's limits on the free plan, obviously. But you're really going to want to be on a paid plan to be able to take advantage of its ability to, well, it's a world model, right? So it thinks about things. You know, it can understand gravity. It can really keep that character consistency, right? Like if one shot is head on in the sun and the shadow is behind and then you change the camera angle, right? It's not going to move the shadow. The shadow is going to stay in the same place or where it should be respectively. Right? So just incredibly, incredibly impressive. So the typography is production grade, right? That's one of the biggest reasons why it's useful. Even their GPT 1.5 was OK-ish, right? But you still ran into some issues with text rendering. I think Nano Banana Pro was one of the biggest steps forward where it's like, OK, you can actually use, you know, text that's created in AI image, right? So whether you're kind of trying to create signs, you know, just or images that would normally have, you know, signage in the background or a poster, et cetera, right? A lot of times you'd have to go clean it up in another program. Not anymore with images too. It is really, really good. So it is like I said, you do have to be on a paid plan to take advantage of that thinking. It is available in the API as well. So here's why it's useful. Well, aside from the typography is finally production grade, which really opens up what you can use it on. So UI text is big, labels, signs, multi-word strings, whatever it is. But now it also supports 2K resolution, which is huge. And you can do aspect ratios from 3 to 1, 1 to 3, everything in between. And it's roughly twice as fast as the original GPT image 1. So who's going to find this useful? Well, marketers. Right. Marketers, content creators, or anyone producing slides and infographics. That's something I don't think has been talked about enough about this model. Right. And it's available in Codex as well. So I'm creating a lot of slide decks in Codex. But the slide decks. So if you've used Nano Banana in there, or sorry, if you've used Notebook LM, their slide decks, which are amazing, are powered by Nano Banana, which I think has been one of the most underutilized AI tools since it came out. So this can create the same thing. It won't create by default like a PDF. You have to save the images individually. But I think I actually created a skill inside of Codex to do that for me. So just what it can do visually is amazing. Right. So for our live stream audience here watching, showing some of the images on my screen here. One thing I really liked about what OpenAI did here is they did on their blog post, they did a classic mode, right? Which is just all the text and images on how it would normally be laid out. Then they did this image mode, which I thought was really cool. So they essentially took a blog post with all this information and they put it into images too. Talk about dog fooding and obviously the images are amazing. Let me just go ahead and call this out right now. So why should you use this along with Claude Design? Well, number one, again, good luck using Claude Design. Hopefully, it'll get better. But even on a $200 plan, you're going to run out of limits very quickly. So you should do a lot of the iterating, some of the basic prototyping in Chat GPT first with images too. If you're on a paid plan, you're not really going to run out of limits, right? Unless you're doing it around the clock on the $20 a month plan. But if you're just doing 10, 20 images a day, you're fine on the paid plan. But I would start doing these things first inside GPT image and then take them elsewhere because the jump in quality is absolutely outrageous. So I'm actually trying to bring up here, let's bring up. AI moves too fast to follow, but you're expected to keep up. Otherwise, your career or company might lag behind while AI native competitors leap ahead. But you don't have 10 hours a day to understand it all. That's what I do for you. But after 700 plus episodes of Everyday AI, the most common questions I get is, where do I start? That's why we created the Start Here series, an ongoing podcast series of more than a dozen episodes you can listen to in order. It covers the AI basics for beginners and sharpens the skills of AI champions pushing their companies forward. In the ongoing series, we explain complex trends in simple language that you can turn into action. There's three ways to jump in. Number one, go scroll back to the first one in episode 691. Number two, tap the link in your show notes at any time for the Start Here series. Or you can just go to starthereseries.com, which also gives you free access to our inner circle community, where you can connect with other business leaders doing the same. The Start Here series will slow down the pace of AI, so you can get ahead. Let's see, let's see, let's see. I'm trying to bring up the, there we go, the image. There we go, text image. Let's bring this up here for our, for our live stream audience. So it absolutely crushed everyone. Okay, so we have the arena, right? So the arena is just, you know, blind taste tests. You put in a prompt, text prompt, you get two images, you choose which one's best. This set a record. So it's obviously in first place, but it set a record for the biggest gap. It is, I'm looking at the scores right now. It's actually bigger than it was prior. It is a 236 point gap, right? Between GPT Images 2 and then NanoBanana 2. And for context, right? From second place to fifth place, that's a 40 point gap, right? So everything's been in the, you know, 1100, 1200s and GPT Images 2 shot up to 1500. It is amazing, right? And so don't just think of images, right? Because you're probably thinking like, okay, what can I use this for? Well, here's the thing. If you are on a paid ChatGPT plan, all right? And if you have memory enabled, here's what you should do. Well, first, make sure to, you know, tell ChatGPT to go check out the release, right? Sometimes it won't, and it will rely on his training data and it'll be like, oh, you know, there is no GPT images too, right? So first send it to the blog post, copy and paste it in and then say, hey, based on everything you know about me, what are seven specific ways that I can use this new images model to improve my work, to make our product better, to make our marketing more effective, right? To bring life to our blog, right? Anything, just do that, right? I think sometimes if you are not a marketer at heart, if you're not a content creator at heart, if you're not a photographer at heart, you know, to find true business value in this, it sometimes can be a struggle. So just ask ChatGPT and I guarantee you, you're going to get some pretty good ideas. All right, let's move on to our next update. In this one, well, there's a handful from Codex. Some of these are very fresh, but I want to talk first about Codex Chronicle. Okay, so do you remember, you know, like a year and a half ago when Microsoft Windows announced this thing called Recall, right? And everyone for Copilot, and everyone went absolutely nuts, right? Everyone's like, oh, this is a huge security concern, right? Where essentially the system just takes photos of everything on your screen intermittently, and then it remembers that, and then you can talk to it and ask it. Well, guess what? Codex just released that via this new product called Chronicle. So it's not available everywhere right now. It is obviously an opt-in research preview, but I'm going to go ahead and read how OpenAI describes it. They said Chronicle augments Codex memories with context from your screen. When you prompt Codex, those memories can help it understand what you've been working on with less need for you to restart context. Chronicle is available as an opt-in research preview in the Codex app on Mac OS. Then it's just saying, be careful of prompt injection. You have to give it certain access. So it's going through all the normal, take proper measures. So it also gives some examples of just how much easier. So without Chronicle, they give an example. Using Codex, you say, why is this failing? And then Codex responds, I do not know what this refers to yet. Send the failing command, error text screenshot or GitHub actions link and I can debug it. Right. But then if you do have Chronicle enabled and then you say the same thing, why is this failing? You know, it says, it first says that depends on what this refers to. So I will inspect your recent screen context before guessing. Then it uses the Chronicle skill and then it kind of says, oh, I see what's wrong now. So essentially, this thing does go through your usage fast or more. It does say that. Right. I've been using it and I haven't been noticing a huge jump in my rate limits. I'm also on the $200 a month plan for ChatGPT. Yeah. So you can use Codex with any ChatGPT account. But yeah, and I probably skipped over because I did a Codex show last. No, was that last? Oh, my gosh. This is how you know I'm tired. I'm like, that was last week. No, that was two days ago. All right. So make sure you go check that one out. Episode 761 if you haven't already. So essentially, Codex was originally built for coders, but now it's great for just everyday work. So if you like Claude code, if you like Claude co-work, you're going to love Codex. I go use it immediately. It is that good. All right. A couple other new updates in Codex worth talking about this. So Andrew from, which is crazy, OpenAI shipped so much. They didn't even put out a blog post or anything on these. It was literally just Andrew's tweet. And then I went into my Codex and I'm like, wait, all of these things are live and absolutely no one's talking about it. All right. So some other new updates in the Codex app. Obviously, GPT 5.5, we're going to get to that later. But there's a new browser control skill, which is really cool. There's a new skills for Google Sheets and Google Slides. There's an operating side, operating system wide dictation, auto review mode and more. So yeah, Codex just keeps pumping out the great updates. So yeah, aside from everything that Chat GPT did this week, some pretty good updates there from Codex. All right. Speaking of useful updates, this one, this here from Google Gemini. This is what people have been wanting from Gemini for like two years, but we haven't gotten. All right. So here's what's new. This is called Gemini Workspace Intelligence. So here's what it is. It is a new semantic layer that was announced at the Cloud Next Conference this week in Vegas. And it maps your emails, chats, files, collaborators, and active projects into shared context for Gemini powered agents. So it gives Google Gemini continuous awareness across Gmail, Drive, Calendar, Chat, Docs, Sheets, and Slides, rather than treating each query as a blank slate. It also adds an AskGemini in Chat, Drive projects, AI inbox in Gmail, and prompt-based generation in Sheets, Docs, and Slides. So this was announced just this week, and changes are rolling out now. So you might not get it now. That's the downside with anything. You know, from Google or Microsoft, right? Because they're obviously their ecosystems are large. Their enterprise user base is large. So you might have access to this today. You might not for a couple of weeks. But the good thing is this is for workspace customers. Let me explain. So many things over the past year and a half of what Google shipped inside of Gemini were not available for Google workspace. So if you had a Gmail account, right? So if you were paying for Google Gemini for your personal Gmail, a lot of the updates that came out in 2024 and 2025 were not for workspace. Right? So if you like millions of businesses, use Google, you know, as your Gmail provider, you know, you use Google Drive, right? You essentially choose, you know, you're a Windows shop or a Google shop when it comes to, you know, all your office, your email, et cetera. So so many things you could not get inside of workspace. So this is huge. This is huge. This is essentially a version of a vector database of your whole company that just follows you around everywhere. All right. I cannot emphasize. Well, I'm not going to, you know, blow smoke too soon, but if this works as advertise, I cannot emphasize how big this is going to be. If your organization is a heavy Google shop, right? Some people you have Google workspace, but you still end up using other things for whatever reason, right? Or maybe you're using Chat GPT or Claude, even though you have access to Gemini, but this is going to be one of those where you might want to relook at that decision, because this is going to be so powerful. Like I said, all of those different pieces following you around everywhere. So this is much more almost how copilot when it works well, how it operates across the entire operating system. This is bringing that same functionality that I think a lot of people have wanted. So versions of this have been available, right? Like as an example, you could go in Google Drive and you could chat with certain Google Drive folders. And so whether it was a doc, sheet, slides, et cetera, you could do it that way. But now this is something that literally just kind of follows you around, which is really cool. So here's why it's useful. It's going to cut down the time that workers spend stitching, you know, scattered information from tabs, email threads and chats before starting their work. You know, there's some great one shot generation features. There's prompt based spreadsheet population that's claimed to be nine times faster than manual entry. There's infographics that you can make from your business data, fully editable decks from company templates. There's also a new match my voice button that makes Gemini generated writing mimic the user's actual patterns. That's really cool. So who's going to find this useful? Well, like I said, anyone that's a heavy Gemini workspace team. Also, if you're in sales, HR, project management, right? Any type of job where you're routinely having to work with a lot of the different Google files, right? Some people, if they're more heavy in data, maybe they're just spending more time in sheets and they're not slides a lot, or maybe they're not even spending a lot of time in Google Docs. So if you are someone that is using all of those different Google products, this is going to be something that is going to be extremely helpful. So also worth noting for organizations, you're still evaluating Google Workspace versus Microsoft 365. Google also announced a rapid enterprise migration offering, targeting those Microsoft 365 customers. So yeah, the reason they did that is because I think those copilot customers that are finding success and maybe want to move a little bit faster, this is a great option, I think. Although, actually, Microsoft Copilot has actually been really good, I think, the past quarter. They've shipped a lot of meaningful updates, their co-work that we talked about rolling out more Anthropic models. It's actually been, I think, on the rise. All right. Speaking of Anthropic, that is our next AI feature that I hope you didn't miss. So here's what's new and it sounds small, but this is not small. Ready? So here's what's new. It is Claude's new live artifacts. So live artifacts are dashboards and data trackers that Claude builds inside of co-work sessions and it keeps them alive after the conversation closes. So you're like, okay, what does that mean? Well, essentially, you can connect anything, right? So Claude has all of these different apps or connectors. You can build it once. You can build a dashboard, right? It brings in your most important CRM updates, your most important emails, KPIs from a third-party analytics tool that Claude supports, right? All these different data sources and connect it once, and it is going to update automatically, right? So obviously, this caused a little bit of a hiccup in this stock market, right? Because there's a lot of obviously big enterprise SaaS companies that that's what they do, right? All these business intelligence tools, this is what they do. So now it's like you can almost kind of vibe code your own version of Tablo or Tablo or whatever, some of the more power BI type tools. So maybe your organization is struggling to really use some of those more enterprise business intelligence tools, or maybe you don't have enough people on your team that can actually take advantage of it. So I'm not saying that don't don't get me wrong. This is not like a one to one replacement. Not at all. Right. But if your company is using a lot of common SaaS products, that have connections inside of Claude, this one's huge. So this was just announced. It's available on all plans, but free users cannot create live artifacts. All right. You can just create normal artifacts. So it also works inside the co-work workspace. All right. And here's why it's useful. It turns one off Claude outputs into persistent working systems. This one for me, even though I think OpenAI cleaned up this week, this might be one of the ones I use most often, to tell you the truth, because this is something I'm continually doing, because I've used all kinds of business intelligence dashboards, and for whatever reason, either they don't give me what I want, I don't like how they look, or they're just terribly expensive. I'm a small business. I'm not, some of these are $500, $800 a month. I'm like, I'm not going to pay that. It's overkill. I'm barely going to use 1% of the capabilities. I just want these 10 platforms. I just want all of that data. I want it there and I want to be able to easily see and understand it. So now you can kind of do that in these new live artifacts from Claude. So pretty cool. All right. So next, here's where it gets juicy. Workspace Agents. Sorry, my nose itches. All right, this is a tricky one. All right, because there's some fine print here that I referenced. But before I get to it, let me first tell you what workspace agents are. So this is brand new from ChatGPT. They are shared agents that handle complex tasks and long running workflows across tools and teams. They are in this position as the next step beyond custom GPTs. It is powered by Codex, which is really cool, and it runs in the cloud, so they can continue working even when the user is away. And they can be scheduled to run automatically or deployed in Slack to respond to incoming requests. So here's some of the downsides and some of the caveats. So this is only available on team plans. All right, so that's Chat GPT business, enterprise, EDU and teachers plans, and then admins can enable agents via role-based controls. Here's the other thing. It's only free slash included until May 6th. All right, so this one's interesting because then at that point, OpenAI says that they're rolling out a credit-based pricing system. So not sure what that means. Not sure if that means, oh, you'll have 100 agent runs a week included, or if it's strictly on top. I'm guessing there's going to be some inclusion, just like, oh, you used to get 20 deep research runs a month on a paid plan. So maybe it's like that, but right now, you're not paying extra. It is unlimited. I actually, which was fun last night, I had Codex, the program, building me agents in the browser, but they were built by Codex. It was the most meta thing without being meta, which was fun. Built me a handful of agents that I was trying out. So that is caveat one. So there is going to be some credit-based pricing system that starts on May 6. Not really sure what that is. All right. The other thing that you need to keep in mind. No one's really sure what's going to happen to GPTs. All right. Let me see. Let me see. I'm trying to remember if it's on this page or the other ones. Let's see. Here we go. Yeah. So it says editors know, right? Fine prints. GPTs will remain available while teams test workspace agents with their workflows. Soon will make it easy to convert GPTs into workspace agents. So does that mean GPTs are going away? I don't know. It seemed like some very vague footnote down there, because they start out by saying it's the next evolution of GPTs. And then they talk about GPTs will remain available while teams test workspace agents. And then they say soon will make it easy to convert GPTs into workspace agents. So I'm not sure, right? I'll reach out to some of my friends at OpenAI and I'll actually be there this weekend. So yeah, maybe I'll see if I can connect with them and get some answers. Oh, by the way, I will be in San Francisco Sunday through Wednesday. So yeah, hit me up. I always keep my LinkedIn in the show notes. So yeah, if you're going to be around in San Francisco, hit me up. A couple of people already have. So try to try to just chat AI with a bunch of people out there. All right. So what's going to happen to GPTs? I don't know. But let's talk about a little bit more about the actual agents because they are really cool. So here's kind of the breakdown. So you build once and you can scale across your team. So you can create an agent once and then share it across your workspace. So teams follow the same workflows and best practices. This is from OpenAI's release. They said work that runs itself. You can run agents on schedule to handle tasks like reviewing leads, summarizing support requests, or generating reports. You can keep work moving across your tools. Agents use your tools to gather information and take action like updating tickets, editing documents, or sending messages without step-by-step guidance. And then they say automation with control. Admins define permissions, approvals, and monitoring so teams can automate workflow with oversight. So I'll say this. If you're already using Codex, there's not a lot of additional value with the ChatGPT agents except that you can use them in the team environment, right? So essentially like if you use the GPTs, there's always like a GPT store or directory for your team account. So same thing here. But overall, Codex, they're a little more powerful. Codex, you can, it's the same thing, more or less, more or less, right? There's some differences, some differences in intricacies. I just made a new word, differences, differences. So Codex as an example, way more powerful because you have more control. It can use the terminal on your computer. The downside is you can't share them with your team. And also, your computer has to be on for them to run. So the two benefits there, right? With the new ChatGPT Workspace agents is, well, you can, your computer doesn't have to be on. They run in the cloud and they can be shared across your team. So friendlier for non-technical people as well. Again, some people are intimidated by Codex, but you shouldn't be. All right. And then moving on to our last one, and this is the big one we have. Whether it's for one week, one month, one quarter, not sure. But we have a new most powerful model in the world. And I'm talking about GPT 5.5. The Spud is now, I don't know, out of the oven. So this was the rumored model we've been hearing about for a long time from OpenAI. It was codenamed Spud. It is a new pre-trained. So that means it's not a version of their previous model, right? It's not like a better version of 5.4. Technically, it is a brand new pre-trained model. So here's what it is. It is a fully retrained agentic model targeting agentic coding, computer use, knowledge work and early scientific research. The benchmarks, obviously really good. 82.7 on Terminal Bench. Here's the one that, man, 84.9 on GDPVal. You guys know I love GDPVal. All right. Anyways, that means that it ties or beats expert humans tested or judged by expert judges, and they don't know which one's human, which one is AI. So it wins or ties 85% of the time. And right now, it ships in two different variants. So you do have the GPT-55 and then you have the GPT-55 Pro. All right. So it is already available, right? So it came out yesterday. So tons, you know, tons of time already that I've been playing with it. Really good. Noticeably, noticeably better, especially the lower version, you know, not having to, you know, juice the thinking all the way up or going all the way to Pro. Super helpful. So it came out to ChatGPT. It's available also in Codex and on all the team plans as well. All right. So here's why it's useful. Well, it matches GPT-54 in per token latency while scoring higher or nearly higher on every single benchmark. So you get a big efficiency gain, but without a speed trade off, right? Which normally doesn't happen a whole lot. All right. And it uses significantly fewer tokens to complete the same Codex task versus GPT-54. There's also improved ambiguity handling and more often in first missing steps and tool interactions without detailed user prompting. So who's going to find this valuable? Well, anyone. If you're a power Chat GPT user like myself, you're going to find it especially valuable. Software engineers running long horizon coding tasks, refactors. If you're debugging inside of Codex, right? I'm actually re-running or doing like re-auditing on a lot of my old kind of vibe coding projects and is finding some amazing things, right? But here's what I think is the bigger unlock, right? It's the combination of the new model, GPT 5.5 with images too. Why do I say that? One of the big knocks against OpenAI slash Codex has been that it's terrible at front end, right? So any front end design, that's why people are always like, Oh, I'm using Claude code for this. I'm using, you know, Claude, it's just better. It's better. It's better. It's better. But one of the reasons people always say is, well, you know, ChatGPT's front end stinks and it has, right? So it's not any better by default, but the cheat code now is inside of Codex as an example, or just inside of the normal ChatGPT, you can code in there and, you know, render things in Canvas. You can use the image mode first, render, just say render a beautiful front end analytics dashboard that tracks my personal life. Here's the things I want. You know, here's some inspiration. It will create you a front end with images, images too, which is amazing at creating front ends. And then you just, you know, okay, Codex, go do this. So they kind of, by just inserting one more step, OpenAI solved, I think, one of the biggest issues that they were facing, at least from developers, software engineers, iCoders, you know, people that preferred Claude. I always said, you know, if you wanted something fast and pretty, use Claude code. If you wanted something done correctly, or if you cared about accuracy, you know, use, at that point, I would say, you know, use GPT-45 or 5.4 Pro, or, you know, use Codex, whatever. But now, at least on the front end side, that is kind of gone if you just put in that extra step. So, here is what OpenAI says about the new model. They say, We're releasing GPT-55, our smartest and most intuitive to use model yet, and the next step toward a new way of getting work done on a computer. GPT-55 understands what you're trying to do faster and can carry more of the work itself. It excels at writing and debugging code, researching online, analyzing data, creating documents and spreadsheets, operating software, and moving across tools until a task is finished. Instead of carefully managing every step, you can give GPT-55 a messy, multi-part task and trust it to plan, use tools, check its work, navigate through ambiguity, and keep going. The gains are especially strong in agentic coding, computer use, knowledge work, and early scientific research, areas where progress depends on reasoning across context and taking action over time. GPT-55 delivers this step up in intelligence without compromising on speed. Larger, more capable models are often slower to serve, but GPT-55 matches GPT-54 per token latency in real world serving, while performing at a much higher level of intelligence. It also uses significantly fewer tokens to complete the same codecs tasks, making it as well as more capable, making it more efficient as well as more capable. Here's the other thing that's definitely worth noting. Artificial analysis. I talk about it now and then on the show. Let me bring it up here. Why are not all the models showing? All right, all right, all right, everyone. Today, the screen sharing is just giving me difficulties. All right, there we got it. So artificial analysis is essentially an aggregator of all of these third-party benchmarks. So you don't have to look at these 10 different AI benchmarks. It just gives a score for all of them kind of combined to the most important ones, or they have a weighted way that they put these together. And essentially, you get an intelligence index score. So right now, which has never been done as far as I've been, you know, I've been following the artificial analysis, their intelligence index since they started it. And I never recall a model having the top three. So there's different variants of GPT-55. So in this version, they have extra high, they have high, they have, what do we have here? Oh, that's five four. Okay, so then they have five four high and then five five medium. So essentially, the scores 60, 59, 57, 57. All right. So those are the top four models. All right. The last two, GPT 55 medium and GPT 54 extra high are tied with Gemini 3.1 and Opus 47 Max that just came out. Right. But regardless, OpenAI has now number one, number two and tied for number three models in the world. All right. And actually the 60 to 57. So their top model is scored a 60. The next model from Opus and Gemini 3.1 are 57 and three points. You might be like, OK, that's nothing. It's actually a pretty big jump. Right. So it's not something that a sonnet 47 or maybe we'll see what Google does. They have their IO conference in a couple of weeks here. I'm sure that we're going to see something from them. They may jump ahead, but it's extremely, extremely impressive. So anyone that's been saying like, oh, OpenAI's models fell off and maybe they did for a short while. We saw these stories over the last couple of months. OpenAI had a code red and then they dropped all these, what they called side quests to focus on their core product. Well, mission accomplished because not only did they have some of the biggest updates probably of the year, I mean, workspace agents, huge. Images too, the best AI model and it broke all records for the biggest gap between number one and number two. Then you have the most powerful AI model in the world, all in one week, all in one week. So yeah, anthropics got to be sweating a little bit. Google, you know they're going to be cooking something up for their big IO conference here in a couple of weeks, but regardless, there's going to be a lot of new AI features coming over the next couple of weeks, and I'm going to be here obviously helping you through it. So that's all. That's our seven fresh AI updates that you can use today. So I hope this one is helpful with the new Friday feature show. If so, let me know. Make sure you subscribe to the podcast and then go to youreverydayai.com. Sign up for the free daily newsletter. Thank you for tuning in. Hope to see you back tomorrow and every day for more Everyday AI. Thanks y'all.
Speaker 1:
[46:04] And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit youreverydayai.com and sign up to our daily newsletter so you don't get left behind. Go break some barriers and we'll see you next time.