Aravind Srinivas & Edwin Chen: The $1B Bootstrap, Apple's AI Edge, and Benchmarks

title Aravind Srinivas & Edwin Chen: The $1B Bootstrap, Apple's AI Edge, and Benchmarks | TWiAI E10

description This week we sit down with Aravind Srinivas and Edwin Chen on This Week in AI Episode 10. Aravind is the co-founder and CEO of Perplexity, whose revenue has grown from $100M to $500M on the back of Perplexity Computer. Edwin is the co-founder and CEO of Surge AI, the data training company teaching frontier models how to think, which quietly bootstrapped past $1B in revenue without ever raising a dollar.This Week In AI is made possible by:PayPal Open - One Platform for all Business: https://paypalopen.com/Timestamps:00:00 Cold open01:00 Welcome & intro to Aravind Srinivas and Edwin Chen02:40 Why Perplexity Comet and Perplexity Max are taking off05:25 Edwin on Surge AI and why "data labeling" is the wrong term10:47 Tim Cook steps down — what Apple's new CEO should do13:56 Owning your agent loops: why Apple wins21:20 "The iPhone is not getting disrupted by AI at all"23:09 $242B raised in Q1 2026 and the late-stage capital flood23:55 Edwin on bootstrapping Surge past $1B without raising25:09 The ChatGPT "one weird trick" story and clickbait models30:58 Claude Code as a loss leader to dominate token collection33:30 Are we in the endgame for coding?35:35 Autocomplete to auto-diff to auto-outcomes38:54 The death of the no-code movement41:34 30% headcount growth, 5x revenue, the efficiency playbook45:41 AI in Hollywood, Gal Gadot, and $70M movies that should cost $200M50:29 "People don't buy models, they buy products"57:20 Specialization vs commoditization — what actually accrues value58:00 "LM Arena is a cancer on AI"63:34 Perplexity's heuristics for measuring user intent65:41 Model Council, Jensen Huang, and orchestrating frontier models71:23 Most impressive AI experiences — WhisperFlow, Grok on X, Claude Design, and WHOOP78:29 Hiring at Surge AI and PerplexitySubscribe to This Week in AI on Apple:https://thisweekinai.ai/appleSubscribe to This Week in AI on Spotify:https://thisweekinai.ai/spotifyFollow Jason:X: @jasonLinkedIn: /jasoncalacanisFollow Oliver:https://x.com/oliverkorzenCheck out all our partner offers:https://partners.launch.co/Links Mentioned on the Show:Perplexity: https://perplexity.ai/Surge AI: https://surge.ai/Perplexity Comet: https://comet.perplexity.ai/Perplexity Max: https://perplexity.ai/maxClaude Code: https://claude.com/product/claude-codeClaude Design: https://claude.com/claude-designOpenAI Codex: https://openai.com/codexGitHub Copilot: https://github.com/features/copilotCursor: https://cursor.com/LM Arena: https://lmarena.ai/Kimi K2: https://www.kimi.com/DeepSeek: https://www.deepseek.com/Apple Silicon: https://www.apple.com/mac/m4/WhisperFlow: https://wisprflow.ai/WHOOP: https://www.whoop.com/

pubDate Thu, 23 Apr 2026 15:19:23 GMT

author Jason Calacanis

duration 4780000

transcript

Speaker 1:
[00:00] Can these other companies keep up with Claude Code, Cursor, Codex, GitHub, Scopilot?

Speaker 2:
[00:06] We're almost still sort of at the beginning of all of this progress that can still be made. Someone else could catch up.

Speaker 1:
[00:12] Are we in the end game when it comes to coding?

Speaker 2:
[00:15] I don't think we're anywhere close. Coding is very different from playing Go, in that coding is completely open-ended.

Speaker 3:
[00:21] The space of possibilities in coding is endless. It's limited purely by your imagination.

Speaker 1:
[00:25] Are the LLMs gonna get commodified?

Speaker 2:
[00:28] I still don't think that the AI models themselves are gonna get commodified, even if they all have the same degree, the same level of knowledge. People will naturally sometimes just want to talk to different models, depending on their mood, even for the same topic.

Speaker 1:
[00:39] What do you think about where the value will start to accrue?

Speaker 3:
[00:42] I believe that the value is in the application layer. One of the main reasons some of them went out of business, they couldn't build an application. The pure API model doesn't work.

Speaker 2:
[00:51] I think there's a difference between optimizing for what the humans want, what the real users want, and what makes their lives better, as opposed to optimizing for clicks and engagement.

Speaker 1:
[01:00] Thanks to our friends at PayPal, the exclusive sponsor for This Week in AI. Try the payment and growth platform that's trusted by millions of customers worldwide. PayPal Open. Start growing today at paypalopen.com. All right, everybody, welcome back to This Week in AI, episode 10. This is the new roundtable that I've been doing in order to get smarter about AI and keep up with an industry that is moving every month is probably a year in our industry. Keeping up with it, incredibly hard. That's the point of this podcast. We'll talk about whatever's happening in our industry with experts, the people who are actually building the future. And we've got an amazing roundtable this week. Aravind Srinivas is with us. He is the co-founder and CEO of Perplexity AI. Started out with AI-powered search and answers. People became addicted to that. And then released something called Perplexity Computer, in addition to a number of other really great products. And Perl of FT, the Financial Times, your revenue has grown 100 million to 450 million, apparently. I don't know if you've confirmed that or not.

Speaker 3:
[02:12] But you've had quite a run.

Speaker 1:
[02:14] You did confirm it. OK.

Speaker 3:
[02:15] 500 million was what we confirmed a week or two ago.

Speaker 1:
[02:19] Amazing. So this is hyper growth at its best. And is it Perplexity Computer that's really driven this? Is it that interface?

Speaker 3:
[02:27] That's right.

Speaker 1:
[02:28] Why? Why has this product become such a hit in your mind? I use it every day. I love it. I love the model council. I've been using Perplexity comment browser and won't shut up about it. But you can download this incredible app. You have Perplexity Computer. Why is it catching people's attention? What are people using it for?

Speaker 3:
[02:48] I think it makes agents really simple. That's the core reason. It makes it very, it's the most intuitive interface to have a manager, to be a manager of several agents, to essentially orchestrate several different agents, which don't need to like, you don't need to think about whether it runs locally or on the cloud or like setting it up. There's no onboarding required. Like there's no onboarding pain. There's no need to bring API keys. It all works intuitively in the same interface that you're used to asking people to do stuff for you. And it connects to hundreds of connectors that are valuable. It puts all the models in one harness, one agentic harness. So you don't have to feel a vendor lock-in to Claude or GBT. You just can be guaranteed the best model will do whatever it's supposed to do. And people love that. People are using it for a lot of deep and wide research and browser automation, data analysis, so many tasks, building dashboards, building web apps.

Speaker 1:
[03:55] Yeah, we've started using it internally. We obviously had a fascination with OpenClaude. We're still iterating on that open-source project. We've tried Claude Cowork, and then the team just started loving Perplexity Computer to the point at which I had to upgrade to the $200 a month account.

Speaker 3:
[04:11] That's awesome.

Speaker 1:
[04:12] So you got me on the hook. And we're using it.

Speaker 3:
[04:14] That's awesome. I'll be happy to do any customer support for you guys. So please feel free to ping me if you have run into any issues.

Speaker 1:
[04:22] We will ping your customer support line. We're having good success with back office functions. You know, we have a venture capital firm, we've got accounting and we've got legal documents, we do due diligence. So we've been writing the scope of work, you know, the standard operating procedure, the best practice for say doing due diligence on startups. And now we're going into the Perplexity Computer product and trying to figure out, hey, which sections can we actually give to it? And how well does it work? You know? And it's been quite impressive. Also joining us today, Edwin Chen is here. He is the founder and CEO of Surge AI. They're doing data labeling for all these frontier models founded in 2020. And this has become an incredible space. You have clients from OpenAI to Google, Anthropic, Microsoft, Meta. From what I understand, 130 employees, approximately 50,000 expert contractors. Some of this might be needs to be updated. I'm not sure Edwin, because like I said earlier in our opening, things are moving really fast. But there's been a lot of brouhaha on the internet about expert networks as well. Is this a great business? Is it a terrible business? They're very fast growing businesses, which always gets people wringing their hands. We have an investment in One Micro One, which I think is a contemporary of yours. Tell us a little bit about the business, why it's important, and then maybe, I don't know if it's a backlash or the criticism of the industry that we saw in the last week or two.

Speaker 2:
[06:00] I would start by saying I actually hate the terminology data labeling, because when you talk about data labeling, you think about people doing incredibly simple things like labeling images of cats and dogs and drawing bounding box drawn cars. I think what we do is actually so much more complex than that. I often think of what we're doing as building a school for AGI. We have all these incredibly smart physicists, like Harvard professors, Princeton graduate students, Stanford computer science PhDs, and what they're doing is they're cross-examining these models and probing them to figure out when they make mistakes. Then when they make a mistake, they're going in and teaching them all these incredibly advanced things. I think it's one of the most profound things that we can do for AI. It even goes beyond teaching. I often think about what we're doing as raising these models, not just to be correct, not just to produce the answer to a question, but to think and to have certain kinds of values, and to have wisdom and taste and all that. First of all, I'll start by saying, I think what we're doing is- What's the better term?

Speaker 1:
[07:04] Is there an industry term that has evolved this from data labeling? Because I agree with you, it's much more than that. When you hire PhDs or lawyers or CPAs to do this data training, is it data training? What's the right term in the industry, or does it need one?

Speaker 2:
[07:22] Again, I think I often think about this either parenting or education analogy, where, again, what we're doing is going beyond teaching them facts, going beyond just teaching them, oh, this is a Wikipedia page and here's a correct answer. Instead, we're trying to teach them creativity and taste. Personally, the terminology I like is either teaching the models, AI teaching, because I think it actually gets beyond training as well. You're also measuring them and all that.

Speaker 1:
[07:55] How much are these, collectively, the frontier models spending on this? It seems like billions of dollars a year, yeah? Yeah.

Speaker 2:
[08:02] I think the crazy thing is I actually think it dispels in comparison to their compute budgets. I think they should be spending a lot more.

Speaker 1:
[08:08] Yeah, fair enough. These are true experts getting paid 100 bucks an hour, 200 bucks an hour from what I understand in very specific fields, because a lot of the data obviously on the web that could have crawled has been crawled. How does it just take us mechanically, and then we'll get into this amazing doc that we have today. We've got four or five great subjects we're going to chop up. But just for the audience to understand, how does it work? Do you take the queries that people gave a thumbs down to when they were using an LLM and does that get routed? If somebody is not happy with a perplexity query, does it get routed to you to fix? Or do you just say, hey, let's just hire this group of attorneys to take these important cases in the world and annotate them in an intelligent fashion? How does the data work?

Speaker 2:
[08:59] Yeah. There's actually a bunch of different ways it can work. Probably the most canonical way it works is, okay, so you have this expert mathematician. In the course of their normal research, yeah, they're trying to prove some new theorem. They will just interact with the models as if they're doing their normal research. Hey, try to solve this problem or try to explain this concept to me. They keep on doing that until they find a failure from the model. Again, this is why I often think about it as a cross-example model. You're talking to them until you find this very, very interesting failure. Sometimes there's ways of accelerating finding that failure. We may do various things on our end where we're asking our data science team to find loss patterns in the models that guide the failures. Or sometimes French Real Labs will send us certain kinds of queries where they sense that users aren't unhappy. It's not always the case that the user is right. Users are offered wrong. They fund things down for incredible reasons. So they will still need to make sure that the model failed there. So what we do is we'll verify that model failed, and if so, we'll teach it the correct answer. There's all these different ways of coming up with these almost broken gaps in the model's reasoning. So it might be either us finding it ourselves, it might be through various kinds of analyses that we do, it might be through user conversations. We basically take those conversations and then we teach it the right answer.

Speaker 1:
[10:23] Yeah, it's becoming a great job for people. All right, listen, topic one, the industry was taken back by Tim Cook, deciding to transition out of the CEO role. This is an important thing for us to discuss here because Apple has a huge opportunity in AI. For a number of reasons, and they named John Ternes as CEO. Cook's going to be CEO until September 1st, then he'll move up to Executive Chairman. He's still going to work on industry relations, but Ternes has been there now for 25 years, and he worked on a lot of very important hardware products while at the company. I guess the take I'm most interested in hearing, Aravind, from you and also from Edwin is Apple Silicon has arguably been one of their great success stories. They got off of Intel, and then when the OpenClaw thing came out, or Kimi, OpenSea, a bunch of open source models, people said, hey, where can I run these? If you don't have an NVIDIA rack, people started pulling together Mac Studios with 128 gigs of RAM, 512 gigs of RAM, and they also have Siri. So you have Siri, you have Silicon, and you have a system, an operating system. So three big S's there. What should the new CEO do? What would you advise them to do, Aravind, if you were working with them, with this incredible group of assets? Because they don't have a language model to speak of. They've worked on some open source projects. They've got a dysfunctional, broken Siri that everybody wants to throw out the window of their car when they try to use it. But it does feel like they are positioned well. What are your takes?

Speaker 3:
[12:17] I actually think the M series chips, which was just a project led by John Ternes, the current CEO, is one of their underrated assets. I think people really underestimate what it takes to build a powerful chip. At this moment in time, it is even better on the benchmarks than DGX Spark, at least for local inference of LLMs that can be hosted locally. The open-source models, you might have seen KimiK 2.6 that launched recently, I think yesterday, that seems to be doing even better than Ocus and GPT on some of the Terminal Bench and Agentics Suite benchmarks. So I do think these models are getting to a point, like QWEN 3.6, KimiK 2.6, they're getting to a point where they can be competitive with the Frontier, but they could also potentially run on one of your MacBooks or the auxiliary hardware like Mac Minis. Especially the M6 chips are going to be even better. They've already secured a lot of the fab capacity in advance for the 2nm chips for next year and two. So they should go deeper on this. And I think you have the perfect leader for that. And Tim has set up the company well so that Apple's silicon has a bet paid off for multiple years in the future. So if agent loops start running locally, that's the CPU compute. All that stuff doesn't need to be centralized on servers. You get to own your agent loops, what data your agent accesses on your local system, local files, local apps, messages, emails, notes, photos. All that can stay private. And the orchestration loop can run locally. And the model orchestrating them could also potentially run locally. And which company is best positioned to profit from all this? I think it's Apple. So they're actually in a pretty good spot.

Speaker 1:
[14:22] Yeah, this is an incredible vision, if you think about it, Edwin, because frontier models are expensive. Now, they are the frontier model, so they tend to be ahead. But as you learn, if you're a perplexity user, and it's just picking the best model for you, nine out of 10 queries that most people do, they actually don't care what models.

Speaker 3:
[14:41] Exactly, this vision is compatible with frontier models coexisting together. Like this orchestrator can still ping a subagent that relies on a frontier model. It could use your own API key, or it could use a perplexity centralized version, doesn't matter. But the key thing is the loops start running locally on your hardware, the agent loop itself, the recurring processes like event triggers, like we could have a trigger that says every time Jason texts me about an issue on perplexity max, make sure to alert my support team about it. I could set up a lot of loops like this that just don't need to run on any server, and then it starts to be my own personal computer, or my own agent that I own, and the hardware device that's best suited for this is Apple's ecosystem.

Speaker 1:
[15:36] Edwin, what are your thoughts here on the power of the silicon and what the new CEO should do, given, hey, they're kind of starting from zero in terms of any kind of product that's facing the, they're a massive customer base. What would you do? Do you have the same vision as Aravind, or a different one?

Speaker 2:
[15:56] So I think I would say two things. So one is, I think historically people, a lot of people have thought that LLMs were going to be commodities. Like at the end of the day, every model is going to be intelligent to some level, and they're going to be interchangeable. I think what I believe, and what we've been starting to see over the past year, is that actually every model has a different personality. Like you interact with ChaiGBT, and it just feels very different from the type of conversation, the personality, the type of taste that you get from Claude or Gemini. And so it's almost like if they're not a commodity, and I really don't think they're going to be, you really, really, really need your own foundation model, because AI is just going to be so important to the future, and to the kind of like field that you want your products to have, that you really are going to need your own foundation model, otherwise you're just going to be relying on somebody else's taste, somebody else's sophistication. So I really do think that the base foundation model is going to be incredibly important for Apple, and they really need to be able to do it themselves. I think that others...

Speaker 1:
[16:59] Do they need to own a model, Edwin, on that first point? Do you think that they should either buy a model company or just start a group and maybe fork an existing open source one? Because they have an image one that they work on that's open source. What do you think they should do in terms of building models? And then you can definitely go on to your second point, Neil.

Speaker 2:
[17:18] Oh, yeah. I mean, I definitely think they really need to build their own. Because if you don't build your own, you're relying on somebody else's taste and personality for how an AI should behave. And, like, you know, Apple obviously has always had such a strong vision for what their products and what their design should be, that they can just outsource it to somebody else. Like, sure, they may be able to do it temporarily. Just don't play around with all these different concepts and play around with what AI products may look like on an Apple device. But they really, really want to own their future. They're going to need their own, like, they're going to need to infuse their own values into the way they want these AI systems to behave.

Speaker 1:
[17:57] Seems clear to me that they are going to put the whole company behind this. And one has to wonder, like, Steve Jobs started the Apple Silicon movement, I think it was 2008, 2009, when they made the decision, they came out with the first products, like, eight, nine years later. This was a very significant strategic effort. But I don't think that they had in mind, like, oh, this is going, at that time, there were no large language models. Nobody even knew that this product would exist. But man, talk about serendipity and making a great bet, huh, Aravind? If you think about historically Steve Jobs' legacy, he kind of soared around a corner that, you know, or maybe two corners at once, this impossible task.

Speaker 3:
[18:40] Yeah, I think Apple Silicon is like a very forceful, like it wasn't necessarily a bet they made to build hardware for LLMs. But the hardware got increasingly more and more powerful. The neural engine is very capable. The MLX compiler is like really, really good. And they have like a lot of expertise in building these things now. And not just that, like it's not just about Mac studios or minis. It's also, consider the fact that if you do want like an ecosystem of compute, right, like you want to wear an Apple glass, let's say in the future, you want to parse whatever you're seeing and you want to start asking questions about it. And all that pairs seamlessly with some auxiliary hardware you have at home, but it's all running as a pseudo desktop server. So you're able to pair all the compute in one family of devices. I think all that's the kind of magical experiences you can provide to a consumer without draining the battery on the device itself. So all that stuff hasn't been converted into a real consumer experience yet. But it feels like to me that even if other people build all these consumer AI devices, they're eventually going to lose to Apple because they have all the chips advantage, they've already secured the capacity for years in advance, they have the OS, they have the ecosystem lock-in, and they have all their personal context and you trust them to handle it in the most privacy conscious way.

Speaker 1:
[20:15] This is a key point. There's the privacy. I'm packed out of it because you mentioned it earlier. As a corporation, you want to own your agent loops, the agentic knowledge that your organization is building. It's essentially your entire business and to feed it into another LLM, well, that might seem to some people like, okay, no big deal, until all your secrets are now being used by your competitors because you just did the training on the next Claude model, yeah?

Speaker 3:
[20:43] Exactly. This is also why I think even if they're using a different model, I guess the news is they're working with Gemini, they will host it on their own like silicon, they will customize it to their own needs, they'll be doing a lot of custom post training for that. So my sense is that even though people, a lot of people have like opinions on how bad Sirius or good Sirius, they have the ability to take time and do things the way they want to do, because they have a lot of advantages as a brand that people truly trust, and the ecosystem lock-in is underrated, and auxiliary hardware devices that chip advantage, all this is like really underrated right now, because and here's my opinion, I haven't said this before, the iPhone is actually not getting disrupted by AI at all. In fact, like the more AI works better, the iPhone essentially becomes your digital passport. It has your wallets, it has all your cards, it has your passes, it has your health records, you connect with other human beings through it, you do FaceTimes, you do calls, you have your photos of precious moments in your life. All these are things that are truly personal to you and have no connection to AI. Yeah. That's why they can actually afford to move stuff.

Speaker 1:
[22:14] Yeah. It does seem to me that that privacy piece, Edwin, is the privacy plus the silicon plus, oh my gosh, my photos are here, but I don't want my photos up on OpenAI, I'll do respect to JGBT. But even with Google, I'm like, do I really, I turned off syncing and took my photos off Google. I'm like, I don't think I want those in the Google Cloud right now. I much prefer to keep all my kids' photos, and maybe I'm a weirdo on my local device, and I trust Apple to not train their next image model on my kids' images, etc. Yeah?

Speaker 2:
[22:49] Yeah, exactly. I mean, especially because these models are so powerful that when they're trained on certain pieces of data, they just end up re-curgitating it. I think it's really, really important that people should wonder about where their data comes from.

Speaker 1:
[23:01] Yeah. The next story, Edwin, you were talking on our group chat there about the massive amount of late stage capital, and we are in a really interesting moment in time in venture capital. We've all been in the industry for a while. But the amount of money and the velocity of the money coming into this space is extraordinary. AI companies raised $242 billion in Q1 of 2026. I'm assuming that number includes the giant $100 billion raised by OpenAI. But your company, Edwin Surge, you waited to raise money. I think you hit like a billion dollars in revenue before you did your first round. Maybe talk a little bit about why you went with the bootstrap model for as long as you did, and then what the impact of all this money being dumped on the founders, what is the impact that's going to have on the industry? And then Aravind, I'm going to go to you just to talk about how you manage your treasury, because you've also been a beneficiary of this.

Speaker 2:
[24:01] Yeah. So, I mean, first of all, we've actually never raised. So, we're still-

Speaker 1:
[24:04] You've never raised?

Speaker 2:
[24:05] So, we're still happily bootstrapped in, and growing in, I think, the best way possible. So, I'm actually really, really happy that we've never raised. And yeah, to your point, I think it's never been easier to raise money in AI. And that's kind of the problem. Like, when you raise a billion dollars, you get all these growth targets from your investors that incentivize volume over quality. You get all this board pressure to spend your time optimizing for your next fundraise instead of the product that you're building. Like, all of my friends who are CEOs of other companies, they're like, oh yeah, I have to spend the next few weeks just prepping a board deck. And they're like always jealous of the fact that I don't have to do the same thing. And I think the problem is, like I've heard that some post-training teams, their goal, it's not to make their model more intelligent. Their goal of all these post-training teams is actually just to get their companies a billion users. And if that's your North Star, and yeah, it's a North Star that happens when you raise gazillions of money, is it a surprise that your model starts trying to whisper in your ear and start click baiting you? Like, it's kind of funny. A couple of weeks ago, I was chatting with ChaiGBT. I was asking it, you know, give me some tips for what to do in Tokyo. And it ended a response to me with like, hey, by the way, do you want to hear about one weird trick that you could do that doesn't let us know? And it was just like shocking to me because we had this super intelligent model and it just sounds like a 2002 tabloid. And the problem is like, this is what happens when you have all these different incentives that don't align with what you were originally trying to build. So yeah, I think we chose a completely different path and I think we're really happy that we did it.

Speaker 1:
[25:41] Super impressive. And you hit a billion dollars of revenue bootstrapping, which I'm trying to think of another company that's done that in our industry and I can't. I don't know if you can, Edwin. Have you heard of one who's hit a billion in revenue without raising venture capital? I mean, I know people who have been incredibly judicious about raising, but that's a true first for me. I don't think I've ever heard that.

Speaker 2:
[26:07] Yeah. I mean, I think it's really important because I actually really do think that AI is just so important for our future that it kind of needs to be shielded from the typical Silicon Valley growth hack playbook.

Speaker 1:
[26:18] The only ones I can think of, MailChimp was famous for that in Patagonia, but that's not in our industry. But those are Osoho, that other company. That was another one that didn't raise a ton of money to do this. Aravind, you've got a war chest you've raised. You've raised from some of the most important companies and investors in the world. Do you think we're seeing unnatural acts? That's the term I use for what the phenomenon Edwin's talking about. I saw it up close and personal when I was in the publishing space, and people would chase clickbait. They would do all kinds of unnatural acts to try to get their page views up, and BuzzFeed would be, and Business Insider would be these canonical examples of just lunacy. How do you stay grounded? Then also, you're in competition with people, so if you don't raise, then there's capital as a weapon, as I saw up close and personal with Uber versus Lyft versus Icar. Just Travis was an absolute monster when it came to raising money, and if you invested in one, you couldn't invest in the other. That's gone away here a bit. But what are your thoughts on this issue?

Speaker 3:
[27:27] Huge kudos to Edwin for doing what he did. I think not raising capital and getting to a billion in revenue is very, very hard, not just in terms of business building, like you know, just financial health, but also convincing other people to come join your venture. They're all looking at valuations. They want validation from the rest of the world. Like to convince really good employees to come join you when you don't yet have a working business, they want to see validation from somebody else, which could be a reputed venture capitalist. So trying to build a team without that is very hard. So kudos to him. I think one thing that founders need to take away from the success of companies like Surge is the need to be more disciplined. You can raise money for sure, as long as you truly know that you're spending it the right way. And like Elon raises a lot of money, but he knows exactly what to do with it. XAI has raised a lot of money, but it's being spent on building data centers. He's known for being very judicious about the allocation of capital. So you can be both, that you can have the discipline of bootstrap founders, but you can also have the ambition of the most successful founder in history of capitalism, Elon. So if you can figure out a way to be both, you could be far more successful. So that's my takeaway. It doesn't have to be a dichotomy between staying bootstrap forever or raising endlessly with very indiscipline capital allocation. I think you should just be very good capital allocator, and you should have a clear plan for why you need money. Then you should continue to have the discipline of a bootstrap founder even if you have a gigantic watch list.

Speaker 1:
[29:24] How have you managed that you've raised? How much have you raised to date? I think it's been public.

Speaker 3:
[29:28] I think we've raised around 2 billion cumulatively. We haven't raised since August of last year. Our goal is actually to advance further in our revenue progress that we've been doing since the beginning of the year and try to become profitable. Unlike a model company, we don't have to actually spend a lot on compute, particularly on training. We do a lot of post-training, but we don't do any pre-training, so we have no excuses to not be profitable. And like companies in the coding application layer, your gross margins on the revenue are actually negative. We don't have the problem. So gross margins on all the revenue we make are pretty positive, highly positive in the case of max users, the $200 a month plan. So our goal is to just keep growing the top line, stay disciplined, not actually spend more on payroll or infra and become profitable as soon as possible. And when that happens, we don't actually think more capital is leading to a meaningful change to our destiny. And that probably should become the blueprint for application layer companies. Try to just run the company in an efficient way with the discipline of bootstrap founders like Edwin and try to keep growing the top line revenue.

Speaker 1:
[30:47] Having the union economics dialed in, Aravind, is critically important. We've seen some other folks who are supposedly, and I think the coding space is the number one example. They're just losing money.

Speaker 3:
[31:00] That's correct.

Speaker 1:
[31:01] Yeah. On their, I don't know what percentage of users, or maybe it's the entire user base in aggregate loses money. Yeah.

Speaker 3:
[31:08] That's right. That's what I know. I could change, but today that is the case. And this is not because of the product. It's actually because of the Frontier Labs subsidizing tokens in the form of a subscription plan. So even though Claude Code is worth $200 a month, the amount of tokens you can consume on Claude Code is actually worth more than the $200 a month you pay. So they're actually running it as a loss leader to just dominate token collection in order to take all those tokens and make their models even better. So if you are an application leader company competing with Codex and Claude Code and coding, it's pretty difficult for you to have any positive gross margins.

Speaker 1:
[31:51] Edwin, I would assume that part of that is whoever has the most tokens consumed, specifically in coding, will have the best model because you'll have the most reinforcement learning and all that usage from developers is going to create a signal for your model, yeah?

Speaker 2:
[32:10] Yeah, I think there's a lot of interesting signals that you can learn and just the more you reason, the more your models reason, like oftentimes that just leads to better responses.

Speaker 1:
[32:19] Can these other companies keep up with Claude Code, Cursor, Codex, GitHub, Co-pilot? Are they going to keep up or do you think we've hit this acceleration where Claude's going to run away with it, Edwin?

Speaker 2:
[32:33] I don't think there's anything inherently preventing any other companies from catching up. Certainly that data is valuable, but we're almost still sort of at the beginning of all of this progress that can still be made that I think someone else could catch up if you want to see.

Speaker 1:
[32:50] Do you do data labeling for all this Fortran and Cobalt? Is that part of the desire of these companies to get you to find these old gray beards to explain to you how these AS 400 and microcomputers from the 60s and 70s actually work? Is that a big business for you?

Speaker 2:
[33:10] It is. I mean, the coding landscape is just so huge that we have to be part of every aspect of it. So yeah, it's every single language. It is front-end design and back-end design. It is the correctness of the algorithms, the efficiency of the algorithms, but also the quality and the beauty of the front-end designs that they create. So it's just a wide landscape that we have to be part of every aspect of.

Speaker 1:
[33:34] Are we in the end game when it comes to coding? It is a finite set of data, Edwin. So it would seem to me there will be diminishing returns at some point. Are we 96% of the way there or 99% of the way there? Like self-driving is apparently 98%, 99% of there with the edge cases. How would you contextualize that game if we made it a game in terms of being perfectly solved? Chess got perfectly solved. They believe No Limit Hold'em has been almost perfectly solved. And PLO will see if that eventually becomes perfectly solved. I think they're on the way.

Speaker 2:
[34:13] I don't think we're anywhere close. One of the things I often think about is coding is very different from playing Go in that coding is completely open-ended. You could literally create any program in the world. It doesn't have a single solution or a single end state. Sure, a game of Go sort of ends with one person winning and the other person losing. I like one analogy I often think about is, imagine you took Jeff Dean and you gave Jeff Dean a thousand years to learn more about coding and to explore the world, and to also learn about poetry and mathematics and physics and history and artistry and all of that. He'd be able to incorporate all of these principles into what he builds. Software is about building globe-spanning infrastructure, it's about designing rocket ships, so there's almost an infinite ceiling to what coding is capable of, that I really think that we're just one percent of the way there.

Speaker 1:
[35:07] Where do you stand on, Aravind? You think we're getting close to solving the game of coding, and everyone will just be able to make quality code, or do you think we're producing a lot of slop with a lot of attack vectors? We've obviously covered the various attack vectors that AI is helping identify with Methos, but what's your take on where we are at in terms of solving the game of coding?

Speaker 3:
[35:36] I think the framing should be around what does solving mean, right? So maybe think of this as paradigms, like cursor, GitHub Copilot, like auto-complete, like you're trying to complete a few lines of code, but you are writing code largely. Claude Code, Codex, as command-line interfaces, you can almost think of it as auto-diff. You're looking at the diff, the new lines of code added and the existing lines of code subtracted. You're not actually like auto-completing anymore. You're operating at a different abstraction of changes. The next paradigm is just going to be auto-outcomes. You're not even going to look at the diffs. You're not going to read any line of code. You're going to look at the outcome, and then you're going to ask for changes, and you're going to keep iterating. That's clearly the next thing. And then Elon talks about it. I think he talked about it in one of the XAI all hands at God livestream, where he said, you're going to just output the binary.

Speaker 1:
[36:44] Which is a wild thing to think about.

Speaker 3:
[36:46] So I still feel we haven't hill climbed on capability yet. I think we're still very early in what it means to solve coding entirely. Of course, problem-solving skills, like the ability to connect dots across different things that Edwin was talking about. I almost imagine, what is Jeff Dean or someone like Jeff, I heard Linus is also using AI. So what are these people coding? How did they do things these days that they were not able to do before? I think all these things are very interesting to think about, but also fundamentally, if AIs can actually move you to the level of working at outcomes and binaries, then what do you think of coding also changes from just inspecting lines of code and things like that?

Speaker 1:
[37:41] I think that's the most exciting part. I always say startups are like where you can see these trends before anything else. It's like Santa Monica with yoga and fresh food and farmers market. Everything interesting starts with the hippies in Santa Monica and Venice and then goes east, if they wind up making it past there. I feel like startups are the same thing. Startups now, you'll have two or three people. They'll never add their fourth employee, the fifth employee that they thought they were going to add, and they're shipping code faster than I've ever seen, and they're doing their go-to market and their customer acquisition using things like Perlexity's computer, and they're producing code at such an alarming rate, Edwin. I actually think we're going to see a significant amount of job creation as more people realize, I don't need a developer to start a company. We had a false start. There were people using scripting and, oh God, what was it called? Before vibe coding, there was a term for these code. They were almost like WYSIWYG. Gosh, what was the name of it? Do you know, Aravind, that people used to do where they would vibe code?

Speaker 3:
[39:00] In no code?

Speaker 1:
[39:01] No code. Thank you. The no code movement, which was such a false start, but I used to have people pitch the accelerator, and I'd be like, oh, who built this? I did. I'm like, okay, whatever. I thought you were the salesperson from Salesforce who started their own company. Yeah, but I just figured out how to use no code. And no code's just gone now, right? It's just totally replaced.

Speaker 3:
[39:20] It's fairly limited in what it can do in terms of what are all the possible set of things you can do, because it was built with certain intentionality, certain deterministic behavior, certain level of hard coding. So obviously, it can cover all the combinatorial possibilities that models can just generate code on the fly and do whatever you ask them to do. This also goes back to the point that I think Edwin made earlier, which is the space of possibilities in coding is endless. It's limited purely by your imagination. You can build things that exists inside Minecraft, the kind of structures and worlds that you can build inside Minecraft is endless. So as a game, Minecraft is even more complicated than Go is, and Minecraft is just one game that AI can code, and the world is full of infinite possibilities. So that's why solving coding means you have something truly general purpose intelligence.

Speaker 1:
[40:18] What are you encouraging your developers to use internally? How much more productive are they this year when you look at it?

Speaker 3:
[40:29] Yeah, so largely it's two camps, Codex or Claude Code. I have been trying to understand why one is preferred over the other, and it keeps changing, but I can share with you a rough level of understanding I have today, which is Swift UI and Rust. People like Codex. I think it seems to be better there. Frontend development and full stack development, people like Claude Code, especially if you want to have frontend design work done, Claude Code seems to be better. This also goes to the point Dario made in one of the podcasts recently of this whole point of models commoditizing. Actually, what's happening is models are specializing, and even within a specialty like coding, there are specializations on which aspects of coding each frontier lab was actually good at, which is also why we wanted to build a product like computer because when models start specializing deeply and orchestration of what each one can do individually at whatever they're skilled at is valuable. Yeah, we're largely in these two camps, and headcount-wise, we've remained flat since beginning of the year, and over a period of one year, that is exactly from last year, same time to now, we have grown roughly just 30 percent. So, I want to remain this efficient, and I want our company to be an example for many other founders in future, to build sub 500 people companies that can make several billions in revenue. I think that's the way to go, because you want that force multiplier, you want your designers to write code, you want your business professionals to do data analysis, you want your sales reps to actually make their own presentations index and data analysis of the customer. You want them to do the bug triaging, you don't need a program manager, a project manager to be an intermediary there. So, it just vertically integrates your company even more.

Speaker 1:
[42:35] Everybody is adding skills, which we lived in for 20 years. Hey, pick a specialty, be an expert, Edwin, was the advice, and now, hey, if you're a salesperson and you can redesign the landing page for the demos and you think you have a better idea, you can just vibe code it and send it, and the dev team is like, okay, whatever, or if you're the chief revenue officer, you don't have to go to the data analysis group, you can just dump your spreadsheets into Perplexity Computer and just rip. How are you using, you heard the sort of two camps, and Aravind, just to put a pin in it, 30% headcount growth, 5X revenue growth. That's significant if you think just about efficiency. And Edwin, you have 130 people, at least in my research, somewhere around that number. And if you're over a billion in revenue, it doesn't take a genius to figure out how efficient you are right now. So how do you think about efficiency and company building?

Speaker 2:
[43:35] Yeah, so I absolutely agree with Aravind, where I really strongly believe that, historically, there's been this incentive for companies to grow as much as possible, as quickly as possible. And I think people always underestimate the bureaucracy and the politics and the communication complexity that that creates. Like, yeah, like, there's anybody, I think very, very few people want to be running a 5,000 person company. Like, you're no longer invested in, like, you're no longer spending your entire day playing with your product and talking to your users, you're just spending your entire day, you know, being a corporate CEO who's just managing, managing company. And so I absolutely agree with Aravind on our front. And I think that, to your point, I think that one of the things I love about things like Claude Design, for example, yeah, it used to be the case that if or like maybe if a, you know, one of our front end developers or even somebody on our operations team, if they wanted to prototype a new interface or prototype a new landing page for these experts that come in, they would need to write down their ideas, send it to a designer, wait for a designer to, you know, sketch something up and that may take a couple of days and then maybe the vision didn't quite look at like what they wanted. And so it would just be this long iteration cycle. Now you can just talk to Claude Design. It's fit something out pretty amazing within 15 minutes, 10 minutes, five minutes, and you can just iterate so much faster. And yeah, you get like actually see this is what this vision looks like. And maybe I didn't like it now that I see it in person. And so they changed our idea. They just go somewhere completely different. And Claude Design is online all the time. I'm like a designer who sleeps eight hours a night. And so I think it just makes the product development process both faster but then also like, yeah, for the personnel or operations team or for the engineer, we think they just get to own something end to end and to basically see their vision fleshed out as opposed to sort of like delegating parts of it. So yeah, I really, really bullish.

Speaker 1:
[45:43] I was at the Breakthrough Prize this weekend, you know, Yuri Milo's Science Prize. And I was talking to Wonder Woman, Gal Gadot, the actress. And we were just talking and there was a director there, Darren Aronofsky. And we're just talking about how it's impacting Hollywood as an example. And you start to think about the unique roles everybody had. There were people who were storyboard artists. There were people who wrote scripts. There were directors like Akira Kurosawa or Spielberg, who would draw their own, or Ridley Scott from Aliens and Gladiator. He was known for drawing his own cells. And he would draw all of these interesting images that he would then give to a cinematographer. Now, with AI, you have the people who are writing screenplays, the producers, they're all coming together, and anybody can do almost anything. Write dialogue, do the backgrounds, you know, and write all these cells to do them. And the cross-disciplinary nature of that leads to innovation, Aravind. If you've ever met somebody who had expertise in multiple areas, whether it was computer science and art or art and, you know, cells, they can just make some breakthroughs that other people don't have. And she was telling me there's a movie coming out, Bitcoin Killing Satoshi, and it's only a $70 million budget. It would have cost $200 million, but they're just doing the actors on a gray screen, and then everything is being built by AI in the background. So all they had to do was write a great script, have the best actors perform it, and now they can just build the movie with, you know, having had them on a sound stage for 20 days or whatever it happens to be. Think about what happens in that industry. Now you can make three movies for the cost of one. Really kind of interesting moment in time, yeah?

Speaker 3:
[47:41] Yeah. I mean, Steve Jobs used to talk about the story. I think that he's given a lesson, like how most of the work in producing a movie is all about the pre-production phase, like getting the story right, like getting the core story and storytelling right. That was his biggest learning at Pixar, was he would sit with the team, you try to go through the whole storyboard, and if it didn't make sense, go redo it, go redo it. We're not going to make the movie until this is so good. This was the single lesson he learned from Walt Disney, is you cannot make a bad story succeed no matter how good you produce the actual movie, but even if you don't necessarily do a great job at production values, a good story will win.

Speaker 1:
[48:37] You see that with independent films, right? You can see some incredible independent film where you're like, yeah, it's a little rough along the edges, but man, a great performance is based on a great story, based on great dialogue, you get all that right, and that is the-

Speaker 3:
[48:50] That's true in products too. Products often work if one or two ideas, you just hit it out of the park.

Speaker 1:
[49:00] I think you did that with Comet, by the way. You were the first to drop a browser, and that's when I first started communicating with you. I was like, this browser is unbelievable, and I got everybody in the, I don't know if you've used it before, Edwin, but it was the first where you could be like, hey, here's what's on my page, let's work with that, whatever that happens to be. And it was the first time you kind of let ChatGPT or whatever model you're using out on the real web. And man, that was a major breakthrough. Now, obviously with hooks and integrations, it's getting to the next level. I wanted to talk a little bit about the commoditization of large language models and the creation of small language models, SLMs, I guess, is the industry term, or VSLMs, verticalized ones. And I think, Aravind, you have the belief that we're starting to hit some form of commoditization or wondering where the value is going to accrue. Is it going to accrue to the harness, to the wrapper? Is it going to accrue to the core model? So maybe you could explain your best estimation of what's going to happen in the next year or two in terms of people loading Kimi or DeepSeq or not even knowing which model they're using and then what a harness is and how people should think about harnesses and the impact they're going to have.

Speaker 3:
[50:28] People don't buy models, they buy products, right? And fundamentally, at the end of the day, the consumers have to pay for services. And pure model companies basically don't exist anymore. Anthropic is as much playing in the application layer as they're playing in the model layer. And whatever the information reported, I forgot, but 30 to 40% of the revenue, at least 30% of the revenue is coming from applications. So that shows you that you have to be an application layer player, whether you build models or not, and the money is in the applications. If you have a model, obviously, you can vertically integrate it with the application, and you can build custom harnesses for your models, and you'll claim your models on being good at your own harness. So that's an advantage you have. But the disadvantage you have is you have to always ensure you have the best model all the time. And that's like serious, serious competition. Like it's truly a game you can only play if you have at least tens of billions of dollars in cash to spend on compute. And it's not just about the cash you have, it's also that you have to secure compute capacity and compete with all the other players trying to secure the compute capacity years in advance, power capacity now, and then hyperscalers need to be invested in you. It's a game that you only play at the highest level. And that's also why there are like four or five players playing there.

Speaker 1:
[52:03] And maybe less in the future, maybe we'll see some of them consolidate, or maybe some people get out of that business if they don't feel they can compete. Edwin, where do you think this winds up? You obviously are helping people train their models. These are your customers, so you're rooting for them, you're helping them build. But there's also, hey, people saying it really is the application layer. Where are the values going to accrue? Whether it's Google's suite of products in their browser, Apple's suite of products as we talked about in the first topic of the show, Perplexity Computer, Claude, Co-Work, OpenClaw, all of these different front ends, harnesses, the orchestration level is becoming more important. How do you think about the orchestration level yourself, and are the LLMs going to get commodified?

Speaker 2:
[52:50] I still don't think that the AI models themselves are going to get commodified. I think a big part of that is because I just think so often about their personalities, or the specializations that Aravind mentioned earlier. So I got an example of that. Even if I ask today a fairly simple question like, I don't know, who was Abraham Lincoln? I want to get a very different response from Chachipiti versus Claude, for example, or Chachipiti versus Gemini. In the same way that, okay, what about your friends? Sometimes I will ask them certain questions, even if they all have the same level of intelligence, even if they all have the same degree, same level of knowledge. Sometimes I just want the quick snappy answer from one of my friends. I just enjoy talking to them more when I'm in a certain mood. Sometimes I want to really well research to really insightful thing. But I know it's going to take me five minutes to get the answer from my friend, and sometimes I'm just too busy to talk to them. In the same way, I just feel like people will, even for fairly similar tasks, like coding, like front-end coding versus back-end coding, or different languages, people will naturally, sometimes just want to talk to different models, depending on their mood, even for the same topic. So I really don't think that AI models will commoditize.

Speaker 1:
[54:07] I'm wondering, Aravind, if you have the other side of it, where so much of my work is happening in Perplexity Computer, OpenClaw, and I'm like, we need to have these skills, this sole file, these memory files, local on our hard drives. And before we use any language model, we're like, hey, here's the context. This is how I like to work. This is how I like my answers. And I've had to, now, with four or five different models, explain to it, I like concise answers. I like you to just solve the problem, not give me updates on your thinking. OpenClaw became so verbose recently. In the latest version, I wanted to kill myself. I was like, OK, the user wants me to do this. OK, I'm going to do this. OK, I'm going to do this. I was like, no, no, you just give me the steak when it's perfectly cooked. I don't want you to explain to me all the steps in cooking the steak. So what do you think about where the value will start to accrue?

Speaker 3:
[55:02] I believe that the value is in the application layer. And there were a lot of model companies, and one of the main reasons some of them went out of business is also why they couldn't build an application. The pure API model doesn't work, because you cannot build a model that's so far, so much better than the rest. No one's able to maintain that much of a significant. The only time there was a significant lead in the model layer was when GPT-4 existed, and it took a year for anybody else to catch up. After that, the gap has usually been months, I would say. And even between open source and frontier, I think the gap is like six months to a year at this point.

Speaker 1:
[55:45] Is that what you feel six months to a year? You feel the same way, Edwin? What would you say the gap is? Open source to frontier?

Speaker 2:
[55:50] Yes, in terms of raw model intelligence, raw EPI, I agree.

Speaker 3:
[55:54] So I think commoditization and specialization are not necessarily mutually exclusive. There are some models that are getting specialized clearly, like Claude models are clearly very good at agent decoding, code execution, agent orchestration, and OpenAI models. Google's models are very good at multimodal stuff because they have a lot of data and multimodality that nobody else has. Grok's models are very good at being unfiltered and unconstrained and that has its own-

Speaker 1:
[56:30] Unhinged.

Speaker 3:
[56:31] Unhinged. Yeah. Unhinged.

Speaker 1:
[56:33] I think they call it unhinged mode, right? I think that's their-

Speaker 3:
[56:36] Yeah. The talks that speaks a lot to the shapes and how you shape the values of the model, like what do you train it on? What are the fundamental ground truths it assumes is true, or at least has been trained for it? That's all not commodity. These characteristics of how these models behave and what they're good at and all these things are not commodity. What is commodity is if they're all hill climbing on Ella Marina or Terminal Bench or Gentix V or Humanity's last exam, these are all academic benchmarks. If all of them are hill climbing on these benchmarks because that's the stuff you publish to researchers to show you are at the frontier, that part is commodity because open source is also doing that. Some qualities will be specializations. A lot of academic benchmarks will be commodity, and it will be up to the model trainer, the product builder, the application layer owner, to take what is commodity and shape it in a way that matters for the use cases that they own. You can only survive. You can only have value accruing to you if you actually own a certain bunch of workflows, and then have a bunch of loyal customers, high retaining customers, and own a bunch of workflows. Because that's the only way that you collect unique tokens, unique data, that you alone can harness, and keep improving on those capabilities.

Speaker 1:
[57:59] How do you think about these LM arenas of the world, the benchmarks, Humanities Last Test, Edwin, because you're helping folks with training, obviously, you're a key player in this, how do you, what's your take on LM arena, and people optimizing for these benchmarks today?

Speaker 2:
[58:16] I really do think that LM arena is just this terrible cancer on AI, like you basically have a random niche subset of population, like people, I think people don't realize that it's so niche, and they think that it's a random representative set of users, but no, it's like a random niche subset of population that just wants free access to models, and they have endless time to wait for LM arena to spin endlessly before they glance at the responses for two seconds, and then they click their favorite. It's like what basically what happens with LM arena is that you get models that completely hallucinate and they beat out models to answer correctly as long as they have a bunch of pretty formatting that catches the eye of this random niche subset of population. But the problem with it is that it's such a visible benchmark. Everybody knows about it in the industry. It's such a visible benchmark that you have all these VPs, all these CEOs, all these companies that basically have entire teams purely dedicated to hacking it. Like it's pretty well known within the industry that yeah, once you have a data science team analyzing the kind of weird idiosyncratic preferences of this niche population, you can just hack it. And so companies do it, even though they've researched themselves, they agree that it simply makes their models worse. So personally, I was really hoping that would completely die out after Meta showed last year, how easy it is to hack. But somehow, somehow, so was this.

Speaker 1:
[59:35] And this speaks to, I guess, Warren Buffett would say, show me an incentive, I'll show you an outcome, or a good heart's law. When a measure becomes a target, it ceases to be a good measure because everybody starts optimizing for benchmarks. Yeah, Edwin?

Speaker 2:
[59:53] Exactly.

Speaker 1:
[59:54] And what should we be measuring then? How should we benchmark the industry and the models that are being built? Is there a better way to do this?

Speaker 2:
[60:03] So I think the way to really do it is to think about how real humans are using these models in real life. So for example, a lot of what we do is we simply run these human evaluations. When we take models, like we take Claude Code or we take Gemini, and we actually ask the software engineers themselves, go use this in the real world. Go use it for your actual day-to-day job. Then ask at your queries and then measure whether or not it actually helps you. So not only was it correct, not only would it pass this set of unit tests, but was the web page that it created for you, is it something that you would actually want to launch to your users? Did it make great recommendations for you for your A-B test, for the metrics that you're trying to optimize for, that you actually believe in? And not just playing this sort of benchmark game, where a lot of these benchmarks, I think people don't realize, they're just very contrived. So the prompts themselves are things that no real user would ever ask. The way you measure the benchmarks, because they're often auto-evaluated, they're purely measured on, they didn't match a certain string. And in the real world, that's not what we're looking for. In real world, you care about things, like the creativity responses, you care about the design of the web page and so on.

Speaker 1:
[61:19] If you think about it, like Google did, Google was measuring bounce back rate at some point, where it was like, somebody searches for, hey, what time is the Knicks game today? They go to a web page. Do they come back and click on the second and third result? If they did, you didn't give them the right result. And then eventually, I remember talking to Larry about this 20 years ago, Larry Page, and he was like, eventually Jason, we're just gonna tell you what time the Knicks game starts. That's eventually what's gonna happen. The computer will just know. And so if, and that's a very weird thing to think about. Google's whole existence was, don't come back to the website for that query. Don't come back. And how quickly can we get you off of our website? Whereas other people, Disney Corporation, ESPN, they were saying, hey, when we get you to the website, how long can we keep you on the website? Can we keep enticing you? Meta, obviously, with Instagram and Facebook, how long can we keep your session going? YouTube, how long can we keep our session going? Two very different North Stars.

Speaker 2:
[62:23] Yeah, I think there's a difference between optimizing for what the humans want, what the real users want, and what makes their lives better, as opposed to optimizing for clicks and engagement.

Speaker 1:
[62:34] Yeah. And AI should be at its best, Aravind, of just solving your problem. And do you track that? Like, did I solve the problem or not with Perplexity Computer? Did I solve the problem or not with Model Council? If people don't know Model Council, you can explain it a bit. Like, that's to me, I think, the ultimate test is, do I keep querying you and am I happy with the answer? Yeah.

Speaker 3:
[62:59] Yeah. Yeah. So, I mean, to the Larry Page thing you said, eventually the AI, sorry, the computer should tell you, like, what time the game starts. You don't have to click on a bunch of links. That's precisely why we put Perplexity. Like, that was the problem. It's all there. The links to answer. So, yeah, we do track that. Like, for example, there's a very simple heuristic. If a user asks a question and then there's a follow up question, that's like, but no, I meant, like, you know, that means you didn't do a good job at the first question. They're like, I mean, initially, we start just using heuristics because every day we get a lot of queries. So it's hard to do a lot of LLM compute on all of them and filter them. But now we don't care. Like we have small language models that can just run on a lot of query logs and like filter threads where like user clearly had to clarify the prompt again in order to get a better answer, which means you did not understand the user intent well in the first prompt itself. And this is like another like a Larry Page philosophy thing. Like even if the user's prompt wasn't detailed enough, your job is to still give them a good answer. You should consider a user prompt as intent, not the actual like descriptive prompt. This is a very different product design philosophy from ChatGPT where in ChatGPT, I think at least in the beginning, they used to tweet stuff like, you're not a high-taste tester enough if you don't know to tell why this model is better than the previous model. No, you shouldn't need to be like, the model should speak for itself. Users should feel it, and you shouldn't blame it on their prompting capabilities. We took the Google philosophy of the user is never wrong, where even if their prompt was bad, even if their prompt was incorrectly phrased, it's on the AI to disambiguate and understand and reformulate it, and expand the prompt, and search as much as possible, and give as much information the user wants, and ask a clarifying question at the end if they're happy with that, whether they're looking for something else.

Speaker 1:
[65:05] It seems like this was the big innovation with Claude's 4.6 and now 4.7. Yeah, Edwin, when you ask it a very simple question, it kind of threads out and thinks, well, what are your next five questions? Or what did you really mean by that? And it tries to rationalize it and give you an answer that's much more comprehensive than you could ever have imagined, yeah?

Speaker 2:
[65:29] Yeah, I mean, I think Claude has always been very, very good at the planning stage where he was to formulate his plan up in advance and then execute it and then, yeah, kind of like backtrack whenever it was that's going wrong. So, yeah, I think that's what I mean.

Speaker 1:
[65:43] My favorite tool is Model Council. I am super addicted to it. Aravind, I don't know, do you use it, Edwin? Do you use Perplexity's Model Council ever? Or do you have, I mean, you might have your own for your own benchmarking, I guess, internally. But have you used it before? Any thoughts on putting the models up against each other and knowing the diffs? Yeah.

Speaker 2:
[66:06] I am constantly doing that myself just because, I mean, it's almost like a lot of what we do in our day-to-day work or like a lot of our experts do, they're just constantly comparing the models. So I personally do it a ton.

Speaker 1:
[66:18] Which is multiple windows open.

Speaker 2:
[66:19] So I have a special app that I built to do it myself. I mean, and then our experts have a different kind of app. But Aravind will have to give me a demo of Model Council.

Speaker 1:
[66:30] Model Council is spectacular. You give it your query, threads it out to whatever three or four models you want. Then it will tell you, here's where they disagree, and here's who you should trust. How popular is this now?

Speaker 3:
[66:43] It's pretty popular among the max users. So actually, Jensen asked me to build that feature. So Jensen said he really loves asking different models the same thing. The way he said he would do it is, he would ask Perplexity one question, he would ask Claude one question, he would ask Chachapi D one question, and Jim and I. And then he would look at all the AIs and see what each of them say, and then compare in his head. I was like, hey, like, Perplexity has all these models in one app. Maybe we can just do that within the app itself so that you don't have to open four apps and read all of them and then figure out what the differences are. And so what if we built that feature natively? And we'll be the only app, like layer company that could do this because other app layer companies have an incentive to just put their own model. And so we built that and he was like pretty happy about it. And we wrote it out. Obviously, it's expensive. Like you're going to ask four or five different models each question. And it's not just like aggregating the answers. Like the orchestrator looks at it, tells you exactly where they differ, where they agree and what you should truly take away. And so there's some synthesis layer that's actually like running with the Frontier Model 2. And I like to use it a lot for health queries because health queries are very like, you know, the evidence is there on the Internet. But how the models interpret that evidence in terms of the prompt you asked often differs. Some models are very risk conscious. Some models are actually like risk seeking in terms of what they recommend. And so you want like both modes of behavior there and then an analysis. So I think that's like super useful. I like asking model council about like what it thinks about different stocks. Like what do you think of Tesla? Can Tesla be worth $10 trillion in the next 10 years? Or like which Mac 7 stock actually could be worth 10x in the next 10 years? From where it is today. And I like asking all these different models simultaneously. And I think council is a good cool feature. And council is just a skill inside a computer too. So you can use it in computer.

Speaker 1:
[69:03] Oh, really? Oh, that's I didn't realize that. And that's the next thing I need. I need an AI to sit next to me while I'm working. I guess this is what Microsoft Co-pilot was supposed to be. And just tell me, hey, dummy, there's a quick key for that. Hey, dummy, Perplexity Computer has that built in. I need like a Clippy that just tells me when I'm doing something on my computer that I'm an idiot and there's a faster way to do it. That would be pretty helpful at this point in time.

Speaker 3:
[69:33] I think that's going to be a feature that I feel like Apple's best position to ship because you need full access to your screen. You're not going to trust the server-side AI company to do that.

Speaker 1:
[69:45] You ever think about building an operating system? I know this sounds insane. Have you thought about that because you built a browser and you built a computer? What's the difference between Perplexity Computer and Comet Browser? Chrome became an OS. Have you thought about just building an OS that people can boot?

Speaker 3:
[70:02] I think it's certainly an interesting idea. Well, fundamentally, Jason, everything's about distribution. If you build an OS, you need to get it distributed in actual hardware devices, which means you need to have an OEM that wants to distribute it for you. If you actually read the contracts Microsoft assigned with the hardware OEMs, it will make you look at Google like an angel.

Speaker 1:
[70:28] That was a big part of the antitrust case back in the day.

Speaker 3:
[70:32] It's still pretty terrible. There's a reason even Chrome OS never picked up. They were largely only able to get adoption at the level of schools and banks and governments.

Speaker 1:
[70:46] I ran our firm on Chrome boxes for maybe two years. Then the one thing that broke it was Zoom. The ascent of Zoom just made it impossible because when you did a Zoom call in a browser window, it sucked and Zoom never released itself. It's just like, this is never going to work for our team. But people loved it because at work you could remain focused. You wouldn't have iFotos popping up. You wouldn't have Apple Music. People just love the restraint of it. As we wrap here, I want you guys to think about the most impressive AI experience you've had. The last couple of months, a tool, a product, you can shout anybody out. I'll kick us off. I have become addicted to WhisperFlow. I don't know if you guys are using WhisperFlow, but it is the greatest speech-to-text I've ever experienced in my life, and I had given up on the category of speech-to-text because I was just so disappointed in Siri's ability to take dictation. It just never worked. It could never do my last name. I mean, Aravind, it would never be able to figure out yours. We can't figure out Calacanis, but my God, WhisperFlow is so amazing. And then I got a foot pedal for 20 bucks off of Amazon. When I'm at my computer this morning, I was using the model council and I would press down on my pedal on my Windows machine. And I would just talk and keep talking and keep talking and keep talking. Because the more you give a model, the better the response. The rambling long prompt is so much better than a short one and having to just go back and floor some play ping pong with questions. And then I just lift up my foot pedal, bang, text right in there. It is a life-changing experience. Just that stupid foot pedal and this incredible piece of software. Do you have one, Aravind or Edwin?

Speaker 3:
[72:37] You can name our own products, right? Is that the purpose of the question?

Speaker 1:
[72:41] I mean, you can, but sure, if you want to get one that you love.

Speaker 3:
[72:45] You're asking me for something. Yeah. I would say...

Speaker 1:
[72:50] Feel free to give a shout out to your own.

Speaker 3:
[72:51] But the goal is something interesting. Sure. I'm a big time user of Perplexity Computer. I love using it for financial research, company running, like internal data analysis, all that, sure. But I guess I personally thought the integration, like the Grok integration inside X has improved considerably in recent times. They explained Grok button on tweets. I don't always understand some of the jokes, so I think it's pretty good.

Speaker 1:
[73:24] It's exceptional, especially when you're catching up to a pop culture moment or a breaking news story. There's somebody's like, oh my God, well, I guess President Trump and Iran's over. I'm like, I can't keep up. So I hit that Grok button and explains the whole context so well.

Speaker 3:
[73:40] Yeah. When I'm on my desktop, I can just use Comet to do that for me. But when I'm on X mobile app, you need to explain Grok button. It's pretty good. So I think it's a very beautifully done integration. So kudos to them. It wasn't actually good before.

Speaker 1:
[74:00] It was not good before. It was lacking.

Speaker 3:
[74:02] So definitely it improved tremendously. The other thing I'm impressed about is, I just think like the Gemini Flash model, the Gemini 3 Flash, it's just seem insanely fast for the capability task. It's probably the fastest, the model that hits the best sweet spot in terms of speed and intelligence. So that's a sheer amazing piece of engineering that they've accomplished.

Speaker 1:
[74:33] It is wicked fast. Edwin, what do you got? What are you obsessing over on the weekends or at nights in the AI space?

Speaker 2:
[74:40] So I actually am a really, really big fan of Claude Design. I think it's really well designed and opinionated. And it's almost like, OK, yeah, maybe this is where I see the Instagram founders touch. So I think it's a great product. I mean, it still has some bugs, some little annoyances that have been a little frustrating for me. But I'm very, very optimistic for it.

Speaker 1:
[74:58] And then Claude Design, yeah, that is I think it came out last week. I mean, Claude is just releasing stuff at a pace that none of us can keep up with. I know it's out, just haven't had a chance to play with it yet.

Speaker 2:
[75:10] Yeah, I was like, maybe this is where it would be fun to learn a little bit more about design. So yeah, I had a lot of fun playing around with it last week. But I would say one other place where I think like models just kind of blew my mind was I got my blood work a couple of months ago and it's like, okay, so I tried talking to my doctor and my doctor just gave me generic recommendations and rushed me out the door. And so I took some photos of it and I uploaded it to all different models. And honestly, they gave me some great recommendations and I feel like I've been feeling so much better since I've been following them. So I thought that was a pretty amazing experience.

Speaker 1:
[75:44] That is an interesting thing. I don't know if you guys are on Team Whoop or Team Aura or what you use or Function or Superpower, so many of these great things out there. But Whoop now, I'm trying to get my sleep dialed in so I can give a good performance on podcasts and when I'm meeting with founders and trying to hit certain stress goals, which is a good thing, like stressing your body in a good way. And it now has an AI built into it that's tuned on your health. So it was terrible six months ago. Then just the last couple of weeks, it was like, wow, you did a really great job skiing. This is your sixth day in a row of skiing. You might want to take a day off. You could consider some hydration beverages. And you're at 7,000 feet of elevation. So you're going to need to drink more water. You're going to have to get more rest. It was like, whoa. And then it's like, I know you just took a long flight to Japan. Here's how you should reset. This was like another breakthrough moment. It's like, oh, I know you're on a different time zone. Here's how that's going to affect your sleep. Would you like a sleep plan? And I was like, would I like a sleep plan to get over jet lag? You bet I would. Really well done. Like a very verticalized, incredibly fast using my data. Yeah, incredible. You have vitamin D, Edwin? Is that your issue? Like everybody? It seems like everybody doesn't get enough vitamin D in our industry.

Speaker 2:
[77:07] I do take a lot of IND, but that was one of his organizations. But I often check out the Whoop.

Speaker 1:
[77:12] Oh, you use the Whoop too? Yeah.

Speaker 2:
[77:14] I'll have to check it out.

Speaker 1:
[77:15] You should check it out. Do you use any of these, Aravind?

Speaker 3:
[77:20] I use Apple Watch. But yeah, I do have my blood work done pretty much every month.

Speaker 1:
[77:28] Every month? Whoa, that's obsessive.

Speaker 3:
[77:33] Well, that's because I had some conditions and I needed to be monitored. But now I'm fine. I'm pretty fine on all the vials. But yeah, the things I go low on at times are vitamin D and zinc because some diet stuff. So it's good to know. It's good to know ahead of time. And I think definitely the proactive intelligence is very important. So we built, like in computer, you can connect all your health stuff like Apple Health, Whoop or Function Health, Bevel. You can put all that. You can put your lab results, everything and set it up. So we are very serious about like making computer work really well for personal health use cases.

Speaker 1:
[78:20] I think it's going to be incredible. All right, gentlemen, I know that you've got very efficient teams, but at some point you must need to hire a person for something specific. We get a lot of people listen to the pods. So Edwin, anybody you're searching for, I know on the expert side, you must be constantly looking for experts. So where can people find more information if they want to be an expert or go work at your company?

Speaker 2:
[78:41] Yeah, so just go to our website, SurgeHQ.AI or email me personally. I love reading applications.

Speaker 1:
[78:47] Amazing. Aravind, what are you looking for? What do you need? How can we help you with keeping the product train running? It's been doing really well. Yeah.

Speaker 3:
[78:57] We're going deeper on the enterprise. So people interested in sales roles, definitely feel free to apply. Engineers are watching this. Hiring for full stack engineers. So definitely feel free to apply here. And yeah, very excited to see who's interested in us.

Speaker 1:
[79:12] Absolutely. And this has been Episode 10 of This Week in AI. Go to thisweekinai.ai, sign up for our newsletter. We're going to have a paid newsletter. We're tracking every single company that's invested. And we're going to be sending out reports on every one of these seed stage and series A companies. So you're going to want to sign up and get the free email. And then I think June 1st, we're going to launch the paid version with even more granular details for people who are in the industry. And we'll see you next time. Bye bye.