Hello, and welcome to Decoder! I’m Alex Heath, deputy editor at The Verge and author of the Command Line newsletter. I’m hosting our Thursday episodes while Nilay is out on parental leave.

Today, we’re talking about how AI is changing the way we use the web. If you’re like me, you’re probably already using apps like ChatGPT to search for things, but lately I’ve become very interested in the future of the web browser itself.

That brings me to my guest today: Perplexity CEO Aravind Srinivas, who is betting that the browser is where more useful AI will get built. His company just released Comet, an AI web browser for Mac and Windows that’s still in an invite-only beta. I’ve been using it, and it’s very interesting.

Aravind isn’t alone here: OpenAI is working on its own web browser, and then there are other AI native web browsers out there like Dia. Google, meanwhile, may be forced to spin off Chrome if the US Department of Justice prevails in its big antitrust case. If that happens, it could provide an opening for startups like Perplexity to win market share and fundamentally change how people interact with the web.

In this conversation, Aravind and I also discussed Perplexity’s future, the AI talent wars, and why he thinks people will eventually pay thousands of dollars for a single AI prompt.

I hope you enjoy this conversation as much as I did.

This interview has been lightly edited for length and clarity.

Alright, Aravind, before we get into Comet and how it works, I actually want to go back to our last conversation in April for my newsletter Command Line. We were talking about why you were doing this, and you told me at the time that the reason we’re doing the browser is, “It might be the best way to build agents.”

That idea has stuck with me since then, and I think it’s been validated by others and some other recent launches. But before we get into things, can you just expand on that idea: Why do you think the browser is actually the route to an AI agent?

Sure. What is an AI agent? Let’s start from there. A rough description of what people want out of an AI agent is something that can actually go and do stuff for you. It’s very vague, obviously, just like how an AI chatbot is vague by definition. People just want it to respond to anything. The same thing is true for agents. It should be able to carry out any workflow end to end, from instruction to actual completion of the task. Then you boil that down to what does it actually need to do it? It needs context. It needs to pull in context from your third-party apps. It needs to go and take actions on those third-party apps on your behalf.

So you need logged in versions of your third-party apps. You need to access your data from those third-party apps, but do it in a way where it doesn’t actually constantly ask you to auth again and again. It doesn’t actually need your permission to do a lot of the things. At the same time, you can take over it and complete the things when it’s not able to do it because no AI agent is foolproof, especially when we are at a time when reasoning models are still far from perfection.

So you want this one interface that the agent and the human can both operate in the same manner: their logins are actually seamless, client-side data is easy to use, and controlling it is pretty natural, and nothing’s going to truly be damaging if something doesn’t work. You can still take over from the agent and complete it when you feel like it’s not able to do it. What is that environment in which this can be done in the most straightforward way without creating virtual servers with all your logins and having users worry about privacy and stuff like that? It’s the browser.

Everything can live on the client side, everything can stay secure. It only accesses information that it needs to complete the task in the literal same way you access those websites yourself, so that way you get to understand what the agent is doing. It’s not like a black box. You get full transparency and visibility, and you can just stop the agent when you feel like it’s going off the rails and just complete the task yourself, and you can also have the agent ask for your permission to do anything. So that level of control, transparency, trust in an environment that we are used to for multiple decades, which is the browser — such a familiar front end to introduce a new concept of AI is going and doing things for you — makes perfect sense for us to reimagine the browser.

How did you go about building Comet? When I first opened it, it felt familiar. It felt like Chrome, and my understanding is that it’s built on Chromium, the open-source substrate of Chrome that Google maintains, and that allows you to have a lot of easy data importing.

I was struck when I first opened it that it only took one click to basically bring all my context from Chrome over to Comet, even my extensions. So, why decide to go that route of building Comet on Chromium versus doing something fully from scratch?

First of all, Chromium is a great contribution to the world. Most of the things they did on reimagining tabs as processes and the way they’ve gone about security, encryption, and just the performance, the core back-end performance of Chromium as an engine, rendering engines that they have, is all really good. There’s no need to reinvent that. And at the same time, it’s an open-source project, so it’s easy to hire developers for Perplexity. They can work on the Comet browser, especially if it’s something that has open standards, and we want to continue contributing to Chromium also.

So we don’t want to just consume Chromium and build a product out of it, but we actually want to give back to the ecosystem. So that’s natural. And the second thing is, it’s the dominant browser right now.Chrome, and almost if you actually include Edge — which is also a Chromium fork — DuckDuckGo, Brave, they’re all Chromium forks, only Safari’s based on WebKit. So, it’s actually the dominant browser and there’s no need to reinvent the wheel here.

In terms of UI, we felt like it would be better to retain the most familiar UI people are already used to, which honestly is the Chrome UI. And Safari is a slightly different UI and some people like it, some people do not, and it’s still a much smaller share of the market. And imports need to work, otherwise you’re going to be like, ‘Oh, this is not working, oh, that thing doesn’t have all my personal contacts, I’m missing out on it. I don’t want to go through the friction of logging into all the apps again.’

I think that that was very important for us for the onboarding step, which is not only onboarding you as a human but also onboarding the AI. Because the moment you’re already logged into all the third-party apps that you are logged in on Chrome in the exact same security standards, the agent gets access to that on your client and can immediately show you the magic of the product.

And the agent is seeing it, but you, Perplexity, are not. You’re not using all of the Chrome data I instantly bring over to train on me or anything like that?

No. The agent only sees it when you ask a relevant prompt. For example, ‘Based on what I’ve ordered on Amazon in the last month, recommend me some new supplements’ or, ‘Go and order the magnesium supplement that I’ve already ordered frequently on Amazon.’ The agent only sees that for that one singular prompt and doesn’t actually store your entire Amazon history on our servers, and you can always ensure that your prompts get deleted from our servers.

So, even the prompts we can choose not to look at, even for fine-tuning purposes. Let’s say we want to make our agents good at an aggregate or like, users have done Amazon shopping queries, let’s go and make it better on that. We don’t even need to look at that if you choose to not retain your prompt. So that’s the level of privacy and security we want to offer.

At the same time, the frontier intelligence is all on the server side. This is one of the main reasons why Apple is struggling to ship all Apple Intelligence being on iOS or macOS or whatever, because I think there’s generally an expectation that everything needs to live on the client side. That’s not necessary to be private. You can still be pretty secure and private with frontier intelligence on the server. So that’s the architecture we brought in on Comet.

We are talking now a couple of weeks or so after Comet came out and it’s still invite-only — or I think it’s also restricted to your premium tier, your $200 a month tier — but you’ve been tweeting a lot of examples of how people have been using it. They’ve been using it to make Facebook ads, do FedEx customer support chat, run their smart home accessories, make Facebook marketplace listings, schedule calendar meetings, there’s been a lot of stuff that you’ve shown.

Unsubscribing from spam emails, which is a favorite use case of a lot of people.

So maybe that’s the one. But I was going to say, what has been the main use case you’ve seen so far that people are finding with Comet?

Actually, while these are the more glamorous use cases, I would say the boring dominant one is always invoking the sidecar and having it do stuff for you on the webpage you’re on. Not necessarily just simple summarization, but more complex questions. Let’s say I’m watching Alex Heath’s podcast with Zuckerberg or something and I want to know specifically what he said about a topic, and I want to take that and send it as a message to my teammates on Slack.

I think that’s the thing, you can just invoke the assistant on the site and do it instantly. It’s connected to your Gmail, your calendar. It’s also able to pull the transcript from the YouTube video. It has fine-grain access, and it’s immediately able to retrieve the relevant snippet. I can even ask it to play it from that exact timestamp instead of going through the entire transcript, like whatever I want. That is the level of advantage you have.

It almost feels like you should never watch a YouTube video standalone anymore unless you have a lot of time on your hands, and it’s fantastic. And people use it for LinkedIn. Honestly, searching over LinkedIn is very hard. It doesn’t have a working search engine, basically. So the agent figures out all these shortcuts, like how we figure out using these filters — people search, a connection search — and it’s able to give recruiting power that was never possible before. I would say it’s better than using LinkedIn Premium.

I’m glad you brought up the sidecar because for people who haven’t tried it or seen it, that is the main way Comet diverts from Chrome, is that you’ve got this AI assistant orchestration layer that sits on the side of a webpage that you can use to interact with the webpage and also just go off and do things.

That interface suggests that you see the web as being less about actually browsing. You just said no one really has time to watch a YouTube video and more about an action interface. Is the browsing part of the browser becoming less meaningful in the world of AI is what I’m wondering?

I think people are still going to watch YouTube videos for fun or exploration. But when I’m actually landing at a video — you do a lot of intellectual stuff, so it’s not always fun to watch the entire thing — but I like watching specific things in the video. And also, by the way, when I’m in the middle of work, I can’t be watching The Verge podcast. I want to instantly know what Zuckerberg might have said in your video about their cluster or something, and then on the weekend, I can go back and watch the entire thing. I might have a lot more time on my hands, so it’s not actually going to stop the regular browsing.

I actually think people are going to scroll through social platforms or watch Netflix or YouTube even more, I would say, because they have more time on their hands. The AI is going to do a lot of their work. It’s just that they would choose to spend it on entertainment more than intellectual work, so intellectual browsing. Or if people derive entertainment from intellectual stuff like intellectual entertainment, I think that’s fine, too.

Like reading books, all these things are fine, like reading blog posts that you otherwise wouldn’t get time to read when you’re in the middle of work. I think these are the kind of ways in which we want the browser to evolve where people launch a bunch of Comet assistant jobs, like tasks that would take a few minutes to complete in the background and they’re chilling and scrolling through X or whatever social media they like.

Your tagline for Comet is enabling people to “Browse at the speed of thought.” I find that there’s actually a very steep learning curve to understanding what it can do.

By the way, Alex, I want to make one point. There was some article either from The Verge or somewhere else that Google was trying to use Gemini to predict maximal engagement time on a YouTube video and show the ad around that timestamp. Perplexity on the Comet browser was using AI to exactly save your time, to get you the exact timestamp you want on a fine-grain basis and not waste your time. So often people ask, why would Google not do this and that? The incentives are completely different here.

And I want to get into that and I have a lot of business model questions about Comet because it is also very compute intensive for you and expensive to run, which you’ve talked about. But to my point about the learning curve and making it approachable, how do you do that? Because when I first opened it, it’s kind of like I don’t know what I can do with this thing. I mean, I go to your X account and I see all the things you’re sharing. But I do think there’s going to be a learning curve that the people building these products don’t necessarily appreciate.

No, no, I appreciate that and it’s been the thing for me, myself as a user is that even though it’s fun to build all these agent use cases, it takes a while to stop doing things the usual way and start using the AIs more, which includes even basic things like what reply you type onto an email thread. Even though Google has these automatic suggested replies, I don’t actually usually like it and it doesn’t often pull context from outside Gmail to help me do that. Or like checking on unread Slack messages. I usually just go open Slack as a tab and try to scroll through those 50, 100 channels I’m on, clicking each of those channels, reading all the messages that are unread. It takes time to actually train myself to use Comet. So what we plan to do is actually publish a lot of the early use cases on educational material and have it be widely accessible.

I think it’s going to go through the same trajectory that chatbots had. I think in the beginning when ChatGPT was launched, I’m sure not a lot of people knew how to use it. What are all the ways in which you could take advantage of it? In fact, I still don’t think people really… It’s not really a widespread thing. There are some people who really know how to use these AI tools very well and most people have used it at least once or twice a week, and they don’t actually use it in their day-to-day workflows.

The browser is going to go through a similar trajectory, but on the other hand, the one use case that’s been very natural, very intuitive that you don’t even have to teach people how to use this is the sidecar. It’s just picked up so much that I feel like it’ll be so intuitive. It’ll almost be like, without the sidecar, why am I using the browser anymore? That’s how it’s going to feel.

It does quickly make the traditional chatbot, the Perplexity or ChatGPT interface, feel a little arcane when you have the sidecar with the webpage.

Exactly, a lot of people are using ChatGPT for… You’re on an email and you want to know how to respond, so you copy / paste a bunch of context. You go there, you ask it to do something, and then you copy / paste it back. You edit it finally in your Gmail box or you do it in your Google Sheets or Google Docs. Comet is just going to feel much more intuitive. You have it right there on the side and you can do your edits, or you’re using it to draft a tweet, or Elon Musk posts something and you want to post a funny response to that. You can literally ask Comet, ‘Hey, draft me a funny reply tweet to that,’ and it’ll automatically have it ready for you. You literally have to click the post button.

All that stuff is going to definitely reduce the amount of times you really open another tab and keep asking the AI. And firing up jobs right from your current website to go pull up relevant context for you and having it just come back and push notify you when it’s ready, that’s feeling like another level of delegation.

Where is Comet struggling based on the early data you’ve seen?

It’s definitely not perfect yet for long-horizon tasks, something that might take 15 minutes or something. I’ll give you some examples. Like I want a list of engineers who have studied at Stanford and also worked at Anthropic. They don’t have to be currently working at Anthropic, but they must have worked at Anthropic at least once. I want you to give me an exhaustive list of people like that ported over to Google Sheets with their LinkedIn URLs, and I want you to go to ZoomInfo and try to get me their email so that I can reach out to them. I also want you to bulk draft personalized cold emails to each of them to reach out to for a coffee chat.

I don’t think Comet can do this today. It can do parts of it, so you still have to be the orchestrator stitching them together. I’m pretty sure six months to a year from now, it can do the entire thing.

You think it happens that quickly?

I’m betting on progress in reasoning models to get us there. Just like how in 2022, we bet on models like GPT-4 and Claude 3.5 Sonnet to arrive to make the hallucination problem in Perplexity basically nonexistent when you have a good index and a good model. I’m betting on the fact that in the right environment of a browser with access to all these tabs and tools, a sufficiently good reasoning model — like slightly better, maybe GPT-5, maybe like Claude 4.5, I don’t know — could get us over the edge where all these things are suddenly possible and then a recruiter’s work worth one week is just one prompt: sourcing and reach outs. And then you’ve got to do state tracking.

It’s not just about doing this one task, but you want it to keep following up, keep a track of their responses. If some people respond, go and update the Google Sheets, mark the status as responded or in progress and follow up with those candidates, sync with my Google calendar, and then resolve conflicts and schedule a chat, and then push me a brief ahead of the meeting. Some of these things should be proactive. It doesn’t even have to be a prompt.

That’s the extent to which we have an ambition to make the browser into something that feels more like an OS where these are processes that are running all the time. And it’s not going to be easy to do all this today, but in general, we have been successful at identifying the sweet spots where things that are currently on the edge of working and we nail those use cases, get the early adopters to love the product, and then ride the wave of progress and reasoning models. That’s been the strategy.

I’m not sure if it’s just the reasoning models or it’s just the product’s early or I haven’t figured out how to use it correctly. My experience—

It’s not like I’m saying everything will work out of the box with a new model. You really have to know how to harness the capabilities and have the right evals and version control the prompts and do any post-training of auxiliary models, which is basically our expertise. We are very good at these things.

I would say that based on — and I’ll caveat that I haven’t spent weeks yet with it — but based on my early experience with it, I would describe it as a little brittle or unpredictable in terms of the success rate. I asked it to take me to the booking page for a very specific flight that I wanted and it did it. It took me to the page and it filled in some stuff, whereas the normal Perplexity or ChatGPT interface would just take me to the webpage. It actually took me a little bit further. It didn’t book it, but it took me further, which was good.

But then I asked it like, “Create a list of everyone who follows me on X that works at Meta,” and it gave me one person, and I know for a fact there’s many more than that. Or for example, I said, “Find my last interview with the CEO of Perplexity,” and it said it couldn’t, but then it showed a source link to the interview, so the answer said it but the source didn’t. I see some brittleness in the product and I know it’s early, but I’m just wondering is all of that just bugs or is that anything inherent in the models or the way you’ve architected it?

I can take a look at it if you can share the link with me, but I would say the majority of the advertised use cases that we ourselves advertised are things that are expected to work. Now, will it always 100 percent of the time work in a deterministic way? No. Are we going to get there in a matter of months? I think so, and you have to be timing yourself where you’re not exactly waiting for the moment where everything works reliably. You want to be a little early, you want to be a little edgy, and I think there are some people who just love feeling being part of the ride, too.

The majority of the users are going to wait until everything works stable, so that’s why we think the sidecar is already a value add for those kinds of people where they don’t have to use the agents that much. They can use the sidecar, they can use Gmail, they can use calendar connectors, they can use all those LinkedIn search features, YouTube, or just basic stuff like searching over your own history. These are things that already work well and this is already a massive value add over Chrome. And once several minutes’ worth of long-horizon tasks start working reliably, that’s going to make it feel more than just a browser. That’s when you make it feel like an OS. You want everything in that one container, and you’ll feel like the rest of the computer doesn’t even matter.

We started this conversation talking about how you think the browser gives you this context to be able to create an actually useful agent, and there’s this other technical path that the industry is looking at and getting excited about, which is MCP, model context protocol. And at a high level, it’s just this orchestration layer that lets an LLM talk to Airtable, Google Docs, whatever, and do things on your behalf in the same way that Comet is doing that in the sidecar.

You’re going at this problem through the browser and through the logged-in state of the browser that you talked about and that shortcut, while a lot of people — Anthropic and others, OpenAI — are looking at MCP as maybe the way that agents actually get built at scale. I’m curious what you think of those two paths, and are you just very bearish on MCP or do you think MCP is for other kinds of companies?

I’m not extremely bearish on MCP. I just want it to mature more, and I don’t want to wait. I want to ship agents right now. I feel like AI as a community, as an industry has just been talking about agents for the last two years and no one’s actually shipped anything that worked. And I got tired of that and we felt like the browser is a great way to do that today.

MCP is going to definitely play a contributing factor to the field in the next five years. There’s still a lot of security issues they need to figure out there. Having your authentication tokens communicated from your client to an MCP server or from a remote MCP server to another client, all these things are pretty risky today, way more risky than just having your persistent logins on your client on the browser. The same issues exist with OpenAI’s Operator, which tries to create server-side versions of all your apps.

I think there’s going to be some good MCP connectors that we’ll definitely integrate with Linear or Notion. I guess GitHub has an MCP connector. So whenever it makes sense to use those over an agent that just opens these tabs and scrolls through them and clicks on things, we’re going to use that. But it’s always going to be bottlenecked by how well these servers are maintained and how you orchestrate these agents to use the protocol in the right way. It doesn’t solve the search problem on those servers, by the way. You still have to go and figure out what data to retrieve.

You define it as the orchestration layer. It’s not the orchestration layer, it’s just a protocol for communicating between servers and the client, or one server or another server. But it’s still not solving the problem of reasoning and knowing what information to extract and knowing what actions to take and all that chaining together different steps, trying things when things don’t work. Whereas the browser is basically something that’s been designed for humans to actually operate in, and extracting a DOM and knowing what actions to take seems to be something that these models, the reasoning models, seem to be pretty good at.

So we are going to do a hybrid approach and see what works best. In the end, it has to be fast, it has to be reliable, and it has to be cheap. So if MCP lets us do that better than the browsing agent, then we’ll do that. There’s no dogmatic mission here.

At The Verge, we care a lot about the way our website looks and feels, the art of it, the visual experience, and with all this agent talk and it collapsing into browsers, I’m curious what you think happens to the web and to websites that devote a lot to making their sites actually interesting to browse. Does the web just become a series of databases that agents are crawling through MCP or whatever and this entire economy of the web goes away?

No. I actually think if you have a brand, people are going to be interested in knowing what that brand thinks, and it might go to you, the individual, or it might go to Verge, or it might go to both. It doesn’t matter. So even within Verge, I might not be interested in articles written by some other people. I might be interested in specific people who have data content or something. So I think the brand will play an even bigger role in a world where both AIs and humans are surfing the web, and so I don’t think it’s going to go away. Maybe the traffic for you might not even come organically. It might come through social media. Let’s say you publish a new article, some people might come click on it through Instagram or X or LinkedIn. It doesn’t matter.

And whether it would be possible for a new platform to build traffic from scratch by just doing the good old SEO tricks, I’m actually bearish on that. It’s going to be difficult to create your own presence by just playing the old playbook. You’ve got to build your brand through a different manner in this time period, and the existing ones who are lucky enough to already have a big brand presence, they have to maintain the brand also with a different playbook, not just doing SEO or traditional search engine growth tactics.

On Comet as a business, it’s very compute-intensive and it’s still invite-only. I imagine you wish you could just throw the gates open and let anyone use it, but it would melt your servers or your AWS bills, right? So how do you scale this thing? Not only do you scale it from the product sense and it becomes a thing that normal people can easily use and understand that curve of learning it that we talked about, but also just the business of it. You’re not profitable, you’re venture-backed, you have to make money one day, you have to be profitable. How do you scale something like this that is actually even more compute-intensive than a chatbot?

I think if the reliability of these agents gets good enough, you could imagine people paying usage-based pricing. You might not be part of the max subscription tier of $200 a month or anything, but there’s one task you really desperately want to get done and you don’t want to spend three hours doing that, and as long as the agent actually completes and you’re satisfied with the response rate, the success rate, you’ll be okay with trusting the agent to paying an advance fee of $20 for the recruiting task I described, like give me all the Stanford alumni who worked at Anthropic.

I think that is a very interesting way of thinking about it, which is otherwise going to cost you a lot more time or you have to hire a sourcing consultant, or you have to hire a full-time sourcer whose only job is that. If you value your time, you’re going to pay for it.

Maybe let me give you another example. You want to put an ad on Meta, Instagram, and you want to look at ads done by similar brands, pull that, study that, or look at the AdWords pricing of a hundred different keywords and figure out how to price your thing competitively. These are tasks that could definitely save you hours and hours and maybe even give you an arbitrage over what you could do yourself, because AI is able to do a lot more. And at scale, if it helps you to make a few million bucks, does it not make sense to spend $2,000 for that prompt? It does, right? So I think we’re going to be able to monetize in many more interesting ways than chatbots for the browser.

It’s still early, but the signs of life are already there in terms of what kind of use cases people have. And if you map reduce your cognitive labor in bulk to an AI that goes and does it reliably, it almost becomes like your personal AWS cluster with natural language-described tasks. And I think we have to execute on it, but if we do execute on it and if the reasoning models are continuing to work well, you could imagine something that feels more like Cloud Code for life. And Cloud Code is a product that people are paying $1,000 a month also because, even though it’s expensive, it helps you maybe get a promotion faster because you’re getting more work done and your salary goes up, and it feels like the ROI is there.

Are you betting so much on the browser for the next chapter of Perplexity because the traditional chatbot race has just been completely won by ChatGPT? Is Perplexity as it exists today going away and the future of it is just going to be Comet?

I wouldn’t say that I’m betting on it because the chatbot race is over. Let me decouple the two things. The chatbot race does seem like it’s over in the sense that it’s very unlikely that people think of another product for day-to-day chat. From the beginning, we never competed in that market. We were always competing on search. We were trying to reimagine search in the conversational style. Yes, every chatbot has search integrations. Some people like that, some people still like a more search-like interface that we have, so we never wanted to go after that market and we are not competing there either. Google is trying to catch up and Grok’s trying to catch up, Meta’s trying to catch up, but I feel like all that is wasted labor in my opinion at this point.

But the way I would phrase it is the browser is bigger than chat. It’s a more sticky product, and it’s the only way to build agents. It’s the only way to build end-to-end workflows. It’s the only way to build true personalization, memory, and context. And so it’s a bigger price in my opinion than trying to nail the chat game, especially in a market that’s so fragmented. And it’s a much harder problem to crack, too, in terms of intelligence, how you package it, how you context engineer it, how you deal with all the shortcomings at the current moment, as well as end-user-facing UX — which could be the front end, the back end, the security, the privacy, and all the other bugs that you’ get to deal with when working with a much more multifaceted product like the browser.

Do you think that’s why OpenAI is going to be releasing a browser? Because they agree with that?

I don’t know if they are. I’ve read the same leaks that you have, and it was very interesting it came two hours after we launched. You also made another point about Perplexity being ignored and Comet being the next thing. I don’t see it that way because you cannot build a browser without a search. A lot of people praised the Comet browser because it doesn’t feel like another browser. You know why? One of the main reasons is, of course we have the sidecar and we have the agent and all that, but the default search is Perplexity. And we made it in a way where even if you’re having an intent to navigate, it’ll understand that.

It’ll give you four or five links if it feels like it’s a navigational query, it’ll give you images pretty quickly. It’ll give you a very short answer also, so you can combine informational queries or navigational queries, agent queries in one single search box. That is only doable if you actually are working on the search problem, which we’ve been working on since the last two and a half years. So I would say I don’t see it as two separate things. Basically, you cannot build a product like Chrome without building Google. Similarly, you cannot build a product like Comet without building Perplexity.

So is there a Comet standalone mobile app and a standalone Perplexity app?

Yeah, there will be standalone apps for both. Some people are going to use the standalone Comet app just like how they use Chrome or Safari, and it’s okay. They probably won’t do that because it’s going to have an AI that you can talk to on every webpage, including in voice mode actually. But you still want to just navigate and get to a website quickly. I just want to go and browse Verge without actually having any question in my mind, that’s fine. And I could go to Perplexity and have all the other things the app has like Discover feeds and Spaces and just quick, fast answers without the web interface. That’s fine, too.

We are going to support a packaged version of the browser Comet within the Perplexity app, just like how the Google app still supports navigation like Chrome. So, by the way, both the Google app and the Chrome app are WebKit apps on iOS. Similarly, both the Google app and the Chrome app are Chromium apps on Android. We’ll have to follow the same trajectory.

Speaking of competition, I’m curious what you think of Dia, what The Browser Company has done. They released it around the same time as you, they’re moving in this direction as well. Obviously they’re a smaller startup, but they got a lot of buzz with Arc, their original browser, and now seem to be betting on the same idea that you have with Comet. I’m curious if you’ve gotten to try it or how you think it will stack up against Comet.

I haven’t tried it myself. I’ve seen what other people have said. I think they have some interesting ideas on the visuals on the front end. And if I were them, I would’ve just tried it in the same browser they had instead of going and trying to build distribution on a new one. But yeah, it’s interesting. We are definitely going to study every product out there. Our focus, though, more goes on Chrome. It is the big brother. And the way I think about it is even if I take 1 percent of the Chrome users, set their default as Comet, that’s a massive, massive win for us and a massive loss for them, too, by the way, because any ad revenue lost is massive at that scale.

Is word of mouth the main way you’re going to grow Comet or are you looking for distribution partnerships beyond that?

In the beginning, we’re going to do more word of mouth growth. It’s very powerful. It’s worked out well for us in the past with Perplexity itself, and we’re going to try to follow the same trajectory here. And luckily we have an installed base of Perplexity already of 30 to 40 million people. So even if we get a good chunk of those people to try out Comet and convert some of those people who tried it into setting it as default, it’ll already be a massive victory without relying on any distribution partnerships.

And then we’re obviously going to try seeing how to convert that progress into a partnership like Google has with a bunch of people. I just want to caveat that by saying it’s going to be extremely hard. We’ve spoken about this in the past where Google makes sure every Android phone has Google Chrome as a default browser and you cannot change that.

You lose a lot of money if you change that. And Microsoft makes sure every Windows laptop is coming with Edge as the default browser. Again, you cannot change that. You will lose a lot of money if you change that. Now the next step is okay, let them be the default browser, at least can you have your app as part of the Android or Windows build? You still cannot change that easily. Especially on Windows, it’s basically pretty impossible to convince large OEMs to change that. So they have all these agreements that are several years locked in, and you work with companies that plan for the device that they’re shipping two years in advance.

That’s their mode in some sense. It’s not even the product, it’s not even exactly in the distribution world, it’s more in the legalities of how they crafted these agreements, which is why I’m happy that the DOJ is at least looking into Google. And we’ve made a list of recommendations on that, and I hope something happens there.

Yeah, it may have forced a spinoff of Chrome, which would be really interesting and reset things. There’s a lot of people that think Apple should buy you. And Eddy Cue, one of their top execs, actually had some pretty nice things to say about you on the stand when he was there during the Google trial and said that you guys had talked about working together. Obviously you can’t talk about something that hasn’t been announced yet, especially with Apple, but yeah, what do you make of that and Apple?

I mean, I’m firstly honored by Eddy mentioning us in the trial as a product that he likes, and he’s heard from his circles that people like it. I would love to work with Apple on integrations with Safari or Siri or Apple Intelligence. It’s the one product that almost everybody loves using or it’s a status symbol. Everybody wants to graduate using an Apple device.

So I’m pretty sure that we share a lot of design aesthetics in terms of how we do things and how they do things. At the same time, my goal is to make Perplexity as big as possible. It’s definitely possible that this browser is so platform-agnostic that it can benefit Android and iOS ecosystems, Windows and Mac ecosystems, and we can be pretty big on our own just like Google was. Of course, Google owns Android, but you could imagine they would’ve been pretty successful if they just had the best search engine and the best browser and they didn’t actually own the platform either.

I and others also reported that Mark Zuckerberg approached you about potentially joining Meta and working on his reboot of their AI efforts. What was Zuck’s pitch? I’m curious. Tell me.

Zuck is awesome. He’s doing a lot of awesome things, and I think Meta has such a sticky product. It’s fantastic, and we look at that as an example of how it’s possible to build a large business without having any platform yourself.

Were you shocked by the numbers that Zuck is paying for top AI research? These nine-figure compensation offers. I think a lot of them are actually tied to Meta stock needing to increase for those numbers to be paid. So it’s actually pretty contingent on the business and not just guaranteed payouts, but still huge numbers.

Yeah, huge. And definitely, I was surprised by the magnitude of the numbers. Seems like it’s needed at this point for them, but at the same time, Elon and xAI have shown you don’t need to spend that much to train models competitive with OpenAI and Anthropic. So I don’t know if money alone solves every problem here.

You do need to have a team that works well together, has a proper mission alignment and milestones, and in some sense, failure is not an option for them. The amount of investment is so big and I feel like the way Zuck probably thinks is, ‘I’m going to get all the people, I’m going to get all the compute and I’m going to get all the milestones set up for you guys, but now it’s all on you to execute and if you fail, it’s going to look pretty bad on me so you better not fail.’ That’s probably the deal.

What are the second order effects to the AI talent market, do you think, after Zuck’s hiring spree?

I mean, it’s definitely going to feel like a transfer market now, right? Like an NBA or something. There’s going to be a few individual stars who are having so much leverage. And one thing I’ve noticed is Anthropic researchers are not the ones getting poached.

Mostly. He has poached some, but not as many.

Yeah. So it does feel like that’s something labs need to work on, which is truly aligning people on one mission. That money alone is not the motivator for them. And as the company, your company’s doing well, the stock is going up and you feel dopamine from working there every day. You’re encountering new kinds of challenges, you feel a lot of growth, you’re learning new things, and you’re getting richer, too, along the way. Why would you want to go?

Do you think strongly about getting Perplexity to profitability to be able to control your own destiny, so to speak?

Definitely, it’s inevitable. We want to do it before the IPO and we think we can IPO in 2028 or 9. I would like to IPO, by the way, just to be clear. I don’t want to stay private forever like some of the companies have chosen to do so. Even though it gives you advantages in M&As and decision-making power, I do think the publicity and the marketing you get from an IPO and the fact that people can finally invest in a search alternative to Google is a pretty massive opportunity for us to IPO.

But I don’t think it makes sense to IPO before hitting $1 billion in revenue and some profitability along the way. So that’s definitely something we want to get to in the next four or three years. But I don’t want to stunt our own growth and not be aggressive and try new things today.

Makes sense. So, you launched Perplexity, and it’s crazy that it’s already been just over three years now, and it was right around when ChatGPT first launched. It’s wild to think about everything we’ve talked about and that all this has happened in barely three years. So maybe this is an impossible question, but I want to leave you with this question. If you look out three years from now, you just talked about the IPO, which is interesting, but what does Perplexity look like three years from now?

I hope it becomes the one tool you think of when you want to actually get anything done. And it has a lot of deep connection to you because it synchronizes with all your context and proactively thinks on your behalf and truly makes your life a lot easier.

Alright, we’ll leave it there. Aravind, thanks.

Questions or comments about this episode? Hit us up at decoder@theverge.com. We really do read every email!

Decoder with Nilay Patel

A podcast from The Verge about big ideas and other problems.

SUBSCRIBE NOW!

Share.
Exit mobile version