He Built an AI Audience Simulator. It’s the Future of Customer Research. - Ep. 49 with Michael Taylor

Michael Taylor has perfected the art of getting AI to speak in tongues. He’s taught it to mimic the voices of your customers—so you can see how they would respond before you ship. Michael is the creator of Rally, a market research tool that lets you simulate an audience of AI personas. He built a simulator that lets us A/B test Every’s headlines on an audience that mimics the real Hacker News audience. It’s become a part of my writing workflow, and I love it because you test your assumptions quickly, cheaply, and without any of the risks of putting something out into the world. Besides Rally, Michael co-authored a book on prompt engineering for O’Reilly, and he writes a column for Every about managing AI tools like you would people. In a past life, he founded a growth marketing agency which he grew to 50 people and sold in 2020. One of the reasons I’m drawn to Michael’s work is because he has a tinkerer’s mindset. He’s always exploring the limits of what a new technology can do, and what he’s into today, everyone else will likely discover six months later. We spent an hour talking about using language models to judge your work, best practices for assessing an AI’s performance, and Michael’s flow inside Cursor. He also demos Rally live on the show, testing three different potential headlines for an Every article. If you found this episode interesting, please like, subscribe, comment, and share! Want even more? Sign up for Every to unlock our ultimate guide to prompting ChatGPT here: https://every.ck.page/ultimate-guide-to-prompting-chatgpt. It’s usually only for paying subscribers, but you can get it here for free. To hear more from Dan Shipper: - Subscribe to Every: https://every.to/subscribe - Follow him on X: https://twitter.com/danshipper Timestamps: - Introduction: 00:01:32 - AI can simulate human personalities with remarkable precision: 00:04:30 - How Michael simulated a Hacker News audience: 00:08:15 - Push AI to be a good judge of your work: 00:15:04 - Best practices to run evals: 00:19:00 - How AI compresses years of learning into shorter feedback loops: 00:23:01 - Why prompt engineering is becoming increasingly important: 00:27:01 - Adopting a new technology is about risk appetite: 00:44:49 - Michael demos Rally, his market research tool: 00:47:10 - The AI tools Michael uses to ship new features: 00:54:53 Links to resources mentioned in the episode: - Michael Taylor: @hammer_mt - Join the waitlist for Rally, Michael’s synthetic market research tool: https://askrally.com/ - The book Michael co-authored on prompt engineering: [Prompt Engineering for Generative AI](https://www.oreilly.com/library/view/prompt-engineering-for/[redacted card]/) - The column Michael writes for Every: Also True for Humans - Michael’s article on personas of thought: "I Asked 100 AI Agents to Judge an Advertisement” - Michael’s article on building a Hacker News simulator: "I Created a Hacker News Simulator to Reverse-engineer Virality”

Published: Published Feb 26, 2025
Uploaded: Uploaded Jun 13, 2026
File type: Podcast
Queried: 00
Source: share.transistor.fm

Full transcript

Showing the full transcript for this episode.

AI-generated transcript with timestamped sections.

0:00-1:17

[00:00] You're thinking about using AI to do simulations of people? If it can roleplay as one character, why not multiple? You can have it be a whole focus group for you. So basically you typed in a prompt and then it spun up an audience and then it asked the audience that question. So like, which of these three would you click on? So that's cool. But also in the sidebar, you can see like each individual like simulated person, like what they said. You can test things and kind of look at the assumptions that you're making before you actually put it out there in the world. [00:30] I think people may not like even really understand the full weight of what you're saying. My workflow now is, you know, when we do an announcement or like I'm writing a post or whatever, I will figure out like what I want to tweet or what the headline should be or whatever. And you have a little MVP that I've been using and I can just like put it in there and get some wisdom of the crowds. And I love it. This is so valuable. Before you'd had to like throw it out into the world and that costs time and money. And if you don't have an audience, like it's really hard if you don't know the people who are it's going to be going to. So yeah, I love this thing [01:00] cool, you can just like automate everything. You know, you'll always have the right answer to every question and you're going to make amazing content like every single time or whatever. I think what's interesting about this is there's so many different variables to it. I think it's not going to do the thing that people think where it's like, oh, we're just going to turn into like zombies and everyone's marketing messages are going to be the same because the space of possibilities are really big.

1:30-3:04

[01:30] you [01:31] Michael, welcome to the show. [01:33] Yeah, good to be here, Dan. Good to have you. So for people who don't know, you describe yourself as a recovering agency owner. So you had a marketing agency that you grew to 50 people, and then you sold it in 2020. And then you just dove into AI stuff. You wrote the prompt engineering book for O'Reilly, and you are also a regular columnist on Every. You have an amazing column called Also True for Humans. I'm psyched to have you. [02:00] Yeah, it's good to be here. And I've been watching a bunch of these episodes and just kind of thinking about how other people use AI. So it'd be interesting to see like how I differ. [02:09] Amazing. So, um, I'm excited to have you. I think like, [02:14] One of the reasons I really loved your work is you have a little bit of a, you have a tinkerer's mindset and you're just always like playing with new things and getting psyched about them and like exploring the limits of what the current technology can do. I think that's why you're a really good prompt engineer and why you think about it so much. I think you're really good at like building little workflows to yourself to automate things. And I'm just, um, yeah. [02:37] I'm always into or interested in what people like you are currently like playing around with. Cause I think like what you're into right now, people, other people are going to be into in like six months or a year or something like that. Um, and the thing that you're working on that I'm most excited about is, um, [02:53] You're thinking about using AI to do simulations of people. [02:58] And I wonder if you let's start there. Lay the groundwork for me. Explain what that is and why you think that's interesting. And then let's talk about what you're doing with that.

3:05-4:28

[03:05] Yeah, so I studied economics and then went into a career in marketing, but I always was interested in this idea of being able to predict behavior. I think that's what got me into economics, because with microeconomics, you could somewhat predict behavior and then get into growth marketing that way because you could run A-B tests and you could see how to predict behavior. [03:35] got into AI, I would start messing around that stuff too. And I use roleplay a lot, right? And that's pretty obvious. That's the number one prompt engineering tactic that everyone tries first is, as a researcher in this field, give me an answer. And it actually really does change the answer. And because the training data is so wide, it can roleplay as almost any character. And the logical [04:05] And you can have it be a whole focus group for you. You can have it be an entire audience. And then you can test things and look at the assumptions that you're making and see in a risk-free environment whether your idea would work or not before you actually put it out there in the world. Right. I want to stop there because I think people may not even really understand the full weight of what you're saying.

4:35-6:08

[04:35] it will like make the same, a lot of the same decisions, like very high overlap as like a real person who has that personality. So, um, one thing that I did early on, this is, [04:45] maybe like a year ago-ish. It was like when GPT-4 was still good, which should tell you, like, it feels like forever ago. Yeah, it's like 100 years ago now. [04:54] But one thing that I did was, because I was sort of, I sort of knew about this research too, and I was like kind of into it, and I... [05:02] So I got GPT-4 to look at my tweets and then based on my tweets, write a personality profile for me. And then I gave that personality profile to another GPT-4 and I was like, be this person and adopt this personality. [05:17] And then I had it take a personality test as me. [05:21] So I took the personality test, then it took the personality test to see what the overlap and the scores were. And the overlap was extremely high. And then I was like, I wonder if... [05:32] Someone who knows me really well in my life would be able to do what GPT-4 just did, just based on knowing me. And so I had my girlfriend at the time and my mom... [05:42] pretend to be me and take a personality test and try to think about what I would pick. And GPT-4 is better than both of them, just from my Twitter account, which is crazy. It's really wild. Yeah, maybe they don't read enough of your tweets, I guess. [05:57] Yeah, it's really not that GPT-4 is good. It's just an indictment of my ex-girlfriend. [06:03] I mean, yeah, I've done the same thing. Very early on, I was doing a lot of writing and I was like,

6:12-7:54

[06:12] And then actually I did do it and people couldn't tell. And people would ask me like, hey, Mike, I really like that thing you posted on LinkedIn. And I was like, you're going to have to remind me what it was. [06:24] I'm too busy and important and popular now to even know what I tweeted or what I put on LinkedIn. Exactly. It must be how famous people feel when they have someone managing their account. But yeah, I always thought, oh, there's actually surprisingly good at this. [06:39] And, and then I saw it like, like you said, I saw a bunch of these studies, there was one that jumped out at me, I have to find the link, but they basically said that, if you do a two hour interview with someone, and you take the transcript and then create a personality profile, it, you get 80% accuracy in big five personality traits, like personality tests, how they, with the decisions they make in economic games, you know, how they vote, as well. [07:09] crazy thing is that like 80% is already pretty amazing, but it's actually as accurate as if you interview the same person again, the second time. So, so there's always drift. Like when you interview, like if I interview you today and I interview you next month, uh, next month Dan is only going to agree with 80% of what Dan said today. Um, so virtual Dan is like as good as that. I contain multitudes, Mike. Yeah. Um, yeah. And I think, and, and where you're going with that is [07:39] um [07:40] If you're a founder and you want to interview your customers or know what your customers want, and it's in an industry that you're not as familiar with, a good place to start is just...

7:54-9:44

[07:54] talking to ChatGPT or talking to Claude, which I've done. It really works. It's crazy how well it works just from basic prompting. If you want to try to... You're thinking about, I want to build a product in X industry. Let me go figure out how that kind of person would think and what their day is like and all that kind of stuff. It's really, really good. But you're taking it a step further. Tell us about what you've been building behind the scenes. Yeah. Yeah. So this all kind of stemmed from a post I did for you guys, actually, [08:24] prompt I was using, where I said, okay, let's generate a bunch of people who are relevant personas who could answer this question first and then query and then like kind of fill in the blanks of like what that person would say in answer to the question. So I had that, I was doing that all in one prompt. But then I wanted to automate it. So I wrote a script. And then I think we were talking, you were like, help me solve a debate. I think you liked the, you're trying to describe [08:52] How should we describe every? A meta media company or a multimodal media company? Was the question. And Kate, our editor in chief was like, I don't like meta. And I was like, but I like it. So we were like, well, let's test the wisdom of the crowds. Let's see what would happen. And then you built the script, you put it into the script. [09:14] Yeah, I ran the script. And then the thing that kind of convinced me to get working on this a bit more was that you changed your mind. It's like very rare that I see a CEO change their mind about anything. So I was like, OK, there might be something here, you know. And and then I was I was using it for everything. I was like, oh, I'm I'm actually changing my mind, too. And it was an unexpected result. Like I didn't I didn't expect I thought people would be really skeptical.

9:44-11:24

[09:44] the AI responses. But I think because each AI is answering itself, it gives the reasons. And then when you summarize those reasons, then you can kind of look at... [09:55] them and go, oh, I agree with those reasons and therefore I can change my mind without being, you know, totally, uh, too stuck in my waist. You know, I don't know, like what, how is that? Like, is that kind of a similar thought you had with the time or basically, I mean, like, I think, um, yeah, I think, um, yeah, I think, um, yeah, I think, um, yeah, I think, um, yeah, I think, um, [10:10] You know, my basic way of being as a CEO or writer or whatever is like, [10:16] As I talk, I'm like paying attention to how people respond. And I'll like get a sense for like, ooh, this little thing I just tried, like worked really well, you know? And then that's how I write headlines or that's how I figure out ideas and all that kind of stuff is I'm just like constantly trying stuff. And so that feedback was very important to me. And that's why I love, you know, using X or whatever is because like I can just like put something out and like lots of people will interact with it or not or like doing sales meetings, the same thing, all that kind of stuff. [10:46] And so this feels like an extension of that where I can take way more risks, you know? And I can also like, it's a little more quantitative. It's a little bit more like... [10:59] I can pit one thing versus another. Whereas in real life, people remember what you just said. So you can't just wind back the clock and be like, what if I had said it another way? How would you respond? It's hard to do a fair test. And so I was immediately like, wow, that makes a lot of sense. And also, I do think having the thoughts of the model is really helpful. Because they were like,

11:29-13:12

[11:29] was saying. And I was like, I don't know, Kate, like I'm a, I'm the CEO. Like I don't think of meta when I think of it or whatever. Um, but I, you know, other people were thinking and I was like, okay, well that, that makes a lot of sense. And then, and then I changed my mind and, um, and yeah, like it is, um, [11:43] I will do anything to get this to be part of our bundle because I love this product. I think it's so cool. My workflow now is when we do an announcement or I'm writing a post or whatever... [11:58] I will use Spiral, which is our content repurposing platform, to figure out what I want to tweet or what the headline should be or whatever. And then I'm just pinging you or you have a little demo or little MVP that I've been using. And I can just put it in there and get some wisdom of the crowds. And I love it. And what's interesting to me, too, is... [12:20] Yeah. [12:21] I think where people are going, like where people's heads might be going is like, okay, cool. You can just like automate everything and like, you know, you'll always have the right answer to every question and you're going to make like super, super like crazy, amazing content like every single time or whatever. [12:34] And, [12:35] I think what's interesting about this is there's so many different variables to it. So like, for example, one variable is what is the audience? And you can spin up different audiences with different like demographics and different viewpoints and personalities or whatever. So we have an every audience and then we have like a hacker news audience that we can test and all that kind of stuff. And then also the results that you get are extremely dependent on the question that you ask. [12:58] And so if I'm pitting one thing versus another, it's not like going and finding what is the best possible way to phrase this. It's just like pitting one thing versus another. And maybe there would be a way to automatically...

13:12-15:00

[13:12] try tens of thousands of messages or whatever. But even then, you start to realize that the [13:18] the space of possibilities is so huge. It's so huge that anything... You can only ever test a small portion of it and different people are going to start in different places in the landscape. And so yes, these tools are super powerful. And it makes my process way more efficient to have it available. But I think it's not going to do the thing that people think where it's like, oh, we're just going to turn into zombies and everyone's marketing messages are going to be [13:48] The space of possibilities are really big. [13:50] Yeah, exactly. There's two things combined, right? The space that you're testing it in and the personas you're using. Because a Hacker News audience is going to be very different from your Twitter audience or whatever. But yeah, that was something that we did a lot of in the agency. We were testing tons of creative ideas with Facebook ads. And you could just never have enough budget. Even the biggest brands don't have enough budget to test everything. [14:20] in which you can play in, but it's still your duty to play. You still have to come up with good ideas. And AI can help you come up with good ideas too. I tend to use it in the brainstorming phase, and then I use it in the testing phase, but I'm in the middle. [14:35] I see quite often it's better at judging than it is at finalizing the copy. [14:42] And this is something I saw a lot of as a prompt engineer, like working over the past few years, is just that LLM is actually pretty good at judging the results of tasks, even if they can't do the task very well themselves, which kind of led me down this path in some respects as well.

15:00-16:30

[15:00] That's really interesting. A lot of pushy there because I've had experiences with like Claude, for example, where I'm like push where I'm like putting in like an essay or whatever. And I'm like, grade this essay. And it just like always gives you a B plus the first time or an A minus the first time. And then like, if you make some revisions, it always moves up to an A or whatever. So like, where are the tasks where it is good at judging? And where are the tasks where it's kind of going to give you like that kind of like, oh, it's an A minus or whatever. And here's, you know, and the next turn, it's an A, you know? [15:30] Yeah, it's really bad at grading. So you can't grade things on a Likert scale very easily. You can't give it stars out of five. It tends to always overestimate the middle. So it's almost like being too nice to you. Everything's like a four out of five. [16:00] something that is a 1 or a 0. So I'll have some criteria of... [16:06] For me, a good article is one that is as concise as possible. And so I have a one and zero on the concise scale. Is this concise? I'll have another one which is, does this have a compelling hook? And I'll build up maybe 20 of these for any task that I'm automating. And then I'll run my testing

16:36-18:08

[16:36] that score together, it just kind of gives me an aggregate score that is much more reliable. When you run it again, it's not changing as much. [16:46] That's interesting. So what about like, so first of all, I like the idea of breaking it up into subtasks, but then like within a specific subtask, like the hook. [16:55] is it still only going to be able to do like zero or one, like good hook or bad hook? Or is it, could you do like zero to five and would it be able to do that? [17:02] Yeah, it's not as good when you go zero to five. So what I would try and do is break that classification down into subclassifications. So I would think about, okay, what is a good hook? A good hook grabs your attention and a good hook maybe name drops or references something famous or credible. A good hook does this. [17:32] so you can give it two different articles that you've written. You say... [17:37] What is different between these two articles? [17:40] And then you look at the differences and then some of those will then become judge criteria. That's a whole other tool that would be super useful is like just recursively making like a like recursively creating a template classifier thing that like takes a bunch of examples of good and bad and then uses that to create like the most detailed possible rubric for another LLM to use. You know? Yeah. Yeah. I did think about doing that. I worried it is like way too far down the rabbit hole. I don't know. It might be.

18:10-19:41

[18:10] people actually care about the quality of their writing for most places. Yeah. Maybe not for writing, but like in general, I think like there's, I feel like there's got to be use cases for that, but it may not, it may not be like a today thing. Like it might be like a, in a year, people will be more sensitive to it kind of thing, you know? Yeah. I was doing a lot for prompt engineering, like with my clients. And then, and then once you have like a list of these criteria, then you can build a metric and then, and then you can use like a prompt [18:40] because then you can improve the accuracy of the classifier. [18:45] And then once the classifier is much better, then you can improve the accuracy of the generation task as well. So it depends on how deep you want to go on this, but this is something I spend a lot of time thinking about. I want to go deep. I mean, the thing I'm thinking about, we're talking about making evals, basically. [19:03] And I'm just thinking about Quora, our email tool, and trying to improve those summaries and then trying to figure out what is a good summary. And we're basically doing this where like, [19:13] Every week I go in and look at the summaries that Quora generates for me with Kieran, who's the GM of Quora. And I literally rewrite my inbox for him personally. That's good. That's usually what I try to get the clients to do. Yeah. And just sort of hope that over time, we'll have enough examples and enough rewrites and enough pulling out all those principles that it starts to work. And it's really fun. It's really interesting.

19:43-21:13

[19:43] I love all of the like... [19:46] mapping all those little things that I know that are not explicit that I just sort of know, you know, because I'm like, this summary is wrong. Or like, here's how I would summarize this. But like, and I'm, I can do it, but I can't explain it until I look at it. And then I'm like, well, this is the rule I'm following, you know. And it's, it's really fun. [20:04] Sometimes you don't know the rules you're following as well. You mostly don't know the rules you're following. That's part of the fun. [20:09] Yeah, once you see it, then you're like, "Oh, I get it." So I was trying to make a Tinder-like interface when I'm doing this, where I'm like, "One or zero, one or zero." Yeah, yeah, yeah. It's like the eye doctor. Yeah, exactly. Better or worse. I'll walk you through how to make quick evals for this. [20:39] analysis. You can do this in a Jupyter notebook or get the engineers to do it, or do it manually. It'd take a long time. But you just get, you compare, just ask, say, Claude or Jupyter 4, what is the difference between the rewritten and the original? And then you take all of those differences and then summarize and say, what are the main differences between a good email and a bad email? [21:07] And then you ask it for bullet points, and then you have each bullet point then becomes your evaluation criteria.

21:14-22:53

[21:14] Yeah. And then you can look at the accuracy of the evaluation criteria because now you have, for every one of your rewritten emails, you have whether it scored one or zero on that evaluation criteria, whether it was present or not. So then you can look at the false positives, false negatives. [21:44] My LLM judge said it had a good hook, so it was wrong. So that's like a false positive. So that would mark the score of the classifier down. So yeah, that's what you do. And you would do this in a Jupyter notebook? Or what's the format that you keep all the emails and examples in? Yeah, I would love to use a tool to do this. And I have tried a lot of them. And I find the abstractions change so often with AI that I keep going back to just no framework, no tool. [22:14] just Jupyter Notebook. I'm doing it in cursor and I'm just letting Claude YOLO create the interface, save everything locally in CSVs and stuff. It's probably not the best way to do it. [22:28] Um, yeah, I'm, I'm like, uh, [22:30] I'm like the sort of guy that just doesn't really care that much about the elegance of my code. And I just want to kind of get the job done. I've always been like that. And now everyone is catching up to us. Because now everyone's just like, well, I let the LLM do it and I press accept all. And I've always basically YOLO'd all my code. I used to have to handwrite it and now I don't. Yeah, I was YOLO in code back when it was Stack Overflow.

22:55-24:27

[22:55] And just copy someone else's code in there and just see if it runs. [23:00] Um, you know what this also reminds me of is like, um, [23:04] I frequently all the time, like we run a media company. So like all the time I'm giving feedback to different people. That's like the same on their writing or like the videos they make or the tweets or whatever. And like, I just want someone to like make a rule thing and then like make a bot that just like grades this stuff and gives all that feedback because it's like so repetitive. And what's been really interesting is, um, [23:27] We have this writer, Alex, who started writing with us over the last month or two. And he writes Context Window, which is our Sunday email. And he writes all of the like, here's what happened in AI, here's all the new releases, here's some analysis or whatever. And then also sometimes like, when new models come out, like I always write a big piece. So like, you know, I'm going to write a big piece. [23:47] OpenAI launched Deep Research this week. And so I wrote that and we collaborated on that. And the first time we collaborated, he's a junior writer, right? And so he's got a ton of great ideas, but the first draft he did of that... [24:05] of the first piece we did together, which was, it was some other, it was some other new model release. I can't remember, but it was the first draft he did like, [24:13] I was like, oh, this is not good. No shade to Alex. It doesn't feel like you, right? Yeah. Not just not like me, because his byline's on it and stuff, but it had a lot of...

24:28-26:16

[24:28] It had a lot of the information that was supposed to be in there, but it was also like... [24:34] It had a lot of information that shouldn't be in there, and then the actual sentences themselves, they just weren't that good. And there's a lot of reasons and a lot of mistakes and a lot of whatever. [24:43] And so what I did was after he gave me the draft, we didn't have a lot of time, so I just rewrote it. And honestly, that kind of thing where it's a new product release, it doesn't take me that long to write. It took me like 45 minutes to write the whole thing once all the information is there. And then what I had him do is take his draft and then take my draft and throw it into O1 Pro and be like, what's the difference? [25:07] And... [25:08] Like, [25:09] The next time, like when we wrote this deep research piece, I had him write the first draft and it was like... [25:14] a thousand times better the first time. And I was like, what the f***? Because before all of these tools, to get from where he was to what I saw him deliver would have taken like a thousand articles. He would have to write a thousand articles or like a year or two of like really grinding. And it was just like the next time it was great. And I was just like, this is crazy. It's absolutely crazy. And it's because like he's sort of every time I edit him, he like throws in the difference [25:44] rule book and like, um, is constantly as he's writing, going back with the O one or, or, you know, any of these models and having it, um, remind him of some of these things and write little portions that he struggles with and all that kind of stuff. And it's so much better. It's crazy. [25:57] Yeah, and I think that's the thing that people are missing is that, yes, AI is like a threat to people's jobs. And, you know, nobody really knows what's going to happen as we get like, you know, PhD level AI is doing everything. But then at the same time, they're also helping us learn a lot faster.

26:16-28:09

[26:16] So if you're actually using these tools and you have kind of these, like you're keeping rule books, right, of what is a good article, like according to the Every Style Guide, right, then you have that opinion, that strong opinion based on, you know, kind of the collective wisdom of the publication, right? [26:46] essentially like what I did at the agency as well. Like we would write these SOPs, we called them like standard operating procedures. And we had like a Google drive full of hundreds of them. And it was like, here's how I do this. Here's how I do that. Yeah. That's the thing that like, you know, I don't know, a year ago, like the really popular thing to say was like, oh, like prompt engineering is going to be dead or whatever. And that the opposite has turned out to be the case. Like it has become more and more important. And I think what, [27:12] I think the mistake that people made when they said that is, um, they, what they were saying is the gap between what you intend and what the model does is going to get [27:22] smaller and smaller. [27:25] So the gap between your objective and what it does is smaller and smaller. But the thing that people miss is that everybody has different objectives. Objectives are these big, high dimensional things that are really hard to express. [27:38] And [27:39] And people who are better at detailing their particular objective are... [27:45] prompting and that's like what it is you know um and so uh like detailed prompts are not going away because people have very detailed like things they need to get out for the to tell the model what to do um and i just think that that's that's like a really interesting i love i love the ways that we tend to misjudge new technologies and um

28:09-29:47

[28:09] And this is, I think, a really great example. Yeah, I think it stems from the fact that a lot of the people who are early in AI come from machine learning or engineering backgrounds or data science. And while obviously that's a huge benefit and they did really well even very early on in AI to get some of these systems working, they only really worked on things that were well specified. [28:39] the tool is already briefed, right? Like, like the product manager took the messy world, uh, outside and like crammed it into a PRD for you. Right. So, uh, so like, you don't, you don't realize how messy the world is, I think when you're an engineer. Um, and, and I say that as someone who, you know, was a business person and now is a full-time engineer. Um, and I can see both sides of it, but, um, and, and, and so they are completely right. Um, that like for well-specified tasks, there's, you don't need to do prompt engineering anymore. You can, [29:09] use DSPY, but so much of the world is completely unspecified. And people didn't even know what the specification might be, like we talked about earlier. You don't know what is a good [29:20] article until you see it. And then you're like, oh, that's something that I feel allergic to. Totally. And I think that's the really interesting thing is so much of the world is not specified. And it's really interesting to see these companies like OpenAI going down the RL route with getting better at tasks where you can specify the end result in every step, verify the end result in every step, which will do some interesting things. But like,

29:48-31:30

[29:48] So much of the world is outside of the realm of what can be specified. And I think people just sort of miss that or assume that eventually it will all be specified. And I think that it won't. Specifications tend to be too... [30:05] low dimensional to express like all the stuff that's going on. Um, there's some, some things always have to be sort of inexplicit is, is my, my sort of philosophical view. And I think the [30:18] It is the first time that we have had tools that can deal with things that are inexplicit. We've never had that before. Any kind of computer, you have to be able to specify. It has to be exact. It has to be mathematical and logical, basically. And LLMs are like, here's an example. Just classify it based on these examples or follow the examples or whatever, which is a good way to do that. [30:42] talk about something or point to something that is inexplicit. Why is that example a good example? You can probably boil it down to rules, but there's something there that is still kind of unexpressed or there's a lot more richness in an example than the rule that you use to describe the example. And I love that. I love all that stuff. It makes me so happy. Yeah. I mean, I do wonder how far it will go, if we're going to keep seeing this exponential [31:12] Inescov. Like even if it [31:15] Even if it does tail off, I think we'll have enough to keep us busy for 50 years. Even with the models just stopped improving today. So I'm not worried about that. But yeah, I always think about this...

31:30-33:07

[31:30] uh, this thing, uh, people talk about, um, billiards or snooker pool. Um, uh, you can, you can calculate, uh, where a ball is going to hit, um, for like the first bounce. Uh, but like by the third or fourth bounce, you'd need like more time than, uh, has ever passed in the entire universe to calculate where it's going to hit. And I think it's, there's something in that, you know, you know, I, I first read that in, uh, in one of the Nassim Taleb books, I can't remember if it's Fooled by Randomness or the Black Swan or whatever. And I love that because the reason [32:00] because, [32:01] you have to start to take into account the pull of gravity of every like object in the universe in order to like keep moving. [32:09] Because everything pulls on everything else. And so after one hit, it's not that much pull, but after the fifth, it's a lot. [32:16] Same thing for personality prediction. It's like, if you can get 80% of what I would do, you might be able to predict the next thing I do. But once we get five moves out, things are wildly diverging. I want to take one minute away from this episode to introduce you to our sponsor, LTX Studio. I think storytelling is one of the most essential skills of the AI age. You can bring your stories to life with just a few words, [32:39] storyline, settings, all according to your style and your specifications. LTX Studio is helping storytellers visualize their stories in entirely new ways. Two of the most magical parts of LTX are in their character generation and storyboarding. Here's how it works. If you type in a description and maybe add a headshot, LTX generates unique dynamic characters, each with their own distinct look and personality. Remember the old days of struggling with storyboards? LTX makes it simple. If you need to map out a bustling debate in ancient Greece

33:09-34:42

[33:09] lays out and expands on your vision shot by shot. Better yet, it suggests new angles and shots you might not have considered. First-person perspectives, wide angles, close-ups, you've got it all. I can switch between six different rendering styles for the characters and settings. Whether you want ultra-realistic characters or cartoon-style art, it's all just one click away. The AI revolution is just starting, but if one thing's clear, it's that it's not replacing human creativity. It's expanding it. So if you've ever had a story in your head with no way to bring it to life, [33:38] Start with LTX Studio. It might just be the creative partner you've always needed. Check out the link in the episode description for more details. And now, [33:45] Back to the episode. Yeah, but essentially, like, you're not going to fit the gravitational pull of every object in the universe in the prompt. Yeah, exactly. Probably not. Yeah, you can't. It doesn't matter how good the models get. You know, like, it doesn't matter how good we get at pre-training or whatever, you know. Like, it's never going to be able to completely specify. Like, it's always going to be a simplification of reality, right? It's always a model. [34:15] know either, but my feeling about this stuff is, um, [34:21] We tend to collapse our view of ourselves and the world. [34:26] to reflect the tools that we have at our disposal. So as an example, [34:34] When we only had like... [34:37] watches and telescopes and calculus, like,

34:42-36:26

[34:42] we tended to think of the human mind and the universe as sort of like a, sort of like a mechanical clock, you know? Yeah. And sort of operating according to Newtonian mechanics, which it sort of does if you look at it that way, but also if you look at it a different way, it does not work like that at all. But we were completely convinced that like for, for hundreds of years, basically that that's how everything worked. And we just needed to like, um, [35:09] We just needed to do a better job of specifying or finding the underlying rules or principles by which the clock of our minds or the clock of the universe worked. And then Einstein came along and he was like, well, it's not really like that at all from this other perspective or maybe this larger perspective or whatever. And I think something similar happens with language models where like... [35:29] you look at a thing that can do a lot of tasks that you're used to being able to do. And you're immediately like, it's going to do everything I do because like, [35:37] you just sort of shrink your sense of self and sense of the world and sense of what you can do to like that thing. At least initially, it tends to like hide all the complexity of, of who you are and what you can do and like all that kind of stuff. But I think over time, [35:53] we will discover that there's like a lot of things about us that, um, [35:58] it's not able to do even if it's operating at phd or post phd levels um [36:04] We just can't see what that is right now, but I think that it's there. And that's another example of ways that new technologies cause us to see things differently and mistakes we tend to make. Yeah, I really think... I said this before on the show, but I think there's this false peak thing with AI where every time you get to a new level, you're like...

36:26-38:08

[36:26] I can see the peak right there. And once I get there, that's going to be it. It's going to take over everything. And then you get to that peak and there's, [36:32] There's another horizon that opens up. And I think that that's true, too. Like, humans are way more flexible and powerful than I think we give ourselves credit for in a lot of ways. And not to say that this technology is not amazing and powerful. And I totally think that, too. But there's, you know, it's, I think, holding both is important. [36:53] Yeah. Do you know the bulls**t jobs book? The hypothesis that something like 40% of all the jobs in the economy are basically unnecessary? No. Like admin jobs. So he uses the example of a parcel being delivered to a military base. And 50 years ago, the guy who was delivering the parcel, like a package from Amazon, whatever it is, would just take it straight up to the soldier. [37:23] and then the soldier would sign for it. And today, the guy has to go and put it in a holding space, and then there's an office manager of the holding space who signs for it, fills in a form, and then they take it. There's someone else who goes, and then they sign it off. And then they have HR training. And basically, when you add all these extra people in, the cost is inflated by 40%, 50%. But the same task is still being done. [37:53] improvement in the efficiency of parcel delivery. And I don't know if I fully agree with his thing, but the interesting germ of a thought in my head when I read that book was, "Oh,

38:09-40:00

[38:09] Um, [38:10] We already have universal basic income. It's just that they're admin jobs. Maybe that's controversial. You see what Elon is doing in some of his companies or in the government now and ripping out a lot of these jobs that actually don't seem to make a measurable difference. We'll see what happens if he still keeps getting away with it. [38:40] It's an interesting hypothesis. If you believe that, then essentially all of the gains from the internet [38:47] And all the productivity gains essentially went to creating this extra 40% float. And then AI might just make that 80% or 90%. That's interesting. I don't think I agree with that fully. I think the interesting devil's advocate, not that you're, I think you're definitely not arguing with you. You agree with everything about that book. But I think the interesting devil's advocate is like, why? Like one way to think about it is like, [39:11] We make all those sorts of rules because like... [39:14] We just want to waste things, basically. But usually, there is at least some organizational reason why. There's some risk management thing where it's like, well, if the parcel gets delivered directly to the soldier, then XYZ bad thing happens, and so we need to create more process or whatever. [39:33] And usually, I think the interesting thing about human organization is that we've needed process or rules in order to facilitate collaboration between large groups of people. That system of rules just creates a lot of bureaucracy and middle management and things that everyone complains about and doesn't really like. Even the people who are doing it, they don't really like it. But we need that in order to coordinate masses of people.

40:00-41:32

[40:00] And I think one possibility for AI is that it... [40:06] it, [40:07] removes the need for so much, so many like layers and rules and processes, because like the AI can do a lot of the like coordination stuff that usually would require lots of middle managers or, and that kind of stuff. I think another way that I like sort of agree with, with this one of the conclusions of, of this sort of like hypothesis is like when I go to, you know, get my, get my shirt tailored in Brooklyn, like they don't take credit card. [40:36] you know, [40:36] It takes a long time for people to adapt new technologies. Even we are not getting as much out of this stuff as we possibly could. There's so much stuff that I'm sure is latent in the tool. And so pure productivity, pure efficiency... [40:58] it takes a really long time to filter through society. And it's not like... One of the things that I always say when I talk to people at big companies is like... [41:08] I think AI is a really good test of what would happen if we had magic powers. We would just basically do nothing. Maybe we would create a working group where some people at the company would spend some of their time figuring out what to do with our new magic casting, magic spell casting abilities. But we probably then wouldn't even... It would take them a year and they wouldn't actually figure anything out. And so, yeah, I think...

41:33-43:02

[41:33] I think we tend to feel or believe that just because we have a new capability that it just sort of like gets integrated like super fast in society. And that's definitely not true. Yeah, I think where there is some kind of insight here is that a lot of the way that we organize people is based on kind of like organizational scar tissue, I guess, is how I think about it. [42:03] stupid and therefore now we have a form right my dad is always like one guy tried to blow himself up with his shoes in like 2006 yeah and like now we have to take our shoes off like every time and it's like yeah that is that is true like the shoe bomber he ruined it for everybody and he didn't even succeed and we're like afraid to tear down the fence in case something comes through you know so like i think it does have a purpose but i think it's like an emotional purpose um and uh and like [42:33] maybe like we, it makes us feel good that we have these protections in place, even if they're not actually doing something measurable. Right. Yeah. So, so I actually think that AI will increase them, right. Because the only reason we have like to take your example, the only reason we have the TSA scanning your bags at an airport, but not at a train station is that like, it's too inconvenient to do it at a train station. Like you couldn't, you couldn't do this in the subway in New York. Right. The whole thing would shut down. Maybe they should.

43:03-45:01

[43:03] I don't know. We've had some issues recently. But as it becomes more convenient to make you fill in forms and to do administrative tasks, I feel like eventually that's just going to balloon. And you'll have my AI talking to the government AI and checking everything's okay across a thousand different forms. As long as I don't have to do it, I'm fine. [43:28] Yeah, that's true. It's still a win, I guess, if it's automated. [43:33] I agree with is like, um, I, [43:35] actually the limits of the technology right now are not even the limits of what's technologically possible. Like there's, you know, you can't, [43:44] Like GPT-4, GPT-3.5 could have taught you how to make a nuclear bomb, but it doesn't because like OpenAI made it so it wouldn't do that, you know? And so... [43:53] there's all sorts of ways in which the technology is limited by our, like, our, our tolerance for risk and everyone's tolerance for risk is different. So every organization has a different approach to like what, what risks they're willing to take or whatever. And, [44:08] To some degree, that's what makes baseball. I think that's actually good. Yeah, they can find the right answer collectively by saying, oh, that feels wrong. When they've released it completely openly, they get backlash. Or if they're too conservative, then they fall behind and it becomes like a... [44:25] a real race. Exactly. And it also provides opportunities for startups because like what OpenAI can do is just different from what like... [44:32] you and I can do. Every government in the world is like watching them like a hawk, you know, and, um, and we're just, no one's really watching us because it's small enough that like no one cares. And so we can take more risks and, and find out more good things to do with it than, than, you know, a bigger company can. So, you know, another, another way in which I think people were wrong at the outset of, of AI stuff is to be like, oh, incumbents are just going to win. And it's like, incumbents always get up and, and, and it's about risk taking.

45:02-46:46

[45:02] And it's not their fault. It's not like they're stupid. It's just massively difficult to take real risks when you have tons and tons of customers in a big organization and all this pressure. It's just not a good environment for that. [45:15] Yeah, it's really rare that they manage to do something good. And it's not through lack of trying, right? [45:23] They've all read the disruption stuff, right? They'll see it happening and you can see them trying their best to do it. But I would say actually they're doing, I think the scorecard is pretty good. I know people talk about Google and falling behind and all this stuff, but I would say big companies are innovating faster. [45:45] than they ever have. There was one example that really jumped out in my mind where I went to a conference. It was two months after ChatGPT came out. It was a B2B enterprise. It was all Fortune 500 companies. They had NASA there talking about AI and stuff like this. [46:15] of their hands shot up. [46:17] And I was like, this is like two months after chat GBT. It's pretty amazing. And just contrast that with like, I was working in growth marketing, like growth hacking with my agency. That was a growth hacking agency. We started in 2014. And it was, it took like four years before our first Fortune 500 client like Googled the term growth hacking. Like we were number one for growth hacking on Google. And yeah, it was like four years before we got our first enterprise client.

46:47-48:20

[46:47] You know, it took years or even the internet, you know, years and years and years and years. Yeah, that's interesting. I do think they're getting better at it because they've been. [46:55] They've been learning their lesson in a lot of ways. Yeah. So I want to talk about Cursor. But before we do that, I want to bring this whole conversation back around to Rally because we haven't shown people Rally. And so if you feel like showing a demo, I think it'd be really fun. I just think it's the coolest piece of software. [47:10] Yeah, so... [47:13] We talked about the script earlier about how I was A/B testing your headlines, the multimodal media versus metamedia. And what I've done in the past couple of weeks is put a basic user interface on it. And I have that live now actually, with a waitlist, but it's at askrally.com. [47:43] I run this. And I've got maybe 20 of my friends checking it out now. So we could try having a look on the call as well. [47:52] That's great. Yeah, let's do it. So basically, for people who are listening instead of watching, so like the rally screen is at the top, you see something that says like your audiences and you can pick an audience. So like millennials, manufacturing CEOs, there's an audience in their general population. There's an audience in there that's like Hacker News audience, dog lovers in Dallas, every audience. That's my test audience I always use because it's the first thing I think of. Perfect. Let's do Hacker News. So like if we select Hacker News,

48:21-49:59

[48:21] And then it's like, what would you like to ask the audience? There's a text box. So it's sort of like who wants to be a millionaire, you know, but like, but like, you know, for real, anyone can do it. You don't have to know Regis Philbin. So do you have a question or should we use, I mean, I can come up with one. Should we use the Metamedia one? [48:38] Yeah, well, why don't we do like, maybe do like a headline test. Do you have like a couple of headlines for the latest posts? Yeah, totally. We just did a new launch where we said like every now includes Quora, which is our email app. And basically as part of the every bundle, like you pay one price, you get access to Quora and all the other stuff we make. So the current headline is the every bundle now includes Quora, which I don't really like that much. The every bundle now includes Quora. [49:08] whether or not we should call it a bundle or just like, [49:11] and the every subscription or, or every people don't say like the prime bundle, they just say prime. Yeah. Um, membership, like, yeah. What? Membership, like every membership is an interesting one. Um, so like maybe a B would be like, every now includes Quora, um, manage your emails with AI. [49:34] Thank you. [49:36] and then you like uh the capitalizations on each one yeah we usually do um all caps basically but like i don't think the caps like make that much that's a really that'd be a good a b test is do the caps make a difference on what people click on yeah we can do that i used to test stuff like that in the agency yeah yeah um and then the next the last one would be like um

49:59-51:28

[49:59] Every now includes Quora, the most human way to manage your inbox. So every now includes Quora, the most human way to manage inbox. Yeah, my wife always complains about the loud sound of my typing because I'm like a heavy typer. You're probably hearing that now. [50:29] like, I type with purpose. I always think about that. There you go. I need to use that line. Cool. So, you know, we have the three different examples, and this is just like sending a prompt to ChatGPT. So we're going to kind of tell it like what we want out of this. And in this case, we're going to say, you know, which of these... [50:53] headlines would you click on? Probably like, I want to know if they would open it. Like, cause it's going to go to their inbox. [51:01] Yeah, OK, these are email subject lines for the publication every week. [51:11] Which of these headlines would you open? [51:15] Okay, so we'll just keep it simple there. And then I'm going to put it in voting mode, which basically just gives it a tally at the top as well. Oh, cool. I don't think I've seen voting mode. That must be an interesting trick. Yeah, I just added it.

51:30-53:13

[51:30] Yeah, thank you, Claude. That's the other thing I love about this era of AI tools is everyone's just shipping so fast. You're just like, oh, yeah, I added that in an hour, like two days ago. I had the idea and I just did it. It's like, I love that. [51:43] Yeah, yeah. No, it's definitely really appealing. The problem is like when you have to refactor everything. Yeah. And then the UI gets messy and the whole thing is like, yeah. So, OK, so basically you typed in a prompt and then basically it spun up an audience and then it... [52:01] ask the audience that question. So which of these three would you click on? And then what you can see basically, return the results. And the results are, EveryNowIncludesCore, the most human way to manage your inboxes, by far the most clicked on or opened one, which is really interesting. And the most human way to manage your inbox is the one that we've been going with. So that's cool. But also in the sidebar, you can see each individual simulated person, what they said. [52:31] human way to manage your inbox stands out to me because it's just a more personal approach, which resonates with my values about technology and its impact on our lives. So that's what one of these AI... [52:41] people said about why they clicked on it. And then you also have this like summary thing where it's like, [52:46] It summarizes what everyone says. So the feedback indicates a strong preference for the subject. Every now includes the Quora, the most human way manager inbox. The choice resonates due to this emphasis on balancing technology with a human touch, which aligns with the values of the individual. So like, I think that's really interesting. And I think we would get different results, for example, like, if we, if we change the audience, or another thing that I know you're building, which I really want is like,

53:13-54:38

[53:13] I don't want it to be testing them against each other. I want them tested in the context of someone's inbox and the kind of emails they usually have to see which one of them... [53:23] I clicked more or opened more. Um, [53:26] And, uh, and I'm sure I know that that's coming, but like, yeah, this, like, this is so valuable. It's like, uh, before you'd had to like throw it out into the world and that costs time and money. And if you don't have an audience, like it's really hard. Or if you don't know the people who are, it's going to be going to like whatever. So yeah, I love this thing. Yeah, exactly. And then once you, once you have like a sense of, uh, whether you're asking the right questions or, um, you know, if you, if you really disagree with the result or you really agree with it, you can then go and validate. [53:56] with real tests, right? So, you know, it just kind of gives you more options and more, you know, room to navigate these different creative decisions. Right. Obviously, like disclaimer, this could be wrong. It's just, it's another piece of information, you know? [54:11] Exactly. Yeah, it's another opinion. And, you know, this is like still pretty early as well. So like we're working on getting all the evals set up and, you know, doing the rigorous testing against like real human responses. So, you know, that's that's coming to but but yeah, that's like a long road. And yeah, there's a lot of a lot of really cool stuff you can do like, like the thing I'm working on right now I showed you earlier today. It's like a hacker news simulator as well.

54:41-56:13

[54:41] today and then you just like, [54:43] insert your headline in there and just see if they, if they click on that one. I love it. So yeah, we'll see how that goes. Yeah. That's the best. Um, so, um, [54:54] You're using Cursor for all this stuff. Tell me about that. Yeah, so I've tried a bunch of them. I tried Windsurf and heavy copy and paste initially. Early adopter of GitHub Copilot. And the reason I like Cursor and I keep coming back to it is just that you can hop quite quickly between very low level and very high level. [55:21] So my typical flow is I'll actually talk through an idea. [55:28] I'll record myself just blabbing about the thing I want to build because I don't really know what I want to build yet. Like in a voice note? Yeah, quite often it would be just record it. I'll use Descript. [55:45] just because it's what I have on my computer for recording videos. So I'll just record. It doesn't really matter what you record it with, but then I'll transcribe that. [55:54] So you can upload that to Google Gemini, like the AI studio, and it's pretty good at transcription. So I tend to do that. And then I'll have a transcript of me blabbing. And then I'll stick that into usually Claude and get a PRD, like a product requirements document.

56:15-57:46

[56:15] I'll edit that quite a lot. And that's where I think a lot of the prompting. Claude versus 01? Yeah, I tend to use Claude still over 01. [56:24] 01 Max is, sorry, they changed all the names, but like the, the, the, the $200 version. Um, I do use that. I'm using that more and more, but I still find Claude is like more creative. Yeah. Is that in any way I can describe it? Um, I use a one to, uh, settle, um, problems. Like it's like the, the big daddy that comes in and just like fixes the problem. Like when, when [56:54] that PRD and I'll run the cursor agent and let it just completely go nuts and create any files at once, do whatever it likes. And then I'll look back and see what I have and if it works. And I'll change a few things based on the general vibe. But when it gets stuck, which is actually pretty often, then that's when I jump out to 01 in some cases and say, hey, here's the context, [57:24] and then get the answer back. But you say you're editing the PRD that comes out of Claude before you put it into Cursor. You're not just YOLOing it. Tell me about that. Because I always... [57:37] One of my failure modes when I do this stuff is sometimes I just YOLO it because I'm just like too lazy, you know, and it... [57:42] it kind of gets f***ed up and... [57:45] Yeah, tell me about that.

57:46-59:16

[57:46] Yeah, I found that quite a lot. It's actually similar to what we talked about earlier with the evals, where what I'm looking for is what do I strongly agree with and what am I allergic to? So I'm scanning mostly. I'm not fully reading in a lot of detail, but I'm just looking for things. It's actually similar to if an employee sent me something and I'd be like, I'm going to scan it and just see if there's anything that really upsets me. And if not, then it's fine to go ahead. [58:16] especially when it's a new feature that I haven't really thought through, then I'm like, oh, I see that it's tried to do this, and that feels really wrong to me, so I'm going to change my whole approach. Actually, that was why DeepSeek was really good, actually. I used that a fair bit over the past week, but just the ability to see the thinking [58:38] and just see, "Oh, it's thinking wrong. I must have got something wrong in my prompt. It's trying to create a new file, but I already have that file. I just needed to put it into the context." So it's really about refining [58:49] It's just me figuring out what I actually want. And then once I'm fairly happy with that, then it's much easier to build the stuff. [58:58] Interesting. What do you think of O3R using it at all? [59:01] I did try 03 mini a bunch today and I would say it's like an impressive model. It's surprisingly fast, which was like my first impression for a thinking model. Which makes a big difference. Like it's not trivial that it's fast.

59:31-1:01:00

[59:31] have in one day. So then I'm like browsing Twitter and I'm like talking about politics. So this is like, I look up like 30 minutes later, like, oh yeah. I have the same thing. Like I did that this morning with deep research. Like I posed at this like really big, deep research question. It was like a page of stuff. And then I like went out and like, like got coffee and like, and just, I knew it was working. It was so weird. [1:00:01] Maybe just enjoy yourself. How about that? I did hear someone say that the other day. They were like, oh, it's really nice to have regular breaks now with AI. It was before I was thinking the whole day. But no, I mean, it reminds me a lot of my first internship. It was in the UK government and the benefits, like government benefits, like Social Security is the equivalent. [1:00:31] which is a very old language. And we'd write the code and then we'd send it off to a central server and it would take a couple hours to run. So I basically didn't know work that whole internship. I was like, "Sorry, my code's running." I think there's an XKCD comic that makes fun of this as well. My code's compiling. But yeah, it feels like we're back in those times again. Back to the future. Yeah. I've got to find a way to be productive. I was thinking about maybe I should start

1:01:01-1:02:33

[1:01:01] seeming cursor. So like I'll have like, you know, I'll have like, um, my, my, my blog, uh, writing and cursor like on, on one, uh, one tab and I'll be creating blogs and like talking about those. And then, and then like, well, well, you know, one's working in the background and the other thing. [1:01:17] I like that. I was doing that a lot with Devin when I was using it. I would have four Devin's at once doing different things. And then I would be like, if you have four of them, every minute, something it needs help on. And so you're just constantly tending it. And it reminded me a lot of when I was in college, I used to play online poker. And I was never that good. But good players can play four tables at once or whatever. And so sometimes I would do that because whatever, [1:01:47] It felt like that, where it's like you're doing all these different things all at once, but... [1:01:52] Uh, you don't, it, it, each one only requires divided attention so you can get it all done. And it's, I think that's so cool. Yeah. It's like those people use the chess grandmasters you see in Washington square park and they're playing like four games at once. Yeah. It does feel like that, but, but yeah, I wonder how much we're really getting done when we do that. You know, I don't know. Well, I mean, to some degree, like running a company is like that. It's very similar, you know, um, where I'm constantly just like, I'll spend like three [1:02:22] go on to the next thing or whatever. So I just think more people are going to be working that way. [1:02:28] with all of its benefits and all of its trade-offs and we're gonna have to like figure out how to manage that you know?

1:02:33-1:04:06

[1:02:33] Yeah. What I really want to try and get good at is doing it while I'm on a call. Obviously not this call. But I find quite often, maybe this is just me, but I'll be on a call and it could have taken five minutes. The actual relevant part of it was five minutes. And I don't even need to take notes anymore because AI is listening and taking notes for me. So that's the thing. I haven't quite got [1:03:03] it yet. But yeah, that's the dream. I can be managing Cursor in the background while I'm also [1:03:10] Yeah. Also, it's just like, I'll have my agent call your agent and like, we don't even need to get on the phone, you know? [1:03:16] Yeah. I, do you know, one of the things, one of the really out there ideas that I would love is maybe I'll just do it for fun on the weekend. But like, I'd love like a negotiation, like settlement kind of agent where I give, I give it like my red lines and, and it's like a trusted third party. [1:03:40] And then it just goes back and forth for our virtual negotiation. And then it comes back and says, we've settled at this price. It's the best you could get. I think it's cool. And then it's like, you have to also decide how much you want to fund the agent. How many compute cycles does the agent get to get to a result? Yeah, yeah, yeah. This was a big thing with the early agents, like Agent GPT. I don't know if you ever played with that back in the day. I'd say back in the day, it's like two years ago.

1:04:09-1:05:34

[1:04:09] It's still going. [1:04:10] But yeah, I remember seeing a talk by those guys and they had... [1:04:15] It really stuck out to me because they had LLM judges in their CI pipeline. So whenever they pushed code to that repo, they had AI agents checking the code and making sure it didn't break and running tests. [1:04:31] At the time, that was kind of radical. Now there's products that do that. But they said they were spending $200 every time they pushed a commit. [1:04:41] That's crazy. That's crazy. Now that same commit probably costs $2. Totally. You probably lost. Yeah. I do think it makes sense. There is a case to be made that you should just be spending as much money as possible in AI because... [1:04:59] We're almost like definitely being too conservative. [1:05:01] Totally. Agreed. Well, that is a great place to leave it. This is a great conversation. I'm so glad we finally did this. I'm so excited for Rally. I'm so excited to get to work with you. If people are looking to find you and your work online, where can they find you? [1:05:16] Yeah, on Twitter, I'm hammer underscore MT. So it's probably the easiest place to find me. And then the book that I did for O'Reilly was prompt engineering for generative AI. [1:05:27] It's on Amazon. It's got like an armadillo on the front. So it's easy to spot. Awesome. Thanks for joining. Cool. Thanks, man.

1:05:57-1:06:25

[1:05:57] about ChatGPT. Every episode is a roller coaster of emotions, insights, and laughter that will leave you on the edge of your seat. [1:06:06] craving for more. It's not just a show, it's a journey into the future with Dan Shipper as the captain of the spaceship. [1:06:13] So do yourself a favor. Hit like, smash subscribe, and strap in for the ride of your life. [1:06:19] And now, without any further ado, let me just say, Dan, I'm absolutely hopelessly in love with you.

Want to learn more?