Guest: Martin Davidson
In the seventh episode of Asynchronous and Unreliable, Anne and Martin Davidson of Tollens.AI discuss the current state of AI coding for high efficiency, high resilience software. It's scarier than you think (in every sense - wait for the end).
Watch on YouTube
Listen on Spotify
Listen on Apple Podcasts
Read shownotes & transcript below
Join host Anne Currie in this episode of "Asynchronous and Unreliable," where she talks with veteran technologist Martin Davidson about his journey from telecommunications to AI-driven software development. Discover how AI looks set to reshape the software industry, including by building high-quality, production-ready code and redefining testing, design, and organizational structures.
- Martin Davidson’s career evolution from high-resilience telco software to AI and machine learning
- The concept of oracle-driven development for AI-generated code quality
- Practical experimentation with rewriting libraries and building emulators in Rust using AI
- The extension of traditional unit testing to fuzz testing, differential testing, and other AI-powered validation
- The importance of defining success criteria ("what good looks like") upfront in AI projects
- Parallelization strategies in AI and software architectures: from agents to cores and teams
- Organizational implications of AI-driven productivity increases and automation
- Future outlook for legacy software companies amid AI disruptions
- The cultural and economic impacts of AI on software engineering careers and industry stability
- The analogy of AI as an unstoppable "Terminator" in software automation
Anne Currie (00:00) So hello and welcome to asynchronous and unreliable, a new weekly podcast where we discuss the most interesting ideas and concepts in tech. I'm your host, Anne Currie co-author of O'Reilly's building green software, the cloud native attitude and author of the science fiction panopticon series. And I also offer a consultancy and workshops strategically green. And for our guests today, we have veteran technologist, Martin Davidson, who doesn't like being called a veteran, but we have to accept that we're both now. We've had long careers in tech, we should own that. Who is currently the engineering head at AI startup, Tollens.ai. He's also worked for 30 years in the field of high efficiency, high resilience software, specifically the software that powers telcos. And he's also the author of my favorite AI and software development sub stack. So Martin, do you want to introduce yourself?
Martin Davidson (00:52) Yeah, of course. Thanks, Anne. That's very kind of you. Yeah. When did I start? I started out in the early 90s. The first software job I had was working for Hydroelectric, as it was at the time. And they had all this scanner information from their systems, water levels and dams and power stations, all that kind of stuff. I wrote some code, which I spent several summers writing this database application on Windows, which would store the data. And then you could graph it and show stuff. So they could use it to predict demand curves and things. It's the kind of thing that nowadays you do perhaps in an afternoon with Grafana and a bit of Postgres, how things change. But then I got a proper job working for a company called Data Connection. I was really lucky there because I got to work on NetMeeting, which is the precursor to what we're using right now, audio video conferencing, much smaller and lower resolution. But it worked over a 56k modem. I tell you this doesn't work over a 56k modem.
Anne Currie (01:48) If only it would. Mm-hmm.
Martin Davidson (01:48) Yeah. And then I was part of the Windows development team working on remote desktop, which was an amazing opportunity to actually do something which impacted an awful lot of people in a very positive way, I think. Then I spent a lot of time after that working on highly available, highly scalable communication software.
Martin Davidson (02:11) doing stuff that was used for communications with submarines and then moving on to messaging systems. So very high scale telephony systems, very high reliability. So the words five nines are imprinted somewhere inside here because I spent so much time thinking about that. More latterly I've got very much involved in AI and looking at the potential of what AI can do and how it's really transforming our industry. I first actually started using AI back in the mid 90s when one of the other projects we tried at Hydro was using AI to predict what the demand curve would be like for the power grid. spoilers, it didn't work.
Anne Currie (02:47)
Martin Davidson (02:49) But it got me really interested in how neural networks work and understanding that. And it's been interesting watching and following along on that journey. just over the past few years, it's been a wild ride of how much code it can write now and how good the code is it can write and where that takes us. And for me, I like building things. So this is fantastic. I've now got a tool that will allow me to build lots of things. And I don't have to sit and learn the syntax of Rust or remember how to do this or spend ages looking stuff up in API books or whatever. So that's great. And I think one of the things that I've found with the AI is you can take together the amazing coding ability. And it's one of these things where I think if you drive it and you steer it correctly, it can be the most amazing tool possible and create really good quality software. But equally, if you just let it go for it, it produces this terrible slop and some really terrible code. So it's opened up the potential of how widespread of code quality can be. And I think what we're trying to do with Tollens and what our goal is, is to take the knowledge and the experience we have about how to build good quality, high quality software, provide that as an extra harness around the development tools like Claude Code or Codex or whatever it happens you're using, so that everyone can benefit from this and everyone can start creating amazing quality code and solving the problems they have. it's almost like in a way, it's fulfilling the whole promise of personal computing from way back in the late 70s and 80s that actually you get your computer and you can make it solve the problems that you have in the way you would like them to be solved. And I think that's really exciting.
Anne Currie (04:29) So I'm going to stop you there and roll you all the way back because we need to provide a little bit more context for this. So you and I first spent any time together 30 years ago when you were working on Microsoft NetMeeting product and on secondment there from the UK for the company that we both worked for and I was working on Microsoft Exchange. that was our first experience of really, really high impact, efficient, high resilience software that's meant to be used by millions and millions of people in a way that is resilient and reliable and just works. So that is very much steered your perspective in the years since: good software works, it works all the time. What's the name of your substack? How do you get it? I love it. I think it's a really good way of finding out what's possible at the moment and what the issues are and their problems with AI and software development lifecycle.
Martin Davidson (05:37) Yeah, so the Substack has a geeky name. So it's 0x4d44, which is ASCII for MD, which are my initials. I probably could have thought of something better, but it appealed to me. So which probably tells you all you need to know, and yeah, the Substack has been an interesting thing to write.
Anne Currie (05:55)
Martin Davidson (06:06) Because I don't know about you, I'm sure you find this too. I find that writing stuff helps solidify the thinking that's going on in my brain. And really, I started out writing the Substack for me and if it was useful for other people, that's a bonus, right? That's great. But it's become part of my life, the discipline of every week I want to process all the thinking that's going on in my head and write an article about it. And that's basically what I do. So, yeah.
Anne Currie (06:37) And basically this is documenting your journey, trying to get your head around what's the situation for getting AI to write good quality software? And as you learn and you try and you trial and error it, you will every week writing down what you've learned and what progress you've made, is that it?
Martin Davidson (06:56) There's a few topics I cover. I think there's very much how it changes the mechanics of how we're going to develop software and what the software development process looks like in the future. And how some of the things that we hold dear and true, like code review, for example, right? Code review made a lot of sense in a human world. Does code review make a lot of sense in the AI world? Does human code review of AI code make any sense? And I think probably the answer to that, if I was Claude I'd say the uncomfortable truth is, because Claude says things like that. But I think the reality probably is that human code review, we're getting to the point where it has very little value. There are better ways to spend your time. And there's, it's all about what's the return on the investment? It's the good old example that if you have 32 independent inputs, and they can be on or off, that's 2 to the 32. So it's four billion combinations. Well, you only live for about two billion seconds. If you're going to test all of those extensively, you've got half a second for each of them, starting from the moment you're born till the moment you die. I'm halfway through my life, so I've got even less time to go and do these. And I do like sleeping and eating and things as well. you very quickly realize that you can't comprehensively test the full surface space of software. So it's always a compromise. Quality is always this kind of thing where you're trying to figure out what's the thing that will give me the best return. And also, there's all the precursor to that is, well, what does quality mean? Because quality is not a fixed level. You could say, what I need is fantastic diagnostics from this product in order to be able to debug it. That's what high quality is for me as a support engineer. But as a customer, you might say, I want it to run really well on this battery powered device and not use up lots of disk space. So that's a computing quality perspective from the end user for the support engineer.
Anne Currie (08:55) Yeah. So again, I'm going to slightly break in and roll you back a little bit this, which is the experimentation you've been doing, which I really like, and I think it's great. So you decided about a year ago, 18 months ago, that you were going to really look at could AI write high quality software? And so you've been trialing that by building a whole load of stuff, really difficult software to build and software that you can test and find out whether or not it works so that you can actually validate whether or not the AI is producing good quality software or not. So tell us a little bit about some of the things that you've been building in order to try that out and to find out.
Martin Davidson (09:53) Yeah, the starting point of all this really was last September when GPT-5 came out, because that was a massive step change. And if I look at my GitHub repo, I can see the massive step change in terms of the commit history. So I started out trying to re-implement some of the stuff that I had implemented or been part of the team implementing previously in my career. And discovered that actually, it was really quite easy to implement some of these things these days, which was interesting in itself.
Anne Currie (10:22) Yeah. Things that have taken big teams of expensive software engineers, man years, literal years to produce.
Martin Davidson (10:32) Yes, absolutely. I mean, it's completely incredible. You end up with, there was a thing over Christmas where I wanted to build a thing where Claude, because I live near the hills and I go out for a walk in the hills every morning, I thought it'd be great if Claude could phone me when it got stuck on something and I could talk to it. And I thought, how am I going to do that? And I ended up settling on using Windphone, which is a SIP client. But in order to do that, I needed a SIP stack on my end to talk to it.
Anne Currie (10:46)
Martin Davidson (11:02) I just wrote a SIP stack. Actually, I wrote three. Well, I didn't write them, codex did.
Anne Currie (11:05) So what's a SIP stack?
Martin Davidson (11:08) So SIP is the low-level protocol that you use for telecoms. So there's a decent chance that this call is going over SIP stack. So it's basically just the way that if you make a telephone call, your voice ends up tunneled over this protocol called SIP.
Anne Currie (11:16) This is probably running on top of a SIP stack somewhere, is it?
Martin Davidson (11:29) There's actually also RTP as well, which is for the actual packets of the media. SIP has handling on signaling. So it says, I'd like to talk to Blah and they agree what codecs. So codecs are how you convert, compress the audio and the video to send it over the network to use less bandwidth. So SIP stacks, the kind of thing that we do.
Anne Currie (11:48) It's not a trivial thing. Writing a SIP stack used to be a multi million pound project of many years of high quality software engineers. I think my husband Jon who's on this podcast as well, I think he's managed teams of SIP stack developers in the past. It's not a cheap thing to do.
Martin Davidson (12:11) No, it's an incredible thing. That was one example. I put together a DOS 386, actually now up to a Pentium, and Windows emulator so I can run all the old games that I used to want to play when I was younger. That's the kind of thing where there's DOSBoxX and there's PCM and there's 86Box. There's a whole variety of people who've written things like this before. But none of them quite like the way that I've glued it together. And none of them is easy to configure and none of them in Rust. this is the other thing, the language that I'm using for all these things is Rust because Rust is a fantastic language that removes a whole class of bugs. I mean, in the amount of hours I'm sure you've spent, and I've spent, chasing scribblers and memory leaks and concurrency issues, and they're just gone. You don't have to worry about them anymore. So.
Anne Currie (13:18) And this is Rust, not AI. So there we're talking about Rust in and of itself is a really good, you've told me, actually, I think I've mostly got this initially from you, but then repeated over and over again. Rust goes very well with AI for multiple reasons. Actually, Rust is just a very good modern language that has the efficiency of C, but because it stops you introducing a lot of bugs, the compiler will not let you write the code that does that. So the bugs that are really expensive to find are removed, taken off the table before you even get to the point of having a compiled executable.
Martin Davidson (14:00) Yeah. And the trade off is it's much harder to write the code upfront, but it's got this fantastic compiler and ecosystem and the compiler is really good at helping you figure out what's gone wrong. Claude or Codex or whatever compiles it gets these really rich error messages and then knows where to go and can iterate. And the agentic frameworks are a bit like a dog with a bone. They'll just happily go and go until code compiles and the tests all run and everything's happy. We get tired as humans and we get fed up and we want to go and do something else. And we're more prone to giving up, I think. Rust, I can't remember what the stats are, but there's something I read that a Rust engineer in the US could command about one and a half times the salary of a C or a Python developer because it's hard and it's a very useful skill. So yeah, so what else have I done? I ported the Opus codec. It's the modern version of MP3. It's about half the bit rate in audio quality. It has a companion codec called Silk, which is really good for voice, which is a lower bit rate. So I ported both of those from the official C versions to Rust. So there's a and that's on crates.io.
Anne Currie (15:08) All right.
Anne Currie (15:26) So the story here is that it's not just you writing a couple of lines of Rust code using AI. This is your substack and all your journey at the moment is about actually rewriting or writing from scratch existing libraries that are good, that run out there and work and are not noddy. These are things that people have literally spent years doing and you know what good looks like because you have been part of the teams writing these things for years and years. You are a veteran software engineer with all the experience of writing things. You're a lot less white haired than I am. It's obviously being a veteran software engineer has taken a lot less toll on you.
Martin Davidson (16:13) How many here? That's the problem.
Anne Currie (16:15) You know what good looks like. You are not some enthusiastic junior person. You really painfully know what good looks like. We've been doing this for decades in the highest end of software that just has to work over and over and over again. And so to me, you are probably the only person who I'm judging all of my stuff at the moment on: gosh, actually, I trust Martin. And I trust your background and I trust you know what good looks like. And you're saying it looks pretty damn good.
Martin Davidson (16:53) I find it quite terrifying because it upends the whole world model that you've had for all these years. You think, how is this possible? But the other thing I think in all of this as well is that the way that I approach development has turned on its head slightly. You go back 30 years ago and testing was nowhere near as advanced as it is these days. even then, would a lot of the testing happen as you wrote the code. So you'd write the code and test it. And by the time you got to the end, it's all working. Whereas I view it very differently now, because that's obviously not what's happening with Claude or Codex. And if you go down that path of just saying, go write some stuff, go and do this, a single short prompt, it's not going to work very well. So I like this idea of Oracle driven development because it has ODD, I think, a good acronym. There's something strange about it. Never mind. This is why I do this. I'm not on the stage. Yeah. But when you're coming up with these things, you start with the Oracles. How are you going to test this?
Anne Currie (18:00) So Oracle is a, you've leaped ahead to something that's quite a niche concept within AI development at the moment, which is the word Oracle, which is confusingly not the same as the company Oracle, it's more like the Oracle of Delphi. It's a way of saying what does good look like. Isn't it? Is that what it is?
Martin Davidson (18:37) Yeah, exactly. It's recognizing that it is a way of defining what good looks like, but it has some imperfections. No oracle is perfect. So it's why you can't have a single oracle generally. That would be a bad idea because you need multiple of them to cover different dimensions.
Anne Currie (18:56) Well, the whole of classical literature is based on the idea that the oracle did not really provide you very useful information. So hopefully a Greek tragedy will not result from the application of ODD oracle development
Martin Davidson (19:02) Yeah, let's cross our fingers. So you start with these oracles.
Anne Currie (19:13) So give me an example of an oracle, something that you would say is the definition of what the good looks like.
Martin Davidson (19:16) Yeah, so. But if I talk about the PicoM, the Raspberry Pi PicoM emulator, where I started from was, well, one oracle is the data sheet for the chip. That defines everything that the chip can do. Another oracle is what I've got sitting on my desk here, which is a debug probe, and this is the Raspberry Pi Pico that's being tested. So I've got a hardware oracle there. And I can actually go and test my code and run the same thing on the real thing and see what happens.
Anne Currie (19:26) Yeah.
Martin Davidson (19:48) QEMU is an emulator which emulates the cores, the ARM cores or the RISC-V cores that are on these things. So you can emulate the instruction set and check that. You've got these sets of oracles that you're going through. And once you've got those in place, then I spend a lot of time getting the design right and doing a lot of design. We'll go through the design, me and an agent team.
Anne Currie (20:15) So what you're saying there is that what you've learned hard through trial and error over the past year when you've been building all these ridiculous number of things like SIP stacks and Windows emulators, all kinds of things. The tough lesson you've learned is that almost the first thing you have to define is: what are you going to say good looks like? How will you know if it's, before you do anything else, you have to say, how am I going to know if this AI is doing the right thing? Because I can't know what it's doing. So I'm going to have to examine it from the outside and I need to understand up front how I'm going to know whether it was right or not. And after that, before I do anything else, if you're working on something where there's no way of telling or it's very hard to tell what good looks like, it's probably not a great AI option. What do you think about that?
Martin Davidson (21:13) I think for any software project, you have to have some idea of what good looks like. And I think sometimes it's harder and sometimes it's easier. But as a human, you're going to do a software project. At some stage, you've got to decide whether it's good enough to ship or you're done. So you still need to know what your definition of good is at that point. The advantage, and I think, in the old human world, you would start the project and worry about that down the road, which was always an error. Actually, better projects were the ones that thought about how you're going to test it and define what good looked like upfront. But I think in the AI world, that becomes utterly critical because otherwise you're basically saying to the engineering team of Claudes or whatever, go and create this thing. I'm not going to tell you what I really want. I'm not going to define what good looks like or your success criteria. Just go and build something. And then they don't know. Whereas once you've defined something, the AIs are tenacious, and they will be able to iterate against that and just keep going until they've solved it. And so yeah, you get your oracles in place. You have a really good design. And then you launch your agents.
Talking about design, design changes as well, because there's a kind of reverse Conway thing going on, because the old world was that your code mirrored your organizational structure. Well, in the new world, your organizational structure is fixed. It's like lots of agents and you want to parallelize. So then you need to say, what does that mean my code needs to look like so that I can achieve that? So for example, I will decompose things into more crates in Rust, more decomposition, more seams where things can work in parallel than there would have been in the old world, because then I can parallelize more. And what I want is I want the agents to be able to work without tripping over each other. Because you can either run them all in the same repo, and then they will trip over each other, and they have good old fights of reverting each other's work. Or you can run them all in work trees, and that's great because they don't trip over each other. And then you have the merges. And merges are really difficult. And the AI will sometimes be a bit slap dash with the merge. It's like, oh, this is a bit difficult to do. I'll just throw all that away. I don't want to merge again. Yeah.
Anne Currie (23:25) Ha. They are like a lazy developer.
Martin Davidson (23:31) Yeah, sometimes. Claude 7 has got this habit at the moment of saying, I've done 90% of the task, would you like me to schedule a follow up to finish it off in a week's time. No, I'd like you to finish 100% of the task now. It told me the other day, it was too late in the evening to keep doing things because it was tired and it might make a mistake. Yeah, I don't believe you.
Anne Currie (24:00) Yeah, it is bizarre, isn't it? It is quite like managing a strange... Well, I mean, it's learned from what everybody's saying, but it's quite interesting. What you're saying there, which is not something I've heard before, or I've not thought about it in this way, is the need to break up tasks so they're parallelizable, so you can run more well-defined tasks using lots of different agents all at the same time. In some way, that's an oddly coincidental parallel to the difference between GPUs and CPUs. It's like GPUs are smaller, simpler, more well-defined, more parallelizable tasks.
Martin Davidson (24:38) Yeah, you're right. It's a similar kind of idea, isn't it? Maybe it's just like Amdahl and multi-cores on CPUs, how do you parallelize work across lots of cores? In some ways it's just like when you manage a team and you're thinking: person A can do that and person B can do this and C, and this is how it will all fit together. And it just goes a lot faster. And you can go as wide as you want, because you're not restricted by a certain, the restriction on the width of the team is me, not the AIs. They can parallelize it as much as they want. So I've got myself into some pickles at some points where, like on the DOS emulator, I would have three separate machines all working on different bits of it at the same time. And then there was the merge, this terrible, terrible merge. You just think, what have I done? I've done a bad thing. I made my life really difficult for a day. But yeah, so you've got the change in Oracle driven development, you've got the change in the way that you do design. And then I view UTs now, UTs still have value, but I view them as locks. So once you've got the UT in place, you got your code and everything is good, you get the UTs then, and the UTs lock the main code. So that if anything moves around in the main code, anything gets refactored or whatever, you find it out through UT. UT is now your oracle for saying we're staying consistent, we're not changing. Claude and Codex still have a propensity to refactor things from time to time. I don't really want you to do that. So I'd like to know if you've done that. And that's where the UTs feed.
Anne Currie (26:26) So basically, the value is still there that you want to be testing small sub chunks as well as system wide tests. It's not that different in some ways from having developers do it. developers who are refactoring the code all the time will break things. And you just need to know that they're broken so that you can fix them again.
Martin Davidson (26:45) Yeah. And that's fine. It's absolutely fine to break things. You just need to know so you can fix them. in the whole world where you never change anything and you're always making these little pragmatic fixes. We've all been there. You're near the end of release and you think, I really should refactor this, but I'm far too scared. I'll just make this terrible fix. And then it's just tech debt that gets carried forward and the code gets bad to use. It's not a good place to be. And then once you're through all that, or maybe in parallel with that, you start fuzz testing. So.
Anne Currie (27:08) Absolutely. Yeah. Okay.
Martin Davidson (27:22) And there's two types of fuzz. There's fuzz and there's differential fuzz. So you're either just naturally fuzzing your own code, or you can run differential fuzzing where you're comparing two different things against an oracle and your code. So you can take the fuzz stream of stuff you're trying and run it on the hardware and run it on your emulator and you can see what happens.
Anne Currie (27:40) Yeah. So fuzz testing, it's a well known concept in security, which is that you put all the possible things that could go through an interface through it to see what happens. Because in the old days, particularly with C, it was about trying to make sure that there weren't any scribblers in there, which could then be exploited. So you're doing the same thing. You're just saying, look, with AI, you're doing AI driven fuzz testing presumably.
Martin Davidson (28:12) Yeah, absolutely. And that finds the whole pile of stuff. You've got mutation testing. So cargo in Rustland, you've got cargo mutants. Mutation testing is really about testing how good your UTs are. So you change your code.
Martin Davidson (28:33) So you change your code and you see if the tests find the bug that you've introduced. And if the tests don't find the bug that you've introduced, you're thinking, my tests aren't very good, so I need to improve my tests. So mutation testing is cool for that. There is obviously coverage. And coverage in the LLVM curve in Rust comes in a variety of different forms. There's the traditional line coverage, which is not terrible but not brilliant, region function and branch coverage. And branch coverage, I think, is the most interesting one because you're looking to see if each branch is hit. And the follow on from branch is a thing called MCDC testing, multi-condition decision checking, which is where you're looking at all the different things that go into a conditional. So if you've got A is less than five and B is greater than six, you test all the different combinations of that and you check what happens through the code, through all the paths through that.
Anne Currie (29:22) Yeah, because that, we all know that's a big source of bugs. It always used to be a huge source of bugs.
Martin Davidson (29:22) And MCDC testing is... Yeah. And it's really expensive to do. It is the kind of thing you would never do. We saved it for aircraft control systems and things, but we're knocking on the door of that becoming a possibility. And there's a whole lot of work going on in the Rust community to add MCDC testing into the compiler. So suddenly you're in this world where all these tests are suddenly available to you and
Martin Davidson (29:57) You've got to watch that you don't end up with theatrical UTs, which are just hitting lines for the sake of it, not actually testing anything. But equally, you ask yourself, well, if I've got 5% of that, but I've got 300 times as many tests as I had before, now I'm probably OK with that. Yeah.
Anne Currie (30:15)
then does it matter? Yeah,
It's an interesting one. So fundamentally, I remember having a, there was a paper that came out a few months ago now from OpenAI saying that they had written a, they'd implemented a C compiler in Rust, a new C compiler in Rust. And it was 100,000,
lines of code. And there was a lot of debates in my household about how many lines of code was that and how many unit test lines of code would that be. And I remember talking to Jon who, for 30 years has been managing teams writing the highest quality code software and he would say, well, will at least be 10x the number of lines of code of test to the actual number of lines of code. And then, and he said, but I wonder what Martin would say, what the percentage that he would use now that he's doing his AI trialing. And I asked you, and you said, it's 100, it's 100 difference. I'd have 100 lines of test code using AI for every line of actual code.
Martin Davidson (31:25)
It varies, find. I think that was true when I said that at the time. So the emulator, the ratio is ridiculous because there's thousands of opcode tests. So you can that's really easy to build those out. Some of the more recent stuff, I've kind of. I found that actually, think increasingly the oracles are more useful, the fuzz testing and the differential testing, they are more useful in finding the interesting things.
than just building endless UTs. I do have, I built one of the very first things I built, I don't know, probably back in when 01 came out, which is like nearly a year and a half ago. Is that right? Is that all it is? That's probably all it is. Wow. Feels like a lot of time ago. But.
Anne Currie (31:58)
Yeah.
Yeah, actually, I'm
going to ask you a question because it's quite dense amounts of stuff. So I'm trying to process some of the things you say. But when you say that your fuzz testing and your mutation testing is now more valuable than your UTs, my guess is that's because you are now you've kind of maxed out. You're getting diminishing returns on how many
UT lines per unit test lines per, just the underlying code. But that's presumably at a level which is far, far higher than we would normally be expecting to see in code. I mean, it's like high quality code, maybe 10 to one, Jon was saying a few years back, now you're talking about AI code, because you can, you're up to 100. Are you saying that like 100 to 1 is kind of like good, and then 200 to 1 you started to see diminishing returns?
Martin Davidson (33:13)
I think it very much depends on the app. So the Pico emulator, I think, is maybe somewhere between one to one and two to one. It's not that hard. But I think I've increasingly moved away from thinking UTs are a way of finding bugs, because they're not a great way of finding bugs, I find. But they're a great way of locking the current position.
Anne Currie (33:23)
really? Yeah.
There are good way of identifying regressions.
Martin Davidson (33:41)
Yeah, exactly. That's what they are. They are a regression tool where really the differential testing or the fuzz testing or a kind higher level functional testing or another thing which we haven't talked about is kind of exploratory testing. So you can set Claude up or Codex up. Some of my repos have skill in them and we've
Anne Currie (33:59)
Mm.
Martin Davidson (34:09)
changed the API interface, I've got this app, a Tori app, which is TypeScript on the front end, Rust in the back end. And the whole thing is set up so that Codex or Claude can drive it. And they've got the skill alongside, which tells them what they can do and whatnot. And they'll just go and explore it. You can sit and watch them. They'll try and do, there's a database behind it. So they'll try and do SQL prompt injections. And there's all this stuff going on. And you start watching it. That's quite interesting. I never thought of testing that. Well, that's cool.
And I leave them running overnight and they produce a nice HTML report with a whole lot of pictures in it for me the next morning. And I have a look through that and I, there are some things here we need to fix and there's some interesting stuff and I'm glad that worked. And that's another thing I've got on all of this as well is, it's like building the infrastructure around it so that you can be more like the team lead. You get the AI tools to produce the reports for you so that when you want to, you can go and look and see what's been going on.
and then decide what, if anything, you need to do or just get a warm fuzzy feeling that, yeah, this all seems to working quite well.
Anne Currie (35:14)
So it's interesting. If I'm understanding you correctly here, we're talking about fundamentally, we've leaped from the initially whatever it was a year or so, year, two years ago, when you first started this, you really just didn't know whether AI could be used to write good quality code. And this is all about good quality code that has to work. This is production quality and scale production quality code. Could it do it?
and you did a whole load of playing around, worked with, I know in the early days, because I think nearly a year ago now, we had a coffee we talking about this and where you were on it. In the early days, there was a lot of difference between the models. Some of the models were better, some of the models were worse, but actually even a year ago, you were saying this, they've all started to reach a kind of minimum bar for writing pretty damn decent quality code. And then after that, you moved on to, well, how do you define it? How do you ensure the quality? How do you test it? And it sounds like how do you test it and then lock in the good stuff that you've got? Because otherwise, they have a tendency to regress. they are overly, maybe that they, they're your kind of slightly dangerous maintenance engineers.
Martin Davidson (36:31)
I think that's perhaps a little bit harsh. It's more, I guess there's a paranoia on my side of like, just, we're in a good position. I want to lock that. It's like, I don't know I would be very good on these kinds of things. These quiz shows, where it's at. Do you want to lock in your winnings? You yeah. I'm happy to answer a couple of questions, but I want to lock in what I've got. I don't want to keep gambling.
Martin Davidson (36:57)
Because it's that thing with software, isn't it? In the old days, when you did a large refactor of whatever, you knew there was going to be pain coming. It was good, and would be better in the long run, and it might be a necessary evil that you had to go through. But you knew you were going to break some things.
I think I just don't like that kind of surprise. And if there's a way of easily avoiding it and it's relatively cheap, why wouldn't I? I mean, it's interesting. If you go back on the journey two years ago, I guess, I've been doing this, trying to get AI to write code for ages, even starting with the old co-pilot stuff, which seems so archaic these days anyway.
Anne Currie (37:26)
Mm. Mm.
Martin Davidson (37:44)
You've gone from the situation of it providing kind of suggestions and doing a bit of auto completing things. And I think the thing I've always tried is, I think I'm greedy. I just want to keep saying, get the biggest bite that I can. What's the biggest thing I can ask you to build? And will you be able to do it? And can we make that work? I mean, if
Anne Currie (38:07)
and make it actually have something that genuinely works and is production ready That is your mindset on this, isn't it?
Martin Davidson (38:14)
Yeah, and you land it. And I think things like the Opus Codec, that has landed the emulator. The emulator is still in a lot of flux because I keep wanting to add more things. But it reliably plays the games I want to play. So from my point of view, I guess it's very high quality for because it does what I want it to do. It's like a lot of these, I guess the thing with all of these is
One of the things I was interested in was sort of the 80-20 thing, right? With AI, it's really easy to get going. And it is harder to actually get it finished. But can you put the agentic framework around it so that it can finish it without me needing to be involved? And in addition to being greedy, I'm also quite lazy. I don't want to be sitting there babysitting and telling it what to do. I want it to do it itself. I want to go to bed and come back in the morning and be done.
That was management was all about, wasn't it? You have a
Correct, think, for me, please. It never worked like that. I'm not sure I would ever have been brave to say, go do this thing. But it's that same kind of, if you can find the ways to put the scaffolding around the AI such that it can just go and run.
Martin Davidson (39:33)
Edmond, who's one of the other founders of Tollens, one of the observations he made very early on was that it's not like with humans. With humans, if you've got a new person on the team, you can influence them. You can help them learn and they can do things differently. The AI doesn't. You can't change it. You've got to learn. It's on you to adapt. And that's an interesting thing.
Anne Currie (39:50)
Yeah.
Martin Davidson (39:56)
Because I think it's very, to me sometimes it feels like it's a grassy field with lots of ruts in it. And there are these ruts. And if you go in the rut that the AI has provided, life is fine. Life is good. When you want to go somewhere else, life can be a bit tricky.
And I see that with people who say, it doesn't write code the way I would like it to be written. But it writes pretty decent code. At the end of the day, how many people have sat and looked through a .cod list of the assembler that comes out of the compiler? Of all the times I've done that, I've looked at it and thought, this is quite rubbish. It's not brilliant. But clearly, it's absolutely fine. And nobody cares anymore.
Anne Currie (40:38)
Yeah.
Martin Davidson (40:39)
It's good enough. And I think that's the position the world we're moving to with code that you just won't need to look at it. I mean, I don't mean the Opus Codec. I didn't look at any of the code for that. I have no idea what it's like. It all be gobbledegook but I have a thing and it works. And I've encoded a lot of music. I listened to all the music I listened to while I'm working. I was encoding with my codec. This is kind of weird. And they're
It passes all the official tests. There's a IETF. I can't remember. There's some.
Anne Currie (41:14)
I'm going to stop there and say,
I really love that. That you're listening to music, that you're encoding with your own codec. yeah, I wrote that codec.
Martin Davidson (41:21)
Yeah. AI-generated music, encoded with my codec Yeah. I know it's bonkers, isn't I mean, it just doesn't.
It's it's. There's two things about it that are weird. One is, first of all, how quickly it seems normal and secondly.
Anne Currie (41:39)
Is it just normal to you?
Martin Davidson (41:42)
Yeah, but I think as humans we're very quick at adapting and then we're very quick at getting, and I think the second thing is when you step back you realize just how strange it is.
Anne Currie (41:45)
Yeah, we are, that's true.
Mmm.
Martin Davidson (41:58)
I don't think I ever expected to be at this point, and it's almost like for me is someone I was reading over the weekend talked about software brain. There's a set of people who just want to build things and. And I feel a bit like that. I feel it's a bit like a drug. I come and I sit here and I'm building stuff and I'm really happy and time just goes.
Anne Currie (42:11)
Mm.
Martin Davidson (42:24)
Since last September, all I've been doing is building stuff and it's been awesome. Absolutely amazing.
Anne Currie (42:30)
So that's interesting. you still feel like even though you're not, you will no longer typing the code or even looking at the code, no code review, no looking at the code. All you're doing is making sure that the test, the framework is in place to push it in the right direction and make sure that it is actually reaches the goal that you wanted it to achieve. You still feel like that you built that code. You don't feel like you aren't the owner of the product, the work product there.
from your language.
Martin Davidson (43:00)
Yeah, I feel like we kind of did it together. I think one of the things that helps me is I was a team lead and a kind of manager for a very long time. And this is like when you first move into that role where your team don't do things the way you would do them. And you realize you don't have time to go and check everything and have everything converted into your way you would like it.
Anne Currie (43:09)
No, yeah.
Martin Davidson (43:24)
And you end up developing proxies and ways of, a sense of smell for what's good and what's not, and which stones to look under, all that kind of stuff. And it's exactly the same when you're working with all these agents. It's like, it's my team. They'll go and do things. I trust them to do things. I need to provide them with guidance to make sure that they're doing things and following the right processes and things. Exactly. Absolutely. Yeah. Yeah. and we'll have
Anne Currie (43:37)
Mm.
Yeah. Yeah. Trust but verify.
Martin Davidson (43:51)
And I think the thing that I find about it as well is you learn so much. Like on the emulator projects, I've learned nuances and weirdnesses about x86 and how you do performance optimization. On the emulator thing, the initial naive implementation was there's two cores and a whole lot of peripherals. So you execute one core, you execute the code for the second core, you step the peripherals and just loop on that.
And you realize that doesn't scale very well. So you kind of then think, well, I've got like 20 cores on my PC. Can I not put them on different things? And you're like, oh, OK, right. So we'll put them on different cores. And then you think, well, how do you arrange it? And you end up with there's a barrier. So they all run for a bit. And then they all wait for everyone to arrive at the barrier. You sync, and you run again. And the cost of the barrier, well, how expensive is the barrier? And the barrier turns out to be something in what we've got is like about 400 nanoseconds, just to go through and get everybody synced. And that's a long time.
for, it's not long time, but it is a long time in this context. And then you think, right, well, you've got all these cores, and what do you do? So someone arrives at the barrier early. Should they sleep? That's what you naively think. But you can't sleep, because the resume time for a sleep in Windows is measured in microseconds. So the barrier is only 400 nanoseconds, and you've run for, I don't know, 700, 800 nanoseconds. So if you're going to sleep for like 10 microseconds,
terrible, you can't do that, so you just have to spin. You discover things like that and you think, that's really rather interesting, I'd never, it makes sense, but I wouldn't have learned that otherwise.
Anne Currie (45:26)
Yeah.
so we're talking about you're talking about seeing things differently.
Martin Davidson (45:31)
Seeing, yeah, seeing, in terms of how the software world is changing, yeah, it's,
It's very different world we're in now. I think the other interesting question is where does all this go? Where does it end up?
Anne Currie (45:46)
Mmm.
Martin Davidson (45:49)
And I feel at the moment, there's like, even if it stopped now, and AI made no more progress as of today, there's still huge catch up for the industry. And there's a whole separate topic there of, is it actually possible for legacy software companies to pivot to this?
Anne Currie (46:10)
Mm.
Martin Davidson (46:11)
I was talking to some folk yesterday and we were talking about a team and the 30 people in this team and it sort of dawned on us that, if you can make people... So the data from the labs is coming at the moment. Anthropic and Google will claim that their engineers are between 10 and 100 times more productive. So let's go to the low end. So if you were a team of 30 and you made them 10 times more productive, well, then you only need three people. You don't need 30, you need three.
But actually, if you go back and you think of those 30 people, the general rule of thumb is something like communication overhead is 30 percent, keeping everybody in sync and everybody talking. So actually, of your team of 30 people, what's that? Nine of them are just doing communication overhead. So actually, you've got 20 people and we make them 10 times work. So you only need two because you don't have the communication overhead anymore. You've got two people rather than 30. So
How does that play out? The middle managers who are above that, how does a middle manager rank themselves? Well, they rank themselves based on the size of their team, the number of people that report to them generally and what the boss thinks and peers think but the size of their team. So where's the incentive for them to adopt AI really wholeheartedly? They're doing themselves out of a job, and you as an IC, where's the incentive for you as well? Because you are
Anne Currie (47:09)
Yes.
Yeah.
Martin Davidson (47:36)
potentially getting rid of all your colleagues. We're social creatures, humans. It's nice having a team working with other people. There's no team I've been on where I thought, you know what? I'm going to learn a skill that makes that all my colleagues can go away. It doesn't work like that. And then you go to who gains the spoils from this. Because at the one end, you make a developer becomes 10 times more productive.
Anne Currie (47:40)
Mm.
Yep.
No, absolutely not.
Martin Davidson (48:05)
And all the benefit goes to the company. The company gets all the benefit of that, and they get no more money. At the other end, you've got all the benefit goes to the individual, because they're 10 times more productive. They do their work on a Monday morning, and kick back for the rest of the week. But what happens in the middle? And maybe you end up in the middle, but if you become five times more productive, is your company going to pay you five times more? They're not.
Anne Currie (48:07)
Mm. Yeah.
Yeah, it's not. Well,
it depends, doesn't it? If you have a unique skill that they can't get elsewhere, yeah, if it's your skill that is making them five, making you five times more productive, and there's nobody else has that skill. But it doesn't sound like that's the case here.
Martin Davidson (48:46)
Well, I think
I think what's interesting here is that there's a very different set of skills that you need to go fast with AI from the ones you needed to write code. You need to be, I think management experience really helps. I think you need to be good at juggling plates and organization. It sort of started to dawn on me that all those hours I spent playing Civilization might not be wasted.
Anne Currie (48:58)
Hmm.
Alright!
Martin Davidson (49:18)
All right. It's great.
I do have a habit of rewriting history in a positive way, my wife tells me. Yeah, that's probably the ultimate case. But it's
I think
the skills that I use now are those desire to learn, I think, and
architecture and design skills.
Martin Davidson (49:44)
And I'm not a great Rust developer. I was too late to the game to be a great Rust developer. I can write C, but I don't think at this point I'm ever going to write any more C code in my life. Certainly not professionally, only maybe just for a bit of fun
Anne Currie (50:01)
Yeah, for a bit of fun. C will be written.
Martin Davidson (50:12)
So if you're someone who liked the art of writing code, that's a difficult position, I think, to find yourself in. Whereas for me, it was always a kind of means to an end. I like building a thing. And that was kind of why leading a team was good. Exactly. You lead a team, and you can build bigger things and faster. It's great. And this is kind of the ultimate of that, in a way.
Anne Currie (50:16)
Mm.
Yeah. Yeah. Me too. Yeah.
Martin Davidson (50:39)
It is in the back of my mind that you create something like Tollens and, and, you sort of, am I doing myself out of my potential future? But I don't, that's not a reason for not doing something and I think
learning and adapting and things is a thing that I find really interesting. So it doesn't worry me.
Yeah, but I still wonder where all this ends up. And I think I'm old enough and I think we're both old enough that we remember companies like DEC and Sun and ICI and the huge disruption that happened in the early 90s and then again in the early 2000s where there were a lot of layoffs in the software industry. And if you only been involved in it for the past 20 years, you probably think it's a wonderfully stable industry that just only goes in one direction. I do wonder what's going to happen. Because I just, going back to what we're saying, I think it's nigh on impossible for legacy companies to pivot. You look at what happened, Jack Dorsey and Block, they sacked half their staff at the start of this year. You ask, go and ask Claude, give it the organizational structure of a company with a thousand people, an engineering company with a thousand people in it, and what it looks like AI first, and Claude will give you like 200, 250 people back.
Anne Currie (52:16)
Yeah.
Yeah, it's a strange thing, isn't it? There's been much speculation amongst folk, older, more experienced folk in the tech industry like us, tech veterans, that through our lifetime, we might see basically the beginning, we were in at the beginning of the software engineering industry, really. It didn't start that long before we started at work. And we will, we might not still be in the workplace when it kind of disappears, but hopefully, it may be in 10, 20, 30 years time, but we will probably be alive at the point at which really software engineers don't exist anymore. And it's very interesting, is that for a really huge industry to rise and and just disappear potentially within a single human lifetime. It's quite interesting, isn't it? Working lifetime, potentially.
Martin Davidson (53:24)
It is.
When I was a kid, we used to go on holidays to the north of England. You'd go around and see all the old mill buildings and all these things that were falling down. And you think, wow, it's amazing, the decay. And I remember when I was in Microsoft in Redmond and you looked around and you saw all the Microsoft buildings. I think they were up to something like building 31 or something when we were there. all the people who lived
Anne Currie (53:53)
Yeah.
Martin Davidson (53:56)
nearby because that was where they worked and you thought at some point Microsoft will fold because that's the natural curve and what happens? what will you know kids of 50 years time drive past here like I did on my family holidays and think what was here before how did all this end
And you're right, it feels like we're closer to that at this point than we ever have been because Microsoft doesn't need all those engineers anymore.
Anne Currie (54:32)
I
there are lot more engineers at Microsoft than there were in the days when we were there.
Martin Davidson (54:37)
Yeah, absolutely.
I mean, it's massive now, isn't it? It's a massive company.
Anne Currie (54:42)
Yeah.
All the engineers working on Exchange every Friday, we stood at the bottom of the stairs and had a beer. It wasn't a huge number. I mean, now it's just not how the software industry works. There are thousands and thousands and thousands of people involved in these projects.
Martin Davidson (55:01)
Yes, and it's become a lot of these things become bureaucratic over time, and you get all these little fiefdoms appearing and people with specialist roles.
Anne Currie (55:07)
Yeah.
Martin Davidson (55:19)
Yeah, it just becomes inefficient. And I think, from everything I've read, the natural order is that you become that's just what happens with companies over time, they become bureaucratic and inefficient, and eventually they fail and they're replaced by somebody else. And that's just what happens. So Microsoft has done phenomenally well lasting as long as it has because that's very unusual.
Some companies reinvent themselves. I look at IBM and HP. They're consultancy companies, essentially. So they've survived. they're shadows of what they once were. I think, in many ways it's good because one of the other things, going back to what we were talking about earlier, one of the things that I find sad is how software, a lot of software we use day to day hasn't improved over the past 30 years.
Anne Currie (55:46)
Yes.
Martin Davidson (56:12)
Word is orders of magnitude slower than it was 30 years ago. And it really doesn't do anything else. And how can that be? Because no one's focused on the real end user experience. It's about building new features and hopefully trying to make some money off them. Amazon Search is terrible. OK, there's presumably a reason for this, but I don't know.
Anne Currie (56:15)
Yeah, effectively, yes. Yeah.
Mm.
Martin Davidson (56:37)
Windows when I was 30 years ago, I use Windows 3.1 and I thought I wondered what happened when it ran out of memory. So I wrote program that just leaked memory and it was terrible. Lots of bits randomly stopped working and things. And thanks to the wonders of AI, I now have the chance to experience what happens with Windows 11 when it runs out of memory because it will run an awful lot of tests and will run out of memory. And Windows 11 behaves exactly as Windows 3.1 does. Things stop working. Just random things.
Anne Currie (56:55)
yeah.
Martin Davidson (57:04)
Windows Explorer, bits of it will stop working. Like the start menu will stop, you can still, task manager won't display any icons. It's just bizarre. Here we are 30 years later, and it's exactly the same. No one has bothered trying to solve this problem. So that's sort of, I think, what's the word, enshittification of that. How things get worse over time. Sorry, I don't know if I'm allowed to say that word on this podcast.
Anne Currie (57:27)
yeah.
You're allowed
because it's my podcast and you can swear. Although I don't think enshittification is swearing because it is a word in and of itself coined by Cory Doctorow to describe things getting worse because really people aren't paying attention to continuing making them better anymore.
Martin Davidson (57:34)
Okay, bye.
Is it?
Yeah, the incentive has gone away from making it better, which is where one of the things I'm and I might be deluded here because this isn't the way the world works, but I am kind of optimistic and I would like to try and make things better. And some of one of my motivations for Tollens is to reverse that so that we actually just start using tools which build better things for us naturally. And it doesn't have to be a human cognitive load of doing that.
Anne Currie (58:16)
Yeah.
Yeah.
Martin Davidson (58:22)
So.
Anne Currie (58:22)
Well, and that really ties together. It's one of the reasons why you and some of your colleagues from many, actually all of your colleagues, I think from Tollens are going to be on the podcast at some point, because the main thing that's the drive that me and Sara and Sarah particularly and Jon and everybody who's appearing on the podcast wants is software to just get better. And I think software engineering has lent very hard on the fact that hardware has got so much better and has lent on Moore's law for most of our career. And so, you'll get something which is as good as it was 20 years ago, 30 years ago, but it requires 1000 times more hardware to produce what is basically more resources, more CPU cycles to produce something which is functionally pretty much equivalent to what it was doing 30 years ago. And that's just such a waste because software is so powerful. It's so much more powerful than hardware.
Martin Davidson (59:24)
Yeah, you're right. I mean, it's. And Jon was sort of touching on this, I think, when he was on recently and about that. In many cases, it might be the right call, because optimizing software, making it better and what not requires a lot of people time. So maybe you're making the right decision there. Maybe it's the right decision to write things in Python rather than the lower level languages.
Yeah, but I think to me it feels like AI changes that calculus. Python. Yeah, you don't need to use Python anymore. You can just do everything right because you're not coding it. You just use the best. You can.
Anne Currie (59:52)
So yes That's what I like about it.
Yeah. Use English. Use the English language.
Martin Davidson (1:00:10)
you can get it to optimize performance. It understands all these things. And part of the reason I think as well that we have some really slow software is a lot of people don't understand what's going on at a low level. I think you, way back, do I remember this right, that you did a sort of very low level course in MetaSwitch days of nuts and bolts of what's going on in a processor and things.
Anne Currie (1:00:23)
Mm.
I did actually, yeah, I used to have to do that because, oddly, even high-end engineers didn't necessarily understand how does the stack work. And actually, if you're debugging problems, you really needed to understand that back then, 20 odd years ago, 25 years ago, you had to understand how the stack works because that was the cause usually of all your scribbling bugs that are the bugs that you were hitting in the field in production that didn't turn up with all your unit tests and everything because it required you to have a certain amount of, because they were just very unusual and therefore you needed millions of operations going through with all the randomness of real users and then all those scribbles appear because people are using stacks wrong. And yeah, absolutely.
Martin Davidson (1:01:17)
Yeah,
it's amazing. And it's understanding the cost of a data copy. So one of the things when we did the remote desktop client is there's the decryption, obviously has to move some data, but there are no other data copies in the mainline path. Everything is then from this buffer, because it's too slow. And yeah, and we have all these languages where people will just merrily do things which results in endless
Anne Currie (1:01:24)
Mmm.
This is too slow! Too slow!
Martin Davidson (1:01:46)
copying of data and you look at what's or thinking about Rust is beautiful as well because Rust is this thing called PGO program guided optimization, I think, where what you're doing is you run your code and you look at where all your hot functions are and then you put them all together in the same code pages so that all the hot functions sit next to each other so that they sit in the L1 cache of the processor ideally and you're not constantly thrashing that cache because what's the hot function here? Oh, I need to go and get this other page and put it in here because caches operate the line size, you have to have particular chunks of stuff understanding like why that's important and the gains it can give you. I can't remember, they might have been on the Opus codec, but on one of them, it gave me like a 40 % gain. Just wow.
Anne Currie (1:02:36)
And as Jon this is either the last Jon podcast or the one that's coming out next week talks about you can, no matter how clever you are, no matter how experienced you are as an expert in code performance, you can't deduce that. You have to suck it and see. It's incredibly time consuming, loads and loads of trial and error.
Martin Davidson (1:03:01)
Yeah, I mean,
there's where is it back here? Michael Abrash's book. Zenith. This was my Bible of performance. So Michael Abrash worked on Quake and getting the performance of the Quake engine up and did some amazing things of being able to use the two pipelines in the Pentium in parallel. It's really interesting. Well, I find it really interesting to read about. I'm not sure everyone but
Martin Davidson (1:03:32)
his mantra was exactly what you're saying there. You need to test, you need to measure it. And again, that's where the Opus Codec that I've got, the performance of it is about the same as the performance of the C one. It's a little bit slower in some areas and a little bit faster in others. I didn't do that work. I got Claude and Codex. We came up with a list of possible interesting sounding things to go and explore.
And then I left them doing it. I gave them, here's a benchmark so you can test it. Here's a set of tests. Take this idea, create a work tree, go and test it in the work tree. Bench it, see if it makes it better. If it does, let's run all the tests, make sure it works. If some of the tests fail, let's see if we can fix them. If we can and it all looks good, we'll commit it. If we can't fix them, we'll just throw that work tree away. Or if it's not performing, we'll throw the work tree away and we'll move on to the next one. And they're happy to do that for like 24 or 48 hours just tootling away doing that.
And you come back and think, the performance is now an awful lot better than it was. And it's that whole, it's a kind of combination of laziness thing and putting it in the right sort of harness, and going to these remarkable, amazing things.
Anne Currie (1:04:49)
It's interesting.
Terminator. I like all the Terminator movies, but the first one is my favourite. And there is oddly enough, there is one line in it, which I think about all the time. it's not like AI will destroy the world or anything. It's the line about how you can't escape the Terminator because it never gets tired. It never gives up. It never gets distracted. It's what it does. It's all it does. And that kind of
Martin Davidson (1:05:13)
Yeah.
Anne Currie (1:05:19)
That is a really good description of kind of what you're trying to do with setting up what kind of a North Star, what good looks like and then say that's what you need to be going for. It's a bit like in the Terminator movies, the Terminator was given a very simple task, kill Sarah Connor and just keep doing that, kill every Sarah Connor in the world. it's like
Martin Davidson (1:05:21)
Yeah.
Yeah.
Anne Currie (1:05:46)
So there was the vagueness like which Sarah Connor it was. a bit of a lack of specificity there, but they didn't know. So it was a good goal. It was given a really good goal and executed it literally.
Martin Davidson (1:05:59)
Yeah,
you're absolutely right. slightly terrifying parallel, isn't it? But yeah. It's
It is, it's not like anything you've ever really seen before, is it? That's the
I thought from the beginning. Well, we're going to find out that all this has all been predicted by a whole lot of sci-fi stuff. Although then the question is, is it sort of
Martin Davidson (1:06:28)
Is there just a big loop going on here? Because it's all the Claude and all these things, they're all trained on all of our sci-fi stories. They're just out, this is how it's meant to go. The Terminator film said this was what was meant to happen.
--- Break ---
Anne Currie (1:06:44)
So this might sound a little implausibly found-footage, but when I came to edit this podcast, that last statement by Martin was the last thing that appeared in his recording. Although we actually talked for another 10 minutes about somewhat more cheerful results for enterprises from using AI. So I'm going to have to cut off the video at this point because that was the end of him.
not the end end, just the end of this recording. But he will be back on the podcast to talk, maybe we will actually cover some of the more positive things we said after that. But thank you very much for listening. I think this was a good episode. I really enjoyed it. I really learned a lot, to be honest, that I wasn't expecting. So we'll definitely have Martin back on again, probably with his two other colleagues. But thank you very much for listening. And please do subscribe. It really cheers me up if you subscribe. I really feel that. hopefully, see you again. See you all again on a future episode of asynchronous and unreliable podcast.