WELCOME and OPENING KEYNOTE - 9/18/14. Live captioning by Norma Miller, whitecoatcaptioning.com >> ALEX: Hello, welcome! >> Welcome to Strangeloop. I'm really glad to see all of you here. I'm glad so many people made it to the party last night. Hopefully you had a good time. I wanted to mention a few housekeeping things. We have a mobile, which is EventJoy. That's been going through on the loop earlier. We do have WiFi on the open network, and we're going to be outside monitoring issues, we're hoping to make that as good as we can. If you can refrain from downloading iOS8 or other things, I'd appreciate it. I want to thank all of our sponsors, in particular our platinum sponsors, Cerner and Staples Innovation Lab, and I've been showcasing all of the other sponsors on the slides today. If anybody signed up for the Cardinals game tickets for tomorrow night, we're going to distribute all of those tomorrow from the registration desk, so please stop by and get those tomorrow during the day. And we do have extras, if you want to go and didn't sign up, that's fine, or if you want to bring somebody, we do have a number of extra tickets left over. >> Tonight after the conference is the unsessions and those are back in Union Station, and those go from 7 to 10:00, and the schedule is on the website and also in the mobile app, and if you are staying at the Hilton Ballpark, there will be a shuttle from 8:30 to 10:30 to take people from Union Station back to Hilton Ballpark, just like we did last night with the shuttles to Union Station. So I wanted to say just a few words. Strangeloop was created with this idea of mixing people from a lot of different communities and seeing what sort of cross-pollination would happen and what we could get out of that. And that's both, you know, academic versus industry, and across language, communities, and client and server and all sorts of different kinds of communities, and for the last year we've been putting an increasing focus on creating a more diverse program of speakers and a more diverse audience. And so this year we did an experiment called the diversity scholarships to try to create opportunities for more people who might want to attend Strangeloop but were unable to, to remove the obstacle of cost, and see how that would change things. >> So I donated 75 tickets, and seeded the fund with $15,000 for that. In the end, we managed to raise an additional $59,000 and I threw in another 15 tickets, and we were able to support 90 diversity scholars coming here to the conference. [applause] >> So: I think that's -- I'm thrilled to have everybody here and I hope we can expand that next year. And one of the slides earlier was on the diversity scholars, and if you happen to see anybody from those companies, they're here in attendance, please thank them and please continue supporting us in the future and sort of one rough number that I can share is that the number of women attendees is 12% as opposed to 6% last year. So great progress, and I hope that we continue to make progress on that. >> So one final question I wanted to ask was, why are you here? Why travel, you know, thousands of miles and come here to another city and to see these talks when they're all going to be released on video in a few days anyways, and I can only assume that that's really because you want to be in close proximity to other people who share ideas and have ideas that are different from you, and people you can learn from, so I would really encourage you to reach out and talk to other people around you and talk to that person in the elevator, or you're standing next to in the lunch line, or go check out a session on something that you wouldn't normally go see, and so expand your horizons and take those things back to where you live and do some of that where you live, too, so hopefully it's an expanding idea that we're putting forth here. So with no other introduction, I would like to introduce our first keynote speaker, who's the co-inventer of Erlang, and obviously has written some excellent books and is a very dynamic speaker. He did the keynote at Lambda Jam last year for me, and showed up in shorts and sandals for the keynote and did a very dynamic, fun talk and everybody loved it, so I'm very happy to have him here at Strangeloop this year to do the first keynote, Joe Armstrong. >> Thank you. "The Mess We're In" Presenter: Joe Armstrong. >> JOE ARMSTRONG: Well, thank you, Alex. About two and a half weeks ago, it was a Friday, I had a bad day. A really bad day. I had to prepare a lecture that I was going to give on Sunday. It was to the commercial users of functional programming, and I very foolishly put up a title, "How To Make Money From Functional Programming." People thought that because I'd invented Erlang and because what was written in Erlang, what sold for 9 billion dollars, that I knew how to make money and here I was it was, a silly title, and I'd taken Friday off to be at home, because I can't get any work done at work -- I get interrupted all the time -- and I started out in Open Office, and a little prompter came up and said said, "There's a new version of Open Office," and I thought, Oh, great, a new version; it will be better. So I clicked on this little thing and yeah, what happened? I installed Open Office, and I had put some images, and suddenly all the images vanished, and I wasn't really pleased about that. So I Googled a bit and it said, Ooh, Open Office can lose images under certain circumstances. This was reported several years ago and tag still there, and so I thought, oh, golly, what do I do now? Well, Apple had very kindly broken keynote, as well for me, because -- my workflow, so I have a Macbook Pro, a big one, 17 inch or whatever it is, heavy, and I do all my slides there and I put them in Dropbox, and I have a small 11-inch screen, and when I finished my work I close the lid and I open the lid here, and it synchs up. So Apple had very kindly destroyed my workflow. So I wasn't very happy about this. So I know what I'll do, I'll do my slide shows in HTML, they're really nice. But there's one problem with them, that people like to get PDF copies of the slides afterwards, and most of the slideshows that people have done in HTML don't produce decent PDF. So if there's anybody who's written one out there, please make it produce decent PDF. And so I Googled it a bit and I found a project that said you can make nice slideshows in HTML and they can produce decent PDF, and so I downloaded this program and I followed the instructions and it said I didn't have Grunt installed. Now I'm an old guy, so I didn't really know what Grunt was. My Grunt files weren't right or something. So I Googled it a bit and I found out what Grunt was. Well, I still really don't know what it is, but so I downloaded this thing and I installed Grunt and it said Grunt was installed, and I ran the script that was going to make my slides, and it said, "unable to find local Grunt," but I'd just installed Grunt, so I turned to Twitter and tweeted, "I'm having a really bad day here, unable to find the local grunt." And Twitter is very helpful, because people said, "Well, your Grunt path is incorrect and set your Grunt path." So I gave up. So that was it. Right, so, we'll get to the title of the talk in a while. I've been programming since golly, a long time ago, 1965, something like that. That's quite a long time, and programming actually goes back longer than that. The first computer program is remotely reminiscent of the way we write programs today, ran on June 21, 1948, and it was written by Tom Kilburn. Tom Kilburn, he is the world's first -- I know they say [somebody else] was the first programmer, but she didn't program the machine we know of today. This is Tom Kilburn holding a thing called William's memory. It can store -- what is it 64, 32-bit words, right? That's the total storage, and it was cathode ray tube. As the storage changing, you can see on the display what was going on. And this is the first program that was ever written. That was it. And it ran on -- I forgot the date, in June, 1948. And the nice thing about that program, it was probably the last program that was totally correct. I mean there's just one program in the entire world. This is the first program that you put the program into memory and it can change the instructions and then they're executed and it computed, I think the first -- I think it checked the first 38 prime numbers or something like that, and they were all very pleased that it worked. Of course they didn't realize what they'd done, and I don't think Kilburn thought -- just imagine, he was programmer No. 1. Right? And why am I talking about that? Well, let's go roughly halfway between 1948 and today. 1985, this is when I started work on Erlang, and at the time, a computer looked like this. This was a really powerful, fantastic super-duper machine. Typical piece here, it's got 256 kilobytes of memory. Isn't that a lot? And 8 megahertz clock speed. That's pretty cool. You can do amazing things with that. Yeah, it was great. So if we look at today's computers, this is the typical laptop. 8 gigabits of memory. It's got 4 cores running at 2.5 gigahertz. A big solid state disc and so on. Now, in 1985, that machine would boot, in you know, 60 seconds, and this machine is 10,000 times faster or a thousand times faster, it should should boot in 60 milliseconds? How many of your machines boot in 60 milliseconds? All right, so something's gone wrong. When did it go wrong? So that's the title of my lecture, the mess we're in. I'm going to look at the things that I think are going wrong and I talk a little bit about why they're going wrong and then I'm going to talk about the physical limits of computation, so I'm going to relate the speed of computers to some physical quantities, so we can see kind of ballpark where we're in, and then I'm going to suggest some ways that we might be able to clean up this mess. OK? I think software is actually getting worse and worse and worse with time. It's not that we can't do amazing things with computers. We can. But when they don't work, we can't understand why it didn't work. Go back 30 years ago, if you've got a program that doesn't work, you'd go back and look at it. And now if you've got a problem, you don't look at it. You ask Google, and then it says try this and it will make it work, and you try it and it doesn't make it work. I'm going to explain why that is, because I've been thinking about that a bit. Right, so my generation of programmers, I think we should get congressional medals for creating employment, because we have created billions of man hours of maintenance work for people in the future. We are the job creators of the future. All that stuff we wrote years ago, you know, these poor sods will be scratching their heads and going "What the hell does this stuff do?" right? And it's terrible. So what went wrong? I'm going to talk about that and what are of the laws of physics? Because I used to be a physicist, so I can tell you about black hole computers and things like that. If we could make them, they would be quite good. OK, so I just when I was preparing this, I thought, oh, I know, I was going to show pictures of Dante's Inferno and you know, the levels of hell and all that kind of stuff and I thought I'll write down Seven Deadly Sins. I was writing this on the underground, and by the time I'd gotten off the underground I'd written down 25 and had some trouble in sort of ordering them. If you look at these -- I mean one thing I notice is, is code that I write now which I can't understand in a week's time, do you do that? Am I unique in that? You know, I write some code and I really understand it, and a week later, I can't understand it. How many people do that? Oh, good, I'm not alone. I thought it was only me. I think that's because your brain works in two different ways. It sort of -- when you're working on a program you're like cashed into the program and you see no need to explain it to anyone because it's bloody obvious so you don't write any documentation. And when you've cashed it out, you wish there was some documentation, because you can't understand it at all. So this is very difficult. So the answer to this is comments, and if you look at these sins, no comments in the code, you can't understand it, no specification, very obscure, it's not beautiful, it's all about shifting your brain from the mode you're in when you're writing code to the mode you need to be in when you're explaining how the code works. Right. So has anybody heard of comments? Does anybody put comments in their code? Right. Does anybody put no comments at all? Their code, in modules? Right. Yes. Yes, I've done that. A lot. Robert , who developed Erlang with me, is famed for his comment. I think only wrote. Singular. The entire stuff he had had one comment in. A single line that said, "And now for the tricky bit." I thought it was quite good. Now, comments are really good, so I would advise you to write comments. Now, and write big comments. Right? Really big, really exclusive comments. Now, how many people write really big comments? Right. Well, I was going to say, there's another word for a really big comment. What's that? No, a book. How many authors have we got here? Well done. Thank you very much. Because if you haven't got a book, you don't know how the bloody program works, right, so a book is just a big comment, right? And it's difficult at first. Just write your comments and they're bigger and bigger and bigger, and pull them out and stick them in a book, and once you publish the book you'll be rich and famous and they'll invite you to conferences and you can talk about your book. We'll forget the rich and famous bit. You can talk about your book anyway. Right, so we've done all these deadly sins, and there's a load more. Today we've got this stuff legacy code to deal with. What's legacy code? Well, that's dead programmer stuff. I mean not only are there no comments in it, you can't ask the people who wrote it, because they're dead. And it's written in languages that nobody understands, written in Cobol, and there's no specification, and yet it works beautifully. Possible. And management thinks that modifying legacy code is cheaper than a total rewrite. I can tell you, I've looked at some legacy code. Sometimes, you know, changing one line of legacy code is equally difficult to totally rewriting the entire stuff, but management doesn't believe that. So what do they think you should do? Bug it in the machine and don't mess with it. Right. I'm now going to talk a little bit about complexity, because I used to be a physicist, here are some numbers that you can have in the back of your head. The mass of the earth is 6 times 10 to the 27th grams, the number of atoms on the earth is 10 to the 50th atoms. And the number of atoms in the universe. Somewhere between 10 to the 78th and 10 to the 82. It's a very big number. So let's solve a little equation here. I'm going to solve the equation 2 to the K equals 10 to the 50. We don't need Mathematica for that. We can do it in Erlang. ... right, 2 to the 166 is equal to the number of atoms in the planet, and if you divide that by 5.2, you get -- and round that up. So you've got 6. 6, 32 integers in C. The number of states it can possibly be in is the same as the number of atoms in the earth. Right. So if you wanted to test your program by computing all combinations, it's going to take me a long time. Don't ask me about Javascript. Because there's only three variables in Javascript. The number of states that three variables in Javascript could have is greater than the total number of atoms on the planet. Right. Just think about that. Right. So a computer is a finite state machine, right? It's got state, but how many states has it got, well, it's got an awful lot of states. State across event and you get a new state. The number of states that -- my new laptop here has 250 gigabytes of flash memory on it. So the number of states it can be about is ... numbers number of atoms in the universe is 2 to is 260, and that means you need 2 to the -- whatever that number is; it's unreadable -- universes to find two computers that have the same state, right? So this explains, this gives you a very good explanation as to why your programs don't work and why Google ... Because what happens is you have download something onto your machine and you do something, and it doesn't works, so you Google and you find something and it says, ooh, I had exactly the same problem as this. Do this. And then there's like 10 mails after that say gee, thanks, that's really great and so you do it and you do the magic spell and it doesn't work. And you Google again, I had exactly the same, and it's the same story and you do it again and again and again, and then finally it works and you don't understand why. So why it did it work for this guy and not for me? Well, it's because our machines were in a different state when we performed those operations, and the number of possible states of the machine is that whacking great big number, and I'm just not going to find somebody who's got a machine that's in the same state as mine. It's going to be in a different state, and that's why it's not going to work. Right. So what are we going to do about this? Well, there's all this math stuff, programming scary stuff. But that is way, way -- that can deal with programs that are very small, not with the size of programs that we have today. Another thing we have to deal with, we have to deal with failures. Computer systems fail. They're just going to fail. We have to deal with it. We can't ignore it. And this is -- to handle failures, we need to go into territory that is unusual. To handle failures, you need two computers, you might need 10 computers or 50 computers. You can't handle failures on one computer. If you've got a problem executing on one computer and it fails, that's not good. You need 1 hundred computers. If you replicate something on 100 computers, and the chance chance of them failing is 1 in 1,000, then the chance of them failing is raised to 1 in in 100,000. In order to do that, you need to understand distributed programming and parallel programming and concurrent programming. If you're running programs of two machines you are running distributed programs. They're running at the same time, that means you are running concurrent programs. This leads you into places you don't want to go. It's quite easy to make things that are not fault tolerant and not scalable, but if you want to write things that are that are fault tolerant and scalable. There are some very good books that can help you. Hope they've got it outside. This tells you how to do it. Systems should self-repair with time, they should be more like biological systems. People say they don't like Erlang, it's got curly brackets. But notation does matter, because Romans, very nasty -- imagine if you did a prime number arithmetic or RSA in Roman numerals, I thought that would be fun. Languages do matter, but the problem is in 1985, I think all programmers knew shell scripts and make and C. So there was sort of a lingua franca. We could talk to each other. All programs could talk to each other. Actually there wasn't. There was we didn't talk to them. There was make shell script, make C. Well, now we don't have these common languages that we can take to each other. You know, there's doobie-doobie-do, Frotran and Ruby. Funny, when I tweeted my Grunt files doesn't work, someone tweeted, "Joe's gone over to the dark side." So I had to Google, "uninstalling Grunt." So when I learned to program, you could choose between these three languages. And now, well, I don't know how many there are. When I first made the slide. I found there were 676 and then I found another slide there were 2500, I don't know what most of them do. That's very difficult if you're a beginner. Which language do you choose to start with? There are so many. So we don't have this lingua Franca to start with. I checked on the net. I found this, well, is there a -- let me question, is there a rake equivalent in Python? There's paver invoke shovel. Well, I don't know. Real who really cares? Another thing: It works for Ericsson. Erlang. Guy says there's a script that does that. I don't know how to make a file that invoked a bake file or a bitbake file, and it was taking a rather long time and then I stopped it after a while. Downloaded 46,000 files. Hang on, I've got 3 files. What's happening? Do you really need to recompile the entire Linux file and rebuild it? Oh, yeah, we do. And I actually could have used Grunt or something and it would have been much easier. So without Google, programming would be impossible. How many of you can program without an internet connection for more than five minutes? The rest can't. It's terrible. It's terrible. Right. And then we've got this sort of dichotomy between efficiency and clarity. You know, to make something clearer, you add a layer of ab traction and to make it more efficient you remove a layer of abstraction. So go for the clarity bit. Wait ten years and it will be a thousand times faster, you want it a million times faster, wait 20 years. Doubles in speed every year. Yeah, just wait a bit. Names. I don't like names, I'll talk about names. Names are very imprecise. Unless we can agree on the meaning of names, we get in a mess because we can't talk to each other. I'll say much more about that later. I want to get on to this bit now. How am I doing for time? Yes, I'm doing all right. What does physics say about computation? I sort of got into this when I was looking at the manual page for Erlang and it said, generates unique reference and then it says, well we occur after approximately 282 calls. 2 to the 82 calls is about 10 to the 25. Remember that number. 10 to the 25. Remember the number of atoms. Quite a big number. So here's some bits I'll just remind of you some laws of physics. Causality, a cause must always precede its event, right? Something happens and then something happens later because of it. And how things happen are because we propagate rays of light or sound or something. We don't know something's happened until we get a ray of light, something that's conveying information to us. Now, a lot of systems actually breaks the laws of physics. So this notion that you can have consistent data in two different places breaks the laws of physics. If I know something -- two computers and I say, hello, the value of X is 5, and then does it know that -- can I assume that it knows that the value of X is 5? Well, no, because I don't know if that message got there. So I want him to send a confirmation. He'll send me a message back, yeah, I know that X is 5. Can this computer now assume that I know that X is 5? Well, no it can't, because it doesn't know that message has got there. So he won't know it's got there until I send him another message that says it's got there. And that's a Byzantine general's problem. You can't replicate data because you have different systems. Well, two-face commitment breaks the laws of physics. So it's not very good. And then physics is all about things like concept simultanuity. >> It depends upon the time that light takes to reach these things. So if we forget that fact, especially when we're building distributed systems, we'll get into big problems, you shouldn't write systems that violate laws of physics. Yeah. That's a slide. >> Entropy, another law of physics. Law of physics says entropy always increases. What does that mean? It means if you've got a load of dice, chuck them all up in the air they're not all going to land with 6 upwards. They're going to get more and more disorganized as time goes forward. There are some fundamental limits to the speed of computation. Here's a couple of physicists. Who knows who the guy on the right is? No, left. Left if you look at it. Who's that guy? Max Planck, yes. So this is Planck's law. Relates energy to frequency, and the hairy guy, that's Mr. Einstein. So Planck said is E=e mu. That's called the Bremermann limit, and it's ... and that says that's the quantum limit to how fast a kilogram of matter can change state; how fast it can vibrate. So if you've got a kilogram, it can vibrate that quickly. So if you look to quantum mechanics, you find all these fun numbers, you find the Bremermann limit -- so let's look at these. You see? The Bremermann limit, that's the fastest clock rate that you can get out of matter. It comes from quantum mechanics and it's 1.36 times 1050 hertz per kilogram. Then you've got the amount of energy you can get. You can do 1033 operations per second out of a joule of energy. You can store two times 43 bits per mass M rate is kilogram, whatever that was, and the Landauer limit is the minimum amount of energy to change one bit of information. These come from quantum mechanics, and just because of that, you can work out a zepojoule is kind of -- let me see, it's 27 orders of magnitude bigger than the smallest amount of energy you could use, which is not very good. So now I want to build a really fast computer, so how do you build a really fast computer? You squash the components, you put more and more components into the box and we limit the weight to 1 kilogram and we squash more and more stuff into it, so ultimately it becomes a 1-kilogram black hole. That's the ultimate computer. That will actually do -- it will operate at 1051 operations per second, right? And it's got a size of 10 -27 meter but there's a problem it lasts for 10 -21 of a second. Right, and it emits data -- thank you -- Hawkins radiation and quantum entanglement. So there's a kind of a fun thing there's a picture from Scientific American. You've got your quantum computer, you drop your particles into it, and the computation takes place inside the black hole and throws quantum entanglement the quantum on the outside puts its spin or whatever. And you get a measurement out. So you've got a thing that can measure, you know, you've got to get this 1051 operations per second going and then we've got to store all this stuff with quantum entanglement, and yeah. I don't think it's going to work for a while. I think we've got to learn a lot more before we can make this thing work. So if you're interested in that kind of thing, there's a very nice Scientific American article on the physical limits to computation. So why do you want to know these numbers? Well, one group of people -- oh, wait a minute, I had a summary of these. So to summarize, it's -- a 1 kilogram computer can do 1051 operations per second. To store 1031 bits and a conventional computer can do 109 operations and store 1012 bits of information. So you see there's a big gap between what's physically possible and what we can do today. Of course one computer isn't a 1-kilogram black hole. It's the entire universe behaving as a supercomputer. Actually, according to this article, you'll see that the entire known universe has done 1023 operations since it was booted, so when the universe was booted a few years ago, it's now performed 1023 operations, so who's interested in this number? Cryptographers, if it takes more than 1023 operations to crack the code, even with a quantum computer, you're going to need several universes to crack it. Right. OK, so that was the physical limits of computation. Let me see -- yeah, we're on time. So what can we do? How can we sort out all this mess? So now I'm going to go into uncharted territory. Well, I want to build the entropy reverser. This is a device that we -- can you imagine, I tried to find a big sausage machine, where you put sausage meat, you know, you turn the handle. We put all programs into it, we turn the handle and a smaller number of programs come out. You throw away all the other programs. And that breaks the second law of thermodynamics. The trouble with software, you see, complexity increases with time. We start with one program. And it becomes two programs. We want to reverse that process, and this is a problem I've been thinking about for many, many years. I'll show you some of the conclusions I came to. OK, so there are all sorts of problems with things, files and systems, they mutate all the time, they grow in entropy, disks are absolutely huge, and there's all these problems with naming. Naming is horrible. If you've got a file or something, what file name should it have? What directory should I put it? Can I find it later? I've been -- I write about three modules a day. So in a year I'll write about 1,000 Erlang modules, and I've been doing that for 25 years, 30 years. So I've got about 25,000 Erlang files that I've written on my machine and then I've download. When I checked, I had something like 85,000 Erlang modules on my machine. Is that 85,000 different things? No, probably not, probably more like 5,000 different things. I'd like to put them in, turn a handle and bring out a reducible module about that. The first thing I want to do is abolish names and places, right, so to talk about things you've got to have names or you have to have references to them, or some kind of name. So if you've got just a paragraph like that, cup of tea, he sat down and butterflies -- it's difficult to refer to that. OK, it's paragraph No. 46,000 so on. Paragraph, I don't know, 412 from, but that's a rather difficult to follow up. But it's quite easy, we could just compute, so if you were talking to your friend and you said, I've read this really great book, said yeah, what was it called, 789150ad143... , and you would know exactly what he's talking about. No problem at all. If you believe in sha1, you know. Do we all believe in sha1? Yes, it's like a religion. And well, it's good enough for this. Right. OK, so we've got this number thing. How do we find it if we've got this number? >> Well URIs, they're bad things. An address, www.. and why is it that? Well, you need DNS, and the host might be unavailable when you want the thing. And if you change, the reference is something well, you can cache it, you don't know how long to cache it for. And well, if we went away from URIs to hashes, you just say, "get me that thing." That will be embedded in some other document in the file. It would just be a link you'd click on it in your browser or something. The nice thing about that, is the content addressable store is you haven't said where it is, OK? And you don't need any form of security. I mean a man in the middle could change the content, but you can validate when you get there. You just say go get that thing, you get this blob back, you confuse it and it's the same as what you wanted and therefore a man in the middle cannot have changed it and you don't need secure software or anything like that for that reason. You can't damage the problem. You have no problem changing the name. You can cache it forever. Right. So that's very nice. So then the only question is how do you find that thing? Well, there are algorithms like chord and. You take IP address in the right hand column and you compute sha1 check some of that, that's in the left of that. These are now in a sorted order and you say, OK, so which machine am I going to find that paragraph whose sha1 checks, well, suppose sha1 checks is 4 thick something or so just look in that list. So I send a message to those machines and say excuse me, they're also going to have a table like that. For the machines they know about. OK? And so they send back to this machine an updated list of that and we can refine that address down and down until we narrow it down to a small number of machines who the hash of their IP address is very close to the hash of the thing you're looking for, OK, that's called Kademlia. It's combining sort of the ideas from git and bit torrent. Bit torrent trackers and the DHTs that are used for that are based on things like Kademlia and Chord. The idea of so when you combine the two, you get git-torrent, which is -- all right, I got very excited a few months ago, I said, oh, if we can combine git and torrent. I immediately went in and registered the domain name gitorrent. There's no such thing as a new idea. Let's start making the condenser. Well, the first bit's easy. Find all identical files. Right. So how do we do that? >> For F in all files on the planet, do, C is the content, key is the sha1 content, store the could have benefit in a distributed global file store. It will condense all identical copies of things to a small number of copies. That will reduce the total number of files. Right. >> We'll run -- because it's a distributed computation, everybody does it on their own machine, it will run very quick quickly. If we were to set up a computation like that we will condense them. That's easy to do. The next bit is much more difficult and that is for find files that are similar to a given file. So this is -- this is a problem that I've been thinking about for 20 years or so, and I've made embarrassingly small progress on it. I have in the back of my head this idea of a thing that helps me. And I've built one or two of them, and some of them work and some of them don't. >> So the idea is, it's rather like Twitter. You have a little box. When you have an idea, you have a little box and you type something into the box. I've done this. I am you have a little box and there's a little icon, Sherlock Holmes at the bottom, you press the Sherlock Holmes button and the idea is that will find among all my files that I'm interested in the most similar thing to what I've just put in this box. I want it to find the most similar thing to this new thing and then I want to know, is it different, so once it's found them, it makes a list of them in order and I can look at them and then I can make a decision, is this actually a new idea. That would be great if I had a new idea. That would be fantastic, you know, or is it an idea that I've had before and just forgotten about or is it an idea that somebody else has that I just don't know about. And if it's of course what do I want to do now? If it's a new idea, it's fine. If it's similar to an old idea, I might want to edit the old idea and merge the two together so slowly we can start to condense the amount of information. So there are various ways of doing this. One which I'm currently playing with is called least compression difference. Some very -- sort of -- it's kind of nice. How do you know if you things are similar? If A is similar to B, then if you compress A, and work out the size of it and then you can catenate A and B, and you can compress A concatenated to B. >> If B is identical to A, then when you compress A plus B, the compressed A plus B should only be a tiny bit bigger than A. Because the compressor can find all these similarities. That's true for smaller data structures, but it's not true for big data structures and for programs it's worse. They're sort of isometric, but you've changed all the variable names. >> There are other implications. The trouble is you know, whoops, the algorithm. Wait, I had the algorithm here. Oh, no, find all identical files. It's for all files on the planet, compute this thing. Using least compression difference. It even takes like, it takes minutes to go through so if anybody wants to work on something for next 40 years and can figure this out, please goat because we really need to reduce the complexity of everything we've been doing. Right, so we've made all this mess. Daikstra said that computing was about -- we've miserably. The only way to make modular composable systems is to make them out of small units which we can validate and then connect them together. So I think we know how to do it in a way. We need to reverse entropy, we need to do the opposite of -- GitHub is sort of cloning off of things and getting bigger and bigger, we need mechanisms to make them smaller and smaller. Quantum mechanics does set these upper bounds on what is possible. So when we're figuring out the complexity of algorithms, we need to be aware of that and we also need to be aware that the state space of what we're doing is enormous and that's why I think if we relate the complexity of the program to the number of atoms on the program, we see this very, very small program has as many possible states on the number of atoms on the planet, you know, it's just enormously complex. We need to abolish names and replace them with hashes and we need to set up distributed global hashtags or things like that. And then we need to make low power computers. We need to make these carbon neutral, powered by solar panels and things like that. We need to get down the energy of computation. Computers are becoming a big environmental threat. They're using more energy than air traffic and things like that. And this is something, while we can probably do without -- well, we need both air traffic and we need computers. One is capable of going down to very low energy and one is not. Computers can be made to operate with low power and we need to do so with a degree of urgency. So that's what we've got to do. We've got to clean up the mess we've made. Thank you. >> Do you have questions or -- two minutes. Right, I actually finished. Oh, I'm late. Two minutes late. Yeah, any questions? >> >> Did you say how Erlang solves the problem? >> Oh, it doesn't. How does Erlang solve this problem. Of course it doesn't. But it is easy to program these distributed hashtags now. ... OK. AUDIENCE MEMBER: [inaudible] >> >> Sorry, that was a statement, not a question. He said that something wasn't symmetric between A and B. : OK, is it coffee break or -- >> Yes. >> OK, thank you. >> >> ... ALEX: Thanks, everyone. We'll get started at 10:00 with the next sessions in all of -- both in here and in the ballroom, all the other four ballrooms. [break]