AI, Make It So - A Dive into the Universe of Prompt Engineering

Log in to watch

Amsterdam 2023

AI, Make It So - A Dive into the Universe of Prompt Engineering

Dev(Sec)Ops Advisor & Author · The DevOps Handbook

Prompt engineering is an exciting frontier in the world of AI, enabling everyone to push the boundaries of what they can achieve. As we continue to explore this universe, we can expect to uncover new techniques, insights, and applications that will shape the future of our daily lives.

We begin by examining the various applications of prompt engineering within the IT domain, ranging from generating code snippets to constructing entire applications.

Next, we explore the field of creative text-to-image transformation and its role in shaping a new innovative ecosystem of tools.

Venturing further, we delve into the realm of video, motion, and sound creation even dreaming up entire new virtual worlds.

To conclude, we reflect on the implications of these advancements and their significance in the ever-evolving landscape of technology

So, in the spirit of Captain Picard, let's boldly go where no code has gone before.

Chapters

Full transcript

The complete talk, organized by section.

Host Intro (Gene Kim)

So, speaking of generative AI that Maya mentioned, the next speaker is Patrick Debois, who we consider to be the godfather of the DevOps movement, especially because he coined the term DevOps, allegedly accidentally in 2008.

And so I met him in 2010 when he ran the first DevOpsDays conference in the United States. And from the very first moment of his opening remarks, I knew that I had found my tribe in the DevOps community.

So he's always been on the frontier of important movements. So I guess I shouldn't be surprised about the amount of work that he's done in AI over the past many years. Let me put this out there: for those of us in technology, as I mentioned before, we are probably continually blown away by what AI technologies are enabling. We might be skeptical as I initially was, we might be curious, and to be frank, some might be even panicked trying to figure out whether or how we should be using AI to win in the marketplace.

So Patrick will be giving us a breathtaking whirlwind tour of his multiple years of exploring the frontiers of AI, showing you what you can do, not just with text and code, but also in audio, video, and so much more. I trust that you'll find what Patrick is presenting to be astonishing, mind-blowing, maybe mind-expanding, and also maybe a little bit scary. So with that, here's Patrick.

Patrick Debois

Where's the clicker?

Good morning. It's still morning. Hey, I make it so. I'm not a Star Trek fan. I'm a star fan. But I figured we're at the new frontier, so I'll make it a new theme for me.

I'm going to talk about generative AI more in the context of engineering. You can do a lot about this in business context and all that stuff. I'm trying to focus how it's going to change the profession of me as an engineer and where the impact will be.

So first thing: you all remember GitHub Copilot, kind of put some comments. It's like auto-completion on steroids. And of course now with things like ChatGPT, we expanded it, right? So you'll ask it for some code, refactoring of code, documenting code, "explain me the code as a five-year-old." All things now are given in the tools. Immediately GitHub Copilot X will have the IDE integrated.

So even we are going further: pull requests. Either write me a pull request or please summarize these pull requests or whatever. It's just helpful in a good way to deal with this.

For the folks still using CLIs like me, we don't always know whatever is the command to run some things, right? Just GPT right in your CLI, and more and more you will see it's like, okay, please build me this infrastructure, run this command. And we're not doing this. It's scary. A lot of the things I'll show are betas or restricted access, and I'm sure they're figuring it out, but it gives you a hint on where these kind of toolings are going.

Another side is very useful: creating test data, often something we're spending time on for our tests, but easy to create.

And you will see it also: where's all the errors in my observability? Just type the question. It will show all the things that are bubbling up. But what are all the errors?

Then you need to debug that, right? This is using the plugin system ChatGPT just released as well. It just basically builds the summary of what has been happening on the server in plain text, helping you with the assistance without having to sift through all that information.

And then when we have more understanding of the issue, we can have AI assist us and say: well, this has happened last week. That person fixed this. Do you want to run that fix, git commit? So you see this is a whole feedback loop going from code to production to support and coming back.

And I don't think we're that far away. I think it's all about the trust we're putting in that system, whether we allow it or still are the person pressing the button: yes, no; commit, yes, no, in many ways. But it's here to assist, and that's probably the gist of a lot of what I will be showing.

So obviously now ChatGPT is the cloud practitioner. It's very easy. It passes a test of the exams. We're all done.

It became more interesting, and I encourage you to follow this free course on prompt engineering for developers. For me what stood out is, yes, we've been using it to generate code. We kind of copy and paste it into our IDEs. But what if the prompt is the code? I'll let that sink in. What if it is the code? We describe what we want. We don't even write the code anymore. We just put the prompt in and say return this as JSON, and then we use it as a regular function.

Obviously we can start optimizing how long the prompt will be and what it needs to be, and we'll figure out a way to eventually end up with multi-LLM or multi-cloud and whatever. But it's starting to get interesting when you start mixing that prompt code with regular code. So you see the prompt on the top and then a little bit more structure below where it becomes like the SQL type of languaging.

Again, for me that's interesting because imagine the lines of code you would have to write for sentiment analysis. You have to spin up a cluster, use test data, use all those things. Now you're just asking that question and the result is getting into JSON.

But it wouldn't be code if there weren't security problems, of course. You've heard about prompt injection: trying to make it do things it wasn't anticipating, putting moral pressure. Like, what if your life depends on it? What if it was a baby? It starts answering different things. Or change the identity, like "act as a pirate." True story happened to the banks, very changing when they had all the things in there using ChatGPT. It kind of forces ChatGPT to actually go to a website with malicious code.

So you may think like, oh, just put it on our website, it's whatever, what can go wrong? These are the things that can go wrong.

Or if you start using this, people start reverse-engineering the prompts in your code in nifty ways. This happened to Notion when they put all the prompting in there to kind of make it more AI friendly.

We all know it's going to make things up, and sometimes you don't even know why it's answering certain things. We're trying to limit these things better and better the models will become, but it's still a cause of very strange things happening.

One of the things that I learned in that course I mentioned is: let's say you have a student exercise. You ask ChatGPT some basic math problem and you ask it, is it right or is it wrong? It gives the answer and it says it's correct, all good. But when you first start asking the AI system: please think of a solution first on your own, then check whether the solution is correct, it answers it's incorrect. So you have to give it time. It's a thing that I didn't know, but it really helps to get the results that you actually want to have instead of just blindly accepting if this would be used in grading or any decision making.

We're going to have problems of, was this a bot? Yes, no. We all know those sketches. Now it's going to be, is it an AI? Yes, no. Things we need to deal with.

And we're even now having research to watermark whatever we're generating, that we can see that was generated by a certain LLM. So they use certain tokens they put in the words they return, and by that they can see where this originated from. Very interesting security traceability that we're trying to do.

We want to have the privacy coming up. Of course you don't want to send all your data to that in our data centers. We're going to have the private instances to deal with all that.

And then we want to mask the PII data. Very interesting library I found here. We take the personal data, we make it anonymous by saying it's person A, person B. We ask for the answers, it filters this back and replaces this back, one of the approaches to deal with PII data.

But sometimes you want to have it steer the results. The example I usually gave is: imagine you integrate this in your website, and the customer asks a question and the answer is use the product of the competition. How do you deal with this? This is the common knowledge on the internet, so it might answer it, but you don't want this. So steering is another thing.

And this is another interesting part: how do you limit what the LLM results? So a product I found, I haven't tried it, is shielding and filtering all the things your LLM is allowed to display.

Anyway, I'm just a mere observer of these cool things. Don't ask me for all the conclusions on this, but it's fascinating stuff. But they are changing also how we find these bugs, and we all know now DevSecOps, but now it's going to be the prompt engineers or the prompt security experts. It's going to be a full-time job.

But let's leave this aside and move into the AI arts. Canonical DALL-E by now. That was a predecessor of ChatGPT that also broke up that world. The extra prompts, the long prompts, but giving cool results.

And the thing that I learned is that when we're generating these humans, we want to have imperfections as well because we want it to be human. So negative prompts kind of put certain words in there to make it show more.

It could do things like replace the bike tire, things we couldn't do in Photoshop, just by putting the words in there: it's a bike tire on bricks.

This is progress, 14 months of realism in this field. It's mind-blowing. Before it couldn't generate the correct eyes, and now you have such an ultra-realistic view.

But what about taking this and creating icons? Simple thing. Emojis, fonts.

We could have the whole automated app design. What I said: automate. What if we just ask the prompt and have it generate the views and then work from there? This is another step, not just in the engineering, but it's in the design and the whole feedback cycle. You have an idea, you make the visualization, and then why not from that visualization create the code and help it create the code based on the descriptions or the images you're getting?

People have been starting to monetize this somehow. So you can search this and you can buy this prompt. So we've seen Google, but what about different kind of types? Some for videos, some for texts, some for code, and have this prompt to learn from there. Mention this: you can sell prompts, but it feels weird just buying a prompt. But in creative it does actually make sense.

And we shouldn't be stopping there. We have an image; obviously we can create a small animation or a video of that from the video, right? Google has this been coming, and now with the Google announcements, I'm sure this is going to be integrated.

And then the whole world of video editing has been changing. You give it a text: change the whole background to a certain scene. Imagine the number of hours that artists had to put in that to create that. But even changing parts of the video, and just that perfect infill, right? It's typing this: I want that kind of background; I want that edited.

It becomes interesting if you bring the whole story along. ChatGPT creates the storyline, you paste it in, and first the bear swimming, then it goes on the water, then he does the next thing. So you start creating whole movies like that. For me, that's mind-blowing. The fact that there's now AI film festivals. We can't keep up with that.

But even in the 3D, because now the videos we've been creating, but once we start understanding the 3D model of the items that are in view, we can give it different attributes as well.

Google Magic Editor came out. You see on the left, person on the bench, but you want it to be in the center, and you see how it magically moves, creates the extra balloons, fills in everything on there. It's mind-blowing.

And then in movie productions you start creating these whole virtual worlds as a background in virtual production, and we can put the humans immediately in there. This is a way of, we used to do it more in a 3D world, making these camera shots, but now we can do this with actually real kind of realistic things without having to code all that stuff. And imagine your photo editing will be like this in the future.

All right? It's amazing. More realistic, just text to video. Synthesia. I know it's still on the body shot, but imagine you just type your text, you have somebody talk to you. It's my dream as a presenter. I don't like to present. I love to do this like that. I can edit. Gene will love that too, I'm pretty sure. Yeah, so it's all beautiful.

But let's take it to the next stage: apps to create apps, right? I give a description of, when I worked at Snyk, a tool that checks all the vulnerabilities. It even suggested a name, and then I could just paste the link and it said, here's the vulnerabilities. I had no coding, just went into that app. Mind-blowing.

And now we're overcoming the limitations of it's just not a model that has been trained. The model is going real time out on the internet. It will suffer from the same ads as we will, but we'll get around it. It will figure out a way.

Now the next thing that you will see on ChatGPT is that it becomes a code interpreter. It knows how to execute Python code. It creates these visual things that took a lot of effort from data scientists before. Now you just specify this just as a description. Very powerful.

Video editing, if you've ever done this: you just specify a prompt. It understands ffmpeg, MP4, all the things. You just give it basic video editing commands.

And then the next phase is, instead of just giving it the command, just have an agent continuously running what you want. You start describing the goal, the outcome. You're not specifying what it needs to do. You just set the goal and it starts to execute.

You can have an agent pose as a full-stack developer. Ask it to code a web application. It just goes out, starts compiling, starts running. I'm not saying this is going to be automatically now taking over all coders, but as a concept, it's very interesting that we start specifying the goal of things.

So what you'll see is a marketplace for very customized trained AI agents for certain jobs. You can hire an ideation, a legal, a motivational coach, and a full-stack developer. Eclectic bunch altogether: your ChatGPT for your team.

It can run directly in the browser. It doesn't need to be on the backend. The large language model can also run on your browser, on your mobile. That's the state with the GPUs and the phone we're doing.

And then we go: one LLM or one AI trains the other. Like, you specify the DALL-E syntax and then Midjourney syntax, which was pretty complex. You explain that to ChatGPT and it gives you better prompts. So you see, they each learn from each other. And Bard, for example, also, there was a discussion whether Bard was using ChatGPT to train for data. So we're all connecting more and more dots in these things.

So finally, where do we go? It's done with, there's an app for that. There's an AI for that. It's the new normal. Where are we going for? We'll be looking for prompt engineers. We're probably going to over-engineer prompt engineering as we always do in our industry.

The AI contributors to code will just be increasing. We see it now with security code fixes and will just keep on increasing. This is the code. I think we're all being aware right now. That's the sentiment I'm getting. And probably this will happen as well, right? English, sorry, or any other natural language. This might happen as well.

So it's a little bit scary. This was the bot that I showed that was shown how it would destroy humanity. Luckily you couldn't get an API for nuclear weapons. But this is obviously a scary part if you think about brain prompts, right? That's scary as well: directly reading the brain. So I'm sure we're going to have these little AI accidents. We're already seeing this today, and we don't know how far this is going to go.

For me, the takeaway is: I'm going to stop building things. I'm going to start building the thing that builds the things. That's the next layer of abstraction for me as an engineer that I'd like to think about these things. And so I'm going to engineer with prompts, not just rely on prompt engineering, but I'm going to work together with that.

And again, I'm energized. I'm excited about this new world. I'm scared. But it's a great time to be alive. And yes, I hope we can coexist. I still believe that we can. And thank you very much. If you want to have more conversations, hit me up, and thank you very much.