Rise of the Machines Security Automation at Twitter
In 2009, multiple security incidents at Twitter resulted in an investigation by the Federal Trade Commission (FTC). As part of its 2010 decision, the FTC instructed Twitter to form and maintain an effective information security program. By 2012, Twitter had exploded with hundreds of millions of Tweets sent every day and a rapidly growing engineering force. The amount of new code being written quickly outpaced the security team, leading them to consider ways of reducing their workload by automating tools and processes.
Security automation at Twitter started with a desire to automate a single static analysis tool. From there we started to see more opportunities to write code to prevent security vulnerabilities, instead of manually to find vulnerabilities. This talk will cover that journey, our philosophy for unobtrusive continuous security, the simple yet effective tools we used, and the general approach I believe works for multiplying impact through automated security.
Chapters
Full transcript
The complete talk, organized by section.
Justin Collins
My name is Justin Collins. I currently work at SurveyMonkey as an application security engineer, but I'm going to be talking about my previous job at Twitter.
And speaking of Twitter, if you would like to tweet about me, my handle is @presidentbeef. Everywhere on the internet, you can find me as presidentbeef. And no, I will not tell you the story behind the name.
So almost exactly three years ago, myself and some colleagues gave a talk at AppSec USA, basically the same talk. So this is a sequel, a reboot, a remake. That is why I went with "Rise of the Machines," which, as you know, is the third Terminator movie. However, if you're a really big Terminator fan, I used pictures from several ones, so I apologize.
And I put Nick Green on this slide, along with Alex and Neil, because Nick actually did a lot of the work behind this as well.
So I'm going to give you the punchlines up front. First is: build things that are secure by default. And then once you have that secure default, it makes it really easy to detect when people are not doing the default, just via automated tests.
Second thing: don't fix vulnerabilities, prevent them. So if you didn't know, you're in a security talk. I'm going to be talking about vulnerabilities. A lot of times in the security world, we spend a lot of time fixing vulnerabilities. We're filling in the holes in the wall. But really, what we need to do is just build a better wall.
Those are the punchlines. If you came in before them, you can now leave. All right.
Because there are a number of people in the room, how many people have heard in the last couple of days that Twitter is a unicorn?
Okay. If I adjust that for the people who never raise their hands, that's a decent number. I never raise my hand.
So is Twitter a unicorn? I think, of course, if you look at, well, it was a startup, and now they're making $1.5 billion a year. Of course, again, user growth. Yes, in that sense, I guess it is a unicorn.
However, today I want to suggest that it's not as unicorny as you may think, and probably you have a lot in common with Twitter and the problems that we had there. And the solutions that we came up with are just as applicable to you and just as easy for you to implement.
All right. We're going to go backwards a little bit. 2009, 2010, this was Twitter's logo. There wasn't even a bird yet. It was just the bubble letters. And around this time, something happened. You may remember. The FTC got concerned about some things, and it actually took several years to go from the incident to the FTC investigating to the FTC making a decision and essentially an order against Twitter.
So what happened? Well, this is what the FTC said happened. This is what they said Twitter did wrong.
First of all, Twitter didn't give employees email accounts. I know this is mind-blowing, because now you can get Google for Business for $5 a month and everybody gets Gmail in your company. But at the time, they said, "Just use your personal email." Remember, they were a startup, right?
Next thing the FTC said was, "Your admin passwords don't have complexity requirements." You can have anything you want as your password. That's not good.
The login page for everyone is also the admin login page. If you log into an account, and it happens to be an admin account, you're in the admin pages. This is all going to make sense in a few minutes.
Next up, no limit on how many times you can try to log into an account. Remember, it's just go to twitter.com, try to log in. No limit on the number of tries.
You didn't have to change your password, even if you were an admin. Once you logged in as an admin, you had full admin rights. There were no separation of roles. If you were an admin, you had access to everything.
Finally, as makes sense, there were no IP restrictions on admin logins because it was just the twitter.com login. So of course, there were no restrictions on that.
And finally, every employee gets to be an admin. Yes.
So you might see where this is going. It doesn't take a lot of analysis to figure out what's going on here, but this is what happened.
First incident: employee password was brute-forced. Remember, it's just the twitter.com login, and there was no limit on how many times you could try to log in. And according to the person who did this, they didn't even know it was an employee account. They just saw this person was commenting a lot, and they kind of had a cool handle, so they went after them and brute-forced it.
The password was the word "happiness." All lowercase, dictionary word. Trivial to brute force, especially when there's no limit on the number of tries.
So what happened? Well, they got in, they reset some passwords, and then they actually posted those passwords on a forum.
So Fox News tweeted out, "Breaking. Bill O'Reilly is gay." Probably not a real tweet.
My personal favorite, Rick Sanchez from CNN: "I am high on crack right now. Might not be coming into work today." "I might come in, I might not. We'll see. We'll see how it goes."
And then the one that got the FTC's interest: Barack Obama's account. I can't believe someone got access to the Barack Obama account and then tweeted out essentially a phishing link to go fill out this survey. And gas prices were really high at the time, so it was like, "Win some free gas." Amazing. That's not what I would've tweeted if I got access to that account.
Second incident: attacker gains access to an employee's email account. Remember, these are not company accounts. These are personal accounts. So this could've been AOL, Hotmail, MSN. Who knows? They could've been using anything.
Attacker gets access to it. They're going through the emails. They find two passwords. But the passwords are old, right? But somehow, based on those two passwords, they guess the current password. So you can imagine maybe there was a pattern to those passwords, right? We don't know what that password was, but you can kind of imagine.
And in this case, they got in, they kind of poked around, maybe reset a couple passwords.
So the FTC is a legal organization, right? And they said, "Well, you have terms of service and a privacy policy that say you care about security and you take reasonable steps to protect the confidentiality of people's data. However, as you saw in that list, very simple things you're not doing. That's not reasonable."
So their order, of course, you don't have to read the whole thing, but the order is essentially that Twitter must maintain an information security program for the next 20 years. And if they mess up, they get another 20 years.
And I noticed this recently: the safeguards that they put in place need to be appropriate to Twitter's size and complexity. So remember, the incident happened right at the beginning of 2009.
If we look at the number of tweets per day as a proxy for the size and complexity of Twitter, in 2009 it was... Well, if you go back even farther, in 2008, we're talking like tens of thousands of tweets a day. 2009, you're talking around 50 million tweets a day. And you can see this, the line goes up quite dramatically. So by 2012, you're over 300 million tweets a day.
And nowadays, I don't know the number now, but relatively recently it was half a billion a day. So every two days, there's a billion new tweets in the world.
And you can imagine the engineering organization scaled up with this. In fact, Twitter tends to be very proud of the fact that half of its employees are engineers. That's a lot of engineers. That's a lot of code being written. And a very small security team, right? As is typical.
So I want to tell you a little bit about me. Main fact is that I'm a human. I am not a cyborg up here in front of you. So I get tired. I get distracted. I get hungry. I get annoyed doing the same thing over and over again. I have to go home at night.
But if we talk about machines, they don't get tired. They don't get annoyed. They don't get bored. They don't forget to do things. They have a list of things to do, and they just keep doing it. They're also scalable. You can just kind of make more of them. And they're fast, right?
So we should use machines. And we're at a DevOps conference, I think we get the idea.
So this is the cycle I see when you're using a tool, especially security tools. First you run the tool, and then you wait. You browse YouTube and watch cat videos. And then the results come. You go through the results, and you have to figure out which results actually matter and which ones don't matter. And then you have to go fix those issues. And then the next step is you do it all over again, and again, and again.
And I don't know about you, but personally, I get really bored with that, and I get tired, and I don't want to do it. But machines love doing stuff over and over again. They are awesome at repetitive tasks. In fact, if you know about how computers work and the optimizations that are in place, if you do the same thing over and over again, they love it, right? They can optimize it amazingly.
So three years ago, like I said, we gave a talk. We talked about what we were doing at the time. And we had these sort of philosophical statements. And I included them here again because if you're at the planning stage of automation, these are really good things, I think, to keep in mind.
And I went through them again to see, do I still agree with these? And I do. But these are philosophies, not laws. So they're just suggestions.
Number one: getting the right information to the right people.
Have you ever gotten a 200-page PDF from a security team? That is not the right information. It's in there probably somewhere, but it's way too much information. So the goal is to get just the right information to the right people.
And who are the right people? Probably the people who wrote the code. And if they're not at the company anymore, the team that maintains the code. Those are the right people. Those are the people who should have the information.
And when you're talking about security, it's good for the security team to know, but it doesn't make any sense for the security team to just be the only ones who know the information. It doesn't work.
I think I don't even have to spend two seconds on this. We know find bugs earlier is better. In security, this is like that times five, because once something's out in the wild, things are really bad. It's not just like we're losing revenue, but people's accounts are getting compromised, something like that.
If you see a mistake, find out a way to not do that again. Sorry, some of these are rather simplistic, but good things to keep in mind.
Analyzing things from many angles. This basically means we all know there's no silver bullet. If you have a bunch of bullets, you can combine them, and it does a much better job. So don't stick with one tool. Try to use several tools of different kinds to find as much stuff as you can. Of course, in security, we're mostly talking about vulnerabilities.
You want to let people prove you wrong in the sense of having some kind of feedback loop. So you tell someone there's a problem, they should feel free to tell you, "No, there's not." And you should feel free to say, "Yes, there really is." And they can feel free to say, "Well, I really don't think there is."
If you can't have that conversation, then people are just going to be really mad at you all the time because at some point, you're going to tell them something that's not true, and they're going to go, "I know this isn't true." So make sure that that is available.
Help people help themselves. Make sure the information is available. Make sure that when you tell them something's broken, whether that's a vulnerability or a test or whatever, that the information is right there for them to fix it.
If we're talking about tests failing, where did the test fail? What test failed? What was the failure? What was the expected result? Same thing for security.
Of course, automate the dumb work. If you recall, running tools, that first part, run the tool, that's the dumb part. That's the "I'm pressing a button." Automate that.
And finally, I think it's important to build tools for yourself, for your company, and just go after that one thing that you really care about. That thing that you know is a problem, just go after that. And then go after more things. But focus on one thing at a time and keep it tailored to your environment and what you need.
Okay. Let's talk about SADB.
So we presented SADB, like I said, three years ago, and I like to think that it's become a little bit legendary. People don't remember what we talked about necessarily, but they remember there was a picture of a bee that was crying.
Alex Smolen put this together, and we make fun of people now for coming up with logos for vulnerabilities and stuff, but it works. People remember. Everyone remembers Heartbleed. It's a bleeding heart. Ugh.
So SADB. Where did SADB come from? Well, there's a tool called Brakeman, which is an open source security tool for Ruby on Rails. And what was happening? Well, Twitter started using it, and then they hired me because I wrote it.
And this was the cycle. I showed it before. You ran it, then you waited, not very long, a couple of minutes, then you got the results. You had to go through the results. Then you went and fixed stuff, and then you did it again. And my colleague, Nick Green, this is what he was doing.
And so what's the obvious thing? Automate this.
So this is the idea. Code goes up. Code goes up. Code goes into wherever your tools are running: Brakeman, whatever else. Tools run. The results go down to some central location. In this case, we built SADB. And then at that point, you get your dashboards.
I didn't even say what SADB stands for, did I? It's the Security Automation Dashboard. That's SADB.
And then it can decide what to do with those results. Is there a new vulnerability? Are there no new vulnerabilities? Well, I don't need to tell anybody. That's not good information. If there's something new or something fixed, I should tell someone. And in the case of SADB, that just meant send an email. If possible, send an email to the person who wrote the code that introduced the problem. If not, just email it to the security team.
This is kind of what it looked like. It was a dashboard. You get the idea. This is really old, and I don't work at Twitter anymore, as I mentioned at the beginning, so I can't get newer information.
This big cliff is when they started using Brakeman, and they were manually going in and running it and fixing things. And the spikes tend to be new versions of Brakeman, not huge numbers of new things introduced.
But the main thing is, actually, you can see we started using Brakeman, or they, I wasn't even there at the time, kind of end of 2011, beginning of 2012. But SADB wasn't built, I don't even remember when, middle of 2012, maybe. So this is actually historical data. We went back and ran the tool over all of the commits on master all the way back to November of 2011. And so having a dashboard really gives you that nice insight.
All right. This is what it looks like. This is actually a different part. This is running Bundler Audit. And something I'm really ashamed to say is that vulnerabilities would come out for Rails, and then we'd go around and have to remember who's using Rails and what versions they are on. And we did that several times before I finally said, no, I really need to put this into SADB so that we're not doing this over and over again.
This is the view for one application. This is the view for one vulnerability and which applications are affected by it. Very useful.
But how many of you right now are like, "Okay, I'm going to ask if Twitter is going to open source SADB, because I want to use it"? Is anyone thinking that?
Thank you for being honest, sir. Usually, lots of people are thinking that.
But I think that's the wrong thought. The thought should be what comes out of it. Because we built this thing. SADB was not impressive technologically, trust me. But it led to a change in our thinking, and that really was the legacy of it.
And really, as cool as SADB was, the most effective thing we did was this. We had a code review system, and depending on your team, you'd either use Review Board or Gerrit. And there were hooks to run tools before the code actually got to the review. So you'd run a command, it would push it up, tools would run, and then it would create the review for you.
Simplest thing we did, but most effective, was we had tools that just ran regular expressions on the code changes and looked for patterns that we suspected as being bad, and then automatically commented on the review.
So if you're going to build something, I would actually suggest building this, and then maybe think about SADB.
So I'm going to go through the process for this. And this is really what we started doing after we had SADB in place and we started thinking for every problem we ran across, how do we automate this so it doesn't happen again?
First, find a problem. Good step to solving problems is finding the problem first. So is there something that keeps happening? Some vulnerability that keeps coming up? Something that keeps breaking?
Do you have a situation where you have to opt into security? For example, you're making an API call, you have to pass in a flag to make it secure. Well, in the security world, that's no good. You have to be secure by default.
And is there repetitive work? I mentioned going and finding out what versions of Rails everyone is using. In fact, we didn't even really have a list of everyone who was using Rails. So that's repetitive work. That's a problem to solve.
If you can, solve it with code. Can you write a library that is safe by default and then have people use that?
As I mentioned earlier, once you make it safe by default, then it's super easy to enforce the use of it, because if anyone's not doing the default, there must be some flag they're passing in or some call they're making that's not safe. And now you can write that test I mentioned earlier, so when it goes up for code review, you know about it.
Next, can you detect it statically?
Did I miss one? Two and then four. All right. We're paying attention. I'm not a machine. I make mistakes.
Can you detect the problem statically? Of course, I'm a big fan of static analysis. Is there a code pattern that you can detect? And it doesn't have to be perfect. Keep in mind, it doesn't have to be perfect. You're not writing a commercial tool here. You're just trying to find suspect code. And then during the code review, alert about it. That's it.
If you can't do that, maybe you can detect it dynamically. Maybe there's a config setting you want to check, some header you want to check that it's being set right. So write some Selenium tests, write a crawler to go out to your page if it's after production, and just check that the specific thing you're looking for is there or not.
Finally, browsers have all kinds of security features, and they're coming out with more all the time. Once your code is out the door, once your website is running, this gives you an opportunity to protect your users while they're using your site.
Content Security Policy, probably the most flexible and powerful of these. I'm not going to go through them because I don't have a lot of time. But that allows you to control what resources run on your page.
Strict Transport Security makes sure people are using SSL.
Public Key Pinning makes sure that once they're using SSL, they're using your certificate for SSL, not someone else's.
And finally, brand-new thing, Subresource Integrity basically allows you to take a hash of your resources, like your minified JavaScript, then you throw it up on a CDN, and you tell the browser, "Hey, when you get this resource from the CDN, it better match this hash I gave you. If not, don't even load it."
So these are things that you can do that basically give you some automation on the victim side, on the client side.
All right. So to recap: secure by default, and then that allows you to detect people not using the default via your tests.
And I've seen this over and over again where the default is usually unsafe, because someone wrote the call, and then later on they realized that wasn't safe. So then they said, "Oh, but I don't want to break stuff, so I'll add a new flag." Well, it's much easier to detect a flag that makes it unsafe than the absence of a flag making it unsafe.
Anyway. So make it safe by default, then detect it via tests, via static analysis, via grep, whatever it takes, and just have it run as part of your code review, part of your CI, whatever.
And we really want to move, as a security person, and I think as an industry, and we're at a DevOps conference, probably you already realize this, but spending your time fixing things that are already broken, you're just never going to catch up. Instead, you want to prevent those things from being broken in the first place, prevent things from being vulnerable in the first place. And that's just a much better way to go.
If I haven't convinced you yet, I've lost you.
All right, so then back to the beginning. Is Twitter a unicorn? Well, we made some really stupid mistakes. It's really hard to imagine a company making those same mistakes again, not even providing company email accounts for people.
But you can imagine it happening. You're a startup, and you have no money. You don't want to spend any money, and you think, "I'll fix it in the future." And that's exactly what happened with Twitter, except it got too big, and events happened, and the FTC got involved.
So we want to avoid that.
Secondly, the things we built are not so amazing that you cannot build them. They're very simple. Like I said, the most effective thing was running regular expressions against code changes. That gave us insight into what people were doing. It gave us an opportunity to say, "Oh, whoa, don't do that." And it gave us the opportunity to put defaults in place that were safe.
All right. That is the end of the talk.