Travelopia – How to Get Machine Learning Right and Make Data Work Harder

Log in to watch

Virtual US 2022

Download slides

Travelopia – How to Get Machine Learning Right and Make Data Work Harder

Sree Balakrishnan

Technology Director - Innovation & Product · Travelopia

Simon Case

Data Studio Lead · Equal Experts

At Travelopia, we offer personalised travel experiences from our high-end travel brands. It’s been a challenging market for a couple of years. During that time, we’ve been on our own journey, to make greater use of our data and better understand our guests.

Our initial attempt at using machine learning didn’t work out, because we were fixated on the technology. By creating a steel thread from prototype to end-user, and focusing on one use case at a time, we were able to dramatically improve the quantity and quality of data insights that we could use to power our business.

We’d like to share with you how we’ve changed Travelopia with machine learning. It’s not been about platforms, technology, or scale. It’s been about what our data definitions should be, what a team should look like, and how we should use data insights across our organisation.

Chapters

Full transcript

The complete talk, organized by section.

Sree Balakrishnan

All right. Good morning, good afternoon, good evening. Fantastic to be in the DevOps summit. I am Sree Balakrishnan, one of the technology directors at Travelopia. Recently we worked with Equal Experts and Simon on getting a machine learning project done. We call it a machine learning project, and we thought it would be an interesting case study that some of you could take learnings from. We recently presented this in Scotland; it was well received. Hopefully we will be able to share some bits and learnings from it in this conference that you can take back to your work. Simon, do you want to introduce yourself?

Simon Case

Yeah, thanks, Sree. Hi, I'm Simon Case, and I head up Equal Experts' data service. I love this piece of work. It's a really nice piece of work applying what we call MLOps. I'm sure you've all heard that term. Sree's got a great story to tell, so I'm going to hand over to him.

Sree Balakrishnan

Right, excellent. Usually this is in person and more interactive, but let's see. I'm assuming you will all be asking a lot of interesting questions. We will post the slides. Let's go to the second slide, Simon.

Let's probably run through some of our use cases. A bit of introduction about Travelopia: Travelopia is a group holding company, and we have multiple brands that are popular across the US and UK. We do experiential travel in a variety of categories. We have ships that make people do Antarctica; we have boards that take you to the backwaters. We have boutique luxury travel, private tailor-made travel, adventure travel, and whatnot.

In total as a Travelopia group, we take travelers to more than 150 destinations. More than 500,000 guests are welcomed every year. So that's tons of data for us to analyze and understand what they can do with us now and later, and I'll explain more about the value of so many guests and machine learning in our context. We have more than 2,000 colleagues across the world, and we are building back after the pandemic to a larger team. Long story short. Next slide, please.

It all started with a bit of history. It is a bit of boring history for the engineers, but I think sometimes history is important to understand why we did what we did. Travelopia is owned by a private equity company, and private equity companies generally get the top four or top five consulting firms to come and audit and suggest what we should be doing in terms of technology. We were not different. There was a large consulting firm that came in, understood what we were doing well, what the challenges were, and what areas we could do better.

One of the areas our consulting firm identified was the use of data. You saw the previous slide: we have 500,000 guests coming in. As a travel company, one source of channel is PPC, which is Google paid campaign. The second is organic, the third is email campaigns, and the fourth is repeat customers. The less you spend on Google PPC, the healthier it is for our EBITDA. Most companies of our type are constantly looking for ways to nurture the existing guests that we have.

That's where the opportunity came from. If you have 500,000 guests traveling every year over a period of the last five years, there is quite a lot of data. Of course, it's not big data, but it's a sufficient amount of data for us to understand the patterns, the travel factors, what we can do for them more, and what we did. Collect from various different systems, analyze the data, use machine learning models, and predict what we could offer them next. That's what the proposal or recommendation of the large consulting firm came up with.

I like to think of the journey as iteration one and iteration two. Iteration one was the journey before me and Simon's team started looking at it. Iteration one was focused on technology. We assumed we had big data, so we started building a data lake and databases. We used best-in-class Amazon databases like columnar DBs and whatnot, assuming that we had big data. I'll talk later about why I'm saying that.

It was extremely technology-focused. The focus was on building machine learning models and then saying, "You should use the models," not on business adoption or business use cases. Between me and Simon, we categorize this as iteration one: big bang, large team, multiple promises, classical waterfall. Build everything, and then finally magic. As expected, of course, it didn't go well.

That's when we got together and said, "We're spending a lot of money. We're building a lot of engineering effort. How do we now pivot from here?" It doesn't make sense to continuously spend money on the same thing when we don't see the output. I'll explain what we mean by output.

There was a very nice idea in terms of the need to use the data, but the project was not internalized by the internal team. Technology kept pushing things to the businesses. Remember Travelopia is a holding company, and you have 10 different brands, which means you have 10 different marketing directors. Each one was thinking about data and usage of data very differently.

Technology pushing to customers by saying, "Hey, we have a great machine learning model," did not help, because the marketers could not understand how the model was predicting. Take the repeat model as an example. We said we had a great model that predicts repeat behavior. Imagine one wrong prediction. Very often in machine learning you will have accuracy of 50%, 60%, 70%, say 90%. Imagine we have accuracy of 90%, but the 10% should be explainable. In iteration one, we couldn't explain why it was spitting out this customer's preference as XYZ versus ABC.

That's when we got together as a group and said, "Let's do it." Luckily, COVID also came in handy, to be honest, because COVID forced us to think about a better way to do it. Can we do it faster, leaner, better, cheaper? All those questions started, and that's when we ended up pivoting it.

In iteration two, comparing to iteration one, the earlier version had a large team where we had a set of data engineers whose job was to cleanse the data, a separate machine learning team, and a separate business team. The first thing we did in iteration two was say, "It doesn't make sense to silo the data team, the machine learning team, and the business team." In the past, this meant three different conversations. The business people had separate conversations, the machine learning people had separate conversations, and the data people had separate conversations. It was very siloed, and the outcomes used to take much longer.

In iteration two, we said, "Why don't we put together some product managers who can try to place the score, a bunch of data engineers, a bunch of data scientists, and a couple of QAs?" Six or seven members, like a SWAT team, came together to deliver outcomes. Instead of focusing on the machine learning model, which was the focus in version one, we started saying, "Let's really double down on the business outcomes."

We spent a lot of time thinking through with the business and the brand: what exactly would you enjoy from an outcome standpoint? If we give a certain type of data, how do you start using it? Even before building a machine learning model, we spent days together with the business and said, "Imagine we give you predictive analytics on who could potentially repeat. First of all, what would you do with that data?" That meant the marketing team could start sharing their thoughts on customer journey, email or not, and so on.

We completely changed from "forget technology" to "let's focus on outcome." Data lake sometimes tends to be a bit too much because it can be a whole lot of data. We said, "Why don't we selectively build data pipelines where needed?" Of course, behind the scenes we might be using a data lake, but the focus again was not just the data lake. We said, "We will use whatever technology we need, but to generate a particular model or outcome, we will use a data pipeline that will handle anything needed for it," not separate siloed things doing data pipelines and models.

In the same team, one member was doing the data pipeline and another member was doing the data science. The interaction and exchanges between these two were a lot easier, and the self-organizing squad was the focus.

In iteration one, we were quite dependent on GUI tools, graphical user interface tools. Again, a lot of proprietary big mammoth tools solve a lot of your developer headaches in starting a new project. But one of our learnings was that if you are not able to do CI/CD, if you are not able to check in something, do some testing on code, and push it to production faster, the cycle time was way too long because you are again using a siloed tool that is not integrated into the development family.

We called this: instead of a prescribed architecture, let's have an evolving architecture. One evolution was: can we have command-line tools that can be Jenkins integrated? As soon as the code or the data gets updated, automatically the trigger kicks in, does all the validation, pushes it to a machine learning model script, and then immediately pushes the next activity. Versus the GUI tool, after all the scheduling, I think we ended up taking eight to nine hours of bulk processing early morning. Many times with the eight-hour processes, suddenly it stops and you don't know why it failed. Versus the CLI world, if something breaks in between, you know exactly at the logs and can say, "This is an area where we need edge-case scenarios or enough test cases to ensure that the incoming data is cleansed even before the tools are running."

No disrespect to GUI tools. I'm sure a lot of people would have loved using GUI tools, but we were not successful using GUI tools, and we had to go back to CLI tools.

The most important point was that we had to agree with the executive sponsor, the business leader, like the marketing head, for iterative development. This definitely was challenging. I've told this story many times to Simon. When we moved away from the waterfall model to the agile model, one of my comments was, "Iteration one is quite bulky. We used this tool, we had that tool." The new ML project was essentially, "New data engineer, new data scientist, come to our cloud platform, start writing scripts, and start figuring out how quickly we can get things out." That was a bit scary, but executive sponsorship mattered.

When we moved away from the waterfall model to the agile model, one of my comments was that iteration one was quite bulky: we used this tool, we had that tool. I was saying there is a smarter, cheaper, better way to do it, so why are we doing it in such a bloated way? But it was perceived by the executive management as the new hairdresser coming in and saying the previous hairdresser had done a completely bad job. That was not my intention. We were approaching it purely from an engineering and outcomes standpoint: can we generate outcomes so that the business can adopt it? That was the focus, versus, "Here is a great engineering, here is a great data lake, here is a great columnar DB." We focused less on that and more on business adoption and successful use cases. After the first iteration, when we were able to showcase it, the executive adoption of the approach started getting better.

The slide here is a quick view of what happened to iteration one and iteration two. Iteration one had roughly a 40-member team. We built three models for about two brands, with very less adoption or buy-in from the business. It did make people aware about CRM journeys, customer journeys, customer mappings, and whatnot. I won't say it was a complete disaster, but from a technology standpoint, for the money we spent, the value was lesser.

Iteration two needed just six people. The team was small and slick, and we were able to do 10 models for roughly four or five brands in about nine to 12 months. The first model took time for the team to settle in; I remember the first model took about four to five months. From then on, we were able to deliver a new model every two to three weeks in production. The first one was tough, not because it was complex or crazy technology, but because it was my first machine learning project, my team's first machine learning project, and the business's first machine learning project, so there was a lot of learning involved. Once we figured out the method to the madness, we were able to move much faster.

As a side effect, in iteration one the cloud cost used to be close to a million dollars. The slide mentions that cloud cost went down from $1 million to $100,000, another example of using just-in-time command-line-based tools and the benefits we started to see. More importantly, we measured the benefits for the business. Using the customer journeys based on the machine models we built, we were able to prove almost 21% incremental overall business because of the machine models.

This is one of my favorite slides. The most pleasing thing to share is that the business was 100% using it. That's even better.

Simon Case

Thanks, Sree. This is on to me now in terms of the models. For the people here who are interested in the models and technical details, just before I go to that, I love that story. It's such a good story about why being lean and being agile and applying DevOps techniques makes a difference. Starting with that steel thread and using that to create your paved road so that you can spin out more models is a really good example.

Onto the models part, which is really exciting: what are they doing? What's the machine learning bit? The first one the team created is what they call the in-market model. What was that about? Basically, you've got data spread through a whole bunch of channels, as Sree already mentioned: PPC, pay-per-click, social media, all this sort of stuff. What the business noticed was that there is a lot of this traffic. Can they make the most of it? What can they do to understand better what their customers want? What does this traffic look like? What insights can we derive from it?

The in-market model is an example of this. Fundamentally, what it does is look at people who are visiting the site or who have recently visited the site and work out how interested they are in buying right now, what we call their propensity to buy. Are they interested in buying or just browsing? Is there some way you can tell that?

Then, if we think they are interested in buying, what is it they want to do? What sort of thing do they want to purchase? There are lots of really interesting challenges around that. The typical stuff you see: you've got to know who's browsing first off, and you only know that about 30% of the time. But once we have this information, once we know, "Okay, we have somebody here, they look like they are interested in buying, they look like they are interested in these sorts of products," then that information can get sent to the marketing team. We can tell them what sorts of products they are interested in. Are they interested in this particular yacht, for example? Are they interested in this particular destination? What time are they interested in? When are they looking to go on holiday?

All those sorts of things mean the marketing team now has very specific data. From raw data that is very difficult for people to work with, they get very specific enriched data. They know more about the customer, and it's immediately actionable. This can be used by marketing teams to provide those nudges that make the difference. Perhaps make an offer to them. Perhaps contact them directly to close the deal.

Sree Balakrishnan

One more point on in-market. Initially, maybe look at the UI. If the customer is hanging out in a careers page, for example, we definitely know that he's not an active customer, so we score minus. We could even start with a simple mathematical model. It doesn't need to be anything complex. We can mathematically model simple maths: spend X amount of time, X number of pages, scoring higher; did not spend time in the relevant pages, scoring lower. Let's see what the marketing team is doing with that information, and then we can always use a model to make a prediction more accurate.

Instead of getting lost in the machine model, what we were thinking is: what is the cheapest, fastest, best way to give something to the brand so that they can critique? Critique in some ways is good, because we know what is right and what's wrong. Critique from a brand or marketer is a blessing. It's feedback. They might come back and say, "Hey, your model is not refined great." We work on getting the model better. Or, "What you're doing doesn't seem to be giving the outcome." Okay, fine. What else were you looking for? We can start having those engaging conversations.

We focused on how to get this done in the shortest possible time rather than waiting for the perfect model. Just a tip: you don't wait for a perfect start. Get started to get some inputs from the marketers to build the right models.

Simon Case

That's a really good point. It's definitely a thing in machine learning where people can get blinded by the beauty of their models. A more complex model might feel better if you are a data scientist creating the model, but actually it might not be so robust in practice. It might be more overtuned to the training data rather than the ongoing stuff.

Simple models are great. I've mentioned to you before, Sree, that people quite often sidle up to me and say, "I'm not sure I've been doing machine learning, but I've done a thing and it's got linear regression in it. Does that count?" I say, "Yeah, that's fine. Start simple. Use the algorithm that works for you if that's giving you the results that you need. Maybe it is a linear relationship, in which case you've done the perfect thing." Don't feel afraid because your model is simple. You're not using deep learning or anything like that. You don't need to. You can use very simple ones to get good results.

Getting a simple model into practice first, getting it into play so that the users can give you feedback on how it's working for them and what it's doing for them, was so important in the work that you guys did. It was really working with the people to understand what they needed.

The second model I'm going to talk about is called the repeater. This is people who are not actively engaged at the moment, but who might be interested in products at a later stage. It tries to work out whether they are likely to buy from us again, what sorts of things they are interested in, when they are likely to buy, and all those sorts of things. It looks at previous behavior of a customer, trying to work out a good time to engage and what sort of offer you should provide them with. There is a scoring over when they will buy again, and looking at other customers to understand what sort of products they might be interested in.

The really interesting thing about this model was that there were actually two models developed to address this problem. One model was very data driven, very classic machine learning. The other model was heuristic: set rules. You probably look at the data a little bit to work out what the rules are, but fundamentally somebody is saying, "I think this is the rule." They tried both out. The best one was heuristic. If the heuristic model worked the best, that's fine. If you've tested it and you're doing it experimentally, then that's fine. Perfect. Please go ahead and do that. Don't be afraid. Always experiment.

I want to finish by saying what mattered most in both of these models was that we were working closely with the sales and marketing teams from very early on to understand how they would be used. There's a thing that can happen sometimes with machine learning where there's a team that creates a model and they are a long way off from the people who are using it, and that never works. Understand the customer journey of the users, work closely with them, and keep the team bounded in their ambitions. That's the right way to start. Don't start too ambitious. Anything else you want to finish with, Sree?

Sree Balakrishnan

Simplify. The aspects to a machine learning project: because I came from a non-machine-learning background, when I first heard it, I felt like a complete dumb guy and I didn't think I would ever do machine learning. That's when I first came to you, Simon. I clearly remember coming like a kid and saying, "Simon, teach me. What is this machine learning all about?" You simplified it for me at that time.

Some of you are already doing it, fantastic. I hope you get some learning. If you're starting for the first time, think of it as three different boxes, not three different teams. There is one part of data pipeline. To be honest, you don't even need data pipelines; you can start with CSV. Don't bother too much about data pipeline at the beginning.

The second part is, if you have clean data, spend a lot of time to ensure you can take it to the machine model. In the starting phase of the machine model, maybe don't even start the machine model. Start with simple mathematical calculations, summations, or simple logic. Define how you do any of this prediction without a machine model. Start with that.

Number three is how the actual end user, in this case the marketing team or sales team, is going to consume what you are doing. For DevOps people, the first part is how we get the data, organize the data, and cleanse the data. The second is the machine scientists, the people who have Python and R skills, the data scientists. The third part is that you also need a CRM experience manager or a customer experience manager to define what that journey looks like, especially in our case. The third can be replaced with one of your applications or a team.

So think of it as boxes and start building a squad to fill in all these areas. Don't make them siloed. Let them talk every day. Use the principle of agility and self-organizing. There will be some failures on the way.