Each year, brands budget huge sums of money for sports team sponsorships, but few methods exist to validate those expenditures. Peter Lenz and Peter Ibarra join Caroline Allen for a conversation on how brands and sports teams can use data and insights to measure sponsorship effectiveness and ensure that their advertising is hitting a home run.


Apple Podcasts      Soundcloud   Youtube


Caroline Allen, Peter Lenz, and Peter Ibarra from Dstillery

CAROLINE ALLEN:  Welcome back to the DS Without the BS podcast where we help demystify data science, AI, and machine learning for marketers. I’m Caroline Allen, Content Manager here at Dstillery, a predictive marketing intelligence company and host of our channel. Today we’re rejoined by the dynamic duo, Peter Lenz, Senior Geospatial Analyst and Peter Ibarra, Senior Analyst on our data science and analytics team. Last time we spoke to them about data and political predictions. With the start of baseball season, we thought it would be a good opportunity to talk sports. Each year, corporate brands budget huge sums of money for sports team sponsorships, but few methods exist to validate those expenditures. Thanks for joining us.

PETER LENZ: You’re welcome.

PETER IBARRA: No problem.

CAROLINE: So, let’s take it back a couple of years. When I first came to Dstillery, you guys were working on a paper where you were looking at baseball stadiums, MLB baseball stadiums and trying to understand how brands could validate their sponsorships. Can you tell us about the work that you did?

LENZ: Sure, so you have to step back a little bit first and realize that Dstillery is a company in New York City and New York City is a baseball town. You walk around the office, you see various bobble-heads sitting around on peoples’ desks, and jerseys, and shirts for all different sports teams. We love baseball. I mean, how could we not? This is New York City, we have the two time World Series champion Mets, there’s also some other team in the Bronx, I forget the name.

IBARRA: I think they’re triple-A.

LENZ: Yeah, something like that.

IBARRA: Something like that.

LENZ: And so it’s a topic of discussion here, a lot. And the genesis of this project was one of those talks, and literally Pete and I were wondering who the heck goes to Spring Training games. So, we were speculating about this and joking about it, and then I think the light bulb went off over both our heads at the same time. Hey, we collect lots of data. We can factually answer this question. So we quickly mapped out a bunch of Grapefruit League, which is one of the Spring Training league stadiums in our system, and collected just a little bit of data and it worked. And from there we very quickly were able to go from understanding who goes to Spring Training games at a minority of stadiums, to we can profile every single MLB team across an entire season using geo-signals.

IBARRA: Yeah, so as Lenz was talking about we wanted to kind of figure out what could we do with an entire season’s worth of data. I know we both had conversations like “Alright, well could we figure out a way to just go to the MIT Sloan Sports Analytics Conference?” And that’s kind of when we discovered that there was the business of sports track and I think as we were going through and looking at some of the initial Spring Training data. You know, given that we work for advertisers and different brands, that’s kind of where the idea came. Can we start to measure the impact that in-stadium sponsorships have on a fan’s experience. So if we were to start understanding from April, May, June throughout the months, does the Bank of America suites at Yankee Stadium, is it actually starting to impact peoples’ behaviors when they’re outside of that experience at the ballpark? And that’s kind of where our idea was is can we come up with something, an initial metric to understand and start to quantify the actual impact of these multi-million dollar sports sponsorships.

LENZ: So what’s really crazy to us when we start to do the research for this and look for prior work is that nobody’s apparently done this before us. People spend insane amounts of money to sponsor sports teams, but then use extremely simple methods to try and measure whether that expenditure was worth it. And to us that seemed bizarre; we work in an industry where people worry about tenth of a penny on an ad, and here we have people who are spending tens of millions of dollars on sponsorships and basically never actually looking to see whether that was worth it or not.

IBARRA: Yeah, I think most of the time it was how many people are in attendance, so ticket sales, and I think the other side, you know the dual sponsorship with the broadcast is how many viewers are actually watching these games. That’s how they’re quantifying the impact of their sponsorship.

CAROLINE: So they’re looking at viewers or attendance as a whole, they’re not looking at those like we like to say, those subpopulations of the profiles of people who are actually attending and whether that aligns with their brand or not.

LENZ: It blew our minds how little math there is behind the thinking for these things. So we decided to make it better.

CAROLINE: So, you talked about how brands were not using the best metrics to measure their sponsorships. Take us through the process of how you collected this information, what that data looks like, and the amount of time it took to do that.

IBARRA: Yeah, so I mean as we kind of touched on previously, I think Spring Training was kind of like our test to get an idea of how we could potentially do something like this for the six month season that goes on, right? And so the first step was I think this geographic component of basically geo-fencing each of the stadiums. And I know obviously with all 30 stadiums we went through the first step of actually making sure we had the proper latitude, longitude and then drawing a custom geography around each. Because if you’re going to a game in Yankee Stadium for example, you’re in a very densely populated area so you want that geo-fence to be right around the stadium. You go to some other places like from my home town if I go to Angel Stadium it’s within a gigantic parking lot where people are going to be there early for the games, well I don’t know how many people, but there’s going to be some people there early for the games to do some barbecuing or tailgating and stuff. And that’s kind of where we took a little bit more leeway to expand it out, to capture some of that audience that we know are there that are going to be there for the games. So that’s kind of like the first test of going through and seeing for each stadium, kind of drawing that custom geography to ensure that the people that we’re getting are the ones that are actually going to the games and not all the noise that could be around it.

LENZ: It was a nightmare to draw the geo-fence for Fenway because it’s right in the middle of a city. The second component of this was temporal. So we only wanted to collect people who were at baseball games. Baseball stadiums are used for all sorts of other things when they’re not playing games. Could have a flea market, you would have a hockey game going on in what is normally a baseball stadium. We don’t want those people, so we had to actually go out and take the baseball schedule and convert it into code that told our system when to turn on and off data collection for these stadiums. We were looking at specific locations at specific times. So we used Dstillery’s data collection that we use every day in our system; we took these geo-fences and these times and we collected devices that showed up at stadiums during those times. Now you might think we got lots and lots of devices per stadium, but that’s not true. Compound probability says that all these different things have to happen before we collect a device. You have to be in that stadium, so that limits us to say, 50,000 people for a game. Those people have to have their phones out; those phones have to be looking at apps or websites that we are connected to. You have to see an ad that has to come through our bit stream. All these different things have to happen before we see a device, and so at a game that lasts several hours and has 50,000 people at it, we were seeing 30 to 40 devices at a time. So, for every game we’re collecting a relatively small amount of data. The great thing about baseball is it’s played almost every day, so there were lots and lots of chances to collect data times 30 teams, or technically times 15 because every game has two teams at it. And we collected that data and we let it collect for a long time. It took us an entire baseball season to collect data which also made it very scary at the beginning when we were setting up the experiment. We had to create the geo-fences in the temporal aspect at the beginning and get it right once, because if we messed up data collection, that’s it, it’s not like you can get MLB to start the season over for you. If we messed it up, we’d have to wait an entire year for the next baseball season to start over again, so that was fun.

IBARRA: (laughing) Yeah, especially the time where I thought I almost deleted all the data.

LENZ: Yes (laughs). He didn’t, spoiler alert, he didn’t.

IBARRA: Good news was it didn’t happen, but man I was terrified for a good 20 minutes.

ALLEN: So that was in 2015, it’s 2018 now. How has the technology evolved since then besides the fact that we have more data coming into our systems.

IBARRA: Well I think the biggest thing is the evolution of kind of some of the techniques and methodologies that we used for this thing. I mean, at the time we had been tasked on a few different projects to kind of do like these one-off events whether it’s Coachella or maybe like a raceway was having a big NASCAR event that weekend. And that’s kind of what this started from and I think the biggest change has been the fact that we’ve been able to turn a lot of these methodologies that we do into actual products that it’s not Lenz and I collecting this stuff and doing it manually. It’s things that are happening basically every day on thousands of models.

LENZ: So, it took us a year from conceptualizing this idea to writing a report for 30 locations. Every night, our Dscover Maps product does around three and a half million locations across thousands of audiences. This went from a one-off experiment to a product that powers real decision making at companies all over America, every day.

ALLEN: Some of our listeners may not know what Dscover Maps is, can you explain that?

LENZ: Certainly. So Dscover Maps is part of our Insights Portal. It allows us to take our audiences, and remember every audience at Dstillery is a behavior. It’s something that someone has consciously decided to do. We collect samples of the devices’ behavior and we project those samples into the real world using the digital signals and the geo-signals that we collect. So we can say things like “People who go to Shea …  Citi Field” … I’m an old school man, I still think of Shea Stadium. “People who go to Citi Field are more likely to have behavior food than people who go to New Yankee Stadium who are more likely to have behavior bar”.

ALLEN: And using Insights, so a lot of brands use data to kind of find new audiences, but this is really talking about how brands can use data to really listen and ingest information to make broader decisions for their company.

IBARRA: Yeah, absolutely. And I think that one of the cool things about this product other than the fact that it’s just giving you kind of like the raw numbers behind it. I think the cool thing is that it’s actually visualizing a lot of this stuff that Peter was just talking about, is that you can actually see on a map “Oh, this area of the country can be into surfing, whereas this area of the country is into skiing” and you can actually see where that’s happening. I think the interesting thing here is that it allows kind of large companies to zoom in on these micro-geographies and on the flip-side, it takes like some of these smaller companies that may not have access to national profiles of the customers that they care about. But they can kind of plug in and see it immediately and I think that the ability for it to scale for both large companies and small ones … the part that I like about it anyways is that’s what makes it so powerful is that if I’m a small company in Texas for example, I can see who my customers are and see where they would be if it was on a national level and I think that’s pretty cool.

LENZ: You can use that for other purposes, for instance if a company is expanding we can use these same technologies to tell them not just what parts of the country to expand to but even where they should specifically be locating retail locations. We have knowledge not only on say, an actual brand, but because we have millions of points of interest in the United States. Those points of interest are other stores and other brands that we are collecting using our data. We can tell you how your competitors are doing in areas as well. So if you are a store that sells yarn, we can understand where the other stores that sell yarn are, and also where the consumers who are purchasing yarn are. And pull that all together to build a model of where you should be locating your physical location, if anywhere.

CAROLINE: And did you find anything that’s kind of interesting or an out-of-the-box audience when you were looking at the MLB study?

IBARRA: Yeah, I think there was a few of them. I know a lot of what a team is also trying to do in getting some of these sponsorships is trying to show the validity of their audience to whatever the brand may be. And I think that one of the ones that we were, when we were doing this is obviously car brands can be a big sponsor of various MLB clubs. And the obvious one that immediately popped up was Detroit. But then when we were looking at some of these things, I think another one was like Tampa and that’s not something we would have expected, but it’s something where if I’m a brand like Ford, I want to invest dollars but I want to make sure that I’m getting the most value for that. If I’m doing it in Detroit, I’m going to be competing probably with a lot of other carmakers. But I can find some of these other teams that would bring that same similar value, but maybe I don’t have to compete with as many brands so I can have that sponsorship value, but pay less for it. I think that was one. There’s some tech stuff that we saw in San Francisco but then also popped up in some other towns that were really surprising for us. And so, being able to do that for both teams and then on the other side even for brands, like brands want to do this but they want to be smart about it. Those were some of the cool findings that we were able to discover throughout this study.

CAROLINE: So, I think a lot of us are going to be wondering what did you find out? Do sponsorships work? Are they a worthwhile expense for companies, or not?

IBARRA: I think there’s two things, really. First is that there absolutely, like this data can absolutely be used as a way to develop a metric to measure that. But the second thing was that sponsorships do work especially if a brand can be smart about truly integrating it in with the fan experience at the game. I know, I think it was actually Bank of America, we were actually able to see people start to visit bankofamerica.com throughout the season as they were going to Yankee games and I think that’s pretty impressive on two levels. One, you have a lot of people going to the Yankee games, but two, Bank of America is a national brand. It’s one that a lot of customers are going to go to on a daily basis. So to really see the increase in the propensity of it, you have to see a fairly significant amount of people take that action, and so the fact that we were able to see that at a location like Yankee Stadium was pretty impressive cause I can only imagine how much they spent for that type of access.

LENZ: That’s why it’s important for brands and teams. For brands, it tells you the money I’m spending isn’t wasted. People look up to sports teams, they care about them, and if you become involved with them, you get a halo effect of being involved with that sports team. And for the teams it’s valuable because they have to go out and sell those sponsorships. And any number you have that shows that this really works, you can charge for that. It’s great for the bottom line of both parties here.

IBARRA: I’d like to add onto that as well is when you can find out more information about how a fan of a team is starting to interact with who you are as a brand, you can get more creative in how to maximize that partnership.

ALLEN: And we want to make sure to hit on the topic of transparency as well and to talk further on how we protect consumer privacy and how we collect our data in general.

LENZ: So the first way that we’re protecting consumer privacy here is we simply never collect anything that is potentially identifiable information (PII). Just never even enters the walls of the building. We don’t have it; we can’t abuse something that we don’t have. Two, even taking the anonymized data that we do have, we take a lot of steps to ensure that we clean it up even more. For instance, we take the device IDs that we have and the location data that we collect, you can never see anywhere in our system raw device ID and raw lat/long information in the same place. We hash it, hash one or hash the other in different parts of the database and those two different silos can never be connected back. So we protect data by not having it, and what we do have we make sure is anonymized even farther. There are parts of the system where we use hashing algorithms which are a way to take a value, some kind of identifier and change it in a way that it can’t be tracked back to the original identifier. The data science team, those of us who play with the data aren’t allowed to even know what algorithms get used to hash so that we can never be tempted to try and even figure out how to connect certain parts of the system to other parts of the system. There’s all sorts of different technological ways that we’re protecting the data.

CAROLINE: Baseball season kicked off a couple of weeks ago, I know we’re in New York, the Mets are kind of on fire right now.

LENZ: Hell yeah.

ALLEN: Who do you guys think are … let’s get a prediction out of you guys before we end this. Who do you think is going to win the World Series this year?

LENZ: Let’s go Mets!

IBARRA: I don’t think it’s going to be the Mets, I’ll put it that way. I look forward to their collapse in the next two months. It’s going happen, and I can’t wait for his reaction about it.

LENZ: The art of being a Mets fan is the art of being disappointed.

IBARRA: (laughs) I like the Astros. I mean, they’re a really good team and obviously they won it last year but I think it’s going to be them again this year.

LENZ: But it’s going to be the Mets.

IBARRA: It will not be the Mets.

LENZ: It’s gonna be the Mets.

ALLEN: Well I guess we will have to check back in … when is the World Series this year, this Fall.

IBARRA: October.

ALLEN: October. Well  we have a long ways to go but we’ll check back with you guys to see how that prediction pans out for you. Well I think that’s it for this episode. If you guys want more information on the work we’re doing here at Dstillery, or if you’re interested in seeing how audience insights can help you make more strategic business decisions, sign up for our free Insights Portal that Peter Lenz was talking about. That’s insights.dstillery.com. Pete and Pete are also happy to chat about any of the topics that we covered; you can shoot them an e-mail, correct?

LENZ: Especially the Mets.

CAROLINE: Yeah, if you want to talk Mets, Peter Lenz is your guy. Shoot them an e-mail, that’s plenz@dstillery.com, and pibarra@dstillery.com. It’s also going to be in the transcript below. On our website at dstillery.com, you can find their whitepaper on MLB but also a lot of other articles that are helping marketers understand AI, machine learning, and data science. Follow us on Facebook, facebook.com/dstillery.intelligence or hit us up on Twitter or Instagram @dstillery. Don’t forget that’s Dstillery without the “I”: d-s-t-i-l-l-e-r-y. Talk to you soon.