We talk a lot about how to use data, but thought we'd step back and go over some fundamental questions, marketers have about data. In the tenth 2 Guys and Some Data podcast, we discuss what data to capture, how to capture it, and of course the basic tenets behind using it effectively.
(Podcast Transcript)
Allen: Hey to all the data-driven marketers out there looking for new ways to reach unique prospects and better engage audiences. This is the tenth, yes, tenth podcast for the 2 Guys and Some Data series, giving you the nitty -gritty advice you need to actually make more money. I'm Allen Abbott,
Larry: and I'm Larry Kavanagh,
Allen: and today we're gonna talk about ... wait for it, data. Yes, we talk a lot about how to use data, but thought that today we'd step back and go over some fundamental questions, about what data to capture, how to grab it, and of course the basic tenets behind using it effectively.
So Larry, there's so much data out there, and so many different types, how do we classify this data in a way that's meaningful to marketers?
Larry: Well, there's actually a whole science about classifying data. In fact, there's a type of data about data, called metadata. So I think the point you make, about how to classify data in a way that's relevant to marketers, is really the germane point here. So, from a marketing perspective we usually think about data in three types. There's first-party data, second-party data, and third-party data. So, let's start with first-party data.
So first-party data is information that a business directly collects from a consumer. One of the advantages of first-party data is the business has really, inside of their own control, the accuracy of that data. So for example, if the business is collecting an email address, the business could actually send something to that email address and see that it doesn't bounce, that it's actually a good and valid email address. And so, it's a great source of information, where you can control the accuracy, and it's also free to collect. You don't have to pay somebody else for your own first-party data, it's just yours.
Allen, what kind of data do people collect? What can businesses collect that's first-party data?
Allen: Well, the traditional starting point is transactional data, which is still extremely powerful and still works incredibly well in predicting behavior. But there's also customer service data, there's data from primary research, there's data from social media, and there's lots, and lots, and lots, and lots of web browsing data. What are some of the limitations, though, of first-party data?
Larry: Well, while you can be sure of the accuracy of it, you're really only seeing just a tiny piece of what that consumer is about; you're seeing how they interact with you. For the most part, first-party data is really only applicable to people who have purchased from you; your own customers, or at least people who have inquired from you. It's also not very broad, you're not seeing, sort of, all of the United States, you're just seeing that, sort of, narrow piece of who you have. Now, the reason why there is second-party data, are because of those limitations of first-party data.
So, second-party data is something kinda confusing-
Allen: Okay, so let me stop you a minute there. It seems very fuzzy, and some people, I think, even doubt the existence of second-party data, sort of like the Loch Ness Monster and Sasquatch. Exactly what is it, and where do you get it?
Larry: Well, it definitely exists. In fact, I would guarantee you that about every marketer listening to this has used second-party data, and is actively using it, just may not realize it.
Second-party data is collected the exact same way as first-party data; it's a business that is directly collecting information from a consumer. The difference is, if I am a company and I'm buying data, or renting data, or trading data with another company that directly collected that data, that's second-party data. I know who collected it, I know how they collected it. You know, a great example would be anyone listening to this who uses a Facebook Custom Audience. Those Facebook Custom Audiences are based on data that consumers have given directly to Facebook, and you can rent that audience, you can advertise to that audience, but you're actually using second-party data when you're doing it that way. So Allen, what are some other examples of second-party data?
Allen: Well I think there's opportunities to get to second-party data through creative partnerships. If you think about a company like Fitbit, and think about potential partners for them ... Title Nine; so non-competitive in terms of the products that they sell, but very similar audiences. And also, someone like Brooks Brothers could partner with Esquire Magazine; where, again, non-competitive but probably audiences with very similar interests. What are some of the limitations though of second-party data?
Larry: Well, sort of by definition, if you really wanted to get a complete view of your customer, or even fill in some pieces of your customer base, or your prospect base, you'd have to contact a thousand different second parties who had that, what to them is first-party dart, but to you would be second-party data. And just the expense and the difficulty of trying to make trades, like you talked about with Fitbit and the cataloger, Title Nine, are tough to do. And so, hence the reason why third-party data exists, is to overcome the limitations of second-party data.
So, third-party data is data collected by some business, where they go out and they make those arrangements with a bunch of first-party providers, but they bring all that data together and compile it. And so, it gives you a much broader view of people, not just your own customer base but also people who are not your customers. Third-party data will encapsulate really all of the households in the United States. There is third-party data available about all of the households in the United States. And so, it gives you the opportunity to have, sort of in one place, a broader view of your customers. But Allen, what are some other examples of third-party data?
Allen: I think some often used third-party data are things like credit score, or home value, or home versus renter. And it's readily available from data aggregators, compilers. The data base cooperatives, that have been around now for 25 years or so, that fuel a lot of the list rental audiences, is a very good example of that also. But there is an issue to consider with third-party data, in terms of, it's not unique, it's not proprietary, which would lead me to be very cautious about providing my first-party data to aggregators and compilers, because you're giving away something that is proprietary to you and often for really not that much value.
Larry: You're right on. I mean, I'm the only one who knows my first-party data, unless I give it away, sell it, to a third-party aggregator, in which case everybody can know my third-party data. And so, there are times when it's gonna make sense to do that. There's certain businesses where it certainly makes sense to do that, but there's a reason why, for example, Facebook isn't gonna just sell me the data they have on their members. They wanna keep that data internal and control it. It's a big part of their value. And I think, as data has grown, businesses today have sort of lost sight of the fact that, or maybe even never realized, that their data is valuable, and they're giving it away in more ways than they ever thought before.
Allen: Okay, so now it's time for our trivia question, but I'm gonna first quote from Stephen Hawking, from a speech he gave at the Cantab Capital Institute for the Mathematics of Information, in November. And Mr. Hawking said, "In a dazzlingly complex world, you have to be able to discern the meaning in the mess. We are in a figurative and literal sense awash with what we call data. What we're only now fully realizing is twofold; the sheer quantity of data in any given domain, and the tools we need to make use of the information encoded in it. The power of information only comes from the sophistication of the insights which that information lends itself to. The purpose of using information in this context, is to drive new insight. For example, I may have taken great pleasure in talking to you about hairy black holes, as I did earlier this year, but the question is, just how hairy are they? What are the implications of the knowledge I believe I have now gained?"
Larry: My answer is, I have no idea. But that wasn't our trivia question. Our trivia question is ... You know, Hawking brings up a great point; we are awash in data, and if you think about where all this data is coming from, a lot of it is actually first-party data. But I'll get into that more in our next part of this. So Allen, the question for you, though, is, how much data will people have created by 2025?
Allen: Okay, that's a good question, and I'm gonna pause a little bit and think about that, multitask, while we go to the next section, and think about an answer to that one.
Larry: Okay Allen, now that we've defined the different types of data that are available to businesses, let's talk about how to integrate them in a sensible fashion.
Allen: I think everyone agrees that data integration is a necessity, and on a practical level integration is really storing and collecting data from disparate sources. There are just so many sources of data, so many channels, so many devices, and we want all this stuff to end up in one place. But there's really another level of question here, which is, what does the data integration really mean? And what it really means is, is putting it into a context and ascribing a value to it.
So, there's a lot written about the complete customer view, and that's all well and good, but there's a question of cost. So, how actionable is data versus how much does it cost? And just to give an extreme example...iif I wanted to find out a lot of stuff about Larry I could hire a private investigator, and I could pay the private investigator to learn a lot about Larry and come back with a report, and I'd have a lot of information about Larry that, you know, some of which might be useful or none of which might be useful, but I know it would be very expensive. And as an extreme, that's not what we're going here for.
What we really wanna do, is identify the data that has an ROI. You know, what data that we collect is meaningful enough, and it's predictive enough, to be of value. So, it's not about collecting everything, it's about collecting data that is actionable, that has an ROI, and then using it to further the interests of your business. But a question for you Larry, how do we link all this data together? You know, it sounds hard.
Larry: Well, before I get to that, I just wanna say that when you do go out and get a quote from a PI on collecting all that data about me, let my wife know, she'll give it to you for half price. We all come out ahead.
All right, so your PI example is a great example, because what's inherent in that is that you're trying to find out about individuals. And so, when we're taking this first-party, second-party, third-party data, the question is, how do we link it all back to an individual level, in order to be able to make marketing decisions? So, the ability to even think about stuff this way is really a fairly new concept. You know, you go back 200 years ago, there were no marketing databases. You might have had a handwritten customer list, but the shipping business wasn't trying to link that data to the handwritten customer list of a wool merchant. They were separate, there was no, sort of, thought to try to link them together.
You know, in looking at this, really the first real key here, first real thing I could find that allowed, at least in the United States, allowed for the ability to try to link someone's data together, was actually the social security number. You know, developed back in the 30's, the social security number was originally intended to link your payroll check into an account that then, when it came time to get social security, you could receive money; you know, they knew how to pay. It then became, really just a universal government ID, and it was adopted by a lot of other types of businesses as an ID. In fact, when you go to a doctor's office today, go to a dentist's office, they usually ask you for your social security number. They're using it as a way of linking your records.
It's not illegal for them to do that, and in fact you don't have to provide it; there's no requirement for you to provide it. But it is, sort of like a common business practice that they use for linking. Banks, this is required. Banks require your social security number when you open a banking account, and that is a requirement by the government, and it allows them to try to track down money. But outside of banks and government, you're not required to provide your social security number, and really, for the most part people tend not to, except for some reason in medical situations.
So retailers never really had the option. Merchants, businesses that sell directly to consumers, whether they be ... Really, whatever type of business they are. Travel. People tend not to give up social security numbers quite so easily. And so, what became sort of the primary key, so to speak, or the way from marketing perspective you link people together, for a long time was based on your address, based on where you live. So, some combination of your name and address. And people would form a match key around that, but that became the way you would ... You know, take that example I gave from a couple hundred years ago, where the shipping merchant was trying to link to the wool merchant. You know, for a long time that key was name and shipping address.
Today though, that world is changing fast. Two keys that have emerged over the last, really 10 years, as a way of linking that disparate data are, your email address and your cell phone number. And cell phone number is, you know, people will give up that cell phone number in a heartbeat, not like they give up social security numbers. And the nice thing about a cell phone number as a linking device, as opposed to an email address, I might have five, six, 10 email addresses. I've had my cell phone number for a lot of years, can't even think about how many. How long have you had your cell phone number?
Allen: Long time.
Larry: It's a great linking device. And so I would say, you know, if you're a business and you're not collecting cell phone number, you should, because it's becoming sort of the linking standard across the board. So, that's how you link it all together, but really, I guess the question is next, what's the point? Why do you do get all this together? So, you know me, I'm a data modeler, and we're talking about marketing use of data. So, you know, it's trying to predict a response rate, trying to predict likelihood that someone is going to take some action.
It might be to compile how valuable a particular type of customer is. It might be to measure what I spend to market to a customer, versus what I can get back from the customer. It's really, I think, it might be to help me figure out what's the right way to communicate to them, what sort of advertising medium to use to communicate with them. It's really about, at the end of the day, for the marketing use of this data, is about optimizing marketing spend relative to, at the end of the day really, gross margin that a business can produce from their customers. But Allen, what have you seen about how people historically use those?
Allen: Well I think first-party data has historically been used to predict the behavior of existing customers. So, basic RFM, which is the standard for, going back 50 years, for segmenting a customer file as a, you know, this person bought three plus times from us, and they last bought 30 days ago, and they spent $200 a pop; high likelihood that this person is going to purchase from us again. And that's a very basic use of it, that has driven the direct response industry for a very long time.
Amazon, certainly, has made good use of first-party data to actually offer customers' products that they believe will be interested in based on what they've bought before, and they've taken that science to another level. It's not only that these people are likely to buy again, it's, but here's what they might likely purchase. second-party data really has two uses; one is that you can use it to augment your own CRM data, but you can also use it to just, sort of broaden the audience that you can appeal to, especially if you're in the kind of partnership that we discussed before, where you're trading information with somebody who has a similar audience. And then, third-party data has typically been used to predict the behavior of prospects, and the likelihood of a prospect to purchase or to take some kind of action. So that's the history lesson, but Larry, has this changed?
Larry: It sure has. So, as you ... And I think you nailed it. You know, first-party data historically was only about your own customers, but there's really becoming sort of like two types of first-party data that you can collect today. One type of first-party data, the historical type, is when a business asks for information from a consumer, and receive that information back. So they ask for an email address, they receive an email address back.
The reason, though, why we have this explosion in data, that we were referring to on our trivia question, is, there's now the ability to collect information about what a consumer does when using a business' website, or when using just other services. So for example, if you have a cell phone, there's now data that can collect your location; where are you. If you think about the internet of things, you know, all of a sudden there's like a whole new class of this ... It is first-party data, in that it's being collected from a consumer, but it's being collected about a consumer's behavior rather than being questions that are directly asked. And that's, like I say, what's really causing the explosion of data.
What that's lead to, is the ability to have first-party data not become just something you can use to understand more about people who have already bought from you, but it's become a source of prospecting; it's became a source of way of reaching new consumers, consumers who have not yet transacted with you. And I think that's the, you know, an incredibly important shift, in that we can now use our own first-party to find prospects, in a way we never could before.
So, Allen, since I was referencing this explosion of first-party data, it's time to get back to the trivia question. Have you had enough time to think about how much data, and like I say, it's primarily this user collected, or collected from user data, is gonna be around in 2025?
Allen: Well I did some back of the envelope calculations Larry, and I came up with an answer of 2.47 to the 27th power squared terabytes. Is that close?
Larry: Actually no, but it took me a minute. You're off by about 20 orders of magnitude, but in the bigger direction. It's actually a 163 zetabytes of data, is the estimate, and a zetabyte is a septillion. You know, you think about billion, trillion, quadrillion, quintillion, septillion; each one is an order of magnitude above. You did 10 to the 27th, which is way above that, and then you squared it, and so you went into, I don't know, the year 3020. But, I appreciate that you actually had an envelope to use.
All right, so that's it for this episode. Thanks for listening to two guys ramble about the types of data you can collect, how you can integrate it, and how you can use it. If you found this topic interesting, you may find our blog, Integrating Consumer Data for an Individualized Marketing Approach, interesting. You can find it, and more resources, at navistone.com/blog. Again, that's navistone.com/blog. We'll be back in a few weeks to talk about why marketers also need to be data storytellers.
Allen: Well, thanks for coming, I'm Allen Abbott,
Larry: and I'm Larry Kavanagh,
Allen: and have a good day.