This is our Deep Dive Into Local from October 30th, 2017. In our Deep Dive series, we take a closer look at one thing in local that caught our attention and deserves a longer discussion.
If you have a special topic you would like us to discuss for the Deep Dive in Local, please reach out to us. If you would like to be on one or the other of our segments, reach out and send us the topic and your availability.
If you are interested in sponsoring this weekly show also please let us know.
Mike: Hi. Welcome back for part two of the local ecosystem graph recently released by Nyagoslav and Darren at Whitespark. In part one, we discussed sort the graph itself, its history, as well as some of the top data providers. Why don’t we move into key sites as you identified them? Define what you think the role of those key sites are, as well as if any of them surprised you in terms of having more influence than you thought or less influence than you thought.
Nyagoslav: So I think one thing that we missed mentioning in the first part was that how the infographic was actually structured or how it was divided. So we created a few categories. We basically categorized different participants in the ecosystem in a few different types of categories. So obviously one of the categories is the primary data aggregators, then you have obviously the core search engines: Google Maps, Bing Maps, Apple Maps, you have all the other business directories. And then we decided to include the special category called “key sites” in the ecosystem. So this category includes six platforms. These platforms serve both as data aggregators. So they both send out data to other participants in the ecosystem and receive data from other places, or, like for example from user-generated content or proprietary data or whatever. So Mike, your question is, as far as I understand, related to the key sites in the ecosystem?
Mike: Right, so firstly who are they? And secondly, from your point of view, have they changed over time or have some of them surprised you with their influence?
Nyagoslav: Sure. So the sites that we branded as key sites were Facebook, Yellow Pages, Dunn & Bradsteet, Yelp, Foursquare, and CityGrid. So CityGrid is basically a network. The main site of a CityGrid network is Citysearch.
So in terms of if we got surprised by anything when we were doing the research, I wouldn’t say so, because we…I would say, in our daily work we realized that the influence of these sites is pretty significant. I would say what surprised me in particular was how lightly it is that if there is a completely rogue user-generated listing on Foursquare that has nothing, except for, like, let’s say a couple of check-ins, no additional metadata related to it, probably just a business name existing there and the city and obviously the lat-long coordinates of the business, this listing could actually get populated on Bing. So Bing would generate an entirely separate business record just based on this rogue listing coming from Foursquare. Which obviously means that Bing, you know, either they’re desperate for data or they trust Foursquare a lot, or their mechanism of clustering business data is as bad as Google’s circa 2010, maybe, or something.
Mike: Yes, business clustering is a big problem. An interesting thing about Foursquare I’ve seen is that they feed data to Uber and, there’s a whole raft of new geo-focused local business and apps: Uber, Lyft, and all these delivery services, etc., the various technical services that they provide, that, obviously your chart doesn’t take a look at. But I’ve found, for example, having a problem at Foursquare caused a problem at Uber which caused people to get dropped off at the wrong pill or the wrong, you know, spot which I thought was interesting. There’s a whole ‘nother research project here in figuring out how this data flows into the world of apps and geo-focused local products.
Nyagoslav: I remember that a few years ago, Foursquare moved from using Google Maps data to using OpenStreetMap. I wonder if it’s at all…I mean, I’m not sure about Uber, and I’ve used it probably once in my life. I’m not sure what kind of metadata they use.
Mike: So Uber has been using Google Maps but geolocating Foursquare locations over it. In other words, they don’t use Google’s data for geolocation. So if something is incorrect at Foursquare location for whatever reason, it shows up improperly at Google from a data point of view, but they do use Google Maps. In other words, when I fix the map at Google, it gets fixed at Uber but if I didn’t fix the listing at Foursquare, we still had a problem, which was interesting to me.
Nyagoslav: I see. This is very interesting, and I expect that… Anyway, so back to your question about the importance of these key sites, I would say that obviously Yellow Pages has historically been one of the most important non-data aggregator players in the local search ecosystem, so nothing has changed there, I would say. Yellow Pages bought collect data from a myriad of places and distribute data to a myriad of business directories.
So the second site that is super important and is not a data aggregator is obviously Yelp. So Yelp usually has content kind of agreements or arrangements with a number of important platforms including Yellow Pages and including Bing, Apple Maps. So these, you know, these content exchange relationships are actually related to the reviews that you have provided these business directories or search engines. But the thing is that in many cases when reviews are provided and when no listing predates the submission of these reviews or the feeding of these reviews to the endpoint, this endpoint would actually generate a new listing directly, usually. So in this sense, Yelp is pretty influential, because having a listing there, especially a listing that features reviews, would almost certainly cause this data to be distributed to a pretty significant number of additional platforms including some very influential ones.
Mike: Right, one of my learnings with Yelp was around Apple mapping. I was having problems with an Apple map. I fixed it first at Google Maps, which fixed it at Yelp because Yelp uses Google Maps, and then I changed the address at Yelp to reflect the new address, and Apple updated that content almost instantly. So this… the flow went from Google Maps to Yelp to Yelp data correction to Apple, and normally Apple takes a long time because they rely a lot on TomTom. But in terms of a POI, repositioning a POI on a new street, I was able to fix it by fixing it at Yelp prior to that fixing at Google. So there’s this whole other layer that goes on in this stuff that I find interesting, because it’s just another layer of virtual reality that sort of sits below the data, the business listing level, and that does give us some clue about the importance of these flows, right?
Nyagoslav: For sure.
Darren: I want to make two points about the data in the infographic. So one, we don’t necessarily know all the relationship. So if you mouse over Foursquare and says it only feeds two sites, Foursquare could be giving data to many other sites but we don’t have any record of it. So it doesn’t make it onto the infographic. So it doesn’t mean that Foursquare only feeds two sites, it just means that those are the only ones we’re aware of. And another thing is…
Mike: Just to note on Foursquare, they are making a big play on data and, you know, they basically separated their app from their data. And I would suggest that of these, they have the…in fact, like in Europe, for example, they have much more active and relevant data than Yelp. And so around the world, I think Foursquare is a data source player that to some extent is underrated, as it were.
Darren: Yeah, I agree, and I think it’s going to grow and in the next version of this, we’ll try to dig up more information on Foursquare. But, I feel like it’s not reflective of how important Foursquare is. When you mouse over, it’s a small segment. It’s only got two lines, right?
And then another thing is, like, you look at Yellow Pages, we see relationships where Yellow Pages apparently is giving data to all of these other sites, but sometimes Yellow Pages doesn’t have a direct relationship. We think that there’s sometimes just other sites are scraping Yellow Pages. We see the data flow, and so we don’t know if they have a direct relationship or if they’re just kind of scraping the data.
Mike: So that’s an interesting historical point. When Google first got going, they went to the Yellow Pages and said, “Give us your data.” They said, “We can’t but we can structure our site in such a way that you can scrape it.” It was illegal for them to give the data up, given their various legal ties and relationships, but they structured their site in such a way that Google was able to scrape it. So this goes way back. So I think that’s generally true, and I think it has some historical basis as well.
So, you know, obviously another thing you pointed out in the article was just how some of these sites…and I think you pointed this out in the last conversation, but maybe go into a more detail because we’re a little bit lower down on this chain. A lot of these sites may buy primary data but then update it with their own new data and then not buy the primary data again. And so, which of these are important? Where do you see that happening? What’s going on there, I guess? Does that make sense?
Nyagoslav: You know, I read your question that you sent us, but I was not really sure what…
Mike: So some of the sites that you noted, I can’t remember the exact names of them, you said that they might buy primary data but then they would then move into their own data cycle — Silex, for example, right — where they would then, update their own data with…their own listings with original information and then not share it back. Silex was an example. They might do scraping, they might do some other way. Which of these sites are important, do you think, and who’s doing it, why are they doing it, and which ones should businesses worry about?
Nyagoslav: So we couldn’t identify some business directory, for example, that is of primary importance that does something like this, but for example, we have a confirmation from TripAdvisor that they did do something like this. So they originally put data just once, as far as I remember, from…I’m not sure about that. They put just one data feed, just once, and afterwards they have been working together with each of the properties in order to keep the data up to date. So from my experience and from my discussions with some of the sites, apparently, as Darren mentioned, Silex do that. Hotfrog do something a bit different. So basically they have their own crawlers and they basically scrape data directly from the websites of the businesses. It’s very interesting, by the way. I know if you have seen listings on Brownbook that they have been auto-generated, not claimed and verified ones. So you can see that on many of the auto-generated listings, that…basically on all the listings on Brownbook, Brownbook keeps their information about when the listing has been edited in the past. And for almost all of the auto-generated listings, the date of when the listing was set up, created, is, as far as I remember, sometime in October 2008. So it means that Brownbook actually pulled a data feed from somewhere, I don’t know where, at that point of time. You know, they just generated millions of listings at that one day or two days, and from then on, they just relied mostly on users to actually keep the data up to date.
Mike: And are any of these sites, from your point of view, particularly important to be worrying about? I mean, do they hit your top 10 or your top 20?
Darren: TripAdvisor, yes. So TripAdvisor doesn’t really seem to take a feed, but they have direct relationships. And I think the reason they don’t take a feed is because the big hotel properties, they all have a direct relationship with TripAdvisor, and if you were a small independent, well, you’re going to go and make your own listing on Trip Advisor or a user will generate the listing because they want to leave you a review, and so that happens. So TripAdvisor doesn’t need to buy data, it’s just coming to them all the time.
Mike: So I have a question for you about attribution. I know you guys do a lot of citation campaigns. Have you ever… I find attribution to these secondary and tertiary sites to be…I find very little evidence that they, in the end, have much impact, other than possibly contributing to your rank at Google, maybe, right? But obviously the ideal site is one that not only contributes to your rank at Google but also…and contributes to a consistency at Google, but also sends you business, right?
Mike: So I’m curious if, as part of your citation product, you ever have experimented with UTM campaign codes on a per-directory basis, and then gone back and seen if any of these sites are actually generating any local business.
Darren: I haven’t. We haven’t done that, I don’t think. And so it’s like, Yellow Pages, maybe a trickle. Yelp, a trickle. Something that’s very industry-specific, I think yes. So like Avvo, for example, if you’re talking about lawyers, you’re going to get…actually, I know lawyers that are spending a fortune because Avvo is driving so much business for them. So they have paid accounts on Avvo and Lawyers.com, and they actually get a decent amount of leads from that, and so some of these industry-specific ones. But we haven’t done any specific tracking on it, but there is some value in it beyond just your rankings. And it really depends on the site, but, you know, if you’re going to tofindlocal.com or myhuckleberry.com, I really don’t think you’re getting many leads from sites like that, you know. But, the point is I guess it’s an extra link because, especially if we’re talking about links to location pages, it’s very hard to build links to those. So it’s one way to get another link and another mention, and I think the benefit is…because it’s also extremely low cost to get listed on these sites, and so the benefit is that it’s an SEO play. And so I’d say there’s probably 10 that drive traffic and the rest of them are all just specifically for SEO.
Mike: Well, I would suggest as a product upgrade for you that you develop some sort of UTM code tracking, obviously maybe the top 10. You have a unique code, but then after that, you standardize on everything else and we then look over time to see what’s going on, so…
Mary: I think you have to go back to one of the things that’s always been a challenge with local search, is that a lot of times people are just looking for one bit of information about a business. They’re looking for your phone number. They don’t care where they find it. They just want your phone number. Or they want to see where you are with directions. And I think that a lot of places, unless you go through all the effort of setting up call tracking at every one of these places, you really can’t measure.
Darren: Yes, don’t do that.
Mary: Yes, you just can’t measure it. And that a lot of what we’re doing with local search is just trying to be discovered, that our information is discovered and we can’t measure when it is discovered.
Darren: Right. Citations play into prominence, right? The more mentions of your business, generally, the higher your prominence score will be at Google. So that’s why people invest in them, and they do seem to have an impact.
Mary: And I’m also concerned these days, too, about finding the best places that I can get a good citation and people are going to review me there, or people are already leaving reviews there. So I want to make sure that I have a really good, complete listing.
Darren: Those reviews really play in to the attribution, because if you have a listing on yellowpages.com, for example, it could drive traffic, but it won’t because you’re buried on page 40. But if you start getting reviews on Yellow Pages, then your listing moves from page 40 to page 1, you actually will start getting traffic. So that actually plays a huge part. So actually, from tracking it, like, well, do you have reviews? How many reviews? Do you rank well within the site? If you rank well within the site, you’re going to actually get traffic from it. If you don’t rank well within the site, then you probably won’t, and this is why the lawyers are spending the money to get that prominent placement.
Mike: Yes, there’s a certain dialectic, too. If you’re ranking well within the site and Google can associate that rank at Yelp or at YP or at Healthgrades with your listing, and that individual page ranks, that also could impact your rank at Google. So there is that.
So I’m just curious, so has this research and has this…I know you just brought out a bunch of new packages at Whitespark, has it impacted which packages you deliver at which level and how has it influenced how you’ve developed this new product mix you’re using?
Darren: It did influence our packages for sure. We now have three packages. So we have our essentials, which are basically the top there. So we’ve got the four primary data aggregators: Facebook, Yelp, Yellow Pages, and there’s one more. And then we have the essentials plus, which is 13 sites, which is basically that whole top core, and then we have our comprehensive package. Our comprehensive before used to be 50 sites. We’ve scaled that back down to 35 sites. So we’ve kind of got it like, if you just need to be on the really most important sites, then that’s the essential. Essentials plus is, you know, the most important sites plus some of those other important sites. And comprehensive is for the person that just wants to go hog wild, you know. They want to spend the money, they want to make sure that it’s cleaned up across most of the entire local search ecosystem. That’s why we have that third option.
Mary: Thinking of getting cleaned up, in Nyagoslav’s article he mentioned that he couldn’t figure out a way to update Factual. Have you been able to figure out anything with Factual since then?
Darren: No, Factual is the worst. They are totally the worst. We have a direct API connection. We submit an update. It’s like, maybe six months later. But the interesting thing about that is, like, I’ve talked to people that have this, like, trusted partner status with Factual — SIM Partners, etc. They’re like, “yeah, no. We’re in the same boat.” No one can get data updated on Factual. Factual, they don’t seem to care about data quality, or else they’re just completely overwhelmed with all of the update requests coming in, but their database is a mess and it’s really hard to get it cleaned up.
Mike: I think the reason is that they’re moving towards higher value monetization in terms of attribution and geofencing, and I think that basic data is not that interesting to them, nor that profitable.
Darren: Exactly. And honestly, maybe the next ecosystem they get moved off of as a primary data aggregator.
Mike: That would be my bet. They come off and Foursquare actually takes over a more important role as a primary data source, particularly worldwide but in the United States as well.
Mary: So another thing that kind of shocked me is when I used a credit card in a cab and they emailed me a receipt without asking for my email address. So obviously the credit card companies are getting deep into data here. How do you see them fitting into this whole ecosystem?
Darren: That’s an interesting question. I don’t know. I didn’t really consider that. I think that the reason that you got that receipt was because you’d given it at some other point, and so they were able to associate your email address with that credit card on some platform. And it may not have been provided by the credit card company, it was just some intermediary that…
Mike: Could have been Square, for example, using as a pay-in service.
Darren: Yes, exactly. I think that’s maybe what happened there, but I don’t know. It’s interesting.
Mike: I see Google just bought a bunch of credit card data, and I just saw today that American Express is partnering with Acxiom to predict intent, right? So what they’re doing is they’re saying… So I think the credit cards are using it more for enhanced targeting rather than geotargeting, and in this enhanced targeting, they are going to take a dataset of people like you that have recently bought a bike with similar sort of traits and then predict that you’re likely to buy a bike, kind of deal. So I see that’s where there’s a lot more money in that than basic geocoding of this stuff, and I see it’s happening more and more. Well, Facebook, Google, and American Express, and Acxiom, all these guys, are doing it and they’re going to leverage their data for profit every which way to Sunday. And in this country, your data is the toilet paper on which these guys wipe, right? So anyways. So I’m going to give you… Nyagoslav, how would you summarize this from a small business perspective and from a large, multi-location perspective? How would they use this chart for their benefit?
Nyagoslav: Okay, so it’s a bit difficult. I would say that from a business point of view — if a business wants to use the data, it might be a bit complicated for them to figure out what exactly is going on, or at least…
Mike: So the short answer is hire you? That’s my short answer. Anyways, go ahead, sorry for interrupting.
Nyagoslav: That is the short answer, but, you know, I think one of the achievements I’m very happy with with this version of the infographic as compared to older versions of the infographic is that right now if a small business owner, for example, looks at the infographic, they can figure out what is what. So they can figure out that the data aggregators are data aggregators, and when they click on them, they can see, you know, they have so many arrows. So logically, it means these guys are more important than the rest, right? Whereas in the older version of the ecosystem, because there was a lot of, you know, interlinking and crossing of lines, and stuff like this, so it was a bit more difficult for people. So everybody knows that Google is the most important platform, right? So, you know, the overall design of the older versions was that you have Google in the center and everything is around Google. So…but this everything is so much that it can become overwhelming for a business owner, whereas right now in the infographic, you know, it has been designed in such a nice way that it just shows what’s the most important and where you actually need to start from, so…
Darren: I think one huge take-away from the infographic is knowing that…okay, I’ve found an incorrect listing on MapQuest. So if I go to MapQuest and I fix that listing, where else do I need to fix it? Otherwise, you’re playing a game of whack-a-mole where, you know, you fixed it and then that listing pops back up six months later. And so the infographic, you can mouse over MapQuest and see that it’s getting data from these other sites, and so you know you have to go and clean it up on the other sites. And so that’s one sort of actionable thing you can use the infographic for. It’s like, as you’re doing clean-up, you know where you need to also clean up if you want to make sure that it stays cleaned up.
Mike: Great summary. I think with that, we’re probably at our limit. We’ve gone to about almost an hour between the two interviews. So thank you very much for joining us, and we’ll talk to you next year.
Darren: All right, thanks for having us.
Mary: Thanks, Nyagoslav. Thanks, Darren.
Nyagoslav: Thanks, guys.
Mike: All right, bye bye.
- Video: Last Week in Local 6/1/20 - June 1, 2020
- Covid Testing Data for Google Business Profiles Can Now be Submitted to Third Party - May 8, 2020
- New Reviews Now Showing at Google – Sort of - April 9, 2020