Podcast: Play in new window | Download
Subscribe: Spotify | | More
Bill Slawski with Jason Barnard at SEOcamp Paris 2019
Bill Slawski talks with Jason Barnard about patents and entities in search since 1999.
Patents are really easy (or so says Bill Slawski): they identify a problem, they tell you about the prior art used to solve the problem, tell you why that’s insufficient, then they provide a solution. We also talk about patent writing styles and the Ernest Hemingway of patents. We look at Google Maps as a knowledge graph and traffic cop. Onto why many of the best employees moved from Microsoft and Yahoo to Google (the reason is not what I thought). Remembering dates, names and patents is like doing a jigsaw puzzle.
Along the way, we work back through the history of entities in search, starting 2019 and right back as far as 1998 (Sergueï Brin).
This interview / conversation was recorded at 7 am in a bakers shop in Saint Denis near Paris and has an amazing backing track of coffee, bread and the locals chatting in Arabic and watching TV (don’t worry, it doesn’t ruin the listening experience, it a truly makes it better 🙂
Jason Barnard
SEO is AEO! Welcome to the show, Bill Slawski.
Bill Slawski
Thank you Jason.
Jason Barnard
Right. Lovely to meet you. For the listeners, please excuse us, there’s quite a lot of noise behind, some people talking Arabic. We’re at a boulangerie, a bakery in the middle of Saint Denis in France. It’s seven o’clock in the morning after SEO camp. This is absolutely brilliant because the coffee machine keeps going off. People keep coming in to chat to the guy. That’s just setting the scene cause we’re having a good laugh here. Yeah, Bill?
Bill Slawski
It’s a good atmosphere. I love going to breakfast in the morning at bakeries.
Jason Barnard
And this is a bit of a different bakery than you get in San Diego, yeah?
Bill Slawski
It’s not too much different.
Jason Barnard
Oh! Right, okay.
Bill Slawski
There were a few like this, yeah.
Jason Barnard
Brilliant stuff! So you don’t feel too, kind of, away from home, you’re feeling very much at home.
Bill Slawski
And this reminds me more of what I used to go to when I lived in Virginia.
Bill Slawski
It is the Red Truck Bakery. The owner, the baker owned a red truck and used to cater events. He’d drive up in a red truck and hand out bread and pastries.
Jason Barnard
Yeah, so that Red Truck Bakery, he thought long and hard about the name of his bakery.
Bill Slawski
Yeah, he had a red truck and parked down from events
Jason Barnard
I was stuck in Lawrence yesterday and he was looking for examples for something and there was a lemon tree right next to him. So, all the examples were lemons. It was brilliant, but we all have tendencies to do that. We look around, and my idea is the name of the street or swimming pool.
Bill Slawski
When I started my website, SEO by the Sea, I was standing in the second floor window in an office in Havre de Grace, Maryland watching sails bouncing up and down on the Chesapeake Bay.
Jason Barnard
Okay, and so SEO by the Sea is your blog. And your company is Go Fish Digital, so that’s all terribly sea oriented.
Bill Slawski
It’s not my company, it’s a company with a couple friends who I met at a meet up ten years ago or so. I was speaking on named entities in 2007.
Jason Barnard
That sounds terribly probable, yeah.
Jason Barnard
And when, sorry?
Jason Barnard
Okay, so ten years before I even knew what they were. Brilliant stuff. Let’s get on to something a bit more professional.
Jason Barnard
You look at the patents, we all know that, Bill Slawski, the patents guy. I’m very thankful to you as I said earlier on, that you read them so that I don’t have to. I assumed it was just because you’re a lawyer and that’s kind of the connection, but in fact it’s not, is it? Can you tell me?
Bill Slawski
I was an undergraduate English major and one of the professors I really appreciated the advice of used to teach us, taught me a class in deconstruction of literature. And so, it was the idea of reading something, looking at all the parts, everything that made it what it was and combine that with the homework we used to do in law school which is taking judicial opinions, breaking them down into 9 different parts or types of things. It’s a habit I developed from years of school. To read something, break it down in parts. Patents are really easy. They identify a problem, they tell you about the prior art involved that is used to solve it. And why that’s sometimes insufficient, then they tell you *We have a solution, and here’s what it is, here’s where we’re going.* And that’s what a patent does.
Jason Barnard
Brilliant, it does seem very simple. I mean, I look at them and I just go *I can’t think through all this stuff.* But, if you think of all the structure of it, it becomes much easier. And your combination of English and law is absolutely perfect.
Jason Barnard
And you’re looking at specific papers by specific people because you recognize their style?
Bill Slawski
Some of them write differently than others. Some of them are very predictable. They, you know, understand the difference between an F. Scott Fitzgerald and a Ernest Hemingway. They write in very different styles. Ernest Hemingway writes in a, what you would call, an iceberg style, where you just see the very top of what he’s writing. Most of it’s under the water. It’s sort of assumed. He expects you to know things.
Bill Slawski
William Faulkner is a stream of consciousness writer and just spouts out lots, and lots, and lots of thoughts, endless streams of words that paint pictures of things. So he’s not hiding everything under the water like Hemingway does. Same thing with writers of patents. Some of them will very straight forward tell you, these are the advantages of following the process in this patent. The next big list of ten, twelve, fifteen things. This is why you should do it. Not all of them do that.
Jason Barnard
Okay. Brilliant stuff, I’m learning all about patents. Somebody out there who writes patents in an Ernest Hemingway kind of a way. Oh, that’s brilliant. I love that. So, I mean, I’ve been reading your articles for a while. And, you get to talk yesterday’s SEO camp. It was actually brilliant, I mean, the whole SEO camp, in fact, was brilliant, yeah?
Bill Slawski
It was fun, yeah.
Jason Barnard
Yeah, you enjoyed it. That was a bit of a leading question ’cause you wouldn’t really say no, could you? You listed seventeen patents in the beginning, but you actually talked about five or six.
Bill Slawski
I think there’s ways of providing evidence, or proof, or provenance of something being more likely true than not. One of them is when a patent is released and there are lots of related patents. On the same topic, the same subject, not exactly the same thing, but they work well together. And it’s as if somebody had a lot of thoughts that were combined together towards some idea. So, the first one of those that I listed of the seventeen was one that’s sort of the root of local search in Google. The fact that they gain information about local entities and that’s not how most people talk about local search. The local entities from sources like local directories or data aggregators or enterprise websites. And, if the facts about those local entities is consistent from one source to another to another, it’s more likely than not that that information is correct. It’s true.
Bill Slawski
There’s sort of, like, corroborating evidence based upon consistency across the web.
Jason Barnard
Yeah, I love the word consistency.
Bill Slawski
Which is something they’re doing with entities with attributes, with bad properties of those entities. If they can be consistent across the web, they’re more likely to use that information in answers in search results.
Jason Barnard
Yeah, the name of the guy escapes me. Now at Amazon bought a machine learning company, and he was saying that Google Maps is a great example of a functioning knowledge graph.
Jason Barnard
And yeah, the idea that with all the data that they’re getting in and the people fact checking, they can actually answer queries that have never been asked before, in terms of directing people to somewhere. Is that fair?
Bill Slawski
Google acquired a company called Zip Dash. And Zip Dash developed the idea of using GPS data from different devices to track traffic.
Jason Barnard
Okay. When did they acquire them?
Bill Slawski
It was in the mid 2000’s.
Jason Barnard
Okay right. So they had time to get it right.
Bill Slawski
So I think to a degree, Apple is doing that too. So you’re driving along, some place you’ve been to before. You turn on your navigation and Google Maps navigator will tell you that there’s a traffic delay. To give you a chance to turn off the road and take an alternate route. Which is really convenient. I use Google Maps navigation for places I’ve gone to before. I know how to get to, just because of that feature.
Jason Barnard
Yeah, no, I understand, I mean, I do too and I’ve got so used to it that when it gets it wrong I get really annoyed with it. I was with Hugo, the colleague I was playing music with, who you met. And we were in Lyon, they said don’t leave now ’cause it will take at least two hours. But we said *Ah, we’ll leave anyways.* We just wanted to get going, and we used Google Maps. We went through this incredibly torturous little route. We could see the traffic jams on the flyovers all around. The same car was behind us the whole way. And I kind of figured he must be looking at Google Maps and Google has pushed us through this route that presumably couldn’t get blocked up, and then it pushes other people to another one.
Bill Slawski
It’s interesting how Google Maps will route you around problems and I’m used to driving to the same places using Maps, different routes almost every time. And I’m assuming it’s because they’ve identified what they think is the best route. I have a little holder for my phone that I can put right on top of the dash. So I can see where it indicates you can turn off and it’ll be three minutes slower. Or four minutes faster.
Jason Barnard
Oh, no that’s brilliant.
Jason Barnard
I remember you’ve got an old car. That’s right, isn’t it?
Bill Slawski
No, no, it’s not that old. It’s a 2004.
Jason Barnard
Oh god, I can’t remember why I thought you said you had this really old jalopy. Oh, okay. I’m misremembering so, I better be careful. How far can we take our idea that Google Maps is a knowledge graph.
Bill Slawski
I refer to businesses as local entities. It makes a lot of sense to refer to it as that. There are ways to think about Maps as very similar to the web in that Google bot crawls the web, Google Maps has street view cars that crawl the Earth, and map things, and so on. You can make a lot of analogies between both types of search and I consider there being at least a few different types of search going on at Google. One is a web based crawling of links and redirects and so on. And trying to understand the words that are in text on pages and being able to index those. There’s an inverted index of terms on the web that you may try to match with queries, which is very different from a knowledge base, a knowledge graph type of approach where you’re looking for specific entities and you’re asking questions about properties and values, attributes related to those. And, Google, maybe, crawls that layer of the web and visits schema and tries to understand how everything’s connected.
Jason Barnard
Brilliant stuff. Oh crumbs, I’ll have to listen back to this thing to actually get all of this in my brain. You said inverted queries. What was that?
Bill Slawski
An inverted index.
Jason Barnard
Oh, an inverted index, I’m sorry.
Bill Slawski
An inverted index is Google understands where all the terms are. On pages, it indexes those. Where this is interesting is, there’s another inverted index of phrases on the web. In 2003, Anna Patterson, somebody who most people don’t know about, she built the largest search engine in the 21st century.
Jason Barnard
Oh, really?
Bill Slawski
It was called Recall. It was a demo search engine to enable people to search the internet archive and all the different monthly versions of pages. Okay, so it indexed billions of pages. Google acquired the demo that she had of Recall on internet archive and then hired her a couple weeks later.
Jason Barnard
Brilliant, so ’cause yeah we were talking, this is very slightly related, there’s a page with the Google graveyard of all the stuff it’s bought and killed off. I love that page.
Bill Slawski
Well, Google’s acquired a lot of business too. It’s acquired some for the technology they’ve produced and it’s developed some of that technology. lLike Zip Dash, I mentioned. With traffic, right? Grouptivity, which had social circles like Google+. They used the phrase *fail fast*.
Jason Barnard
Oh, okay, brilliant. Which you were bound to say if you’re going to fail, fail fast, get rid of it, and move on to something else.
Jason Barnard
Yeah. Glad I understood that. I’m feeling very clever now. We were actually not supposed to be talking about that stuff. You were talking about the patents, talking about yesterdays extracting data from the web. I mean, you and I both cite [00:14:17] a lot. How well is Google doing now at that? Because I kind of get this dream that they’re doing incredibly well. Are they?
Bill Slawski
I’m not sure because I really have little to compare it to. I see some things that Bing is doing, and I mentioned this to you yesterday. They’re very dysfunctional. Or I mentioned to somebody yesterday.
Jason Barnard
Yeah, might’ve been Hugo he looks a bit like me.
Bill Slawski
So, Microsoft Research developed stuff that they don’t necessarily share with Bing. There’s a huge knowledge graph that Microsoft Research developed that got abandoned and nobody picked it up until maybe a year and half after it was abandoned. Microsoft’s developed an entity database, an API, that picks up on that concept database, that knowledge graph, that Bing had developed. Bing also had an object level search in 2007.
Jason Barnard
Oh, really? I don’t know how you remember all these dates and names. Go on.
Bill Slawski
So, their object level search was, they expressed it, they showed it off and there was an academic search with papers and publishers and so on that they were collecting information about who the publisher was, who the authors were, what the papers were, what the titles were. It was all fact type stuff.
Bill Slawski
They were treating the papers as entities. Okay, so they did the same type of thing with products. MSN product search which was an entity based type thing, it was a knowledge graph. It was an object level search.
Jason Barnard
So, none of this is new?
Jason Barnard
Ah! I thought I was coming in right at the beginning, but I’m not.
Bill Slawski
But this is Microsoft Research, or Microsoft Research Asia developing some of this stuff. And then having it out there, having Bing not developing it, not doing anything with it. Which is, like, *why are you so dysfunctional?*
Jason Barnard
Yeah, do you think they’ve kind of missed the boat? They’ve missed opportunities by being dysfunctional?
Bill Slawski
I see a lot of people who were Microsoft researchers now working at Google.
Jason Barnard
And they go to Google because Google pays more or because Google’s more interesting or because Google moves forward faster?
Bill Slawski
Google’s employing people.
Jason Barnard
Oh, right. Because Google needs the raw talent.
Bill Slawski
Microsoft was laying people off. Google was employing people.
Jason Barnard
Oh, right! Okay, oh it’s as simple as that.
Jason Barnard
So I was looking for the complicated solution and in fact the simple one is the best.
Bill Slawski
And the same happened with Yahoo. You have people like Andrei Broder who was a brilliant researcher. And he was producing stuff for Yahoo. He was the one who came up with different levels of intent behind searches. Like, informational, transactional, and navigational. So he’s now at Google.
Jason Barnard
But I mean like all these people, like these great ideas at Microsoft. Microsoft let them go and Yahoo let them go and Google picked them up.
Jason Barnard
And then we all think it comes from Google.
Bill Slawski
There’s a lot of cross pollination of ideas between search engines and different people performing different jobs in search engines. Like, the search engineers might have backgrounds in information retrieval, and some might have backgrounds in user interfaces. And they work together on projects and they learn from each other.
Jason Barnard
Brilliant stuff, I mean this conversation was supposed to be about extracting knowledge, but it’s actually turned into patents and the history of entities in search from 2007 was the first date I remember. It probably goes back before that.
Bill Slawski
It probably goes back before that. Google Maps was the first patent from the annotation framework set of seventeen i showed. That was 2005.
Jason Barnard
Okay. So now not only do you remember the names, you remember the numbers of these things as well. That’s a brilliant memory you’ve got. I’m jealous.
Bill Slawski
You ever do jigsaw puzzles?
Jason Barnard
Yeah, badly.
Bill Slawski
Okay, so one of the strategies behind building a jigsaw puzzle is you look for the corner pieces. Remembering the dates of patents, when certain things happen, helps you put everything into perspective in the bigger framework.
Jason Barnard
Okay, yeah. You remember the date and then hook it onto the information.
Bill Slawski
Right, so page ranking came out in 1998. In 1999, Sergey Brin came out with a patent, or provisional patent, it never got granted. It involved an algorithm he called Deepry which has something to do with understanding patterns and relationships between things. He had a list of five books, and he said okay, the book title, the author of the book, the publisher, the length of the book. He said okay if we crawl the web, find these five books, on a site, we’ll want to crawl all the other books on the same site and make a big index. And then, once we’re done with that site, we’ll look for those five books on other sites. And he chose five books that he thought were representative of a good corpus, a good starting point. Three of them were science fiction books.
Jason Barnard
Oh, that shows where he’s coming from.
Jason Barnard
So, it goes back to 1999. That’s the earliest date we had. We started in 2017 moved all the way back to 1999 when Sergey Brin was starting with his books, including science fiction. Wow, brilliant, amazing stuff. And that’s great, I think we can end it there. We’ve gone right back to the beginning. Thank you very much Bill. SEO is AEO! Thank you, Bill!
Bill Slawski
Thank you, Jason.
Jason Barnard
Brilliant stuff, man!
Further reading: Patents and entities in search since 1999