How a Search Engine Might Identify and Rank Authors in Search Results
Bill Slawski, March 21, 2013
In the past couple of years, we've been seeing Google bring a level of social activity and awareness to Google that was missing in the past. They've developed meta data approaches that enable authors to connect their Google Account with web pages they write at and contribute to. They've also set up a way for businesses to connect their sites to Google profiles as publishers.
Bing seems to be trying to broaden their social reach as well, though their approaches have been different. Chances are that methods similar to what Bing is exploring will be used by Google in the future as well.
When Google launched Search Plus Your World (SPYW) results, it let Google share information from social networks (primarily Google Plus) to people you are connected to. While performing searches you might see relevant results for queries that someone you're connected to either shared or endorsed with a +1 as a Social search result.
We see similar social annotations appearing in Bing, when logged into our Facebook accounts. See: How Buddies and People Who Know are Selected for Bing Social Searches.
Another feature that we see in Google is that some search results might display Author Badges even when we aren't logged into Google. Those badges can show up when someone who created a page adds author information meta data to that page, and connects their Google Profile to the page or a profile on the site where it appears.
Interestingly, Google is also showing author badges for content on other sites that might be from the same author where authorship markup hasn't been added to those sites. (For example, Google is showing author badges for a couple of my sites where I haven't set up authorship markup, but I linked to those from my Google Profile).
Bing Profile Pictures in Search Results
This week, Bing also started showing profile pictures in their search results when someone isn't logged into either Facebook or Bing. These results appear to be similar to the authors that Google shows. Bing doesn't offer the kind of authorship markup that Google offers. So how does Bing find these pictures and profiles?
AJ Kohn, of Blind Five Year Old, published a thoughtful analysis of these profile pictures from Bing search results, in the post People Snippets on Tuesday. AJ notes regarding these images:
The new faces (for the most part) showing up in Bing search results are not authorship snippets per se but are people snippets derived from entities. It’s about who the content is about rather than who created the content.
It appears that Bing is doing some creative data mining of pages within its index to identify images of people to display next to pages in search results. Given the fact that not everyone will set up authorship markup for Google, it's likely that Google will also have to do something similar if they want to try to associate authors with content on the Web that doesn't use authorship.
There are a lot of people on the Web who publish a lot of content in formats that might make it difficult to add the kind of authorship markup that Google has made available. For instance, consider an academic or industrial researcher who has a profile page and a list of whitepapers available as PDF files on the Web. Google's authorship markup really isn't very easy to apply to those pages and that content, even if it might not be difficult to point a link to the profiles pages for that author from a Google profile.
Below is an image of Bing results that include my picture. There's another using the same profile picture in the search below for a different profile page for me. And a little lower is a link to my profile at MySpace that shows a completely different profile image for me, which is the profile picture I used at MySpace. These aren't links to content I created as much as they are links to pages about me, such as profile pages.
Microsoft published a pending patent at the USPTO in December that describes how they might use a data mining approach to identify authors on the Web. The patent application is:
Discovering Expertise Using Document Metadata in Part to Rank Authors
Invented by Aninda Ray and Dmitriy Meyerzon
US Patent Application 20120310928
Published December 6, 2012
Filed: June 1, 2011
Expertise mining features are provided based in part on the use of an expertise mining algorithm and expertise mining queries. A method of an embodiment operates to provide an expanded feedback query based in part on search results using an expertise mining query and a number of author-ranking heuristics used to rank authors and/or co-authors (e.g., primary authors, secondary authors, etc.) as part of an expertise mining operation.
A search system of an embodiment includes an author ranker component to rank authors based in part on an expertise mining query and author-ranking heuristics, and a query expander component to provide expanded queries as part of identifying relevant search results. Other embodiments are also disclosed.
How Bing May be Finding Authors on Different Topics
There's a multiple step process that Bing might use to find authors on the Web, and to display images of those authors in search results.
The first step involves doing some focused crawling on the Web to locate authors who might have some kind of expertise on a particular topic.
As an example, if someone is searching for expertise on a new mobile phone running Windows Mobile Phone 7 operating system (example is from the patent, which is why it uses a Microsoft example), they might type "Windows Mobile Phone 7 expert" into a search engine to find out about experts on that topic.
Search results that are returned for these queries might be limited to certain types of results as well, such as:
- Design plans,
- White papers,
- Curriculum vitae,
- Published (white) paper lists,
- Citation lists,
- Patent applications,
The patent filing uses an example of a search engineer to describe this process, which is probably a good choice. I've searched for more information on more than a couple of inventors on search related patents, and many of them have one or more profile pages, often at a university page or an industry page, and these link to things like whitepapers that they've written.
These profiles aren't likely to have something like Google's authorship markup on them, and adding markup to PDF documents involves making changes on a server level rather than to those PDF documents themselves.
When authors are located through this focused crawling for different topics, the search engine might expand queries to learn more about these specific authors, including identifying other profile pages for them, and possibley other tings that they've written.
The kinds of information looked for might include things such as:
- User profiles,
- User expertise summaries
Let's say that Robert Example is one of the "experts" identified during that original search. The search engine might work at uncovering more information by exploring search results where the author's name is added to words (or tokens as they might be called) from the original query. So it might search for [Robert Example windows] or [Robert Example windows 7] or [robert Example Windows Mobile Phone 7] and so on.
The purpose behind doing this is to be able to provide searchers with search results that might include links to profile pages for authors as well as results that might contain related pages from those authors. The related pages not only help by providing results that answer a query, but they also reinforce the expertise that might be cited in a profile about an author.
The patent provides a lot of details on how they might not only identify authors and author profile pages, but also content created by those authors, and how different authors might be ranked in determining which ones to show for different topics evidenced in queries.
Why This is Important
Bing is showing us that Authorship Markup like that developed by Google isn't completely necessary for identifying the authors of pages on the Web, and for learning about which authors might be experts on specific topics. It doesn't seem like Bing has completely refined this process yet at this point, but they seem to be working upon it. While the Google authorship markup approach probably makes it easier for a search engine to learn about authors, and to bring a social element to showing content from some authors to people they are connected to in Google's SPYW approach, not everyone is going to adopt or use Google Authorship.
Setting up authorship is recommended because Google does appear to be working towards a day when content in the Web is tied to its author. That could potentially mean that a reputation score associated with certain authors could potentially help boost rankings for pages written by authors with good reputation scores for different topics.
But there are likely going to be many pages on the Web that don't have authorship markup, and an approach like the one that appears to be in development from Bing shows how authors can sometimes be identified when they have profile pages, and they write about topics that they could be considered experts on.
So Bing might be able to use a data mining process like this to find authors that might be shown in search results as having an expertise in fishing, in interior design, in designing parts for hybrid cars, and in many other industries covering many different topics.
Showing off your expertise on the Web through profile pages at different sites such as LinkedIn, Google +, and many others is often a good idea in that it helps show your credibility and the things you have an interest in. Under Bing's data mining approach, it may also land you in search results with a picture.
The Microsoft patent doesn't mention including pictures from profiles in search results, but if you look at how those are appearing often next to profile images, it's a clear sign that Bing is interested in making authors and profiles stand out.