Search engine rankings look at hundreds of factors associated with websites to determine their rankings. One of the factors that may play a role, given the number of patents and comments about it, is sentiment of texts.

For reference, here are some Google patents or related comments on importance or use of sentiment classification:

Sentiment detection as a ranking signal for reviewable entities (link to Google Patent)

https://googleblog.blogspot.com/2010/12/being-bad-to-your-customers-is-bad-for.html

Internal Google System for Large-Scale Sentiment Analysis for News and Blogs

https://patents.google.com/patent/US7987188B2/en

One reason search engines may look at sentiment is to estimate the trust about an entity or brand. We know from the latest Google Quality Raters Guidelines that they treat reputation as an important factor.

In this blog post we will not speculate about the possible mechanisms, but rather just look at the data to see what story does it tell us.

Analysis

Let us first look at the concept of the sentiment itself. A sentence can be denoted as positive, if it has a positive meaning, e.g.:

I am happy.

Today was a good day.

This product is great.

Or it can be a negative one:

She was so disappointed.

They were tired.

The service and food in the restaurant were terrible.

To determine the sentiment of million of sentences from webpages, we employed a deep learning model (LSTM neural net).

Deep learning models, when trained on millions of labelled sentences are pretty good, better than humans, on assessing the sentiment of texts.

We decided to examine the sentiment of two types of texts in ranked pages – their website titles and their text.

Sentiment of a webpage was calculated by determining separately sentiment of each sentence on the webpage and making an average to get the final webpage sentiment. We considered around 200,000 ranked pages with more than 10 million sentences.

Website Texts Sentiment vs. Rankings

The results for sentiment of website texts are as follows:

Statistics R squared is 0.92*.

The chart tells us that the higher ranked websites seem to have a more positive sentiment. The effect amounts to around 1.0% difference in sentiment between 1. and 10. ranking.

* The results should be evaluated taking into account that the sentiment may be related to many other independent variables that were not included in the model which can introduce bias (so-called omitted variables bias)

Website Titles Sentiment vs. Rankings

We continue the analysis by looking at the website titles’s sentiment vs. their rankings:

Again, the trend is the same – higher ranked pages have more positive sentiment, R2 squared is a bit lower (0.66).

It is quite interesting to find such a relationship and we have thought internally about various possible mechanisms how sentiment can channel through various other important ranking factors.

For example, we may on average less like to read negative content as opposed to positive content. Thus we may stop reading negative content marginally earlier than positive content.

Bu this may then translate to a slightly higher bounce rate for the negative texts as compared with positive texts. If the bounce rate is a negative factor for rankings, this channel could contribute to the observed effect. We found at least 10 other interesting indirect effect “channels” and the reader may speculate about their own explanations for the observed effect of higher ranked pages having a more positive sentiment than lower ranked pages.

Share: