Google’s Fabricated Summaries and This Website

The thing about talking about the way Google works now is that some backstory is needed — and, unfortunately, I find this information very dry. You might as well. And, so, I apologize for the next few paragraphs, but I promise it leads somewhere. You just need to trust me a little.

On a search results page where Google shows ten blue links, it can sometimes be difficult to see any difference between its presentation now and how that same page would have looked ten or twenty years ago. Sometimes you will see many more ads and occasionally there are features like “People Also Ask”. But if you are simply looking at web page results, it is hard to see a difference: there is a title, a URL path, a date, and a summary of the page.

A search results page gives the impression of being a semi-neutral index of webpages related to a query, ranked mostly by their quality and relevance, almost like if you were to ask a librarian for books on a specific topic and they returned with ten volumes of substantial authority. These days, it is anything but.

There is an entire industry — mostly comprised of charlatans — dedicated to boosting the ranking of clients’ pages within these results. Sometimes, this can be for nominally ethical reasons; a local restaurant’s own webpage should arguably rank more highly than its Facebook page or Skip the Dishes listing, for example. But this work, no matter how honest its intent, is still manipulation, and it has created a somewhat adversarial relationship between Google and search engine optimization experts.

One thing Google has begun to do to combat misbehaviour is that it often generates page titles and descriptions itself instead of using the ones provided by the webpage. Its criteria for doing so is vague like much of Google’s documentation; the company often says this is because it is trying to avoid tipping off those who may abuse its systems. Any time there is a machine performing this kind of task, there is a risk of introducing untraceable inaccuracies. And here is where this story becomes about the little website you are reading now.

I found my 2017 review of the then-new base model iPad today as I was looking for something related to WWDC announcements this year, after I searched Google for “ipad ram site:pxlnv.com”. The page title appeared in the result as written on the page, but the description was clipped from a section of the article where I specifically describe RAM. And then I noticed something extra weird. Here is a reproduction of that result as it appeared on Google (emphasis theirs):

https://pxlnv.com › blog › 2017-ipad

The 2017 iPad – Pixel Envy

Jul 17, 2017 — Both sizes of iPad Pro also pack 4 GB of RAM. Perhaps the most noticeable difference, though, is that the iPad Pro line is where Apple’s …

Pros and cons: Very fast custom processors ⋅ Pretty great value ⋅ Greater storage options ⋅ View full list

The description has been pulled from what Google believes is a more relevant section of this review. More notably, though, is how Google has generated a “pros and cons” list for this post — and it is wrong.

In the review, I used the phrase “very fast custom processors” only in the context of discussing the iPad Pro line (emphasis added):

[…] Both sizes of iPad Pro represent Apple’s ideals of what tablet-based computing should look like: responsive, big, wide colour displays, very fast custom processors, lots of RAM, and Apple Pencil support.

The phrase “greater storage options” appears twice in the review: once in the context of the iPad Pro, and once as a speculative enhancement of the base iPad (I should edit my writing better):

[…] The Pro models also come with more speakers, newer Touch ID sensors, better cameras — at the expense of a bump — have Apple’s Smart Connector for accessories, and are available with greater storage options. […]

[…]

[…] At some point, the base iPad could conceivably get a newer Touch ID sensor, better cameras, and greater storage options. […]

Both of those are claimed by Google as things I said were qualities of the 2017 base model iPad, but that is not the case for either. (The third phrase, “pretty great value”, is cited correctly in context.) I did not make a list of “pros and cons” anywhere in my review; neither word appears anywhere in its text. But most upsetting is that Google does not make it apparent anywhere on this results page that it is responsible for this description, not me.

If a person quoted me as saying I thought two benefits of the 2017 base model iPad were its “greater storage options” and “very fast custom processors” based on this review, you would question their reading comprehension. If this citation were in something as simple as a high school paper, the student would be docked marks for taking statements out of context. Yet Google is able to do this in an entirely automated way without clearly identifying it as such, and it is supposed to be okay? No, thanks.

It appears this “pros and cons” list is generated for several reviews, though I have not found documentation for it. I have not noticed it until recently, though it is possible it was rolled out as part of last year’s batch of product review ranking updates. I am not sure exactly what causes it to appear for some reviews and not others and, in general, it often generates a fair summary of key points. But that does not compensate for the times it gets things wrong in an invisible way.

I do not monitor my Google search results, nor do I take any particular steps to optimize for them. I have set up my website with Google Search Console to check for broken links and the like. That tool is supposed to help me, as an administrator, manage the way Google indexes my website. I can find nowhere to report this inaccurate summary.

It is not so bothersome that I will spend any more time working on it. The most concerning aspect of this — the thing that got me to write this piece — is how it is not clear from these results that Google is responsible for the contents of these titles and descriptions, nor that they may differ from those specified by the author. This is dishonest and I feel cheated as a search user.