There have long been talks about Google search results getting worse.
I often have dismissed this as a feeling when I heard people making this statement.
Now in a comprehensive study titled "Is Google Getting Worse? A Longitudinal Investigation of SEO Spam in Search Engines" conducted by Janek Bevendorff, Matti Wiegmann, Martin Potthast, and Benno Stein, researchers sought to address this growing concerns of the apparent decline in search result quality.
The research team, spread across Leipzig University, Bauhaus-Universität Weimar, and ScaDS.AI in Germany, embarked on a year-long empirical analysis to delve into how SEO-affiliated content affects the performance of three major search engines: Google (proxied by Startpage), Bing, and DuckDuckGo.
I think is super interesting - and also there are some takeways on what SEO tactics actually works!
Main conclusion; Google is infected with SEO spam
Overall the main conclusions is this:
The search results are showing a disproportional amount of SEO spam content including affiliate links. And it's getting harder to tell real content from spam, especially with AI-generated content. Even though search engines update their algorithms to fight SEO spam, the improvements are temporary.
The main takeaways outlined by the research team are:
- Search Engines Struggle with SEO Spam: All examined search engines, including Google (via Startpage), Bing, and DuckDuckGo, have significant challenges in filtering out highly optimized SEO spam, particularly within their product review search results.
- Prevalence of Affiliate Content: A vast majority of the high-ranking product review pages in the search results use affiliate marketing strategies. This is disproportionate to the overall presence of affiliate marketing on the web.
- Amazon Associates Domination: The Amazon Associates program is the most frequently used affiliate network in the product review content discovered in search engine results, indicating a potential imbalance favoring Amazon's ecosystem.
- Inverse Relationship: There is an observable inverse relationship between the use of affiliate marketing and the complexity of the content. Pages with a higher count of affiliate links typically have less complex text, which could indicate lower-quality content.
- Content Quality Concerns: The study finds that the text complexity and quality of top-ranking pages might be declining, with trends suggesting simplified, repetitive, and potentially AI-generated content.
- Dynamic Adversarial SEO: The continuous adjustments by search engines to combat SEO spam portray an ongoing, dynamic adversarial relationship. SEO practitioners adapt rapidly to algorithm changes, making permanent improvements difficult to maintain.
- Temporary Effects of Updates: Search engine updates do affect the rankings of SEO-optimized content temporarily, demonstrating momentary improvements in search quality. However, these gains are usually short-lived as SEO tactics evolve.
- Increasing Content and Link Spam: The search engines display vulnerability to large-scale spam campaigns, especially those involving extensive affiliate links which blur the line between quality content and spam.
Listen to this article:
What SEO tactics and methods does actually work according to the study
So maybe not what the researches intended with their study, but if we look into their findings we can actually dissect what practices works from a SEO perspectice:
- Keyword Optimization: The research indicated that higher-ranking pages seem to have a strategic use of keywords, especially in headings and titles. This suggests that careful keyword optimization might influence visibility in search rankings.
- Content Structure: The study showed that pages that rank better are often more structured with a proper use of headings and shorter URL paths, which could signal higher content quality to search engines.
- Text Complexity: Interestingly, the research observed that top-ranking pages tended to have lower text complexity, which might infer that simpler, more straightforward content could be more favorable in search rankings. It's inline with what we earlier covered on readability's effect on SEO.
- Frequent Content Updates: The study noted fluctuations in rankings in response to search engine updates. This suggests that continually updating content to align with the latest SEO trends could be a tactic to maintain or improve rankings.
Not many surprises in regards to common best practice among skilled SEO professionals, but always nice to get re-confirmed.
The methods of the search result study
The researchers evaluated a robust dataset comprising 7,392 product review queries and their corresponding search results.
The study centered on the substantial amount of search engine-optimized (SEO) but low-quality content, particularly relating to product reviews, which were postulated to be influenced by SEO spam techniques, including affiliate marketing.
The research showed that when you search online, the top results you see are more often the kind of pages that have been specially made to appear high in searches (SEO-optimized), rather than being naturally popular or useful.
So, there's more of this kind of content than you'd expect compared to the vast amount of other stuff on the internet.
To figure out if the search results were full of pages designed to rank high (SEO-optimized), the researchers looked at several signs on these pages.
Here's how they did:
- Checking Words and Structure: They looked at the text and structure of web pages. If a page had a lot of repetition or was super easy to read but didn't offer much real information, it was probably designed to show up high in search results.
- Counting Affiliate Links: They counted how many times a page linked to products where the page owner might earn money if you buy something (these are called affiliate links). More of these links could mean the page was made more for making money than for being helpful.
- Long-Term Tracking: They watched what kind of pages showed up in search results over a year. This way, they could see if the same types of designed pages kept appearing or if they went away after search engines updated their systems.
- Comparing Search Engines to a Baseline: They compared what search engines like Google, Bing, and DuckDuckGo showed with what a basic search tool found when looking through a huge set of typical web pages. The basic tool didn't try to guess what's important; it just showed pages based on simple rules. If the big search engines showed more specially designed pages than the basic tool, then it meant those pages were probably being pushed higher on purpose.
While the methodology adopted in the study provides valuable insights, it may have some limitations or potential areas of inaccuracy, which include:
- Rapidly Changing SEO Tactics: Since SEO strategies continually evolve, some of the patterns and tactics the researchers identified as indicators of SEO optimization might become outdated quickly. What was identified as SEO spam during the study may become a standard or accepted practice or may be replaced by new techniques that the study did not account for.
- Limited Scope of SEO Indicators: The study relies on specific features to gauge SEO optimization, such as text complexity and affiliate link counts. However, SEO encompasses a broader range of tactics, some of which are more subtle and may not have been captured by the features analyzed.
- False Positives in Spam Detection: The methods for identifying SEO spam could inadvertently label legitimate, high-quality content as spam. The criteria they used might not always align with actual content quality, potentially misrepresenting the presence of SEO spam in search results.
- Sample Limitation of Query Set: The study focused on product review queries, which may not be representative of all types of queries users might enter. Different types of content or queries could experience different levels of SEO optimization and spam, so the study's findings might not apply to all search scenarios.
- Baseline Comparison May Not Reflect Average User Experience: The use of a basic search tool (BM25 on ClueWeb22) as a baseline for comparison assumes that this simpler system would present an accurate portrayal of non-SEO-optimized content distribution. However, this baseline might not accurately reflect the average user's search experience or needs, leading to skewed comparisons.
The difference between ‘SEO Spam’ and regular SEO
In their study, the researchers make a distinction between what they classify as "SEO Spam" and general, more acceptable forms of general search engine optimization (SEO).
General SEO
General SEO refers to the legitimate techniques that webmasters use to make their content more visible in search engine results.
These practices are meant to help search engines understand the content so that it can be more easily found by people looking for that information. It includes optimizing website structure, improving page loading speed, making sure content is relevant to search queries, and so forth.
The goal of general SEO is to enhance user experience and match users' search intent with quality content.
SEO Spam
On the other hand, SEO Spam, as identified in this study, describes aggressive and often deceptive practices that aim to manipulate search engine rankings primarily for financial gain, without necessarily providing value or relevant content to users.
This includes stuffing pages with excessive affiliate links, keyword stuffing (overusing certain phrases to try to get more visibility), creating low-quality content that is over-optimized for search engines with the aim of ranking highly rather than being useful, and other tactics that prioritize ranking over user benefit.
The researchers see SEO Spam as a subset of SEO that detracts from the value of search results due to its exploitative nature, whereas general SEO is seen as enhancing the value of search results by improving content visibility in a user-centric way.
The investigation aimed at identifying and measuring the prevalence of SEO Spam within the top search results, which they suggest is an indication of declining search quality, while recognizing that not all SEO should be categorized as spam.