Google Removes 100-Result Search Parameter, Sparking Debate Over AI Training Data Access

BigGo Community Team
Google Removes 100-Result Search Parameter, Sparking Debate Over AI Training Data Access

Google quietly removed a search feature that allowed users to view 100 results on a single page, limiting the display to just 10 results. While this might seem like a minor interface change, it has sparked significant discussion in the tech community about its impact on AI systems and website visibility.

The removal of the num=100 parameter has raised questions about how AI companies gather training data and whether they should rely on Google's search results in the first place. Many community members view this as an expected move rather than a surprising development.

Search Result Limitations:

  • Previous limit: 100 results per page via num=100 parameter
  • New limit: 10 results per page (hard limit)
  • Impact: 86% of websites saw decreased impressions according to Search Engine Land

Technical Solutions Already Exist

The tech community quickly pointed out that alternative solutions are readily available. Common Crawl, an open repository of web data, provides one such alternative for companies seeking comprehensive web content. Several developers noted that building custom web crawlers isn't particularly complex, suggesting that AI companies will likely develop their own search systems within months.

However, the discussion revealed that crawling and indexing present different challenges. While crawling web pages is straightforward, creating effective ranking systems remains Google's core strength. Some community members questioned whether this ranking matters as much for AI systems with large context windows compared to human users browsing search results.

Questioning the Original Claims

Community discussion challenged several assumptions in the original reporting. Multiple users pointed out that major AI companies like OpenAI use Bing for search functionality, while Claude reportedly uses Brave Search. This suggests that the impact on AI training pipelines may be less severe than initially claimed.

I thought OpenAI was using Bing. Gemini obviously will use Google but to them the restriction does not apply. Claude says it uses Brave.

The community also noted that major AI vendors typically operate their own crawling systems rather than relying on Google's search interface, making the parameter removal less significant for established players.

Alternative Data Sources for AI Companies:

  • Common Crawl: Open web crawling repository
  • Bing Search API: Used by OpenAI
  • Brave Search: Used by Claude/Anthropic
  • Custom crawlers: OAI-SearchBot, GPTBot for OpenAI training

Market Opportunities Emerge

The change has created potential business opportunities for search technology experts. Former Google search engineers and similar specialists could capitalize on the growing demand for independent search infrastructure as AI companies seek alternatives to Google's ecosystem.

The discussion highlighted that while Google's ranking algorithms took years to develop, the fundamental technologies for crawling and indexing are well understood. This suggests that determined companies with sufficient resources could build competitive alternatives, though replicating Google's authority and pattern recognition capabilities would require significant investment.

The community response indicates that while Google's move affects some systems, the tech industry is already adapting with alternative approaches and solutions.

Reference: Google just cut off 90% of the internet from AI - no one's talking about it