PulseAugur
实时 20:53:28
English(EN) In the past 24 hours, these spiders have ignored my robots.txt file and tried scanning the database anyway. They were all blocked. Claude (thousands more attemp

AI爬虫无视robots.txt,尝试扫描数据库

观察到包括Anthropic的Claude和OpenAI的GPT机器人等在内的多个AI驱动的网络爬虫,无视robots.txt指令并试图扫描数据库。这些机器人,以及来自百度、亚马逊、Meta和Yandex的其他机器人,均被服务器管理员阻止。管理员表示沮丧,称这些大公司试图窃取资源,并且这些机器人的同时涌现可能导致服务器无法使用,并引用了他们PieFed服务器最近的一次事件。 AI

影响 AI爬虫正在积极抓取数据,可能影响小型平台上的服务器资源和数据隐私。

排序理由 这是用户对AI爬虫的投诉,并非来自前沿实验室的直接发布或公告。

在 Mastodon — fosstodon.org 阅读 →

AI 生成摘要 · Google Gemini · 来自 1 个来源。 我们如何撰写摘要 →

报道来源 [1]

  1. Mastodon — fosstodon.org TIER_1 English(EN) · [email protected] ·

    In the past 24 hours, these spiders have ignored my robots.txt file and tried scanning the database anyway. They were all blocked. Claude (thousands more attemp

    In the past 24 hours, these spiders have ignored my robots.txt file and tried scanning the database anyway. They were all blocked. Claude (thousands more attempts than the others) Baiduspider Amazonbot Bytespider gptbot Meta-ExternalAgent YandexBot ChatGPT ByteSpider CommonCrawl …