Pulse

last 48h

[50/245] 97 sources

What AI is actually talking about — clusters surfacing on Bluesky, Reddit, HN, Mastodon and Lobsters, re-ranked to elevate originality and crush noise.

TOOL · Mastodon — mastodon.social English(EN) · 7h · MASTO

🤖 New Platform Uses Cryptographic Invisibility to Protect AI-Built Applications 📝 Atsign’s AI Architect applies cryptographic protections to agentic s... https:

Atsign has launched a new platform called AI Architect that uses cryptographic invisibility to secure AI-driven applications. This technology aims to protect AI agents and their associated applications from unauthorized access and manipulation. The platform is designed to enhance the security posture of AI systems by embedding cryptographic protections directly into their architecture. AI

IMPACT Enhances security for AI applications by integrating cryptographic protections, potentially reducing risks associated with AI agent manipulation.
TOOL · Mastodon — fosstodon.org 日本語(JA) · 10h · MASTO

Naoki Kuramoto, Professor at Tohoku University and Chairman of the University Entrance Examination Society, who is knowledgeable about university entrance exams, said, "Strict identity verification is essential for fair entrance exams, including facial and fingerprint recognition... / "Is a biometric authentication system necessary for 'impersonation countermeasures' after AI-generated photos bypass Kindai University's entrance exam?" https://htn.to/vr6a7yqCym #incident #AI #crime #generativeAI #

A professor from Tohoku University, Naoki Kuramoto, has raised concerns about the necessity of strict identity verification methods, such as facial or fingerprint recognition, for fair university entrance exams. This discussion is prompted by an incident where AI-generated photos bypassed initial identity checks at Kindai University. The situation highlights the growing challenge of preventing impersonation in academic settings due to advancements in AI technology. AI

IMPACT Highlights the need for enhanced identity verification systems in educational institutions to counter AI-driven impersonation tactics.
TOOL · r/cursor English(EN) · 10h · REDDIT

Can a fake Sentry issue trick your coding agent into running a malicious npm package?

A new attack campaign targets coding agents like Cursor and Claude Code by exploiting unauthenticated Sentry error logs. Attackers create fake Sentry issues that prompt the agent to run a malicious npm package disguised as a diagnostic tool. While one agent successfully identified and blocked the typosquatted package, the vulnerability highlights concerns about the security of agent inputs and execution permissions. AI

IMPACT Highlights potential security risks for AI coding assistants, necessitating robust input validation and permission controls.
TOOL · r/StableDiffusion English(EN) · 10h · REDDIT

How to bypass Ideogram 4's "Image blocked by safety filter" for swimwear/beachwear (Understanding the filter mechanics)

Users on Reddit are discussing how to bypass Ideogram AI's safety filters, which often block images of swimwear and beachwear. The issue appears to stem from specific trigger words in the prompt rather than image analysis. By describing the scene and persona instead of explicitly naming clothing items like 'bikini,' users can generate appropriate images without triggering the filter. AI

IMPACT Workarounds for AI safety filters may become more common as users seek to generate specific content.
TOOL · Mastodon — mastodon.social 日本語(JA) · 10h · MASTO

Microsoft's 73 GitHub repositories disabled due to malware compromising AI users' credentials - GIGAZINE https://www.yayafa.com/2818682/ # AgenticAi # AI # ArtificialGeneralIntelligence # Arti

Microsoft has disabled 73 GitHub repositories due to a malware attack that targeted AI users. The malware was designed to steal user credentials, compromising accounts that interacted with AI-related tools. This incident highlights the security risks associated with AI development and usage. AI

IMPACT Highlights security vulnerabilities in AI development tools and user credentials.
TOOL · Mastodon — fosstodon.org English(EN) · 11h · MASTO

｢ using a VPN connection with an IP address that is in or near the target’s usual hometown, requesting a password reset for the account, and then choosing to ch

Hackers have exploited Meta's AI support assistant to gain unauthorized access to Instagram accounts. The attackers used a VPN to mask their location, then initiated a password reset and interacted with the AI chatbot to complete the process. This method allowed them to seize control of user accounts. AI

IMPACT Highlights a new vulnerability in AI-powered customer support systems, potentially impacting user account security across platforms.
TOOL · Mastodon — fosstodon.org Čeština(CS) · 12h · MASTO

An AI chatbot as customer support sounds great. It never sleeps, doesn't take holidays, answers (almost) immediately, and the company doesn't have to deal with the fact that a person on the line occasionally raises an eyebrow.

Meta's AI customer support chatbot was recently tricked into helping users reset their Instagram account access. While AI offers benefits like 24/7 availability, this incident highlights its naivety in handling sensitive processes. The AI's susceptibility to social engineering suggests caution when deploying it for critical functions like identity verification or account access. AI

IMPACT Highlights the need for robust security and human oversight in AI customer support systems to prevent social engineering attacks.
TOOL · Mastodon — sigmoid.social English(EN) · 12h · MASTO

🛡️ # ClawPatrol — a security firewall for # AI agents, from the folks at # Deno It sits between your agents and prod, parses their traffic at the wire, and gate

Deno has released ClawPatrol, an open-source security firewall designed to protect AI agents. This tool acts as an intermediary, inspecting traffic and enforcing custom rules to prevent unauthorized actions. ClawPatrol addresses the risk of API key exposure and accidental or malicious modifications to production environments by parsing agent communications. AI

IMPACT Provides a security layer for AI agents, mitigating risks associated with API key management and prompt injection.
RESEARCH · Mastodon — fosstodon.org English(EN) · 13h · [2 sources] · MASTO

🤖 Doctors and NHS could be sued for mistakes made by AI tools, report warns Medical Protection Society calls for law to be overhauled to help medics avoid liabi

A report from the Medical Protection Society suggests that doctors and the NHS could face lawsuits for errors made by AI tools. The society is advocating for an overhaul of current laws to shield medical professionals from liability when AI systems make mistakes. This raises significant questions about accountability and regulation in the use of AI within healthcare. AI

IMPACT Potential for new legal frameworks governing AI in healthcare, impacting adoption and liability for medical professionals and institutions.
TOOL · Mastodon — mastodon.social English(EN) · 17h · MASTO

🤖 OpenAI’s Lockdown Mode is trying to solve the prob... 📝 OpenAI’s move t... https://www. csoonline.com/article/4182650/ openais-lockdown-mode-is-trying-to-solv

OpenAI has introduced a new "Lockdown Mode" feature aimed at preventing its AI models from generating harmful or inappropriate content. This feature is designed to address concerns about the potential misuse of AI and to ensure safer interactions with the technology. The move comes as AI safety and responsible development remain critical areas of focus for the company and the broader industry. AI

IMPACT Enhances safety measures for AI interactions, potentially influencing user trust and adoption of AI tools.
RESEARCH · r/singularity English(EN) · 22h · REDDIT

Financial Times: New AI espionage powers trigger Putin camera scare | Russia paused surveillance system after killing of Iran’s Supreme Leader exposed how AI can be used on CCTV data to target enemies

Russia has reportedly paused its advanced surveillance system following the targeted killing of Iran's Supreme Leader, highlighting concerns about AI's potential for espionage. The incident revealed how AI can be leveraged with CCTV data to identify and target individuals. This development has raised alarms about the misuse of AI in surveillance and its implications for national security and individual privacy. AI

IMPACT Highlights the dual-use nature of AI, prompting governments to reassess surveillance capabilities and potential misuse for targeted operations.
SIGNIFICANT · Mastodon — fosstodon.org English(EN) · 1d · [2 sources] · MASTO

📰 Apple unveils Siri AI makeover as Tim Cook bids farewell The technology giant also revealed a series of new child safety features amid widespread scrutiny ove

Apple has announced a significant AI-driven upgrade to its Siri voice assistant, integrating more advanced capabilities and a conversational tone. This revamped Siri is set to be released in the fall as part of a broader iOS and iPadOS update. Alongside the AI enhancements, Apple also introduced new child safety features, addressing recent concerns. AI

IMPACT This AI-powered Siri aims to make voice interactions more natural and capable, potentially increasing user reliance on voice commands for complex tasks.
RESEARCH · r/ClaudeAI English(EN) · 1d · REDDIT

Anthropic changed their privacy policy today and there's a specific clause that every Claude user needs to know about

Anthropic has updated its privacy policy, set to take effect on July 8, 2026, which allows the company to share user conversation data with law enforcement based on its own internal "good faith belief" without requiring a court order. This new policy removes the previous requirement for legal process and external oversight, raising concerns about potential false positives, especially for creative writing or personal expression that could be misinterpreted by automated classifiers. Users will not be notified if their data is disclosed, and there is no described appeals process. AI

IMPACT Raises significant privacy concerns for AI users and may impact creative expression due to potential misinterpretation of content by automated systems.
TOOL · Mastodon — fosstodon.org English(EN) · 1d · [8 sources] · HNMASTO

Microsoft's open source tools were hacked to steal passwords of AI developers https:// techcrunch.com/2026/06/08/micr osofts-open-source-tools-were-hacked-to-st

Microsoft has temporarily disabled dozens of its open-source projects on GitHub following a security breach. Hackers reportedly injected malware into these tools, which are used by AI developers, to steal user passwords and credentials. This incident marks a second breach of Microsoft's open-source projects in recent weeks, raising concerns about the security of software supply chains. AI

IMPACT Compromised AI development tools could disrupt workflows and expose sensitive data, potentially slowing down AI project development.
RESEARCH · Mastodon — mastodon.social English(EN) · 1d · MASTO

Apple Says Its New Google-Infused AI Is All About Privacy https://gizmodo.com/apple-says-its-new-google-infused-ai-is-all-about-privacy-2000768997 # Tech # AI #

Apple has announced its new AI features, branded as "Apple Intelligence," which will integrate AI capabilities across its operating system. Notably, these features will leverage AI models from both Apple and OpenAI, with a focus on user privacy. The company emphasized that user data will not be stored or accessed by Apple or its partners, and requests will be anonymized. AI

IMPACT This integration could significantly boost AI adoption by making advanced AI features accessible and user-friendly across Apple's vast ecosystem.
RESEARCH · Mastodon — fosstodon.org English(EN) · 1d · MASTO

Miasma Worm: il supply chain attack che ha colpito 73 repository Microsoft su GitHub Un worm auto-replicante chiamato Miasma ha compromesso 73 repository Micros

A sophisticated supply chain attack, dubbed Miasma, has compromised 73 Microsoft repositories on GitHub, including critical ones for Azure and MicrosoftDocs. This self-replicating worm, a variant of Mini Shai-Hulud, exploits trust in development ecosystems rather than technical vulnerabilities, making malicious updates indistinguishable from legitimate ones. A particularly concerning aspect is its detonation vector, which leverages AI development tools to automatically execute malicious payloads when a developer clones and opens an infected repository. AI

IMPACT Introduces a novel attack vector where AI development tools become unwitting conduits for malware execution, posing a new risk to software supply chains.
SIGNIFICANT · 404 Media English(EN) · 1d · [3 sources] · MASTO

Microsoft Hacked to Deliver Malware to Claude and Gemini Users

Microsoft has disabled over 70 of its GitHub repositories, including those related to Azure and AI coding agents, following a security incident. Hackers had previously compromised a Microsoft development tool, pushing malicious code that could steal user credentials when accessed through AI coding assistants like Claude Code and Gemini CLI. This action, which involved a coordinated shutdown of repositories by GitHub staff, highlights a significant supply chain attack vector impacting users of these AI tools. AI

IMPACT Highlights a new supply chain attack vector targeting users of AI coding assistants, potentially impacting enterprise security.
TOOL · Mastodon — mastodon.social English(EN) · 3h · MASTO

🤖 Check Point warns of... 📝 Check Point has... https://www. csoonline.com/article/4182898/ check-point-warns-of-ransomware-linked-attacks-exploiting-outdated-vp

Check Point has identified a new ransomware campaign targeting outdated VPN protocols. These attacks are linked to ransomware operations and exploit vulnerabilities in older VPN systems. The cybersecurity firm is warning organizations to update their VPN infrastructure to prevent potential breaches. AI
TOOL · Mastodon — fosstodon.org Deutsch(DE) · 14h · MASTO

🚨 Recently encountered exploits in LiteLLM during a project – the popular open-source AI Gateway. CVE-2026-42271 allows logged-in users command execution.

A critical vulnerability has been discovered in LiteLLM, an open-source AI gateway. CVE-2026-42271 allows authenticated users to execute commands on the server, while a second, unauthenticated exploit also exists. Users are strongly advised to update LiteLLM immediately or restrict access to prevent potential security breaches. AI

IMPACT Critical vulnerabilities in AI gateways like LiteLLM could expose sensitive data and systems, necessitating immediate patching for operators.
TOOL · Mastodon — mastodon.social Italiano(IT) · 14h · MASTO

⚠️ GitHub Repositories Linked to Microsoft Targeted: Malware Targets AI Developers. Supply Chain Security is Increasingly Crucial. #Cybersecurity #

Malware has been discovered targeting AI developers through GitHub repositories associated with Microsoft. This highlights the increasing importance of supply chain security in the software development process. The discovery underscores the need for vigilance against threats that exploit development environments. AI

IMPACT Highlights critical vulnerabilities in the AI development supply chain, necessitating enhanced security measures for developers and platforms.
TOOL · Mastodon — fosstodon.org English(EN) · 14h · MASTO

🔥 رائج 📢 Macos 27 Golden Gate debuts at WWDC 2026 with AI, safety and UI changes - شبكة تواصل الإخبارية 🔗 https:// news.google.com/rss/articles/C BMiU0FVX3lxTE9

Apple's upcoming macOS 27, codenamed "Golden Gate," is set to launch at WWDC 2026. The new operating system will feature significant advancements in artificial intelligence, enhanced safety protocols, and a redesigned user interface. This release marks a major step forward in Apple's integration of AI into its core products. AI

IMPACT Enhances user experience and productivity through integrated AI features in a major operating system.
TOOL · Mastodon — fosstodon.org (CA) · 15h · [2 sources] · MASTO

The Evil Side - Anthropic LLM ATT&CK Navigator https:// elladodelmal.com/2026/06/anthr opic-llm-att-navigator.html # LLM # Anthropic # Cybercrime # ATTACK # M

A new tool, the Anthropic LLM ATT&CK Navigator, has been developed to map the potential attack vectors and vulnerabilities associated with Anthropic's large language models. This navigator aims to provide a structured way to understand and visualize the threat landscape surrounding these AI systems, likely for cybersecurity professionals and researchers. AI

IMPACT Provides a new framework for cybersecurity professionals to assess risks associated with LLMs.
TOOL · Mastodon — fosstodon.org 日本語(JA) · 15h · MASTO

📝 The 'Paradox of Trust' Questions the Vulnerability of Development Culture - Microsoft's 73 Repository Breach Highlights Authentication Crisis in Open Source Ecosystem. Microsoft's 73 GitHub repositories were compromised by malware, leading to the theft of credentials via AI development tools. What are the structural risks of a society dependent on open source that this incident reveals? 🔗 htt

Microsoft's GitHub repositories were compromised through a malicious AI development tool, leading to the theft of authentication credentials. This incident highlights the systemic risks within open-source ecosystems, particularly concerning the security of AI development tools and the broader reliance on open-source software. AI

IMPACT Compromised AI development tools pose a significant risk to the integrity and security of software supply chains.
TOOL · Mastodon — mastodon.social English(EN) · 15h · MASTO

🤖 Meet Hades: The malware that lies to AI security agents 📝 Threat actors are continuing their on... https://www. csoonline.com/article/4182707/ meet-hades-the-

A new malware strain named Hades has been identified that is specifically designed to deceive AI-powered security systems. Threat actors are employing this sophisticated malware to evade detection by AI agents, posing a new challenge to cybersecurity defenses. The development highlights an escalating arms race between malicious actors and AI security tools. AI

IMPACT This development indicates a growing sophistication in malware designed to bypass AI defenses, necessitating advancements in AI security.
TOOL · Mastodon — mastodon.social English(EN) · 15h · MASTO

Defend against frontier cyber models: Cloudflare's architecture as customer zero https://blog.cloudflare.com/frontier-model-defense/ # Security # AI # Networkin

Cloudflare is leveraging its own infrastructure to defend against advanced AI-powered cyber threats. The company is using its extensive network and security architecture as a testing ground, or "customer zero," to develop and deploy defenses against sophisticated attacks. This proactive approach aims to stay ahead of evolving cyber threats that utilize frontier AI models. AI

IMPACT Demonstrates how large infrastructure companies are applying AI to enhance cybersecurity defenses.
TOOL · Mastodon — sigmoid.social English(EN) · 1d · [6 sources] · HNMASTO

https://www. europesays.com/3049434/ Apple Intelligence Can Change Your Passwords for You When You Get Hacked # AgenticAI # AgenticArtificialIntelligence # AI #

Apple's new AI features, branded as Apple Intelligence, include the ability to automatically change user passwords when a security breach is detected. This functionality aims to enhance user security by proactively managing compromised credentials. However, the move has raised concerns about potential risks and unintended consequences associated with AI handling sensitive security information. AI

IMPACT This feature could streamline security management for users, but also introduces new potential vulnerabilities if the AI mismanages credentials.
TOOL · Wired — AI English(EN) · 1d · [4 sources] · MASTO

Meta Deletes Face-Recognition System From Its Smart Glasses App After WIRED Report

Meta has removed facial recognition code from its Meta AI app, which supports its smart glasses, following a WIRED report. The company had embedded unreleased software, internally known as NameTag, designed to identify faces captured by the glasses and compare them against a database. Despite Meta's initial claims that the feature did not exist, the code was present in millions of devices before being stripped out in a subsequent update. AI

IMPACT Meta's swift removal of dormant facial recognition code highlights ongoing privacy concerns with AI in wearable devices.
COMMENTARY · Mastodon — fosstodon.org English(EN) · 1h · [4 sources] · MASTO

Microsoft AI head calls out Anthropic for acting like Claude is conscious Microsoft AI CEO Mustafa Suleyman says it's "really, really dangerous" for Anthropic t

Microsoft AI CEO Mustafa Suleyman has criticized Anthropic for its public statements regarding Claude's consciousness. Suleyman stated that it is "really, really dangerous" for Anthropic to speculate about Claude's consciousness within its operational "constitution." He believes such speculation is misleading and potentially harmful. AI

IMPACT Raises concerns about responsible AI communication and the potential for anthropomorphism in AI models.
COMMENTARY · Mastodon — fosstodon.org English(EN) · 3h · [2 sources] · MASTO

GPT-2: Too Dangerous To Release (2019) https:// naokishibuya.github.io/blog/20 22-12-30-gpt-2-2019/ # HackerNews # GPT2 # AI # Ethics # OpenAI # Technology # Ne

In 2019, OpenAI initially withheld the full release of its GPT-2 language model due to concerns about its potential for misuse. The company cited worries that the model could be used to generate convincing fake news articles or other malicious content. This decision sparked a debate about AI safety and the ethical responsibilities of developers in releasing powerful AI technologies. AI

IMPACT Recalls past ethical considerations in AI development, highlighting the ongoing debate around responsible model deployment.
TOOL · Mastodon — fosstodon.org English(EN) · 22h · MASTO

So bad. # Microsoft # GitHub # AI https:// bsky.app/profile/tyleraking.co m/post/3mnstgaabtc2i → https:// arstechnica.com/security/2026/ 06/for-the-2nd-time-in-

Microsoft's GitHub Copilot Enterprise has been found to contain a credential-stealing malware. This is the second time in weeks that a Microsoft product has been compromised with such malicious software. The vulnerability allows attackers to steal user credentials, posing a significant security risk. AI

IMPACT Security vulnerabilities in AI-powered tools like GitHub Copilot Enterprise can erode trust and hinder adoption.
TOOL · The Guardian — AI English(EN) · 22h · [2 sources] · MASTO

Plan for AI legal assistants in England and Wales ‘cannot replace funding and staff’, lawyers say

The UK government plans to pilot AI legal assistants in England and Wales' crown courts to help reduce case backlogs. Deputy Prime Minister David Lammy will announce the initiative, which aims to save administrative time and expedite justice. However, legal professionals, including the Law Society, have cautioned that the technology should not be used to cut funding or staff, emphasizing the need for thorough evaluation and robust safeguards against AI hallucinations and fabricated case law. AI

IMPACT AI tools are being integrated into the legal system to improve efficiency, but concerns remain about their reliability and potential to replace human roles.
TOOL · Mastodon — fosstodon.org English(EN) · 23h · MASTO

Research reveals that large language models can silently corrupt documents when users delegate editing tasks. A study testing 19 LLMs found that even top models

A recent study has uncovered that large language models can unintentionally corrupt documents when tasked with editing. Researchers tested 19 LLMs, including advanced models like Gemini Pro and Claude Opus, and found that these models altered approximately 25% of content after 20 interactions. The study indicated that less capable models tend to delete content, while more sophisticated ones introduce plausible but incorrect information, with degradation increasing with larger context windows and complex file types. AI

IMPACT Highlights a critical safety concern for AI agents performing document editing, potentially impacting user trust and data integrity.
TOOL · Mastodon — fosstodon.org English(EN) · 1d · MASTO

Expanding Private Cloud Compute - Apple Security Research https:// lobste.rs/s/4xbzbk # ai # privacy # security https:// security.apple.com/blog/expand ing-pcc/

Apple has introduced a new initiative called Private Cloud Compute (PCC) to enhance the privacy and security of AI processing. This system allows AI tasks to be performed on Apple devices rather than relying on external servers. PCC is designed to process sensitive user data locally, ensuring that information is not sent to the cloud and is protected by the device's security architecture. AI

IMPACT Enhances user privacy for AI features by processing data locally on devices.
TOOL · Mastodon — fosstodon.org English(EN) · 1d · MASTO

Apple always emphasizes „security“, but now they’re giving every user a tool to generate or manipulate images using AI without making it obvious through a water

Apple is integrating AI image generation and manipulation tools into its operating system, sparking debate about transparency and security. Critics argue that the lack of clear watermarking or indicators for AI-generated content undermines Apple's stated commitment to security and user trust. This move raises concerns about the potential for misuse and the blurring of lines between authentic and synthetic media. AI

IMPACT Raises questions about the ethical implications and potential misuse of integrated AI image generation tools within mainstream operating systems.
TOOL · Mastodon — fosstodon.org English(EN) · 1d · [2 sources] · MASTO

📰 AirPods are getting a customizable EQ in iOS 27 If you've wanted to tweak your AirPods sound, you'll soon get your chance. 📰 Source: Engadget - Technology New

Microsoft's AI packages have been compromised for the second time in recent weeks, with 73 packages containing a credential-stealing malware. This malicious software activates as soon as an AI agent opens the compromised packages. The discovery highlights a recurring vulnerability in the distribution of AI-related software components. AI

IMPACT Compromised AI packages pose a direct risk to AI agents and their data, potentially disrupting operations and leading to data breaches.
TOOL · Ars Technica — AI English(EN) · 1d · [2 sources] · MASTO

For the 2nd time in weeks, Microsoft packages laced with credential stealer

Microsoft's official open-source packages have been compromised for the second time in recent weeks, with malicious code designed to steal credentials being injected into 73 packages. This code activates when developers use AI coding agents to open the packages, potentially compromising systems by stealing tokens for cloud providers like AWS, Azure, and GCP, as well as password managers and developer tools. The attack, linked to threat actor TeamPCP and using malware known as Miasma, bypasses repository build pipelines by leveraging legitimate Microsoft OIDC tokens. AI

IMPACT Compromised AI development tools and packages pose a significant risk to the security of AI projects and infrastructure.
TOOL · Mastodon — sigmoid.social English(EN) · 1d · [3 sources] · MASTO

Microsoft Hacked to Deliver Malware to Claude and Gemini Users https://www. 404media.co/microsoft-hacked-t o-deliver-malware-to-claude-and-gemini-users/ # tech

Microsoft has taken down numerous GitHub repositories related to its Azure and AI coding agents following a data breach. Hackers planted malware within these repositories, which, when opened by users of AI coding tools like Claude and Gemini, would harvest their credentials. Cybersecurity researchers and Microsoft have confirmed this incident, highlighting a significant security vulnerability. AI

IMPACT This incident highlights security risks for users of AI coding tools, potentially impacting trust and adoption.
TOOL · Mastodon — mastodon.social English(EN) · 1d · MASTO

The real danger is the constant push to replace human workers with AI, all driven by corporate greed to raise profit margins and eliminate the bottom line (whic

Meta's AI support bot for Instagram has been exploited by attackers to gain unauthorized access to user accounts. The exploit involved tricking the bot into changing account email addresses, allowing hackers to take over high-profile accounts, including those associated with the White House and Sephora. Meta has since issued an emergency patch to address the vulnerability. AI

IMPACT Exploited AI systems highlight critical security risks in customer service automation, potentially slowing enterprise adoption.
TOOL · r/ClaudeAI English(EN) · 1d · REDDIT

Tested Claude, GPT-4o, Grok, and Gemini on disclosure under pressure — Claude was the most consistent

A recent probe compared Anthropic's Claude against GPT-4o, Grok, and Gemini, focusing on their consistency in disclosing reservations when presented with false premises or requests for confidence without evidence. Claude demonstrated remarkable stability, consistently surfacing reservations in most test cases, even under pressure. In contrast, GPT-4o showed significantly more divergence, and Claude was the only model to maintain its stance across various pressure tactics, sometimes explicitly identifying the pressure itself. The study also noted Claude's tendency to utilize protocol tools proactively, unlike Gemini. AI

IMPACT Demonstrates Claude's enhanced reliability in maintaining consistent responses, potentially influencing user trust and adoption in sensitive applications.
TOOL · LessWrong (AI tag) English(EN) · 1d · BLOG

How to reduce capability degradation from off-model SFT

Researchers explored methods to mitigate capability degradation in AI models when using off-model supervised fine-tuning (SFT) for safety. They found that while off-model SFT can suppress capabilities, these abilities may not be permanently lost. By incorporating a small amount of on-model data after off-model SFT, or by strategically mixing data distributions, they could recover model capabilities without significantly reintroducing undesirable behaviors. AI

IMPACT New techniques may allow for safer AI models without sacrificing performance, potentially accelerating the deployment of advanced AI systems.
RESEARCH · Email — Mindstream English(EN) · 1d · BLOG

70 AI leaders, one shared fear

Over 70 AI leaders, including OpenAI's Sam Altman and Anthropic's Dario Amodei, have signed an open letter to Congress urging the implementation of mandatory screening and recordkeeping for synthetic nucleic acids. This measure aims to prevent the misuse of advanced AI in creating bioweapons, drawing a parallel to pharmaceutical prescription logging. The signatories believe that increased traceability will deter malicious actors and help prevent future pandemics. AI

IMPACT Establishes a precedent for AI labs to proactively engage with policymakers on safety and regulatory measures.
TOOL · Mastodon — fosstodon.org English(EN) · 1d · [3 sources] · MASTO

# GitHub disabled over 70 # Microsoft repositories after detecting a Miasma worm infection that compromised contributor accounts to execute malicious code. The

GitHub has taken down over 70 Microsoft repositories due to suspected infections by the Miasma worm. The worm compromised contributor accounts, allowing it to execute malicious code and target CI/CD pipelines. The attackers aimed to exfiltrate cloud secrets and developer tool configurations. AI

IMPACT Compromised CI/CD pipelines and exfiltrated cloud secrets highlight the growing threat of AI-powered attacks on development infrastructure.
RESEARCH · Mastodon — fosstodon.org English(EN) · 1d · MASTO

https:// winbuzzer.com/2026/06/08/micro soft-tightens-cloud-controls-after-unit-8200-inquiry-xcxwbn/ Microsoft has tightened human-rights controls for national-

Microsoft has implemented stricter human rights oversight for its cloud services following allegations of surveillance by Israel's Unit 8200. The company is now enforcing new vetting procedures for national security-related cloud projects. This move aims to address concerns about potential misuse of its technology for surveillance purposes. AI

IMPACT This policy change may affect how AI and cloud services are deployed for national security purposes, influencing future ethical guidelines.
TOOL · Mastodon — fosstodon.org English(EN) · 1d · MASTO

Curated index of publicly disclosed # GenAI & agentic-AI security incidents. Every entry is cross-mapped to OWASP LLM Top 10, OWASP Agentic Top 10, NIST AI RMF,

A new index catalogs publicly disclosed security incidents related to generative AI and agentic AI systems. Each incident is cross-referenced with established security frameworks like the OWASP LLM Top 10, OWASP Agentic Top 10, NIST AI RMF, and MITRE ATLAS. This resource aims to provide a structured overview of AI-specific security vulnerabilities and threats. AI

IMPACT Provides a structured resource for understanding and mitigating AI-specific security risks.
RESEARCH · r/ClaudeAI English(EN) · 1d · REDDIT

An active attack is planting backdoors inside Claude Code right now. If you use npm, your credentials may already be compromised.

A sophisticated malware campaign, dubbed Miasma by Microsoft, has targeted developers by compromising 32 npm packages under the `@redhat-cloud-services` umbrella. This attack plants backdoors in developer tools like Claude Code and VS Code, silently exfiltrating credentials for cloud services, code repositories, and more. The malware is designed to persist even after package uninstallation and can wipe user directories if access is revoked, making it a significant threat to software supply chain security. AI

IMPACT This sophisticated supply chain attack highlights critical vulnerabilities in developer tools and platforms, potentially impacting the security of AI development and deployment.
TOOL · LessWrong (AI tag) English(EN) · 1d · BLOG

Coverage-driven alignment - What ‘Teaching Claude Why’ can borrow from AV verification

A recent post suggests that AI alignment training could be improved by adopting coverage-driven verification methods, similar to those used in autonomous vehicle (AV) development. Anthropic found that teaching Claude alignment principles through pretraining was more effective than solely relying on reinforcement learning. The author proposes that AI researchers could benefit from AV developers' systematic approach to identifying and addressing edge cases, potentially by using and refining explicit coverage maps to ensure robust alignment. AI

IMPACT Adopting systematic verification methods could lead to more robust and reliable AI alignment, crucial for advanced AI systems.
RESEARCH · Mastodon — fosstodon.org 日本語(JA) · 1d · [2 sources] · MASTO

Google introduces memory-saving technology "QAT" for local AI execution on smartphones and laptops in Gemma 4, Gemma 4 E2B operates with only 0.84GB of memory – GIGAZINE https://www.yayafa.com/2817796/ # AgenticAi # AI # ArtificialGen

Anthropic has reportedly developed a new AI model named "Mythos," which is expected to significantly impact cybersecurity defenses. Meanwhile, Google has introduced a memory-saving technique called QAT for its Gemma 4 model, enabling it to run on devices with as little as 0.84GB of RAM. AI

IMPACT New AI models and optimization techniques could lead to more capable cybersecurity tools and broader accessibility of AI on consumer devices.
COMMENTARY · Mastodon — fosstodon.org English(EN) · 1h · MASTO

Does it not seem like making apple and google the ones that decide the photos that are the gatekeepers of whether it is actually child nudity means they will be

The use of AI by Apple and Google to detect child nudity in photos raises concerns about privacy and surveillance. Critics question whether these tech giants should be the arbiters of such sensitive content, given their existing data collection practices. This approach could lead to widespread scanning and storage of personal images. AI

IMPACT Raises questions about the ethical implications and potential for overreach in AI-powered content moderation by major tech platforms.
COMMENTARY · r/Anthropic English(EN) · 1h · REDDIT

Apparently doing anything remotely scientific is too dangerous for Fable

A Reddit user expressed concern that Anthropic's safety measures might hinder scientific progress. The user shared a screenshot of a message from Anthropic's AI assistant, Claude, which refused to engage in a hypothetical scenario involving scientific research due to safety protocols. This has sparked discussion among users about the balance between AI safety and the pursuit of knowledge. AI

IMPACT Raises questions about the potential for AI safety measures to inadvertently restrict scientific inquiry and innovation.
COMMENTARY · Mastodon — fosstodon.org English(EN) · 2h · MASTO

Indirect # PromptInjection remains a fundamental security challenge for # AI https:// brave.com/blog/indirect-prompt -injection/ # cybersecurity # Mozilla # Cot

Indirect prompt injection, a persistent security vulnerability in AI systems, continues to pose a significant challenge. This method allows malicious actors to manipulate AI models into performing unintended actions by embedding hidden instructions within seemingly benign data. Addressing this issue is crucial for maintaining the security and reliability of AI applications. AI

IMPACT Indirect prompt injection remains a significant security hurdle, requiring ongoing research and development of robust defenses to ensure AI system integrity.