Brief

last 24h

[50/324] 223 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · HN — claude cli stories English(EN) · 3mo

Show HN: Skill that lets Claude Code/Codex spin up VMs and GPUs

A new open-source terminal application called Skill has been developed to facilitate the use of AI coding agents. This tool is designed to help users spin up virtual machines and GPUs, streamlining the process of deploying and managing AI development environments. The project aims to provide a next-generation development experience for those working with AI-powered coding assistants. AI

IMPACT Potentially streamlines AI development workflows by simplifying VM and GPU provisioning for coding agents.
- cloudrouter.dev
- Skill
- Claude
- Codex
TOOL · HN — AI startup stories English(EN) · 4mo

Launch HN: Modelence (YC S25) – App Builder with TypeScript / MongoDB Framework

Modelence, an AI startup, has launched an open-source full-stack framework designed for both human developers and AI coding agents. The framework utilizes TypeScript for its type safety and MongoDB for flexible schema management, aiming to streamline app development by handling boilerplate tasks like authentication and database setup. An integrated app builder allows users to generate applications from prompts, with plans to introduce a DevOps agent for production monitoring and error resolution. AI

IMPACT Simplifies AI-driven application development by providing a unified framework and backend infrastructure.
- Claude Agent SDK
- MongoDB
- YC S25
- Modelence
- Eduard
- TypeScript
RESEARCH · HN — AI startup stories English(EN) · 4mo

Apple buys Israeli startup Q.ai

Apple has acquired the Israeli AI startup Q.ai for nearly $2 billion, aiming to bolster its capabilities in audio processing and machine learning. The startup, founded in 2022, specializes in technologies that can interpret whispered speech and enhance audio in noisy environments. This acquisition is Apple's second-largest to date and follows previous AI-focused feature integrations in products like AirPods and the Vision Pro headset. AI

IMPACT Strengthens Apple's AI hardware and audio capabilities, potentially impacting future product development and competition in the AI race.
- Apple
- Q.ai
- Reuters
- AirPods
- Vision Pro
- The Financial Times
- Beats Electronics
- Aviad Maizels
- PrimeSense
- Kleiner Perkins
- Yonatan Wexler
- Avi Barliya
- GV
TOOL · HN — AI startup stories English(EN) · 4mo

Launch HN: AgentMail (YC S25) – An API that gives agents their own email inboxes

AgentMail, a new API service from Haakam, Michael, and Adi, provides dedicated email inboxes for AI agents, aiming to streamline autonomous task completion. The service addresses limitations found in existing email platforms like Gmail, offering features such as programmatic inbox creation, advanced semantic search, and usage-based pricing. Early adopters are already utilizing AgentMail for tasks like data conversion, negotiation, and training model data sourcing. AI

IMPACT Enables more autonomous AI agents by providing a robust, dedicated communication channel, potentially streamlining workflows and data sourcing.
- Adi
- Gmail
- YC S25
- Clawdbots
- Rails
- AgentMail
- Michael
TOOL · HN — AI infrastructure stories English(EN) · 4mo

Show HN: ShapedQL – A SQL engine for multi-stage ranking and RAG

ShapedQL has been introduced as a new SQL engine designed to optimize multi-stage ranking and Retrieval-Augmented Generation (RAG) processes. This tool aims to streamline complex data operations within AI applications. The announcement was made via a Show HN post, indicating a focus on community feedback and developer adoption. AI

IMPACT Potentially improves efficiency for AI systems relying on RAG and complex ranking.
- ShapedQL
TOOL · HN — claude cli stories English(EN) · 4mo

Show HN: A fast CLI and MCP server for managing Lambda cloud GPU instances

A new open-source command-line interface (CLI) and MCP server has been released to manage cloud GPU instances from Lambda. The tool, developed by Strand-AI, allows users to directly control GPU infrastructure via terminal commands or enable AI assistants like Claude to manage these resources. It offers features such as starting, stopping, and listing instances, alongside automatic notifications for instance availability across Slack, Discord, and Telegram. AI

IMPACT Simplifies cloud GPU management for AI developers and researchers using AI assistants.
- Telegram
- Homebrew
- GitHub
- Strand-AI
- Claude
- Slack
- Discord
SIGNIFICANT · 36氪 (36Kr) 中文(ZH) · 5mo · [5 sources]

South Korea's May trade data shows chip exports remain strong

Nvidia is reportedly acquiring assets from AI chip startup Groq for approximately $20 billion, marking its largest deal to date. This acquisition aims to integrate Groq's low-latency inference technology into Nvidia's AI factory architecture. While Nvidia is licensing Groq's intellectual property and hiring key personnel, Groq will continue to operate as an independent company, with its cloud business unaffected. AI

IMPACT Accelerates Nvidia's AI inference capabilities and potentially broadens its custom chip offerings.
- South Korea
- Nvidia
- Groq
- Jensen Huang
- OpenAI
- Cisco
- Jonathan Ross
- Disruptive
- Donald Trump Jr.
- 1789 Capital
- Blackrock
- Neuberger Berman
- Samsung
- Mellanox
- Altimeter
TOOL · HN — AI infrastructure stories English(EN) · 5mo

Show HN: I open-sourced my Go and Next B2B SaaS Starter (deploy anywhere, MIT)

A developer has open-sourced a comprehensive B2B SaaS starter kit built with Next.js 16 and Go 1.25. The kit includes features like enterprise-grade authentication, multi-tenancy, role-based access control, and billing integration. It also incorporates AI capabilities such as RAG pipelines with vector embeddings and an OCR service for document data extraction. AI

IMPACT Provides a pre-built foundation for developers to quickly integrate AI features like RAG and OCR into their SaaS products.
- TypeScript
- Tailwind CSS
- shadcn/ui
- Radix UI
- TanStack Query
- react-hook-form
- Stytch
- Polar.sh
- Recharts
- PostgreSQL
- pgvector
- OpenAI API
- Mistral AI
- Cloudflare R2
- Next.js
- Docker
- Zod
- SQLC
TOOL · HN — MCP stories English(EN) · 6mo

Show HN: MCPShark – Traffic Inspector for Model Context Protocol

MCPShark is a newly released traffic inspector designed for the Model Context Protocol (MCP). This tool allows developers to observe and debug MCP traffic, including requests, responses, and tool usage, between their editor or LLM client and MCP servers. It also offers optional "Smart Scan" checks to identify potentially risky tool configurations. AI

IMPACT Provides developers with enhanced visibility and debugging capabilities for LLM interactions via the Model Context Protocol.
TOOL · HN — AI infrastructure stories English(EN) · 6mo

Microsoft won't let me pay a $24 bill, blocking thousands in Azure spending

A software engineer detailed their frustrating experience attempting to resolve a $24 Azure billing issue that prevented them from spending thousands on new services. Despite numerous attempts through official channels, including a custom-built PowerShell application, Microsoft's support system created a loop where paying the invoice required support, but support required a paid plan that couldn't be purchased due to the outstanding invoice. The engineer expressed disbelief at the company's inability to accept payment, while other users shared similar anecdotes and suggested alternative approaches like contacting sales. AI

IMPACT Highlights potential friction points in cloud provider billing and support systems that could impact enterprise adoption of AI infrastructure.
TOOL · HN — AI startup stories English(EN) · 7mo

Y Combinator Startup brings brainrot to developers' IDEs

Clad Labs has launched a new platform designed to orchestrate multiple AI coding agents, including Claude Code, Cursor, and OpenAI Codex. The tool allows developers to spin up teams of parallel agents, manage their work in isolation, and merge changes seamlessly. It also offers analytics to track coding habits and productivity alongside entertainment usage. AI

IMPACT Enables developers to leverage multiple AI coding agents simultaneously, potentially streamlining workflows and improving productivity.
- Cursor
- Chad
- Claude Code
- OpenAI Codex
- Clad Labs
TOOL · HN — AI startup stories English(EN) · 8mo

Launch HN: Extend (YC W23) – Turn your messiest documents into data

Extend, a Y Combinator-backed startup, has launched a production-ready platform designed to transform messy documents into structured data. The service utilizes specialized vision models for accurate parsing and extraction, offering features like confidence scoring, multiple processing modes, and an optimization agent called Composer Agent to refine schemas automatically. Extend aims to streamline document workflows for AI teams, enabling faster development and deployment of data processing pipelines. AI

IMPACT Accelerates enterprise adoption of AI for document processing by simplifying data extraction and workflow automation.
TOOL · HN — AI startup stories English(EN) · 8mo

Launch HN: Webhound (YC S23) – Research agent that builds datasets from the web

AI startup Webhound has launched a research agent designed to automate the creation of web-scraped datasets based on natural language prompts. The agent, initially built on Claude 4 Sonnet, was re-engineered using Gemini 2.5 Flash and a multi-agent system to significantly reduce costs and improve reliability. This new architecture includes specialized agents for planning, searching, critiquing, and validating data, along with a text-based browser for efficient extraction. AI

IMPACT Automates complex data collection tasks, potentially lowering the barrier for data-driven research and analysis.
- Claude 4 Sonnet
- Gemini 2.5 Flash
- Retool
- Appsmith
- Superblocks
- UI Bakery
- BudiBase
- Shopify
- Hacker News
- Figma
- arXiv
- Webhound
- YC S23
TOOL · HN — MCP stories English(EN) · 8mo · [2 sources]

Show HN: AI-powered web service combining FastAPI, Pydantic-AI, and MCP servers

A developer has created an open-source AI-powered web service that integrates FastAPI for APIs, Pydantic-AI for agent construction, and Model Context Protocol (MCP) servers for tools. The service allows users to query information from sources like Hacker News and web search, presenting ranked trend cards with summaries. It supports various local LLM configurations and is containerized with Docker for production deployment. AI

IMPACT Provides a template for building production-ready AI services with modular components and local LLM support.
- MCP
- OpenAI
- FastAPI
- Pydantic-AI
- Hacker News
- Ollama
- LMStudio
- vLLM
- GitHub
- Model Context Protocol (MCP)
- Docker
TOOL · HN — AI infrastructure stories English(EN) · 9mo

Launch HN: Recall.ai (YC W20) – API for meeting recordings and transcripts

Recall.ai has launched a new Desktop Recording SDK designed to simplify the integration of meeting recording capabilities into other applications. This SDK addresses the complexities of capturing high-quality audio and video, including speaker identification and clean video compositing, without requiring a bot to be present in the meeting. The company aims to provide developers with a robust infrastructure solution, drawing on their experience powering recording features for over 2000 companies and overcoming significant technical challenges in reliability and efficiency. AI

IMPACT Simplifies AI integration for meeting analysis tools by providing a reliable recording infrastructure.
- AWS
- ChatGPT
- Hubspot
- Clickup
- Recall.ai
- Notion
RESEARCH · HN — AI infrastructure stories English(EN) · 9mo

Nvidia results show spending on A.I. infrastructure remains robust

Nvidia's latest financial results indicate a continued strong demand for AI infrastructure, with significant revenue generated from its AI chip sales. The company's performance highlights the ongoing substantial investment in hardware necessary to support the rapidly expanding AI sector. This robust spending suggests that the development and deployment of advanced AI models remain a top priority for many organizations. AI

IMPACT Confirms that the demand for AI hardware remains strong, suggesting continued investment in AI development and deployment.
- AI
- Nvidia
TOOL · HN — AI infrastructure stories English(EN) · 9mo · [2 sources]

Show HN: Smooth – Faster, cheaper browser agent API

Smooth has launched a new serverless browser agent API designed for reliability, speed, and cost-efficiency, claiming to be 7x cheaper and 5x faster than existing solutions. The API aims to simplify web automation tasks for developers by handling complexities like instant browser spin-up and CAPTCHA solving. Separately, ContextFort has introduced a tool to provide visibility and control over AI coding agents like Cursor and Claude Code, addressing security concerns about agents accessing sensitive files and credentials on developer machines. AI

IMPACT New tools emerge to enhance AI agent capabilities and address security concerns in development workflows.
TOOL · HN — MCP stories English(EN) · 9mo

Launch HN: April (YC S25) – Voice AI to manage your email and calendar

April, a new voice-controlled AI assistant, has launched on the App Store to manage emails and calendars. The application allows users to dictate replies, summarize messages, and reschedule meetings hands-free. It utilizes Deepgram for speech-to-text and Eleven Labs for text-to-speech, with custom servers for Google integration. The developers are focusing on low latency and natural interaction, while also considering user feedback on safety features like a 'safe mode' for non-destructive operations. AI

IMPACT Potentially streamlines daily productivity for users by enabling hands-free management of communications and schedules.
- Akash
- SF
- April
- Gmail
- HN
- App Store
- Deepgram
- Eleven Labs
- Google
- Berkeley
TOOL · HN — AI infrastructure stories English(EN) · 9mo

Launch HN: Skope (YC S25) – Outcome-based pricing for software products

Skope, a new billing system, has launched to support outcome-based pricing for software products, particularly targeting the burgeoning AI market. The platform allows companies to charge customers only when their software delivers a specific result, aligning incentives and reducing buyer risk. Skope aims to simplify the implementation of this pay-per-performance model, which was previously challenging to manage at scale. AI

IMPACT Enables new pricing models for AI products, potentially accelerating adoption by reducing upfront risk for buyers.
- Stripe Billing
- Metronome
- Langfuse
- Stripe
- Skope
- Helicone
RESEARCH · HN — AI infrastructure stories English(EN) · 9mo

The U.S. grid is so weak, the AI race may be over

The rapid expansion of AI is creating a significant bottleneck in the United States due to the limitations of its power grid, contrasting sharply with China's robust energy infrastructure. While U.S. AI growth is hampered by debates over data center power consumption and grid stability, China has proactively addressed this by overbuilding its power capacity over decades. This strategic oversupply allows China to integrate AI data centers as a means to absorb excess energy, a situation unimaginable in the U.S. where grids often operate with minimal reserve margins, leading to concerns about the sustainability of AI development. AI

IMPACT AI development in the US faces a critical bottleneck due to power grid limitations, potentially hindering growth compared to China's energy-secure infrastructure.
- Tech Buzz China
- Fortune
- X
- Goldman Sachs
- McKinsey
- Stifel Nicolaus
- S&P 500
- Deloitte
- Ohio
- David Fishman
- Germany
- California
- India
- Texas
- Rui Ma
TOOL · HN — AI startup stories Deutsch(DE) · 9mo

Launch HN: Cyberdesk (YC S25) – Automate Windows legacy desktop apps

Cyberdesk, a startup founded by Mahmoud and Alan, has launched a new tool designed to automate repetitive tasks within legacy Windows desktop applications. Their approach uses a deterministic computer use agent that learns workflows from natural language instructions, offering a more reliable alternative to traditional Robotic Process Automation (RPA) scripts. The agent can self-correct based on screen state and only resorts to expensive AI models when unexpected anomalies occur, making it both robust and cost-effective for industries like healthcare and accounting. AI

IMPACT Automates legacy desktop applications, potentially improving efficiency and reducing errors in industries reliant on older software.
TOOL · HN — MCP stories English(EN) · 10mo

Show HN: Mcp-use – Connect any LLM to any MCP

The mcp-use framework has been released, enabling developers to build applications that can connect to various large language models like ChatGPT and Claude. This framework allows for the creation of MCP Servers and MCP Apps, with SDKs available in TypeScript and Python. It also includes an MCP Inspector for testing and debugging, and a cloud deployment option for production environments. AI

IMPACT Enables developers to build cross-platform applications for multiple LLMs, potentially streamlining AI agent development.
- TypeScript
- mcp-use
- ChatGPT
- Claude
- Python
- GitHub
- Manufact MCP Cloud
TOOL · HN — machine learning stories English(EN) · 10mo

PHP-ORT: Machine learning inference for the web

A new infrastructure project called PHP-ORT aims to bring machine learning inference capabilities directly to PHP, the server-side language used by a significant portion of the web. This development seeks to empower millions of PHP developers to integrate AI features into their applications without relying on external services or switching programming languages. PHP-ORT provides a core Tensor API, a high-performance math library, and integrates with ONNX for direct inference, promising significant speedups. AI

IMPACT Enables millions of PHP developers to integrate ML inference directly into their web applications, potentially democratizing AI capabilities at scale.
- SSE2
- NEON
- AVX512
- SSE4.1
- RISCV64
- PHP-ORT
- ONNX
- AVX2
- CUDA
- WASM
TOOL · HN — AI infrastructure stories English(EN) · 10mo

Show HN: Improving search ranking with chess Elo scores

ZeroEntropy has developed specialized AI models, including rerankers and embeddings, designed for production systems that prioritize speed and accuracy over generalist models. Their offerings, such as zembed-1 and zerank-2, aim to provide lower latency and higher accuracy for applications like Retrieval Augmented Generation (RAG). These models are available for integration into existing stacks and can be deployed on cloud platforms like AWS and Azure, with a focus on security and compliance standards. AI

IMPACT Offers specialized, low-latency AI models that could improve performance for specific RAG and search ranking tasks.
TOOL · HN — AI startup stories English(EN) · 11mo

Show HN: Open source alternative to Perplexity Comet

BrowserOS has launched as an open-source browser designed for the AI era, integrating AI agents that can automate web tasks through natural language commands. It prioritizes user privacy and offers extensive customization by supporting over 11 AI providers, including popular options like Anthropic Claude, Google Gemini, and OpenAI, as well as local models. The browser is built on a Chromium fork, ensuring compatibility with existing Chrome extensions and offering a user-friendly experience for both general users and developers. AI

IMPACT This browser aims to streamline AI agent integration for web automation, potentially simplifying workflows for users and developers interacting with various LLMs.
- LM Studio
- BrowserOS
- Perplexity
- Comet
- GitHub
- Moonshot Kimi
- Anthropic Claude
- Google Gemini
- OpenAI
- OpenRouter
- Ollama
- macOS
- Windows
- Chrome
- Linux
TOOL · HN — machine learning stories English(EN) · 12mo

Show HN: Glowstick – type level tensor shapes in stable rust

Glowstick is a new Rust crate designed to enhance tensor manipulation by integrating shape checking directly into the type system. This approach aims to make tensor operations safer and more intuitive, particularly for developers working with machine learning frameworks. The project, currently in its pre-1.0 phase, offers features like dynamic dimension support and improved error messages, with plans to align with ONNX operations. AI

IMPACT Provides a type-safe approach to tensor manipulation in Rust, potentially improving developer experience and reducing errors in ML workflows.
- Tensor
- Candle
- ONNX
- Burn
- Rust
TOOL · HN — AI infrastructure stories English(EN) · 13mo

Launch HN: Tinfoil (YC X25): Verifiable Privacy for Cloud AI

Tinfoil, a startup founded by researchers from MIT and Cloudflare, has launched a new service designed to provide verifiable privacy for AI workloads hosted in the cloud. The platform utilizes secure enclave technology, particularly NVIDIA's confidential computing capabilities on GPUs, to ensure that neither Tinfoil nor the cloud provider can access sensitive data processed by AI models. This approach aims to enhance AI privacy by replacing trust with provable security, enabling more complex AI applications that require private data. AI

IMPACT Enables more sensitive AI applications by providing verifiable privacy for cloud-hosted models.
- FHE
- Sigstore
- Tinfoil
- MIT
- NVIDIA
- Microsoft Research
- Cloudflare
- Tor
- Llama
- Deepseek R1
- TLS
TOOL · HN — AI startup stories English(EN) · 13mo

Launch HN: ParaQuery (YC X25) – GPU Accelerated Spark/SQL

ParaQuery, a new startup, has launched a GPU-accelerated Spark and SQL data processing solution. The platform aims to offer cost and performance benefits over existing solutions like Google BigQuery. ParaQuery leverages NVIDIA's RAPIDS technology to enhance traditional data processing tasks, which the founder notes are often mistakenly believed to be limited to AI and graphics. AI

IMPACT Enhances data processing efficiency, potentially lowering costs for AI workloads that rely on large datasets.
- ParaQuery
- Spark
- CUDA
- Google
- SQL
- BigQuery
- NVIDIA
- RAPIDS
TOOL · HN — AI startup stories English(EN) · 13mo

Launch HN: Exa (YC S21) – The web as a database

Exa has launched Websets, a new search engine that uses embeddings and agentic workflows to provide precise results from the web, presented in a database-like table format. The service aims to combat the decline in search quality by performing extensive embedding searches and then using LLMs to verify each result against complex queries. While the process can take significant time, Exa believes the accuracy and detailed verification are worth the wait, offering an alternative to traditional keyword-based search. AI

IMPACT Offers a novel approach to web search by leveraging embeddings and LLMs for enhanced accuracy and structured data retrieval.
- Exa
- Jeff
- LLM
- Websets
- Will
- Google
TOOL · HN — machine learning stories English(EN) · 13mo

OCaml's Wings for Machine Learning

Raven is a new ecosystem of OCaml libraries designed for numerical computing, machine learning, and data science. It aims to provide type-safe alternatives to popular Python libraries such as NumPy, JAX, and PyTorch. The project includes modules for n-dimensional arrays, automatic differentiation, tokenization, neural networks, dataframes, and plotting, with the goal of building a robust scientific computing environment. AI

IMPACT Provides a type-safe alternative for AI development in OCaml, potentially attracting developers seeking stronger guarantees.
- PyTorch
- OCaml
- Raven
- NumPy
- JAX
- Matplotlib
- Jupyter
- MirageOS
TOOL · HN — AI startup stories English(EN) · 13mo

Show HN: Morphik – Open-source RAG that understands PDF images, runs locally

Morphik has launched an open-source Retrieval-Augmented Generation (RAG) system designed for developers to integrate complex context into AI applications. The system aims to simplify the process by offering a unified solution for storing, representing, and searching unstructured and multimodal data, addressing the limitations of traditional RAG pipelines that struggle with visually rich documents. Morphik provides features like multimodal search, fast metadata extraction, and integrations with tools such as Google Suite and Slack, with a free tier available for users. AI

IMPACT Simplifies multimodal data integration for AI applications, potentially reducing development complexity and infrastructure costs.
- Morphik
- Slack
- Confluence
TOOL · HN — AI infrastructure stories English(EN) · 13mo

Show HN: We Put Chromium on a Unikernel (OSS Apache 2.0)

A new open-source project offers sandboxed Chrome browsers that can be run as Docker containers or on Unikraft unikernels. This setup is designed for browser automation, web agents, and testing AI agents that interact with the web. The unikernel implementation provides features like automated standby mode with state snapshotting and extremely fast cold restarts, enabling low-latency event handling. AI

IMPACT Enables developers to build and test AI agents that require controlled browser environments.
- Playwright
- Chrome DevTools
- Kraft CLI
- Chromium
- Unikernel
- Unikraft
- Puppeteer
- Docker
TOOL · HN — AI startup stories English(EN) · 14mo

Launch HN: mrge.io (YC X25) – Cursor for code review

AI startup mrge has launched a new platform designed to streamline code reviews for development teams. The tool connects to GitHub repositories and uses AI to analyze code changes within a secure, ephemeral sandbox environment. It aims to assist human reviewers by identifying potential bugs and providing context, inspired by productivity tools like Linear and Superhuman. AI

IMPACT Aims to accelerate code merging and reduce bugs by leveraging AI for code review, potentially improving developer productivity.
- mrge.io
- Cursor
- GitHub
- Linear
- Superhuman
- Cal.com
- Better Auth
- n8n
TOOL · HN — AI infrastructure stories English(EN) · 14mo

Show HN: ActorCore – Stateful serverless framework that runs anywhere

ActorCore, an open-source framework for AI agents, has been released, offering stateful serverless execution that aims to be significantly cheaper than existing sandbox solutions. It leverages WebAssembly and V8 isolates for near-zero cold starts and can be deployed across various platforms. The framework supports multiple AI models and provides granular security controls, with options for self-hosting or using a managed cloud service. AI

IMPACT Provides a cheaper and faster infrastructure for running AI agents, potentially lowering operational costs for AI applications.
- V8
- ActorCore
- WebAssembly
- Pi
- Claude Code
- Codex
- OpenCode
- Rivet Cloud
- Daytona
TOOL · HN — AI infrastructure stories English(EN) · 14mo

Show HN: Python at the Speed of Rust

The blog post "Python at the Speed of Rust" introduces a new approach to Python performance by leveraging Rust. It details how to integrate Rust code into Python projects, aiming to achieve significant speedups for computationally intensive tasks. The author demonstrates practical methods for this integration, offering a way to enhance existing Python applications without a complete rewrite. AI

IMPACT Offers a method for developers to significantly accelerate Python code, potentially benefiting AI/ML workloads that rely on Python.
- Python
- Rust
TOOL · HN — machine learning stories English(EN) · 14mo

SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators

Researchers have developed SeedLM, a novel post-training compression technique for large language models that utilizes pseudo-random generator seeds to encode model weights. This method aims to reduce the high runtime costs associated with LLMs by generating weight matrices on-the-fly during inference, thereby decreasing memory access and improving speed for memory-bound tasks. SeedLM achieves this by trading compute for fewer memory accesses and notably does not require calibration data, generalizing well across diverse tasks and maintaining accuracy comparable to FP16 baselines even at significant compression levels. AI

IMPACT This compression technique could significantly reduce the deployment costs and increase the inference speed of large language models.
- IEEE Visualization
- SeedLM
- LLMs
- Llama3 70B
- Llama 2
- FP16
- Meta
TOOL · HN — machine learning stories English(EN) · 14mo

Show HN: OCR pipeline for ML training (tables, diagrams, math, multilingual)

A developer is creating a versatile OCR pipeline designed to extract structured data from complex educational materials for machine learning training. The system, which supports multilingual text, mathematical formulas, tables, and diagrams, aims to achieve over 90-95% accuracy on academic datasets. It generates AI-ready outputs in JSON or Markdown, including semantic annotations for visual content, and is built using various tools like Google Vision API and OpenAI API. The project's public release has been delayed due to the developer's academic commitments but is expected once the system is finalized. AI

IMPACT This tool could streamline the creation of specialized datasets for ML training, particularly in academic and research contexts.
TOOL · HN — MCP stories English(EN) · 14mo

Show HN: Cursor IDE now remembers your coding prefs using MCP

Daniel from Zep has developed an integration for the Cursor IDE that provides persistent memory across coding sessions. This system uses Zep's open-source Graphiti framework and its Model Context Protocol (MCP) to store and retrieve user preferences, project specifications, and coding standards. The goal is to enhance the AI-assisted IDE by allowing it to remember crucial context without constant user input, adapting in real-time to changes in frameworks or standards. AI

IMPACT Enhances AI coding assistants by providing persistent memory, potentially improving developer workflow and reducing repetitive context setting.
RESEARCH · HN — AI infrastructure stories English(EN) · 14mo

FOSS infrastructure is under attack by AI companies

AI companies are aggressively crawling open-source infrastructure, causing significant outages and disruptions for projects like SourceHut, KDE GitLab, and GNOME. These AI scrapers often disregard robots.txt and mimic legitimate user agents, making it difficult to implement effective defenses. As a result, some projects have resorted to implementing challenging proof-of-work systems to block these bots, which can also impact legitimate users. AI

IMPACT AI data scraping practices are straining open-source infrastructure, potentially hindering collaboration and development.
- SourceHut
- Drew DeVault
- KDE GitLab
- GNOME GitLab
- Anubis
- Anthropic
- OpenAI
- Alibaba
SIGNIFICANT · Databricks Blog English(EN) · 15mo · [170 sources]

MCP Marketplace Brings Real-Time Intelligence to Agentic Applications

Multiple open-source projects and platforms are emerging to standardize AI agent interactions through the Model Context Protocol (MCP). These initiatives aim to enable AI agents to access real-time data, external tools, and complex workflows via a unified interface. Key developments include command-line clients for MCP, frameworks for representing agents as MCP servers, and cloud-hosted solutions for integrating various data sources and services. AI

IMPACT Standardization around MCP is likely to accelerate the development and integration of AI agents, enabling more complex and interconnected AI systems.
- Klavis AI
- JigsawStack
- Agent MCP Studio
- MCP
- BuyWhere
- AWS Health
- Cursor
- Coinopai
- Hyperbrowser
- Apify
- USDC
- Claude Desktop
- Stripe
- AI agents
- Databricks
- Model Context Protocol (MCP)
- AgentPay
- Moody's
- You.com
- OpenAI
- Model Context Protocol
- Cotality
- WorkOS
- Anthropic
- LastMile AI
- Stytch
TOOL · HN — AI infrastructure stories English(EN) · 15mo

Show HN: ArchGW – An open-source intelligent proxy server for prompts

ArchGW, an open-source intelligent proxy server, aims to simplify the development of agentic AI applications. It centralizes essential middleware functions such as agent routing, orchestration, safety guardrails, and model agility, allowing developers to focus on core product logic. Built on Envoy and backed by LLM research, ArchGW supports various languages and AI frameworks, offering features like low-latency orchestration, zero-code capture of agentic signals, and moderation hooks. AI

IMPACT Simplifies agentic AI development by centralizing core middleware functions, potentially accelerating production deployment.
- Envoy
- Anthropic
- ArchGW
- Plano
- OpenAI
TOOL · HN — AI infrastructure stories English(EN) · 15mo

Show HN: Agents.json – OpenAPI Specification for LLMs

Wildcard AI has introduced agents.json, an open specification designed to help AI agents interact more effectively with APIs. This new standard builds upon the existing OpenAPI specification by adding structured contracts, including concepts like 'flows' and 'links', to optimize for LLM understanding and execution of API calls. The goal is to simplify the process for developers integrating AI agents with web services, enabling more reliable and scalable agent interactions. AI

IMPACT Simplifies API integration for AI agents, potentially accelerating the development and deployment of agent-based applications.
- Wildcard Bridge
- Operator
- Wildcard AI
- agents.json
- OpenAPI
- LLM
- OpenAI
TOOL · HN — AI infrastructure stories English(EN) · 15mo

Show HN: Globstar – Open-source static analysis toolkit

DeepSource has open-sourced Globstar, a static analysis toolkit designed for creating custom code quality and security checkers. The toolkit leverages tree-sitter for parsing code and utilizes AI assistants like ChatGPT and Claude to generate complex queries, simplifying the process for developers. Globstar offers both YAML and Go interfaces, supporting over 20 languages with plans to add C/C++ support. AI

IMPACT Simplifies the creation of custom code quality and security checkers by leveraging AI for query generation.
- YAML
- Globstar
- ChatGPT
- Claude
- tree-sitter
- Semgrep
- Comby
- C++
- DeepSource
RESEARCH · HN — AI startup stories English(EN) · 16mo

Intel ruined an Israeli startup it bought for $2B–and lost the AI race

Intel has effectively dismantled Habana Labs, an Israeli AI chip startup it acquired for $2 billion, marking a significant failure in its attempt to compete with Nvidia. Despite initial optimism and a deal with Amazon for its Gaudi chips, Intel's internal issues and integration problems led to key personnel departing and the cancellation of next-generation products like Falcon Shores. This outcome represents a rare misstep for Habana's founder, Avigdor Willenz, who has a history of successful ventures in the semiconductor industry. AI

IMPACT Highlights the intense competition and challenges in the AI hardware market, potentially impacting the supply chain for AI model training.
- Intel
- Habana Labs
- Nvidia
- Amazon
- Gaudi
- LLMs
- Falcon Shores
- Avigdor Willenz
- Marvell
- Annapurna Labs
- Astera Labs
- Mobileye
- Nervana
- Raja Koduri
- AMD
TOOL · HN — AI infrastructure stories English(EN) · 16mo

Show HN: BrowserAI – Run LLMs directly in browser using WebGPU (open source)

BrowserAI is an open-source project enabling large language models to run directly within a web browser using WebGPU for accelerated performance. This approach ensures 100% privacy as all processing occurs locally, eliminating server costs and enabling offline capabilities. The SDK supports multiple engines and popular models, offering features like text generation, speech recognition, text-to-speech, and audio source separation. AI

IMPACT Enables privacy-focused, low-cost AI applications by running models directly in the user's browser.
- Llama-3.2-1b-instruct
- BrowserAI
- WebGPU
- LLM
- Kokoro-TTS
- Gemma
- Transformers
- Whisper-tiny-en
- Flare
- Demucs
- MLC
TOOL · HN — AI infrastructure stories English(EN) · 17mo

Show HN: Anyshift.io – Terraform "Superplan"

Anyshift.io has introduced a "Superplan" for Terraform, aiming to simplify cloud infrastructure management. This new offering is designed to streamline the deployment and maintenance of cloud resources, potentially reducing complexity for developers and operations teams. The platform focuses on enhancing the user experience for managing infrastructure as code. AI

IMPACT Niche tooling improvement; minimal industry-wide impact.
- Terraform
- Anyshift.io
RESEARCH · HN — AI infrastructure stories English(EN) · 17mo

Executive order on advancing United States leadership in AI infrastructure

The White House has issued an executive order aimed at bolstering U.S. leadership in AI infrastructure. The order focuses on expanding access to computing resources, developing AI talent, and promoting responsible AI innovation. It also emphasizes the importance of international collaboration and the development of safety standards for AI technologies. AI

IMPACT This executive order aims to solidify U.S. leadership in AI by focusing on infrastructure and talent, potentially accelerating domestic AI development and deployment.
- United States
- White House
TOOL · HN — AI infrastructure stories English(EN) · 17mo

Show HN: Free TCG Proxy Manager for Magic, Yugioh, and Pokemon

A developer has created a free tool to generate custom-printed trading card proxies for games like Magic: The Gathering, Yugioh, and Pokemon. The tool utilizes an AI upscaling model from Replicate to enhance card image quality for casual play. The project is built using Rails 8 and deployed with Kamal 2, leveraging Hetzner for affordable cloud compute and self-hosting services like Meilisearch and OpenObserve instead of relying on PaaS providers. AI

IMPACT Demonstrates a practical application of AI upscaling models for niche creative projects, potentially lowering the barrier for custom content creation.
RESEARCH · HN — AI startup stories Suomi(FI) · 17mo

Vultr Raises $333M at $3.5B Valuation

Vultr, a cloud computing provider focused on AI workloads, has secured $333 million in funding at a $3.5 billion valuation. The investment round was led by existing investor Thoma Bravo. The company plans to use the funds to expand its global infrastructure and enhance its AI-specific offerings. AI

IMPACT Expansion of Vultr's infrastructure could lower costs and increase accessibility for AI development and deployment.
- Thoma Bravo
- Vultr
TOOL · HN — AI infrastructure stories English(EN) · 18mo

Show HN: Hyperbrowser – Scalable Browser Infrastructure for AI Apps

Hyperbrowser is a new open-source project designed to provide scalable browser infrastructure specifically for AI applications. It aims to streamline the development and deployment of AI-powered web experiences by offering robust backend support. The project is available for developers to explore and contribute to. AI

IMPACT Provides a new infrastructure option for developers building AI applications.
- Hyperbrowser
- AI