Brief

last 24h

[50/180] 223 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · r/LocalLLaMA English(EN) · 21h

New MLX LM Server From Apple

Apple has released MLX LM Server, a new tool designed to enhance the performance of large language models on Mac hardware. It leverages the M5 chip's neural accelerators for faster prompt processing and employs continuous batching to manage multiple requests concurrently. For extremely large models, the server supports distributed inference across multiple Macs using Thunderbolt RDMA. AI

IMPACT Enhances LLM inference capabilities on Apple hardware, potentially improving local AI development and deployment.
TOOL · Mastodon — fosstodon.org English(EN) · 5h

Google Fi just made overseas travel less painful with these upgrades and perks Google Fi’s huge roaming upgrade includes faster 5G and a massive price cut. http

Google Fi has announced significant upgrades to its international roaming services, including faster 5G speeds and reduced prices for data usage abroad. These enhancements aim to make international travel more convenient and affordable for its users. The changes are part of Google Fi's ongoing efforts to improve its global connectivity offerings. AI

IMPACT Minimal direct impact on AI operators; primarily a consumer telecom service improvement.
- Google Fi
- Google
TOOL · arXiv cs.NE (Neural & Evolutionary) English(EN) · 1d

OpenOpt: An Open-Source SRAM Optimizer Based on Equivalent Circuit Model

Researchers have developed OpenOpt, an open-source framework for optimizing SRAM architecture and transistor sizing. This framework utilizes equivalent circuit models to achieve significant simulation speedups while maintaining high accuracy for read/write delays and power consumption. The system integrates various optimization algorithms and has demonstrated substantial improvements in static noise margin, area, and peak power. AI
- OpenOpt
- SRAM
- FreePDK45
- MOEA/D
TOOL · arXiv cs.CL English(EN) · 3d

AlignFed: Alignment-Aware Asynchronous Federated Fine-Tuning for Large Language Models in Heterogeneous Edge Environments

Researchers have introduced AlignFed, a new framework designed for asynchronous federated fine-tuning of large language models (LLMs) in edge environments. This approach addresses challenges like data privacy, resource heterogeneity, and non-IID data by enabling collaborative model adaptation without raw data exposure. AlignFed utilizes a multi-stage semantic alignment mechanism to mitigate model drift and aggregation fairness issues, aiming for stable and efficient LLM optimization in complex edge settings. AI

IMPACT Enables more efficient and privacy-preserving LLM adaptation on distributed edge devices.
- Large Language Models
- AlignFed
TOOL · Anthropic SDK (TypeScript) — Releases Svenska(SV) · 4d · [3 sources]

bedrock-sdk: v0.30.0

Anthropic has released two minor updates to its TypeScript SDK for Amazon Bedrock. Version v0.30.1 and v0.30.0 were pushed to GitHub, with the latter being a prerequisite for the former. These updates likely contain bug fixes or minor improvements to the SDK's functionality. AI

IMPACT Minor update to an SDK, unlikely to have significant industry-wide impact.
TOOL · Hacker News — AI stories ≥50 points English(EN) · 1mo

Anthropic says OpenClaw-style Claude CLI usage is allowed again

OpenClaw has updated its integration with Anthropic's Claude models, allowing direct API access and the reuse of Claude CLI logins. This update enables features like prompt caching and the 1 million token context window for Claude Opus 4.7. Additionally, OpenClaw now automatically handles image and PDF understanding capabilities when using Anthropic's models. AI
TOOL · Hacker News — AI stories ≥50 points English(EN) · 1mo

Scan your website to see how ready it is for AI agents

A new tool called 'Is It Agent Ready?' allows website owners to scan their sites for compatibility with AI agents. The tool checks for adherence to emerging standards related to discoverability, content accessibility, bot access control, protocol discovery, and commerce. It provides recommendations for improvement, such as publishing a valid robots.txt file with AI bot rules and sitemap directives. AI
TOOL · Hacker News — AI stories ≥50 points English(EN) · 1mo

Moving a large-scale metrics pipeline from StatsD to OpenTelemetry / Prometheus

This article details the migration of Airbnb's large-scale metrics pipeline from StatsD to OpenTelemetry and Prometheus. The move was driven by the need for a more robust and scalable solution to handle the increasing volume of data. The new system leverages OpenTelemetry for data collection and Prometheus with vmagent for storage and querying, improving observability and performance. AI
TOOL · HN — claude-code stories English(EN) · 2mo

Launch HN: Relvy (YC F24) – On-call runbooks, automated

Relvy, a startup from the Y Combinator (YC) Winter 2024 batch, has launched its on-call runbook automation product. The platform aims to streamline incident response by providing automated runbooks. This launch targets engineering and operations teams. AI

IMPACT Niche tooling improvement; minimal industry-wide impact.
- Y Combinator
- Relvy
TOOL · HN — claude-code stories English(EN) · 2mo

Nanocode: The best Claude Code that $200 can buy in pure JAX on TPUs

A new project called Nanocode has been released, aiming to provide a high-performing Claude Code solution for $200. The project is built using JAX and is optimized for TPUs, suggesting a focus on efficient and powerful execution. AI

IMPACT Offers a cost-effective solution for code generation tasks, potentially lowering the barrier to entry for developers.
- Claude Code
- Nanocode
- JAX
TOOL · HN — anthropic stories English(EN) · 2mo

Show HN: Orloj – agent infrastructure as code (YAML and GitOps)

Orloj has released an open-source infrastructure-as-code platform for managing multi-agent AI systems. The tool allows developers to define agents, tools, models, memory, and other components using YAML and GitOps principles. Orloj aims to provide a declarative stack for building, operating, governing, and observing complex agentic systems, treating them like traditional software infrastructure. AI

IMPACT Provides a structured framework for deploying and managing complex multi-agent AI systems, potentially simplifying development and operations.
- Orloj
- GPT-4o
- OpenAI
- GitOps
- YAML
TOOL · HN — AI infrastructure stories English(EN) · 2mo

Launch HN: Kita (YC W26) – Automate credit review in emerging markets

Kita, a startup founded by Carmel and Rhea, has launched a new product designed to automate credit review for lenders in emerging markets. The system utilizes Visual Language Models (VLMs) to process diverse and often unstandardized financial documents, a task that current OCR and document AI tools struggle with. Kita's platform extracts structured financial data, detects fraud, and verifies information through cross-document checks and historical data, aiming to improve the speed and accuracy of underwriting. AI

IMPACT Automates document-heavy underwriting processes, potentially increasing lending efficiency and access in emerging markets.
- WhatsApp
- Kita
- Carmel
- Rhea
- Philippines
- Mexico
- Indonesia
- South Africa
- US
- VLM
TOOL · HN — AI infrastructure stories English(EN) · 2mo

Launch HN: IonRouter (YC W26) – High-throughput, low-cost inference

IonRouter has launched a new inference service designed for high throughput and low cost, utilizing its proprietary IonAttention engine. This engine is capable of multiplexing multiple models on a single GPU, enabling rapid model switching and real-time traffic adaptation. The service supports various open-source models and fine-tunes, offering per-second billing and minimal cold start times, making it suitable for applications like robotics and real-time video analysis. AI

IMPACT Offers a potentially more cost-effective and performant inference solution for deploying various open-source and fine-tuned models.
- FastGen
- Flux Schnell
- Black Forest Labs
- Qwen3.5-122B-A10B
- Cumulus
- GPT-OSS-120B
- Wan2.2
- ZhiPu AI
- IonRouter
- IonAttention
- NVIDIA
- Grace Hopper
- Qwen2.5-7B
- LoRA
- GLM-5
- EAGLE
- Kimi-K2.5
- MoonShot AI
- MiniMax-M2.5
TOOL · HN — anthropic stories English(EN) · 2mo

Show HN: Axe – A 12MB binary that replaces your AI framework

Axe is a new command-line interface tool designed to manage and execute AI agents, drawing inspiration from Unix philosophy for focused, composable functionality. It allows users to define agents with specific skills using TOML files, enabling them to be chained together or triggered by standard system tools like cron or git hooks. Axe supports multiple LLM providers, including Anthropic and OpenAI, and offers features such as persistent memory, sub-agent delegation, and structured JSON output for scripting. AI

IMPACT Provides a more modular and composable approach to integrating LLM agents into existing workflows.
- Anthropic
- Unix
- TOML
- AWS Bedrock
- Ollama
- OpenAI
TOOL · HN — AI infrastructure stories English(EN) · 3mo

Launch HN: Sentrial (YC W26) – Catch AI agent failures before your users do

Sentrial has launched a new platform designed to detect and alert users about failures in AI agents before they impact end-users. The service aims to provide a proactive monitoring solution for AI-driven applications. This tool focuses on identifying issues within AI agent workflows, offering a layer of reliability for businesses integrating these technologies. AI

IMPACT Provides a monitoring solution to improve the reliability of AI agents in production environments.
- Sentrial
TOOL · HN — AI infrastructure stories English(EN) · 3mo

Show HN: Klaus – OpenClaw on a VM, batteries included

Klaus has launched OpenClaw, an AI agent designed to function as an employee for businesses. This tool aims to simplify the integration of AI features, allowing companies to deploy agents within minutes. OpenClaw offers various use cases, including calendar management, email triage, travel booking, and meeting preparation, with tiered pricing plans and a managed rollout option for full-service deployment. AI

IMPACT Accelerates business adoption of AI agents for operational tasks, potentially reducing manual labor and increasing efficiency.
- Ag Startup Engine
- Open Chair Advisory
- Klaus
- OpenClaw
- AgentMail
- Link11
- Clayton Farms
- Jupe
- SUSE
TOOL · HN — AI infrastructure stories English(EN) · 3mo · [2 sources]

Launch HN: IonRouter (YC W26) – High-throughput, low-cost inference

IonRouter has launched a new inference stack called IonAttention, designed to multiplex models on a single GPU for high throughput and low cost, compatible with NVIDIA Grace Hopper. Separately, RunAnywhere has released RCLI, an on-device voice AI for macOS that runs inference locally on Apple Silicon using their proprietary MetalRT engine, offering features like local RAG and VLM capabilities. AI

IMPACT These launches offer new options for optimizing AI inference costs and performance, both in cloud and on-device environments.
- Grace Hopper
- IonRouter
- IonAttention
- NVIDIA
- RunAnywhere
- Apple Silicon
- MetalRT
- MoonShot AI
- MiniMax
- GPT-OSS
- llama.cpp
TOOL · HN — AI infrastructure stories English(EN) · 3mo

Show HN: Open-Source Article 12 Logging Infrastructure for the EU AI Act

A new open-source TypeScript library has been released to help developers comply with Article 12 of the EU AI Act. This library automatically records AI inferences as tamper-evident logs, chaining entries with SHA-256 hashes and ensuring a minimum retention period. It is designed for Node.js applications using the Vercel AI SDK and aims to provide a more robust auditing solution than standard logging practices. AI

IMPACT Provides a technical solution for AI developers to meet new EU compliance mandates for high-risk systems.
- Mastodon
- Vercel AI SDK
- S3
- EU AI Act
- TypeScript
- Node.js
- SHA-256
TOOL · HN — AI infrastructure stories English(EN) · 3mo

Show HN: I built a zero-browser, pure-JS typesetting engine for bit-perfect PDFs

A developer has created VMPrint, a novel typesetting engine that operates without a browser, utilizing pure JavaScript for PDF generation. This engine treats document layout as a deterministic spatiotemporal simulation, where elements are autonomous actors negotiating geometry. The system is designed for high-volume report generation, collaborative editors, and print-on-demand services, offering a more efficient and reliable alternative to browser-based solutions. AI

IMPACT Offers a more efficient, browser-less PDF generation solution for developers, potentially reducing infrastructure costs for high-volume document creation.
- Deno Deploy
- Figma
- VMPrint
- Lambda@Edge
- PDF
- Node.js
- TypeScript
- Cloudflare Workers
- JavaScript
TOOL · HN — claude cli stories English(EN) · 3mo

Show HN: OpenSwarm – Multi‑Agent Claude CLI Orchestrator for Linear/GitHub

OpenSwarm is a new command-line interface tool designed to orchestrate multiple AI agents for autonomous code-related tasks. It can integrate with various AI models, including Anthropic's Claude, OpenAI's GPT and Codex, and local open-source models. The tool aims to automate workflows such as picking up issues from platforms like Linear, running code review pipelines, and maintaining long-term memory through databases like LanceDB. AI

IMPACT Enables more complex, multi-agent autonomous workflows for code development and issue resolution.
- llama.cpp
- LMStudio
- Ollama
- Codex
- GPT
- Claude
- Linear
- OpenSwarm
- LanceDB
- Discord
- GitHub
TOOL · HN — claude cli stories English(EN) · 3mo

Show HN: Strava for Claude Code

A new tool called Strava for Claude Code has been released, designed to help developers track their usage and costs associated with AI models like Claude. The tool provides metrics on token consumption, iteration speed, and daily usage streaks, aiming to foster a competitive environment among AI-powered developers. It emphasizes privacy by only sending aggregated usage data, not the content of prompts or code, to its local telemetry service. AI

IMPACT This tool could encourage more efficient and competitive AI development by providing usage and cost-tracking metrics for developers.
TOOL · HN — claude cli stories English(EN) · 3mo

Show HN: Skill that lets Claude Code/Codex spin up VMs and GPUs

A new open-source terminal application called Skill has been developed to facilitate the use of AI coding agents. This tool is designed to help users spin up virtual machines and GPUs, streamlining the process of deploying and managing AI development environments. The project aims to provide a next-generation development experience for those working with AI-powered coding assistants. AI

IMPACT Potentially streamlines AI development workflows by simplifying VM and GPU provisioning for coding agents.
- Skill
- Claude
- Codex
- cloudrouter.dev
TOOL · HN — AI startup stories English(EN) · 4mo

Launch HN: Modelence (YC S25) – App Builder with TypeScript / MongoDB Framework

Modelence, an AI startup, has launched an open-source full-stack framework designed for both human developers and AI coding agents. The framework utilizes TypeScript for its type safety and MongoDB for flexible schema management, aiming to streamline app development by handling boilerplate tasks like authentication and database setup. An integrated app builder allows users to generate applications from prompts, with plans to introduce a DevOps agent for production monitoring and error resolution. AI

IMPACT Simplifies AI-driven application development by providing a unified framework and backend infrastructure.
- MongoDB
- Modelence
- YC S25
- Claude Agent SDK
- Eduard
- TypeScript
TOOL · HN — AI startup stories English(EN) · 4mo

Launch HN: AgentMail (YC S25) – An API that gives agents their own email inboxes

AgentMail, a new API service from Haakam, Michael, and Adi, provides dedicated email inboxes for AI agents, aiming to streamline autonomous task completion. The service addresses limitations found in existing email platforms like Gmail, offering features such as programmatic inbox creation, advanced semantic search, and usage-based pricing. Early adopters are already utilizing AgentMail for tasks like data conversion, negotiation, and training model data sourcing. AI

IMPACT Enables more autonomous AI agents by providing a robust, dedicated communication channel, potentially streamlining workflows and data sourcing.
- AgentMail
- Michael
- Adi
- Gmail
- YC S25
- Clawdbots
- Rails
TOOL · HN — AI infrastructure stories English(EN) · 4mo

Show HN: ShapedQL – A SQL engine for multi-stage ranking and RAG

ShapedQL has been introduced as a new SQL engine designed to optimize multi-stage ranking and Retrieval-Augmented Generation (RAG) processes. This tool aims to streamline complex data operations within AI applications. The announcement was made via a Show HN post, indicating a focus on community feedback and developer adoption. AI

IMPACT Potentially improves efficiency for AI systems relying on RAG and complex ranking.
- ShapedQL
TOOL · HN — claude cli stories English(EN) · 4mo

Show HN: A fast CLI and MCP server for managing Lambda cloud GPU instances

A new open-source command-line interface (CLI) and MCP server has been released to manage cloud GPU instances from Lambda. The tool, developed by Strand-AI, allows users to directly control GPU infrastructure via terminal commands or enable AI assistants like Claude to manage these resources. It offers features such as starting, stopping, and listing instances, alongside automatic notifications for instance availability across Slack, Discord, and Telegram. AI

IMPACT Simplifies cloud GPU management for AI developers and researchers using AI assistants.
- Homebrew
- Slack
- Claude
- Strand-AI
- Telegram
- GitHub
- Discord
TOOL · HN — AI infrastructure stories English(EN) · 5mo

Show HN: I open-sourced my Go and Next B2B SaaS Starter (deploy anywhere, MIT)

A developer has open-sourced a comprehensive B2B SaaS starter kit built with Next.js 16 and Go 1.25. The kit includes features like enterprise-grade authentication, multi-tenancy, role-based access control, and billing integration. It also incorporates AI capabilities such as RAG pipelines with vector embeddings and an OCR service for document data extraction. AI

IMPACT Provides a pre-built foundation for developers to quickly integrate AI features like RAG and OCR into their SaaS products.
- Next.js
- TypeScript
- Tailwind CSS
- shadcn/ui
- Radix UI
- TanStack Query
- react-hook-form
- Stytch
- Polar.sh
- Recharts
- PostgreSQL
- pgvector
- OpenAI API
- Mistral AI
- Cloudflare R2
- Docker
- SQLC
- Zod
TOOL · HN — MCP stories English(EN) · 6mo

Show HN: MCPShark – Traffic Inspector for Model Context Protocol

MCPShark is a newly released traffic inspector designed for the Model Context Protocol (MCP). This tool allows developers to observe and debug MCP traffic, including requests, responses, and tool usage, between their editor or LLM client and MCP servers. It also offers optional "Smart Scan" checks to identify potentially risky tool configurations. AI

IMPACT Provides developers with enhanced visibility and debugging capabilities for LLM interactions via the Model Context Protocol.
TOOL · HN — AI infrastructure stories English(EN) · 6mo

Microsoft won't let me pay a $24 bill, blocking thousands in Azure spending

A software engineer detailed their frustrating experience attempting to resolve a $24 Azure billing issue that prevented them from spending thousands on new services. Despite numerous attempts through official channels, including a custom-built PowerShell application, Microsoft's support system created a loop where paying the invoice required support, but support required a paid plan that couldn't be purchased due to the outstanding invoice. The engineer expressed disbelief at the company's inability to accept payment, while other users shared similar anecdotes and suggested alternative approaches like contacting sales. AI

IMPACT Highlights potential friction points in cloud provider billing and support systems that could impact enterprise adoption of AI infrastructure.
TOOL · HN — AI startup stories English(EN) · 7mo

Y Combinator Startup brings brainrot to developers' IDEs

Clad Labs has launched a new platform designed to orchestrate multiple AI coding agents, including Claude Code, Cursor, and OpenAI Codex. The tool allows developers to spin up teams of parallel agents, manage their work in isolation, and merge changes seamlessly. It also offers analytics to track coding habits and productivity alongside entertainment usage. AI

IMPACT Enables developers to leverage multiple AI coding agents simultaneously, potentially streamlining workflows and improving productivity.
- Cursor
- Chad
- OpenAI Codex
- Clad Labs
- Claude Code
TOOL · HN — AI startup stories English(EN) · 8mo

Launch HN: Extend (YC W23) – Turn your messiest documents into data

Extend, a Y Combinator-backed startup, has launched a production-ready platform designed to transform messy documents into structured data. The service utilizes specialized vision models for accurate parsing and extraction, offering features like confidence scoring, multiple processing modes, and an optimization agent called Composer Agent to refine schemas automatically. Extend aims to streamline document workflows for AI teams, enabling faster development and deployment of data processing pipelines. AI

IMPACT Accelerates enterprise adoption of AI for document processing by simplifying data extraction and workflow automation.
TOOL · HN — AI startup stories English(EN) · 8mo

Launch HN: Webhound (YC S23) – Research agent that builds datasets from the web

AI startup Webhound has launched a research agent designed to automate the creation of web-scraped datasets based on natural language prompts. The agent, initially built on Claude 4 Sonnet, was re-engineered using Gemini 2.5 Flash and a multi-agent system to significantly reduce costs and improve reliability. This new architecture includes specialized agents for planning, searching, critiquing, and validating data, along with a text-based browser for efficient extraction. AI

IMPACT Automates complex data collection tasks, potentially lowering the barrier for data-driven research and analysis.
- Superblocks
- Webhound
- YC S23
- Claude 4 Sonnet
- Gemini 2.5 Flash
- Retool
- Appsmith
- UI Bakery
- BudiBase
- Shopify
- Figma
- Hacker News
- arXiv
TOOL · HN — MCP stories English(EN) · 8mo · [2 sources]

Show HN: AI-powered web service combining FastAPI, Pydantic-AI, and MCP servers

A developer has created an open-source AI-powered web service that integrates FastAPI for APIs, Pydantic-AI for agent construction, and Model Context Protocol (MCP) servers for tools. The service allows users to query information from sources like Hacker News and web search, presenting ranked trend cards with summaries. It supports various local LLM configurations and is containerized with Docker for production deployment. AI

IMPACT Provides a template for building production-ready AI services with modular components and local LLM support.
- OpenAI
- MCP
- GitHub
- vLLM
- LMStudio
- Ollama
- FastAPI
- Pydantic-AI
- Hacker News
- Docker
- Model Context Protocol (MCP)
TOOL · HN — AI infrastructure stories English(EN) · 9mo

Launch HN: Recall.ai (YC W20) – API for meeting recordings and transcripts

Recall.ai has launched a new Desktop Recording SDK designed to simplify the integration of meeting recording capabilities into other applications. This SDK addresses the complexities of capturing high-quality audio and video, including speaker identification and clean video compositing, without requiring a bot to be present in the meeting. The company aims to provide developers with a robust infrastructure solution, drawing on their experience powering recording features for over 2000 companies and overcoming significant technical challenges in reliability and efficiency. AI

IMPACT Simplifies AI integration for meeting analysis tools by providing a reliable recording infrastructure.
- Hubspot
- Recall.ai
- Notion
- ChatGPT
- Clickup
- AWS
TOOL · HN — AI infrastructure stories English(EN) · 9mo · [2 sources]

Show HN: Smooth – Faster, cheaper browser agent API

Smooth has launched a new serverless browser agent API designed for reliability, speed, and cost-efficiency, claiming to be 7x cheaper and 5x faster than existing solutions. The API aims to simplify web automation tasks for developers by handling complexities like instant browser spin-up and CAPTCHA solving. Separately, ContextFort has introduced a tool to provide visibility and control over AI coding agents like Cursor and Claude Code, addressing security concerns about agents accessing sensitive files and credentials on developer machines. AI

IMPACT New tools emerge to enhance AI agent capabilities and address security concerns in development workflows.
TOOL · HN — MCP stories English(EN) · 9mo

Launch HN: April (YC S25) – Voice AI to manage your email and calendar

April, a new voice-controlled AI assistant, has launched on the App Store to manage emails and calendars. The application allows users to dictate replies, summarize messages, and reschedule meetings hands-free. It utilizes Deepgram for speech-to-text and Eleven Labs for text-to-speech, with custom servers for Google integration. The developers are focusing on low latency and natural interaction, while also considering user feedback on safety features like a 'safe mode' for non-destructive operations. AI

IMPACT Potentially streamlines daily productivity for users by enabling hands-free management of communications and schedules.
- HN
- Akash
- SF
- Eleven Labs
- Deepgram
- App Store
- April
- Berkeley
- Gmail
- Google
TOOL · HN — AI infrastructure stories English(EN) · 9mo

Launch HN: Skope (YC S25) – Outcome-based pricing for software products

Skope, a new billing system, has launched to support outcome-based pricing for software products, particularly targeting the burgeoning AI market. The platform allows companies to charge customers only when their software delivers a specific result, aligning incentives and reducing buyer risk. Skope aims to simplify the implementation of this pay-per-performance model, which was previously challenging to manage at scale. AI

IMPACT Enables new pricing models for AI products, potentially accelerating adoption by reducing upfront risk for buyers.
- Stripe Billing
- Metronome
- Stripe
- Langfuse
- Helicone
- Skope
TOOL · HN — AI startup stories Deutsch(DE) · 9mo

Launch HN: Cyberdesk (YC S25) – Automate Windows legacy desktop apps

Cyberdesk, a startup founded by Mahmoud and Alan, has launched a new tool designed to automate repetitive tasks within legacy Windows desktop applications. Their approach uses a deterministic computer use agent that learns workflows from natural language instructions, offering a more reliable alternative to traditional Robotic Process Automation (RPA) scripts. The agent can self-correct based on screen state and only resorts to expensive AI models when unexpected anomalies occur, making it both robust and cost-effective for industries like healthcare and accounting. AI

IMPACT Automates legacy desktop applications, potentially improving efficiency and reducing errors in industries reliant on older software.
TOOL · HN — MCP stories English(EN) · 10mo

Show HN: Mcp-use – Connect any LLM to any MCP

The mcp-use framework has been released, enabling developers to build applications that can connect to various large language models like ChatGPT and Claude. This framework allows for the creation of MCP Servers and MCP Apps, with SDKs available in TypeScript and Python. It also includes an MCP Inspector for testing and debugging, and a cloud deployment option for production environments. AI

IMPACT Enables developers to build cross-platform applications for multiple LLMs, potentially streamlining AI agent development.
- Claude
- TypeScript
- Python
- GitHub
- Manufact MCP Cloud
- mcp-use
- ChatGPT
TOOL · HN — machine learning stories English(EN) · 10mo

PHP-ORT: Machine learning inference for the web

A new infrastructure project called PHP-ORT aims to bring machine learning inference capabilities directly to PHP, the server-side language used by a significant portion of the web. This development seeks to empower millions of PHP developers to integrate AI features into their applications without relying on external services or switching programming languages. PHP-ORT provides a core Tensor API, a high-performance math library, and integrates with ONNX for direct inference, promising significant speedups. AI

IMPACT Enables millions of PHP developers to integrate ML inference directly into their web applications, potentially democratizing AI capabilities at scale.
- PHP-ORT
- ONNX
- AVX2
- CUDA
- WASM
- NEON
- RISCV64
- AVX512
- SSE4.1
- SSE2
TOOL · HN — AI infrastructure stories English(EN) · 10mo

Show HN: Improving search ranking with chess Elo scores

ZeroEntropy has developed specialized AI models, including rerankers and embeddings, designed for production systems that prioritize speed and accuracy over generalist models. Their offerings, such as zembed-1 and zerank-2, aim to provide lower latency and higher accuracy for applications like Retrieval Augmented Generation (RAG). These models are available for integration into existing stacks and can be deployed on cloud platforms like AWS and Azure, with a focus on security and compliance standards. AI

IMPACT Offers specialized, low-latency AI models that could improve performance for specific RAG and search ranking tasks.
TOOL · HN — AI startup stories English(EN) · 11mo

Show HN: Open source alternative to Perplexity Comet

BrowserOS has launched as an open-source browser designed for the AI era, integrating AI agents that can automate web tasks through natural language commands. It prioritizes user privacy and offers extensive customization by supporting over 11 AI providers, including popular options like Anthropic Claude, Google Gemini, and OpenAI, as well as local models. The browser is built on a Chromium fork, ensuring compatibility with existing Chrome extensions and offering a user-friendly experience for both general users and developers. AI

IMPACT This browser aims to streamline AI agent integration for web automation, potentially simplifying workflows for users and developers interacting with various LLMs.
- Windows
- Chrome
- BrowserOS
- Perplexity
- Comet
- GitHub
- Moonshot Kimi
- Anthropic Claude
- Google Gemini
- OpenAI
- OpenRouter
- Ollama
- LM Studio
- macOS
- Linux
TOOL · HN — machine learning stories English(EN) · 12mo

Show HN: Glowstick – type level tensor shapes in stable rust

Glowstick is a new Rust crate designed to enhance tensor manipulation by integrating shape checking directly into the type system. This approach aims to make tensor operations safer and more intuitive, particularly for developers working with machine learning frameworks. The project, currently in its pre-1.0 phase, offers features like dynamic dimension support and improved error messages, with plans to align with ONNX operations. AI

IMPACT Provides a type-safe approach to tensor manipulation in Rust, potentially improving developer experience and reducing errors in ML workflows.
- Burn
- ONNX
- Candle
- Rust
- Tensor
TOOL · HN — AI infrastructure stories English(EN) · 13mo

Launch HN: Tinfoil (YC X25): Verifiable Privacy for Cloud AI

Tinfoil, a startup founded by researchers from MIT and Cloudflare, has launched a new service designed to provide verifiable privacy for AI workloads hosted in the cloud. The platform utilizes secure enclave technology, particularly NVIDIA's confidential computing capabilities on GPUs, to ensure that neither Tinfoil nor the cloud provider can access sensitive data processed by AI models. This approach aims to enhance AI privacy by replacing trust with provable security, enabling more complex AI applications that require private data. AI

IMPACT Enables more sensitive AI applications by providing verifiable privacy for cloud-hosted models.
- MIT
- Tinfoil
- TLS
- FHE
- Sigstore
- Deepseek R1
- Llama
- Tor
- Cloudflare
- Microsoft Research
- NVIDIA
TOOL · HN — AI startup stories English(EN) · 13mo

Launch HN: ParaQuery (YC X25) – GPU Accelerated Spark/SQL

ParaQuery, a new startup, has launched a GPU-accelerated Spark and SQL data processing solution. The platform aims to offer cost and performance benefits over existing solutions like Google BigQuery. ParaQuery leverages NVIDIA's RAPIDS technology to enhance traditional data processing tasks, which the founder notes are often mistakenly believed to be limited to AI and graphics. AI

IMPACT Enhances data processing efficiency, potentially lowering costs for AI workloads that rely on large datasets.
- Google
- ParaQuery
- Spark
- SQL
- BigQuery
- NVIDIA
- RAPIDS
- CUDA
TOOL · HN — AI startup stories English(EN) · 13mo

Launch HN: Exa (YC S21) – The web as a database

Exa has launched Websets, a new search engine that uses embeddings and agentic workflows to provide precise results from the web, presented in a database-like table format. The service aims to combat the decline in search quality by performing extensive embedding searches and then using LLMs to verify each result against complex queries. While the process can take significant time, Exa believes the accuracy and detailed verification are worth the wait, offering an alternative to traditional keyword-based search. AI

IMPACT Offers a novel approach to web search by leveraging embeddings and LLMs for enhanced accuracy and structured data retrieval.
- Google
- Exa
- Websets
- Will
- Jeff
- LLM
TOOL · HN — machine learning stories English(EN) · 13mo

OCaml's Wings for Machine Learning

Raven is a new ecosystem of OCaml libraries designed for numerical computing, machine learning, and data science. It aims to provide type-safe alternatives to popular Python libraries such as NumPy, JAX, and PyTorch. The project includes modules for n-dimensional arrays, automatic differentiation, tokenization, neural networks, dataframes, and plotting, with the goal of building a robust scientific computing environment. AI

IMPACT Provides a type-safe alternative for AI development in OCaml, potentially attracting developers seeking stronger guarantees.
- Jupyter
- Matplotlib
- PyTorch
- JAX
- MirageOS
- Raven
- NumPy
- OCaml
TOOL · HN — AI startup stories English(EN) · 13mo

Show HN: Morphik – Open-source RAG that understands PDF images, runs locally

Morphik has launched an open-source Retrieval-Augmented Generation (RAG) system designed for developers to integrate complex context into AI applications. The system aims to simplify the process by offering a unified solution for storing, representing, and searching unstructured and multimodal data, addressing the limitations of traditional RAG pipelines that struggle with visually rich documents. Morphik provides features like multimodal search, fast metadata extraction, and integrations with tools such as Google Suite and Slack, with a free tier available for users. AI

IMPACT Simplifies multimodal data integration for AI applications, potentially reducing development complexity and infrastructure costs.
- Morphik
- Slack
- Confluence
TOOL · HN — AI infrastructure stories English(EN) · 13mo

Show HN: We Put Chromium on a Unikernel (OSS Apache 2.0)

A new open-source project offers sandboxed Chrome browsers that can be run as Docker containers or on Unikraft unikernels. This setup is designed for browser automation, web agents, and testing AI agents that interact with the web. The unikernel implementation provides features like automated standby mode with state snapshotting and extremely fast cold restarts, enabling low-latency event handling. AI

IMPACT Enables developers to build and test AI agents that require controlled browser environments.
- Chromium
- Kraft CLI
- Unikraft
- Unikernel
- Playwright
- Puppeteer
- Chrome DevTools
- Docker
TOOL · HN — AI startup stories English(EN) · 14mo

Launch HN: mrge.io (YC X25) – Cursor for code review

AI startup mrge has launched a new platform designed to streamline code reviews for development teams. The tool connects to GitHub repositories and uses AI to analyze code changes within a secure, ephemeral sandbox environment. It aims to assist human reviewers by identifying potential bugs and providing context, inspired by productivity tools like Linear and Superhuman. AI

IMPACT Aims to accelerate code merging and reduce bugs by leveraging AI for code review, potentially improving developer productivity.
- mrge.io
- GitHub
- Linear
- Superhuman
- Better Auth
- Cal.com
- n8n
- Cursor