Brief

last 24h

[50/3923] 224 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · Hugging Face Blog English(EN) · 8mo · [5 sources]

VibeGame: Exploring Vibe Coding Games

Google AI has introduced Vibe Coding XR, a new workflow designed to simplify the creation of interactive XR experiences. This system leverages Gemini's capabilities with the open-source XR Blocks framework to translate natural language prompts into functional, physics-aware WebXR applications for Android XR devices. The goal is to accelerate prototyping by allowing creators to quickly test intelligent spatial experiences without extensive coding knowledge, with applications deployable in under 60 seconds. Google plans to demonstrate Vibe Coding XR at ACM CHI 2026. AI
- Android XR
- ACM CHI 2026
- Hugging Face
- Replit
- RPython
- Google AI
- Gemini
- XR Blocks
- WebXR
TOOL · OpenAI News English(EN) · 8mo

Buy it in ChatGPT: Instant Checkout and the Agentic Commerce Protocol

OpenAI has launched "Instant Checkout" within ChatGPT, enabling users to purchase products directly from merchants without leaving the chat interface. This feature is powered by the newly released Agentic Commerce Protocol, an open standard co-developed with Stripe. Initially available for U.S. users to buy from Etsy and soon Shopify merchants, the protocol aims to facilitate seamless AI-driven commerce by allowing AI agents, people, and businesses to collaborate on purchases. OpenAI is open-sourcing the protocol to encourage broader adoption and integration by developers and merchants. AI
TOOL · HN — AI startup stories English(EN) · 8mo

Launch HN: Webhound (YC S23) – Research agent that builds datasets from the web

AI startup Webhound has launched a research agent designed to automate the creation of web-scraped datasets based on natural language prompts. The agent, initially built on Claude 4 Sonnet, was re-engineered using Gemini 2.5 Flash and a multi-agent system to significantly reduce costs and improve reliability. This new architecture includes specialized agents for planning, searching, critiquing, and validating data, along with a text-based browser for efficient extraction. AI

IMPACT Automates complex data collection tasks, potentially lowering the barrier for data-driven research and analysis.
- Superblocks
- Appsmith
- BudiBase
- UI Bakery
- YC S23
- Claude 4 Sonnet
- Gemini 2.5 Flash
- Retool
- Webhound
- arXiv
- Hacker News
- Figma
- Shopify
TOOL · HN — MCP stories English(EN) · 9mo · [2 sources]

Show HN: AI-powered web service combining FastAPI, Pydantic-AI, and MCP servers

A developer has created an open-source AI-powered web service that integrates FastAPI for APIs, Pydantic-AI for agent construction, and Model Context Protocol (MCP) servers for tools. The service allows users to query information from sources like Hacker News and web search, presenting ranked trend cards with summaries. It supports various local LLM configurations and is containerized with Docker for production deployment. AI

IMPACT Provides a template for building production-ready AI services with modular components and local LLM support.
- Hacker News
- OpenAI
- MCP
- GitHub
- vLLM
- LMStudio
- Ollama
- FastAPI
- Pydantic-AI
- Docker
- Model Context Protocol (MCP)
TOOL · Hugging Face Blog English(EN) · 9mo

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

Hugging Face has released a guide detailing techniques to optimize the performance of large language models using the Transformers library. The blog post, inspired by OpenAI's open-source contributions, focuses on practical methods for accelerating inference and training. It covers strategies such as quantization, efficient attention mechanisms, and optimized kernels to help developers achieve faster results with their models. AI
TOOL · HN — AI infrastructure stories English(EN) · 9mo

Launch HN: Recall.ai (YC W20) – API for meeting recordings and transcripts

Recall.ai has launched a new Desktop Recording SDK designed to simplify the integration of meeting recording capabilities into other applications. This SDK addresses the complexities of capturing high-quality audio and video, including speaker identification and clean video compositing, without requiring a bot to be present in the meeting. The company aims to provide developers with a robust infrastructure solution, drawing on their experience powering recording features for over 2000 companies and overcoming significant technical challenges in reliability and efficiency. AI

IMPACT Simplifies AI integration for meeting analysis tools by providing a reliable recording infrastructure.
- AWS
- Recall.ai
- Notion
- ChatGPT
- Hubspot
- Clickup
TOOL · Together AI blog English(EN) · 9mo

Together AI welcomes Mahadev Konar as SVP for Infrastructure Engineering

Together AI has appointed Mahadev Konar as its new SVP of Infrastructure Engineering to bolster its GPU cloud services. Konar, a key figure in Apache Hadoop's development and formerly VP of Infrastructure at Instacart, will lead efforts to enhance the reliability, performance, and scalability of Together AI's platform. The company aims to provide AI-native startups with a robust infrastructure, enabling them to focus on product development rather than managing complex GPU environments. AI

IMPACT Strengthens Together AI's infrastructure capabilities, potentially improving scalability and reliability for AI startups using their platform.
- Vercept
- Together AI
- Mahadev Konar
- Apache Hadoop
- Instacart
- Hedra
- SCB 10X
- ElevenLabs
- Cohere
- Zoom
- Infrastructure Engineering
TOOL · Hugging Face Blog English(EN) · 9mo

SAIR: Accelerating Pharma R&D with AI-Powered Structural Intelligence

SandboxAQ has introduced SAIR, a new AI platform designed to accelerate pharmaceutical research and development. SAIR leverages AI-powered structural intelligence to analyze complex biological data, aiming to speed up the discovery of new drugs and therapies. The platform is expected to enhance the efficiency of R&D processes within the pharmaceutical industry. AI
TOOL · Hugging Face Blog English(EN) · 9mo

Make your ZeroGPU Spaces go brrr with ahead-of-time compilation

Hugging Face has introduced ahead-of-time (AOT) compilation for its ZeroGPU Spaces, enabling faster inference speeds. This optimization technique compiles models before deployment, reducing latency and improving the overall user experience for those running models without dedicated GPUs. The feature aims to make AI model deployment more accessible and efficient on their platform. AI
TOOL · HN — AI infrastructure stories English(EN) · 9mo · [2 sources]

Show HN: Smooth – Faster, cheaper browser agent API

Smooth has launched a new serverless browser agent API designed for reliability, speed, and cost-efficiency, claiming to be 7x cheaper and 5x faster than existing solutions. The API aims to simplify web automation tasks for developers by handling complexities like instant browser spin-up and CAPTCHA solving. Separately, ContextFort has introduced a tool to provide visibility and control over AI coding agents like Cursor and Claude Code, addressing security concerns about agents accessing sensitive files and credentials on developer machines. AI

IMPACT New tools emerge to enhance AI agent capabilities and address security concerns in development workflows.
TOOL · HN — MCP stories English(EN) · 9mo

Launch HN: April (YC S25) – Voice AI to manage your email and calendar

April, a new voice-controlled AI assistant, has launched on the App Store to manage emails and calendars. The application allows users to dictate replies, summarize messages, and reschedule meetings hands-free. It utilizes Deepgram for speech-to-text and Eleven Labs for text-to-speech, with custom servers for Google integration. The developers are focusing on low latency and natural interaction, while also considering user feedback on safety features like a 'safe mode' for non-destructive operations. AI

IMPACT Potentially streamlines daily productivity for users by enabling hands-free management of communications and schedules.
- Deepgram
- App Store
- Eleven Labs
- Google
- HN
- Akash
- SF
- Berkeley
- Gmail
- April
TOOL · HN — AI infrastructure stories English(EN) · 9mo

Launch HN: Skope (YC S25) – Outcome-based pricing for software products

Skope, a new billing system, has launched to support outcome-based pricing for software products, particularly targeting the burgeoning AI market. The platform allows companies to charge customers only when their software delivers a specific result, aligning incentives and reducing buyer risk. Skope aims to simplify the implementation of this pay-per-performance model, which was previously challenging to manage at scale. AI

IMPACT Enables new pricing models for AI products, potentially accelerating adoption by reducing upfront risk for buyers.
- Stripe
- Stripe Billing
- Metronome
- Langfuse
- Helicone
- Skope
TOOL · HN — AI startup stories English(EN) · 9mo

Launch HN: Channel3 (YC S25) – A database of every product on the internet

Channel3, a startup founded by George and Alex, has launched an API designed to provide developers with a comprehensive database of internet products. The service addresses the difficulty of accessing clean, structured product data from various retailers, which is often protected by bot detection. Channel3 uses computer vision and LLMs to identify, normalize, and de-duplicate product listings across multiple vendors, offering a unified API for developers to integrate product recommendations and affiliate monetization into their applications. The platform supports text and image-based searches, provides product details like price and specifications, and aims to facilitate developer earnings through commissions. AI

IMPACT Enables developers to integrate product search and affiliate monetization into applications using AI-powered data processing.
- Exa
- Python
- Typescript
- OpenSearch
- AWS S3 Vectors
- Cloudflare Vectorize
- Bing
- Tavily
- George
- Alex
- YC S25
- Channel3
TOOL · HN — AI startup stories Deutsch(DE) · 10mo

Launch HN: Cyberdesk (YC S25) – Automate Windows legacy desktop apps

Cyberdesk, a startup founded by Mahmoud and Alan, has launched a new tool designed to automate repetitive tasks within legacy Windows desktop applications. Their approach uses a deterministic computer use agent that learns workflows from natural language instructions, offering a more reliable alternative to traditional Robotic Process Automation (RPA) scripts. The agent can self-correct based on screen state and only resorts to expensive AI models when unexpected anomalies occur, making it both robust and cost-effective for industries like healthcare and accounting. AI

IMPACT Automates legacy desktop applications, potentially improving efficiency and reducing errors in industries reliant on older software.
TOOL · Replit blog English(EN) · 10mo

Introducing App Storage – building apps with images, video, and PDFs just got easier

Replit has introduced App Storage, a new object storage solution designed to simplify the hosting and saving of large files like images, videos, and documents within applications. This feature integrates seamlessly with Replit's Agent capabilities, allowing users to build apps that handle diverse file types with built-in authentication and database connections for permission management. App Storage is intended for a wide range of applications, from client portals and recipe apps to document management systems and online course platforms, offering SDKs for both JavaScript and Python. AI

IMPACT Simplifies development for AI-powered applications that handle large media files.
- Replit
- App Storage
TOOL · HN — MCP stories English(EN) · 10mo

Show HN: Mcp-use – Connect any LLM to any MCP

The mcp-use framework has been released, enabling developers to build applications that can connect to various large language models like ChatGPT and Claude. This framework allows for the creation of MCP Servers and MCP Apps, with SDKs available in TypeScript and Python. It also includes an MCP Inspector for testing and debugging, and a cloud deployment option for production environments. AI

IMPACT Enables developers to build cross-platform applications for multiple LLMs, potentially streamlining AI agent development.
- ChatGPT
- Claude
- TypeScript
- Python
- GitHub
- Manufact MCP Cloud
- mcp-use
TOOL · HN — machine learning stories English(EN) · 10mo

PHP-ORT: Machine learning inference for the web

A new infrastructure project called PHP-ORT aims to bring machine learning inference capabilities directly to PHP, the server-side language used by a significant portion of the web. This development seeks to empower millions of PHP developers to integrate AI features into their applications without relying on external services or switching programming languages. PHP-ORT provides a core Tensor API, a high-performance math library, and integrates with ONNX for direct inference, promising significant speedups. AI

IMPACT Enables millions of PHP developers to integrate ML inference directly into their web applications, potentially democratizing AI capabilities at scale.
- SSE2
- ONNX
- AVX2
- CUDA
- WASM
- NEON
- RISCV64
- AVX512
- SSE4.1
- PHP-ORT
TOOL · Together AI blog English(EN) · 10mo

Together Evaluations: Benchmark Models for Your Tasks

Together AI has launched Together Evaluations, a new platform designed to help developers benchmark large language models for specific tasks. The service allows users to define custom benchmarks and utilize leading open-source LLMs as judges to assess model response quality. This approach aims to provide a faster and more flexible alternative to manual labeling or rigid automated metrics, with an early preview now available. AI

IMPACT Enables developers to more efficiently select and integrate the best LLMs for their specific applications.
TOOL · OpenAI News English(EN) · 10mo

Model ML is helping financial firms rebuild with AI from the ground up

Model ML, a company co-founded by Chaz Englander, is developing AI infrastructure tailored for the financial services industry. Their platform utilizes purpose-built agents and applications to automate complex workflows, significantly reducing the time required for tasks like quarterly earnings summaries. This automation allows financial professionals to shift their focus from routine work to higher-value, judgment-based roles, prompting a re-evaluation of organizational structures to become AI-native. AI
TOOL · HN — AI infrastructure stories English(EN) · 11mo

Show HN: Improving search ranking with chess Elo scores

ZeroEntropy has developed specialized AI models, including rerankers and embeddings, designed for production systems that prioritize speed and accuracy over generalist models. Their offerings, such as zembed-1 and zerank-2, aim to provide lower latency and higher accuracy for applications like Retrieval Augmented Generation (RAG). These models are available for integration into existing stacks and can be deployed on cloud platforms like AWS and Azure, with a focus on security and compliance standards. AI

IMPACT Offers specialized, low-latency AI models that could improve performance for specific RAG and search ranking tasks.
TOOL · Hugging Face Blog English(EN) · 11mo

Migrating the Hub from Git LFS to Xet

Hugging Face is transitioning its model and dataset hosting platform, the Hugging Face Hub, away from Git Large File Storage (LFS) to Xet, a new version control system designed for large files. This move aims to improve performance and scalability for managing the vast amounts of data associated with AI models. The migration process is expected to be gradual, with users being notified and guided through the transition. AI
- Hugging Face
- Git LFS
- Xet
TOOL · HN — AI startup stories English(EN) · 11mo

Show HN: Cactus – Ollama for Smartphones

Cactus has released an open-source AI engine designed for mobile devices and wearables, prioritizing low latency and reduced RAM usage. The engine supports multimodal capabilities, including speech, vision, and language models, with an option to fall back to cloud-based models. It features NPU acceleration for energy efficiency and offers OpenAI-compatible APIs for integration into various applications. AI

IMPACT Enables on-device AI processing, potentially reducing reliance on cloud services and improving user privacy for mobile applications.
- OpenAI
- Cactus
- Ollama
- Gemma
TOOL · HN — AI startup stories English(EN) · 11mo

Show HN: Open source alternative to Perplexity Comet

BrowserOS has launched as an open-source browser designed for the AI era, integrating AI agents that can automate web tasks through natural language commands. It prioritizes user privacy and offers extensive customization by supporting over 11 AI providers, including popular options like Anthropic Claude, Google Gemini, and OpenAI, as well as local models. The browser is built on a Chromium fork, ensuring compatibility with existing Chrome extensions and offering a user-friendly experience for both general users and developers. AI

IMPACT This browser aims to streamline AI agent integration for web automation, potentially simplifying workflows for users and developers interacting with various LLMs.
- OpenRouter
- BrowserOS
- Perplexity
- Comet
- GitHub
- Moonshot Kimi
- Anthropic Claude
- Google Gemini
- OpenAI
- Ollama
- LM Studio
- macOS
- Windows
- Chrome
- Linux
TOOL · Hugging Face Blog English(EN) · 11mo

Three Mighty Alerts Supporting Hugging Face’s Production Infrastructure

Hugging Face has detailed its infrastructure alerting system, emphasizing its role in maintaining production stability. The system is designed to provide timely notifications for critical issues, enabling rapid response and minimizing downtime. This approach ensures the reliability of their platform, which hosts a vast number of AI models and datasets. AI
TOOL · HN — AI infrastructure stories English(EN) · 11mo

Show HN: Octelium – FOSS Alternative to Teleport, Cloudflare, Tailscale, Ngrok

Octelium has released a new open-source, self-hosted platform designed for secure access and deployment. It functions as a unified zero-trust solution, offering capabilities such as a remote access VPN, ZTNA, an alternative to ngrok and Cloudflare Tunnel, an API gateway, and an AI gateway. The platform supports identity-based access control and can be used for deploying containerized applications and managing homelab infrastructure. AI

IMPACT Provides a self-hosted gateway for AI LLM providers, potentially enabling more control and customization for AI deployments.
- BeyondCorp
- Google BeyondCorp
- OpenVPN Access Server
- Cloudflare Access
- Twingate
- Pi-hole
- Elasticsearch
- ClickHouse
- Ollama
- Apigee
- Kong Gateway
- Octelium
- Teleport
- Cloudflare
- Tailscale
- Ngrok
- WireGuard
- QUIC
TOOL · Hugging Face Blog English(EN) · 11mo

Transformers backend integration in SGLang

Hugging Face has integrated its Transformers library with SGLang, an open-source language model serving system. This integration allows developers to leverage Hugging Face's extensive model hub directly within SGLang for more efficient model deployment and inference. The collaboration aims to simplify the process of serving large language models, making advanced AI capabilities more accessible to a wider range of users and applications. AI
TOOL · HN — machine learning stories English(EN) · 12mo

Show HN: Glowstick – type level tensor shapes in stable rust

Glowstick is a new Rust crate designed to enhance tensor manipulation by integrating shape checking directly into the type system. This approach aims to make tensor operations safer and more intuitive, particularly for developers working with machine learning frameworks. The project, currently in its pre-1.0 phase, offers features like dynamic dimension support and improved error messages, with plans to align with ONNX operations. AI

IMPACT Provides a type-safe approach to tensor manipulation in Rust, potentially improving developer experience and reducing errors in ML workflows.
- Tensor
- ONNX
- Candle
- Rust
- Burn
TOOL · Together AI blog English(EN) · 12mo · [2 sources]

Introducing Together Code Sandbox & Together Code Interpreter: SOTA code execution for AI

Together AI has launched two new products, Together Code Sandbox and Together Code Interpreter, aimed at improving the execution of AI-generated code. Together Code Sandbox offers customizable virtual machine environments for building development tools and agentic workflows, featuring rapid VM startup and scaling capabilities. Together Code Interpreter provides a simpler API for session-based Python code execution within these secure sandboxes, designed for straightforward use cases. AI

IMPACT Accelerates development cycles for AI coding products by providing scalable and secure execution environments.
TOOL · Together AI blog English(EN) · 12mo

Together Code Interpreter: execute LLM-generated code seamlessly with a simple API call

Together AI has launched Together Code Interpreter (TCI), an API designed to securely execute code generated by large language models. This tool addresses the limitation of LLMs being unable to run the code they produce, enabling developers to integrate and test code within agentic workflows. TCI creates sandboxed environments for code execution, returning results that can be fed back to LLMs for iterative improvement and richer user responses. The interpreter has also shown promise in accelerating reinforcement learning operations by automating code evaluation and unit testing during model training. AI

IMPACT Enables LLMs to execute code, potentially accelerating agentic workflows and improving model training through automated evaluation.
TOOL · HN — AI infrastructure stories English(EN) · 13mo

Launch HN: Tinfoil (YC X25): Verifiable Privacy for Cloud AI

Tinfoil, a startup founded by researchers from MIT and Cloudflare, has launched a new service designed to provide verifiable privacy for AI workloads hosted in the cloud. The platform utilizes secure enclave technology, particularly NVIDIA's confidential computing capabilities on GPUs, to ensure that neither Tinfoil nor the cloud provider can access sensitive data processed by AI models. This approach aims to enhance AI privacy by replacing trust with provable security, enabling more complex AI applications that require private data. AI

IMPACT Enables more sensitive AI applications by providing verifiable privacy for cloud-hosted models.
- NVIDIA
- Tinfoil
- MIT
- Microsoft Research
- Cloudflare
- Tor
- Llama
- Deepseek R1
- TLS
- FHE
- Sigstore
TOOL · Hacker News — AI stories ≥50 points English(EN) · 13mo · [2 sources]

Show HN: HelixDB – A graph database built on object storage

HelixDB has launched as an open-source platform designed to consolidate multiple database types for AI applications. It aims to eliminate the need for separate databases for application logic, relational data, vectors, and graphs by offering a unified graph and vector data model that also supports KV, document, and relational formats. The platform includes a CLI for local instance management and a "helix chef" tool that can bootstrap projects and even build applications from a single description with the help of coding agents. AI

IMPACT Consolidates multiple data stores, potentially simplifying AI application development and agent integration.
- HelixDB
- AI
- OpenCode
- Codex
- Claude Code
- Rust
- TypeScript
TOOL · HN — AI startup stories English(EN) · 13mo

Launch HN: ParaQuery (YC X25) – GPU Accelerated Spark/SQL

ParaQuery, a new startup, has launched a GPU-accelerated Spark and SQL data processing solution. The platform aims to offer cost and performance benefits over existing solutions like Google BigQuery. ParaQuery leverages NVIDIA's RAPIDS technology to enhance traditional data processing tasks, which the founder notes are often mistakenly believed to be limited to AI and graphics. AI

IMPACT Enhances data processing efficiency, potentially lowering costs for AI workloads that rely on large datasets.
- CUDA
- Google
- RAPIDS
- SQL
- BigQuery
- NVIDIA
- ParaQuery
- Spark
TOOL · Replit blog English(EN) · 13mo · [2 sources]

Introducing Replit Auth: add secure login to your app

Replit has launched Replit Auth, a new service designed to simplify the integration of user login and management into applications. This feature allows developers to add secure authentication, including social sign-in options, with minimal effort by simply including it in their Replit Agent prompts. Replit Auth leverages existing infrastructure for enterprise-grade security and provides tools for managing user data and accounts directly within the Replit Workspace. AI

IMPACT Simplifies development for AI-powered applications by abstracting away complex authentication processes.
- Clearout
- HackerOne
- Replit
- Replit Auth
- Stytch
- Replit Agent
- Firebase
TOOL · HN — AI startup stories English(EN) · 13mo

Launch HN: Exa (YC S21) – The web as a database

Exa has launched Websets, a new search engine that uses embeddings and agentic workflows to provide precise results from the web, presented in a database-like table format. The service aims to combat the decline in search quality by performing extensive embedding searches and then using LLMs to verify each result against complex queries. While the process can take significant time, Exa believes the accuracy and detailed verification are worth the wait, offering an alternative to traditional keyword-based search. AI

IMPACT Offers a novel approach to web search by leveraging embeddings and LLMs for enhanced accuracy and structured data retrieval.
- LLM
- Exa
- Websets
- Will
- Jeff
- Google
TOOL · Together AI blog English(EN) · 13mo

From AWS to Together Dedicated Endpoints: Arcee AI's journey to greater inference flexibility

Arcee AI has migrated its specialized small language models (SLMs) from AWS to Together Dedicated Endpoints, seeking improved cost, performance, and operational agility. The company focuses on training efficient models under 72 billion parameters for specific tasks like coding and general text generation. Arcee AI also developed Arcee Conductor, an inference routing system that directs queries to the most suitable model, including third-party options like GPT-4.1 and Claude 3.7 Sonnet, to optimize cost and performance. AI

IMPACT Enables more cost-effective deployment of specialized AI models for enterprise tasks.
TOOL · HN — machine learning stories English(EN) · 13mo

OCaml's Wings for Machine Learning

Raven is a new ecosystem of OCaml libraries designed for numerical computing, machine learning, and data science. It aims to provide type-safe alternatives to popular Python libraries such as NumPy, JAX, and PyTorch. The project includes modules for n-dimensional arrays, automatic differentiation, tokenization, neural networks, dataframes, and plotting, with the goal of building a robust scientific computing environment. AI

IMPACT Provides a type-safe alternative for AI development in OCaml, potentially attracting developers seeking stronger guarantees.
- JAX
- Matplotlib
- Jupyter
- MirageOS
- OCaml
- Raven
- NumPy
- PyTorch
TOOL · Hugging Face Blog English(EN) · 13mo · [3 sources]

Five Big Improvements to Gradio MCP Servers

Hugging Face has released significant updates to its Gradio MCP (Multi-Client Proxy) servers, enhancing their capabilities for LLM deployment. These improvements focus on boosting performance and user experience, allowing developers to more effectively upskill their large language models. The updates include new features and optimizations designed to streamline the process of building and managing MCP servers for LLM applications. AI
TOOL · HN — AI startup stories English(EN) · 13mo

Show HN: Morphik – Open-source RAG that understands PDF images, runs locally

Morphik has launched an open-source Retrieval-Augmented Generation (RAG) system designed for developers to integrate complex context into AI applications. The system aims to simplify the process by offering a unified solution for storing, representing, and searching unstructured and multimodal data, addressing the limitations of traditional RAG pipelines that struggle with visually rich documents. Morphik provides features like multimodal search, fast metadata extraction, and integrations with tools such as Google Suite and Slack, with a free tier available for users. AI

IMPACT Simplifies multimodal data integration for AI applications, potentially reducing development complexity and infrastructure costs.
- Confluence
- Slack
- Morphik
TOOL · HN — AI infrastructure stories English(EN) · 14mo

Show HN: We Put Chromium on a Unikernel (OSS Apache 2.0)

A new open-source project offers sandboxed Chrome browsers that can be run as Docker containers or on Unikraft unikernels. This setup is designed for browser automation, web agents, and testing AI agents that interact with the web. The unikernel implementation provides features like automated standby mode with state snapshotting and extremely fast cold restarts, enabling low-latency event handling. AI

IMPACT Enables developers to build and test AI agents that require controlled browser environments.
- Playwright
- Puppeteer
- Chrome DevTools
- Unikernel
- Kraft CLI
- Unikraft
- Chromium
- Docker
TOOL · Hugging Face Blog English(EN) · 14mo

17 Reasons Why Gradio Isn't Just Another UI Library

Gradio is a Python library designed to simplify the creation of user interfaces for machine learning models. It allows developers to quickly build interactive demos and share them with others. The library offers features like pre-built UI components, easy integration with popular ML frameworks, and the ability to deploy applications with a single command. AI
TOOL · HN — AI startup stories English(EN) · 14mo

Launch HN: mrge.io (YC X25) – Cursor for code review

AI startup mrge has launched a new platform designed to streamline code reviews for development teams. The tool connects to GitHub repositories and uses AI to analyze code changes within a secure, ephemeral sandbox environment. It aims to assist human reviewers by identifying potential bugs and providing context, inspired by productivity tools like Linear and Superhuman. AI

IMPACT Aims to accelerate code merging and reduce bugs by leveraging AI for code review, potentially improving developer productivity.
- GitHub
- mrge.io
- Cursor
- Linear
- Superhuman
- Better Auth
- Cal.com
- n8n
TOOL · HN — AI infrastructure stories English(EN) · 14mo

Show HN: ActorCore – Stateful serverless framework that runs anywhere

ActorCore, an open-source framework for AI agents, has been released, offering stateful serverless execution that aims to be significantly cheaper than existing sandbox solutions. It leverages WebAssembly and V8 isolates for near-zero cold starts and can be deployed across various platforms. The framework supports multiple AI models and provides granular security controls, with options for self-hosting or using a managed cloud service. AI

IMPACT Provides a cheaper and faster infrastructure for running AI agents, potentially lowering operational costs for AI applications.
- Codex
- OpenCode
- Daytona
- Rivet Cloud
- Claude Code
- ActorCore
- WebAssembly
- V8
- Pi
TOOL · HN — AI infrastructure stories English(EN) · 14mo

Show HN: Python at the Speed of Rust

The blog post "Python at the Speed of Rust" introduces a new approach to Python performance by leveraging Rust. It details how to integrate Rust code into Python projects, aiming to achieve significant speedups for computationally intensive tasks. The author demonstrates practical methods for this integration, offering a way to enhance existing Python applications without a complete rewrite. AI

IMPACT Offers a method for developers to significantly accelerate Python code, potentially benefiting AI/ML workloads that rely on Python.
- Python
- Rust
TOOL · Hugging Face Blog English(EN) · 14mo

Hugging Face and Cloudflare Partner to Make Real-Time Speech and Video Seamless with FastRTC

Hugging Face and Cloudflare have announced a partnership to integrate Hugging Face's FastRTC technology with Cloudflare's network. This collaboration aims to enhance real-time communication applications by improving the performance and scalability of speech and video streaming. The integration is expected to provide developers with more robust tools for building seamless interactive experiences. AI
TOOL · HN — machine learning stories English(EN) · 14mo

SeedLM: Compressing LLM Weights into Seeds of Pseudo-Random Generators

Researchers have developed SeedLM, a novel post-training compression technique for large language models that utilizes pseudo-random generator seeds to encode model weights. This method aims to reduce the high runtime costs associated with LLMs by generating weight matrices on-the-fly during inference, thereby decreasing memory access and improving speed for memory-bound tasks. SeedLM achieves this by trading compute for fewer memory accesses and notably does not require calibration data, generalizing well across diverse tasks and maintaining accuracy comparable to FP16 baselines even at significant compression levels. AI

IMPACT This compression technique could significantly reduce the deployment costs and increase the inference speed of large language models.
- IEEE Visualization
- Llama 2
- FP16
- Meta
- LLMs
- SeedLM
- Llama3 70B
TOOL · HN — machine learning stories English(EN) · 14mo

Show HN: OCR pipeline for ML training (tables, diagrams, math, multilingual)

A developer is creating a versatile OCR pipeline designed to extract structured data from complex educational materials for machine learning training. The system, which supports multilingual text, mathematical formulas, tables, and diagrams, aims to achieve over 90-95% accuracy on academic datasets. It generates AI-ready outputs in JSON or Markdown, including semantic annotations for visual content, and is built using various tools like Google Vision API and OpenAI API. The project's public release has been delayed due to the developer's academic commitments but is expected once the system is finalized. AI

IMPACT This tool could streamline the creation of specialized datasets for ML training, particularly in academic and research contexts.
TOOL · Hugging Face Blog English(EN) · 14mo

Journey to 1 Million Gradio Users!

Gradio, a popular open-source Python library for building machine learning interfaces, has surpassed one million users. The platform facilitates the creation of web UIs for AI models, enabling developers to easily share and demo their work. This milestone highlights the growing demand for accessible tools in the AI development community. AI
TOOL · HN — machine learning stories English(EN) · 14mo

Show HN: Hatchet v1 – A task orchestration platform built on Postgres

Hatchet, a new task orchestration platform, has been released, offering a robust solution for managing background tasks, AI agents, and durable workflows at scale. Built with a unique approach using Postgres as its durability layer, Hatchet aims to simplify self-hosting while providing features like automatic retries, real-time monitoring, and multi-language support. The platform is available as a cloud service or for self-hosting, targeting applications where reliability and scalability are critical. AI

IMPACT Provides a scalable infrastructure for running AI agents and complex workflows.
- Temporal
- Hatchet
- Postgres
- AI agents
- DBOS
TOOL · Hugging Face Blog English(EN) · 14mo

How Hugging Face Scaled Secrets Management for AI Infrastructure

Hugging Face has detailed its approach to managing sensitive information like API keys and credentials across its AI infrastructure. The company implemented a robust secrets management system to ensure security and compliance as its operations grew. This system allows for secure storage, distribution, and rotation of secrets, which is crucial for maintaining the integrity of AI models and services. AI
TOOL · HN — MCP stories English(EN) · 14mo

Show HN: Cursor IDE now remembers your coding prefs using MCP

Daniel from Zep has developed an integration for the Cursor IDE that provides persistent memory across coding sessions. This system uses Zep's open-source Graphiti framework and its Model Context Protocol (MCP) to store and retrieve user preferences, project specifications, and coding standards. The goal is to enhance the AI-assisted IDE by allowing it to remember crucial context without constant user input, adapting in real-time to changes in frameworks or standards. AI

IMPACT Enhances AI coding assistants by providing persistent memory, potentially improving developer workflow and reducing repetitive context setting.