This article details methods for automating interactions with the web interfaces of large language models like ChatGPT and Gemini, bypassing the need for API keys. The author explains that while direct web UI interaction is free, it's manual, whereas APIs are scriptable but incur costs. The guide focuses on using Selenium with undetected-chromedriver to programmatically input text, handle special characters and newlines, and upload files. It highlights specific challenges such as contenteditable divs and custom textareas, and the workaround for Gemini's file upload mechanism by intercepting the browser's click event. AI
IMPACT Enables automated workflows for users who want to leverage LLMs without incurring API costs.
RANK_REASON Article describes a technical method for using existing tools to automate web interfaces, rather than a new product or release.
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →