We Ran 4 Claude Code Dialogs for 28 Hours. Here's What the Memory Layer Caught (and Missed).
A developer conducted a 28-hour experiment using four concurrent Claude Code sessions, mediated solely through a shared filesystem. These sessions, named compass, Soul, V5, and nautilus-core, collaborated on tasks like contract negotiation and error detection. The experiment highlighted the effectiveness of filesystem-based contracts for agent communication, especially when agents have a stake in the relationship and deadlines are visible in their prompts. A key finding was the discovery of a significant discrepancy in test passing rates, where a reported "22/22 tests passing" was actually 11/22 failing due to a missing Python module initialization file. AI
IMPACT Demonstrates a novel, low-overhead method for multi-agent AI collaboration and reliability testing.