JSON string errors caused by 4-bit quant or KV Cachce quant?
Users on the r/LocalLLaMA subreddit are encountering JSON parsing errors, specifically "syntax error while parsing value - invalid string: missing closing quote; last read." This issue appears to be linked to the context size growing large, potentially during extended coding sessions. The errors are suspected to be caused by 4-bit quantization or KV cache quantization methods. AI
IMPACT Potential usability issue for local LLM deployments.