Is there a quant of Granite 30b I can run in 12gb of VRAM/32gb of RAM?
A user on the r/LocalLLaMA subreddit is seeking a quantized version of the Granite 30B model that can run on a system with 12GB of VRAM and 32GB of RAM. The user hopes such a version exists, indicating a need for more accessible model deployment on consumer hardware. AI