Resk-Security has released resk-logits, an open-source Python library designed to prevent Large Language Model (LLM) jailbreaks by filtering at the logits layer. This approach intercepts potentially harmful tokens before they are generated, unlike traditional methods that scan output after generation. The library utilizes a GPU-accelerated Aho-Corasick algorithm to scan over 10,000 disallowed patterns in under a millisecond, offering a more robust and faster method for LLM safety. AI
IMPACT Provides a more robust and faster method for LLM safety by filtering at the logits layer, potentially improving security against jailbreaks.
RANK_REASON Release of a new open-source library for LLM safety.
- Aho–Corasick algorithm
- CUDA
- Hugging Face
- LLM
- mistralai/Mistral-7B-v0.1
- PyTorch
- resk-logits
- Resk-Security
- RTX 4090
AI-generated summary · Google Gemini · from 1 sources. How we write summaries →