AI firm Anthropic has developed a new line of defense against a common kind of attack called a jailbreak. A jailbreak tricks large language models (LLMs) into doing something they have been trained not to, such as help somebody create a weapon. Anthropic’s new approach could be the strongest shield against jailbreaks yet…
Read More
Anthropic has a new way to protect large language models against jailbreaks
