Anthropic Redeploys Claude Fable 5 with Cybersecurity Safeguards and Jailbreak Framework
AI 안전 | Fri Jul 03 2026 00:00:00 GMT+0000 (Coordinated Universal Time) | 1 sources
Anthropic redeployed Claude Fable 5 globally alongside a cybersecurity classifier and a draft AI jailbreak severity framework.
Analysis
[Anthropic] redeployed Claude Fable 5 globally [1]
- relaunched to all users worldwide
- deployed alongside a cybersecurity safety classifier
- detects and blocks dangerous cybersecurity use
[Fable 5 Safety Classifier] introduced a four-tier cybersecurity dual-use classification system [1]
- Prohibited use is blocked
- High-risk dual use is blocked
- Low-risk dual use is allowed for defensive purposes
- Clearly defensive use is allowed
[AI Jailbreak Severity Framework] released a draft jailbreak severity framework co-developed with Glasswing [1]
- describes jailbreak risk levels in standardized terminology
- supports consistent communication between AI developers and governments
- aims to foster discussion across academia
- industry
- civil society
- and government
- feedback accepted at [email protected]
[HackerOne Program] launched a cyber jailbreak reporting program for Fable 5 [1]
- allows security researchers to report potential cyber jailbreaks
- Anthropic reviews and responds to submissions
- aims to establish standards for enabling defensive use while preventing misuse