Why A.I. Safety Controls Are Not Very Effective
Italia
When companies like Anthropic, Google and OpenAI build their artificial intelligence systems, they spend months adding ways to prevent people from using their technology to spread disinformation, build weapons or hack into computer networks. But recently, researchers in Italy discovered that they could break through these protections with poetry . They used poetic language to trick 31 A.I. systems into ignoring internal safety controls. When they began a prompt with elaborate verse and metaphor
din zilele anterioare