Persuading AI to Comply with Objectionable Requests

Lennart Meincke and colleagues from the Wharton School published an interesting paper titled Call Me A Jerk: Persuading AI to Comply with Objectionable Requests. They experimented whether Large Language Models (LLM) could be tricked by using usual human persuasion techniques such as authority or liking.

As the leading LLMs have been trained on quasi everything that was published, their internal “probabilities” may be influenced by these techniques. Indeed, many of our literary works and movies are based on such techniques. On the web, a lot of articles also use these techniques.

Without too much surprise, LLMs are infuenced by these techniques.

It is an easy-to-read paper. Highly recommend.

Meincke, Lennart, Dan Shapiro, Angela Duckworth, Ethan R. Mollick, Lilach Mollick, and Robert Cialdini. “Call Me A Jerk: Persuading AI to Comply with Objectionable Requests.” SSRN Scholarly Paper No. 5357179. Social Science Research Network, July 18, 2025. https://doi.org/10.2139/ssrn.5357179.

The blog of content protection

A site managed by Eric Diehl

Persuading AI to Comply with Objectionable Requests

Leave a Reply Cancel reply