LC

Lawrence Chan

191 karmaJoined Oct 2022

Comments
3

I'd personally like to see more well-thought out 1) AI governance projects and 2) longtermist community building projects that are more about strengthening the existing community as opposed to mass recruitment. 

Newbie fund manager here, but:

I strongly agree that governance work along these lines is very important; in fact, I'm currently working on governance full time instead of technical alignment research. 

Needless to say, I would be interested in funding work that aims to buy time for alignment research. For example, I did indeed fund this kind of AI governance work in the Lightspeed Grants S-process. But since LTFF doesn't currently do much if any active solicitation of grants, we're ultimately bottlenecked by the applications we receive.

I think it's very impressive! It's worth noting that this has won a small-scale press diplomacy tournament in the past: https://www.thenadf.org/tournament/captain-meme-runs-first-blitzcon/ (playing under the name Franz Broseph), and also commentated footage of a human vs all cicero bot game here: 


That being said, it's worth noting that they built quite a complicated, specialized AI system (ie they did not take an LLM and finetune a generalist agent that also can play diplomacy):

  • First, they train a dialogue-conditional action model by behavioral cloning on human data to predict what other players will do.
  • They they do joint RL planning to get action intentions of the AI and other payers using the outputs of the conditional action model and a learned dialogue-free value model. (They use also regularize this plan using a KL penalty to the output of the action model.)
  • They also train a conditional dialogue model that maps by finetuning a small LM (a 2.7b BART) to map intents + game history > messages.
  • They train a set of filters to remove hallucinations, inconsistencies, toxicity, etc from the output messages, before sending them to other players.
  • The intents are updated after every message. At the end of each turn, they output the final intent as the action.

I do expect someone to figure out how to avoid all these dongles and do it with a more generalist model in the next year or two, though. 



I think people who are freaking out about Cicero moreso than foundational model scaling/prompting progress are wrong; this is not much of an update on AI capabilities nor an update on Meta's plans (they were publically working on diplomacy for over a year). I don't think they introduce any new techniques in this paper either? 

It is an update upwards on the competency of this team of Meta,  a slight update upwards on the capabilities of small LMs, and probably an update upwards on the amount of hype and interest in AI.

But yes, this is the sort of thing that you'd see more of in short timelines rather than long.