Critiques of prominent AI safety labs: Conjecture

Omega

Crossposted to LessWrong.

This is the second post in this sequence and covers Conjecture. We recommending reading our brief introduction to the sequence for added context on our motivations, who we are, and our overarching views on alignment research.

Conjecture is a for-profit alignment startup founded in late 2021 by Connor Leahy, Sid Black and Gabriel Alfour, which aims to scale applied alignment research. Based in London, Conjecture has received $10 million in funding from venture capitalists (VCs), and recruits heavily from the EA movement.

We shared a draft of this document with Conjecture for feedback prior to publication (and include their response below). We also requested feedback on a draft from a small group of experienced alignment researchers from various organizations, and have invited them to share their views in the comments of this post. We'd like to invite others to share their thoughts in the comments, or anonymously via this form.

Key Takeaways

For those with limited knowledge and context on Conjecture, we recommend first reading or skimming the About Conjecture section.

Time to read the core sections (Criticisms & Suggestions and Our views on Conjecture) is 22 minutes.

Criticisms and Suggestions

We think Conjecture’s research is low quality (read more).
- Their posts don’t always make assumptions clear, don’t make it clear what evidence base they have for a given hypothesis, and evidence is frequently cherry-picked. We also think their bar for publishing is too low, which decreases the signal to noise ratio. Conjecture has acknowledged some of these criticisms, but not all (read more).
- We make specific critiques of examples of their research from their initial research agenda (read more).
- There is limited information available on their new research direction (cognitive emulation), but from the publicly available information it appears extremely challenging and so we are skeptical as to its tractability (read more).
We have some concerns with the CEO’s character and trustworthiness because, in order of importance (read more):
- The CEO and Conjecture have misrepresented themselves to external parties multiple times (read more);
- The CEO’s involvement in EleutherAI and Stability AI has contributed to race dynamics (read more);
- The CEO previously overstated his accomplishments in 2019 (when an undergrad) (read more);
- The CEO has been inconsistent over time regarding his position on releasing LLMs (read more).
We believe Conjecture has scaled too quickly before demonstrating they have promising research results, and believe this will make it harder for them to pivot in the future (read more).
We are concerned that Conjecture does not have a clear plan for balancing profit and safety motives (read more).
Conjecture has had limited meaningful engagement with external actors (read more):
- Conjecture lacks productive communication with external actors within the TAIS community, often reacting defensively to negative feedback and failing to address core points (read more);
- Conjecture has not engaged sufficiently with the broader ML community, we think they would receive valuable feedback by engaging more. We’ve written more about this previously (read more).

Our views on Conjecture

We would generally recommend working at most other AI safety organizations above Conjecture given their history of low quality research and the leadership team’s lack of research experience (and thus mentorship) and concerns with the CEO’s character and trustworthiness (read more). ^[1]
We would advise Conjecture to avoid unilateral engagement with important stakeholders and strive to represent their place in the TAIS ecosystem accurately because they have misrepresented themselves multiple times (read more).
We do not think that Conjecture should receive additional funding before addressing key concerns because of the reasons cited above (read more).
We encourage the TAIS and EA community members and organizations reflect to what extent they want to legitimize Conjecture until Conjecture addresses these concerns (read more).

About Conjecture

Funding

Conjecture received (primarily via commercial investment) roughly $10 million in 2022. According to them, they’ve received VC backing from Nat Friedman (ex-CEO of GitHub), Patrick and John Collison (co-founders of Stripe), Daniel Gross (investor and cofounder of a startup accelerator), Andrej Karpathy (ex-OpenAI), Sam Bankman-Fried, Arthur Breitman and others. We are not aware of any later funding rounds, but it’s possible they have raised more since then.

Outputs

Products

Verbalize is an automatic transcription model. This is a B2C SaaS product and was released in early 2023. Our impression is that it's easy to use but no more powerful than existing open-source models like Whisper, although we are not aware of any detailed empirical evaluation. We do not think the product has seen commercial success yet, as it was released recently. Our estimate is that about one third of Conjecture’s team are actively working on developing products.

Alignment Research

Conjecture studies large language models (LLMs), with a focus on empirical and conceptual work. Mechanistic interpretability was a particular focus, with output such as the polytope lens, sparse autoencoders and analyzing the SVD of weight matrices, as well as work more broadly seeking to better understand LLMs, such as simulator theory.

They have recently pivoted away from this agenda towards cognitive emulation, which is reminiscent of process-based supervision. Here is a link to their full research agenda and publication list. Due to their infohazard policy (see below), some of their research may not have been publicly released.

Infohazard policy

Conjecture developed an infohazard policy in their first few months and shared it publicly to encourage other organizations to publish or adopt similar policies. They say that while many actors were “verbally supportive of the policy, no other organization has publicly committed to a similar policy”.

Governance outreach

We understand that CEO Connor Leahy does a lot of outreach to policymakers in the UK, and capabilities researchers at other prominent AI companies. He’s also appeared on several podcasts (1, FLI (1,2,3,4), 3, 4, 5) and been interviewed by several journalists (1, 2, 3, 4, 5, 6, 7, 8).

Incubator Program

Adam Shimi ran an incubator called Refine in 2022, whose purpose was to create new independent conceptual researchers and help them build original research agendas. Based on Adam’s retrospective, it seems like this project wasn’t successful at achieving its goals and Adam is now pursuing different projects.

Team

The Conjecture team started as a team of 4 employees in late 2021 and have grown to at least 22 employees now (according to their LinkedIn), with most employees joining in 2022.

Their CEO, Connor Leahy, has a technical background (with 2 years of professional machine learning experience and a Computer Science undergrad) and partially replicated GPT-2 in 2019 (discussed in more detail below). Their Chief of Staff, has experience with staffing and building team culture from her time at McKinsey, and has similar experience at Meta. Their co-founder Gabriel Alfour has the most relevant technical and scaling experience as the CEO of Marigold,^[2]^[1] a firm performing core development on the Tezos cryptocurrency infrastructure with over 30 staff members.

Two individuals collectively publishing under the pseudonym janus published simulator theory, one of Conjecture's outputs that we understand the TAIS community to have been most favorable towards. They left Conjecture in late 2022. More recently, many researchers working on mechanistic interpretability left the team after Conjecture's pivot towards cognitive emulation. Those departing include Lee Sharkey, the lead author on the sparse autoencoders post and a contributor to the polytope lens post.

Conjecture in the TAIS ecosystem

Conjecture staff are frequent contributors on the Alignment Forum and recruit heavily from the EA movement. Their CEO has appeared on a few EA podcasts (including several times on the FLI podcast). Some TAIS researchers are positive about their work. They fiscally sponsor two TAIS field-building programs, MATS and ARENA, in London (where they are based).

Their team also spent a month in the Bay Area in 2022 (when many TAIS researchers were visiting through programs like MLAB, SERI MATS and on independent grants). Conjecture made an effort to build relationships with researchers, decisionmakers and grantmakers, and were actively fundraising from EA funders during this period. 3-4 Conjecture staff regularly worked out of the Lightcone Offices, with a peak of ~11 staff on a single day. The largest event run by Conjecture was an EA Global afterparty hosted at a Lightcone venue, with a couple hundred attendees, predominantly TAIS researchers.

Criticisms and Suggestions

Low quality research

General thoughts on Conjecture’s research

We believe most of Conjecture’s publicly available research to date is low-quality. It is hard to find an accurate reference class for Conjecture’s work, as members have prioritized releasing small, regular updates. We think the bar of a workshop research paper is appropriate because it has a lower bar for novelty while still having it’s own technical research. We don’t think Conjecture’s research (combined) would meet this bar.^[3]

As we discuss below, Conjecture does not present their research findings in a systematic way that would make it accessible for others to review and critique. Conjecture’s work often consists of isolated observations that are not built upon or adequately tested in other settings.

Our suggestions: We recommend Conjecture focus more on developing empirically testable theories, and also suggest they introduce an internal peer-review process to evaluate the rigor of work prior to publicly disseminating their results. Conjecture might also benefit from having researchers and reviewers work through (although not rigidly stick to) the Machine Learning Reproducibility Checklist.

Lack of team's prior track record or experience in alignment and ML research^[4]

These limitations may in part be because Conjecture is a young organization with a relatively inexperienced research team, a point they have readily acknowledged in retrospectives and when criticized on research quality. Conjecture's leadership staff staff has a relatively limited alignment research track record. By contrast, at an organization like ARC, Paul Christiano has a clear track record of producing useful conceptual insights (e.g. Iterated Distillation and Amplification) as well as practical advances (e.g. Deep RL From Human Preferences) prior to ARC’s founding.^[5] We're not aware of any equally significant advances from any key staff members at Conjecture (including those who have left).

However, taking their youth and inexperience into account, we still think their research is below the bar for funding or other significant support. When we take into account the funding that Conjecture has (at least $10M raised in their last round), we think they are significantly underperforming standard academic research labs (see our discussion on this in the Redwood post; we are significantly more excited about Redwood’s research than Conjecture).

Our suggestions: We believe they could significantly improve their research output by seeking out mentorship from more experienced ML or alignment researchers, and recommend they do this in the future.

Initial research agenda (March 2022 - Nov 2022)

Conjecture’s initial research agenda focused on interpretability, conceptual alignment and epistemology. Based on feedback from Conjecture, they are much more excited about their new research direction in cognitive emulation (discussed in the following section). However, as an organization's past track record is one of the best predictors of their future impact, we believe it is important to understand Conjecture's previous approach.

Our understanding is that Conjecture was pursuing a hits-based strategy. In general, we are excited by hits-based approaches. Even if they don't succeed, rigorous negative results can save future researchers from going down dead-ends. We've generally not found their preliminary findings to significantly update our views, although some researchers have found those results useful.^[6]

To Conjecture's credit, they acknowledged a number of mistakes in their retrospective. For example, they note that their simulators post was overinvested in, and "more experienced alignment researchers who have already developed their own deep intuitions about GPT-like models didn’t find the framing helpful." However, there are several issues we identify (such as lack of rigor) that are not discussed in the retrospective. There are also issues discussed in the retrospective where Conjecture leadership comes to the opposite conclusion to us: for example, Conjecture writes that they "overinvested in legibility and polish" whereas we found many of their posts to be difficult to understand and evaluate.

We believe three representative posts, which Conjecture leadership were excited by as of 2022 Q3, were: janus’s post on simulators, Sid and Lee's post on polytopes, and their infohazard policy. These accomplishments were also highlighted in their retrospective. Although we find these posts to have some merit, we would overall assess them as having limited impact. Concretely, we would evaluate Redwood's Indirect Object Identification or Causal Scrubbing papers as both more novel and scientifically rigorous. We discuss their infohazard policy, simulators and polytopes post in turn below.

Their infohazard policy is a fairly standard approach to siloing research, and is analogous to structures common in hedge funds or classified research projects. It may be positive for Conjecture to have adopted such a policy (although it introduces risks of concentrating power in the CEO, discussed in the next section), but it does not provide any particular demonstration of research capability.

The simulators and polytopes posts are both at an exploratory stage, with limited empirical evidence and unclear hypotheses. Compared to similar exploratory work (e.g. the Alignment Research Center), we think Conjecture doesn’t make their assumptions clear enough and have too low a bar for sharing, reducing the signal-to-noise ratio and diluting standards in the field. When they do provide evidence, it appears to be cherry picked.

Their posts also do not clearly state the degree of belief they have in different hypotheses. Based on private conversations with Conjecture staff, they often appear very confident in their views and results of their research despite relatively weak evidence for them. In the simulators post, for example, they describe sufficiently large LLMs as converging to simulators capable of simulating “simulacra”: different generative processes that are consistent with the prompt. The post ends with speculative beliefs that they stated fairly confidently that took the framing to an extreme (e.g if the AI system adopts the “superintelligent AI persona” it’ll just be superintelligent).

We think the framing was overall helpful, especially to those newer to the field, although it can also sometimes be confusing: see e.g. these critiques. The framing had limited novelty: our anecdotal impression is that most researchers working on language model alignment were already thinking along similar lines. The more speculative beliefs stated in the post are novel and significant if true, but the post does not present any rigorous argument or empirical evidence to support them. We believe it’s fine to start out with exploratory work that looks more like an op-ed, but at some point you need to submit your conjectures to theoretical or empirical tests.

Our suggestions: We encourage Conjecture to explicitly state their confidence levels in written output and make clear what evidence base they do or do not have for a given hypothesis (e.g. conceptual argument, theoretical result, empirical evidence).

New research agenda (Nov 22 - Present)

Conjecture now has a new research direction exploring cognitive emulation. The goal is to produce bounded agents that emulate human-like thought processes, rather than agents that produce good output but for alien reasons. However, it’s hard to evaluate this research direction as they are withholding details of their plan due to their infohazard policies. Several commenters have asked questions about the proposal including a request to list a concrete research path, the strategic assumptions behind the agenda and more details to help readers evaluate if agenda’s viability. Conjecture has so far not addressed those comments.^[7]

On the face of it, this project is incredibly ambitious, and will require huge amounts of effort and talent. Because of this, details on how they will execute the project are important to understanding how promising this project may be.

Our suggestions: We encourage Conjecture to share some more technical detail unless there are concrete info-hazards they are concerned about. In the latter case we would suggest sharing details with a small pool of trusted TAIS researchers for external evaluation.

CEO’s character and trustworthiness

We are concerned by the character and trustworthiness of Conjecture's CEO, Connor Leahy. Connor has demonstrated a lack of attention to rigor and engagement with risky behavior, and he, along with other staff, have demonstrated an unwillingness to take external feedback.

Connor is clearly a highly driven individual, who has built a medium-sized organization in his early twenties. He has shown a willingness to engage with arguments and change his mind on safety concerns, for example delaying the release of his GPT-2 replication. Moreover, in recent years Connor has been a vocal public advocate for safety: although we disagree in some cases with the framing of the resulting media articles, in general we are excited to see greater public awareness of safety risks.^[8]

The character of an organization’s founder and CEO is always an important consideration, especially for early-stage companies. We believe this consideration is particularly strong in the case of Conjecture:

Conjecture engages in governance outreach that involves building relationships between government actors and the TAIS community, and there are multiple accounts of Conjecture misrepresenting themselves.
As the primary stakeholder & CEO, Connor will be responsible for balancing incentives to develop capabilities from stakeholders (see below).
Conjecture's infohazard policy has the consequence of heavily centralizing power to the CEO (even more so than a typical tech company). The policy mandates projects are siloed, and staff may be unaware of the details (or even the existence) of significant fractions of Conjecture's work. The CEO is Conjecture's "appointed infohazard coordinator" with "access to all secrets and private projects" – and thus is the only person with full visibility. This could substantially reduce staff's ability to evaluate Conjecture's strategy and provide feedback internally. Additionally, if they don’t have the full information, they may not know if Conjecture is contributing to AI risk.^[9] We are uncertain the degree to which this is a problem given Conjecture's current level of internal secrecy.

Conjecture and their CEO misrepresent themselves to various parties

We are generally worried that Connor will tell the story that he expects the recipient to find most compelling, making it challenging to confidently predict his and Conjecture's behavior. We have heard credible complaints of this from their interactions with funders. One experienced technical AI safety researcher recalled Connor saying that he will tell investors that they are very interested in making products, whereas the predominant focus of the company is on AI safety.

We have heard that Conjecture misrepresent themselves in engagement with the government, presenting themselves as experts with stature in the AIS community, when in reality they are not. We have heard reports that Conjecture's policy outreach is decreasing goodwill with policymakers. We think there is a reasonable risk that Connor and Conjecture’s actions may be unilateralist and prevent important relationships from being formed by other actors in the future.

Unfortunately we are unable to give further details about these incidents as our sources have requested confidentiality; we understand this may be frustrating and acknowledge it is difficult for Conjecture to substantively address these concerns. We encourage individuals to talk to others in this space to draw their own conclusions about Conjecture's impact here.^[10]

Our suggestions: We recommend Connor be more honest and transparent about his beliefs, plans and Conjecture’s role in the TAIS ecosystem. We also recommend the Conjecture introduce a strong, robust governance structure. For example, they could change their corporate charter to implement a "springing governance" structure such that voting equity (but not political equity) shift to an independent board once they cross a certain valuation threshold.^[11] (see below).

Contributions to race dynamics

We believe that Connor Leahy has contributed to increasing race dynamics and accelerating capabilities research, through supporting the creation of Stability AI through founding EleutherAI. EleutherAI is a community research group focused on open-source AI research founded in 2020. Under Connor's leadership, their plan was to build and release large open-source models to allow more people to work on important TAIS research that is only possible on pretrained LLMs. At the time, several members of the TAIS community, including Dan Hendrycks (founder of CAIS), privately warned Connor and EleutherAI that it would be hard to control an open source collective.

Stability AI

Stability AI brands themselves as an AGI lab and has raised $100M to fund research into and training of large, state-of-the-art models including Stable Diffusion.^[12] The addition of another AGI focused lab is likely to further exacerbate race dynamics. Stability is currently releasing the majority of the work they create as open-source: this has some benefits, enabling a broader range of researchers (including alignment researchers) to study these models. However, it also has significant drawbacks, such as making potential moratoriums on capabilities research much harder (if not impossible) to enforce. To our knowledge, Stability AI has not done much algorithmic advancement yet.

EleutherAI was pivotal in the creation of Stability AI. Our understanding is that the founder of Stability AI, Emad Mostaque, was active on the EleutherAI Discord and recruited much of his initial team from there. On the research side, Stability AI credited EleutherAI as supporting the initial version of Stable Diffusion in August 2022, as well as their most recent open-source language model release StableLM in April 2023. Emad (in Feb 2023) described the situation as: “Eleuther basically split into two. Part of it is Stability and the people who work here on capabilities. The other part is Conjecture that does specific work on alignment, and they're also based here in London.”

Stability AI continues to provide much of EleutherAI’s compute and is a sponsor of EluetherAI, alongside Nat Friedman (who also invested in Conjecture). Legally, Stability AI directly employed key staff of EleutherAI in a relationship we believe was similar to fiscal sponsorship. We understand that EleutherAI have recently transitioned to employing staff directly via their own non-profit entity (Connor and Emad sit on the board).

EleutherAI

EleutherAI is notable for having developed open-source LLMs such as GPT-NeoX. In the announcement post in February 2022, they claimed that "GPT-NeoX-20B is, to our knowledge, the largest publicly accessible pretrained general-purpose autoregressive language model, and we expect it to perform well on many tasks."

We do not think that there was much meaningful alignment output from EleutherAI itself during Connor’s tenure – most of the research published is capabilities research, and the published alignment research is of mixed quality. On the positive side, EleutherAI’s open-source models have enabled some valuable safety research. For example, GPT-J was used in the ROME paper and is widely used in Jacob Steinhardt’s lab. EleutherAI is also developing a team focusing on interpretability, their initial work includes developing the tuned lens in a collaboration with FAR AI and academics from Boston and Toronto.

Connor’s founding and management of EleutherAI indicates to us that he was overly optimistic about rapidly growing a community of people interested in language models and attracting industry sponsorship translating into meaningful alignment research. We see EleutherAI as having mostly failed at its goals of AI safety, and instead accelerated capabilities via their role in creating Stability.ai and Stable Diffusion.

In particular, EleutherAI's supporters were primarily interested in gaining access to state-of-the-art LLM capabilities with limited interest in safety. For example, the company Coreweave provided EleutherAI with compute and then used their models to sell a LM inference API called GooseAI. We conjecture that the incentive to please their sponsors, enabling further scale-up, may have contributed to EleutherAI's limited safety output.

We feel more positively about Conjecture than early-stage EleutherAI given Conjecture's explicit alignment research focus, but are concerned that Connor appears to be bringing a very similar strategy to Conjecture as to EleutherAI: scaling before producing tangible alignment research progress and attracting investment from external actors (primarily investors) with opposing incentives that they may not be able to withstand. We would encourage Conjecture to share a clear theory of change which includes safeguards against these risks.

To be clear, we think Conjecture's contribution to race dynamics is far less than that of OpenAI or Anthropic, both of which have received funding and attracted talent from the EA ecosystem. We would assess OpenAI as being extremely harmful for the world. We are uncertain on Anthropic: they have undoubtedly contributed to race dynamics (albeit less so than OpenAI), but have also produced substantial safety research. We will discuss Anthropic further in an upcoming post, but in either case we do not think that AGI companies pushing forward capabilities should exempt Conjecture or other organizations from criticisms.

Overstatement of accomplishments and lack of attention to precision

In June 2019, Connor claimed to have replicated GPT-2 while he was an undergraduate. However, his results were inaccurate and his 1.5B parameter model was weaker than even the smallest GPT-2 series model.^[13] He later admitted to these mistakes, explaining that his metric code was flawed and that he commingled training and evaluation datasets. Additionally, he said that he didn’t evaluate the strength of his final model, only one halfway through training. He said the reason he did this was because “I got cold feet once I realized what I was sitting on [something potentially impressive] and acted rashly.”^[14] We think this points to a general lack of thoughtfulness for making true and accurate claims.

We don’t want to unfairly hold people’s mistakes from their college days against them – many people exaggerate or overestimate (intentionally or not) their own accomplishments. Even a partial replica of GPT-2 is an impressive technical accomplishment for an undergraduate, so this project does attest to Connor's technical abilities. It is also positive that he admitted his mistake publicly. However, overall we do believe the project demonstrates a lack of attention to detail and rigor. Moreover, we haven’t seen signs that his behavior has dramatically changed.

Inconsistency over time regarding releasing LLMs

Connor has changed his stance more than once regarding whether to publicly release LLMs. Given this, it is difficult to be confident that Conjecture's current approach of defaulting to secrecy will persist over time.

In July 2019, Connor released the source code used to train his replica along with pretrained models comparable in size to the already released GPT-2 117M and and 345M models. The release of the training code seems hasty, enabling actors with sufficient compute but limited engineering skills to train their own, potentially superior, models. At this point, Connor was planning to release the full 1.5B parameter model to the public, but was persuaded not to.^[15] In the end, he delayed releasing the model to Nov 13 2019, a week after OpenAI released their 1.5B parameter version, on his personal GitHub.

In June 2021 Connor changed his mind and argued that releasing large language models would be beneficial to alignment as part of the team at EleutherAI (see discussion above). In Feb 2022, EleutherAI released an open-source 20B parameter model, GPT-NeoX. Their stated goal, endorsed by Connor in several places, was to "train a model comparable to the biggest GPT-3 (175 billion parameters)" and release it publicly. Regarding the potential harm of releasing models, we find Connor's arguments plausible – whether releasing open-source models closer to the state-of-the-art is beneficial or not remains a contested point. However, we are confident that sufficiently capable models should not be open-sourced, and expect strong open-source positive messaging to be counterproductive. We think EleutherAI made an unforced error by not at least making some gesture towards publication norms (e.g. they could have pursued a staggered release giving early access to vetted researchers).

In July 2022, Connor shared Conjecture’s Infohazard Policy. This policy is amongst the most restrictive at any AI company – even more restrictive than what we would advocate for. To the best of our knowledge, Conjecture's Infohazard Policy is an internal policy that can be overturned by Connor (acting as chief executive), or by a majority of their owners (of whom Connor as a founder will have a significant stake). Thus we are hesitant to rely on Conjecture’s Infohazard Policy remaining strictly enforced, especially if subject to commercial pressures.

Scaling too quickly

We think Conjecture has grown too quickly, from 0 to at least 22 staff from 2021 to 2022. During this time, they have not had what we would consider to be insightful or promising outputs, making them analogous to a very early stage start-up. This is a missed opportunity: their founding team and early employees include some talented individuals who, given time and the right feedback, might well have been able to identify a promising approach.

We believe that Conjecture’s basic theory of change for scaling is:

1) they’ve gotten good results relative to how young they are, even though the results themselves are not that insightful or promising in absolute terms, and

2) the way to improve these results is to scale the team so that they can test out more ideas and get more feedback on what does and doesn’t work.

Regarding 1) we think that others of similar experience level – and substantially less funding – have produced higher-quality output. Concretely, we are more excited about Redwood’s research than Conjecture (see our criticisms of Conjecture’s research), despite being critical of Redwood’s cost-effectiveness to date.^[16] Notably, Redwood drew on a similar talent pool to Conjecture, largely hiring people without prior ML research experience.

Regarding 2), we disagree that scaling up will improve their research quality. In general, the standard lean startup team advice is that it’s important to keep your team small while you are finding product-market fit or, in Conjecture's case, developing an exciting research agenda. We think it’s very likely Conjecture will want to make major pivots in the next few years. Rapid growth will make it harder for them to pivot. With growing scale, more time will be spent on management, and it will be easier to get people locked into the wrong project or create dynamics where people are more likely to defend their pet projects. We can't think of examples where scale up has taken place successfully before finding product-market fit.

This growth would be challenging to manage in any organization. However, in our opinion alignment research is more challenging to scale than a traditional tech start-up due to the weaker feedback loops: it's much harder to tell if your alignment research direction is promising than whether you've found product-market fit.

Compounding this problem, their founding team Connor, Sid and Gabriel have limited experience in scaling research organizations. Connor and Sid's experience primarily comes from co-founding EleutherAI, a decentralized research collective: their frustrations with that lack of organization are part of what drove them to found Conjecture. Gabriel has the most relevant experience.

Conjecture appeared to have rapid scaling plans, but their growth has slowed in 2023. Our understanding is that this slow-down is primarily due to them being unable to raise adequate funding for their expansion plans.

Our suggestions for Conjecture:

Freeze hiring of junior staff until they identify scalable research directions that they and others in the alignment community are excited by. Conjecture may still benefit from making a small number of strategic hires that can help them manage their current scale and continue to grow, such as senior research engineers and people who have experience managing large teams.
Consider focusing on one area (e.g. technical research) and keeping other teams (e.g. product and governance) lean, or even consider whether they need them.
While we don’t think it’s ideal to let go of staff, we tentatively suggest Conjecture consider whether it might be worth making the team smaller to focus on improving their research quality, before growing again.

Unclear plan for balancing profit and safety motives

According to their introduction post, they think being a for-profit company is the best way to reach their goal because it lets them “scale investment quickly while maintaining as much freedom as possible to expand alignment research.” We think this could be challenging in practice: scaling investment requires delivering results that investors find impressive, as well as giving investors some control over the firm in the form of voting shares and, frequently, board seats.

Conjecture has received substantial backing from several prominent VCs. This is impressive, but since many of their backers (to our knowledge) have little interest in alignment, Conjecture will be under pressure to develop a pathway to profitability in order to raise further funds.

Many routes to developing a profitable AI company have significant capabilities externalities. Conjecture’s CEO has indicated they plan to build "a reliable pipeline to build and test new product ideas" on top of internal language models. Although this seems less bad than the OpenAI model of directly advancing the state-of-the-art in language models, we expect demonstrations of commercially viable products using language models to lead to increased investment in the entire ecosystem – not just Conjecture.

For example, if Conjecture does hit upon a promising product, it would likely be easy for a competitor to copy them. Worse, the competitor might be able to build a better product by using state-of-the-art models (e.g. those available via the OpenAI API). To keep up, Conjecture would then have to either start training state-of-the-art models themselves (introducing race dynamics), or use state-of-the-art models from competitors (and ultimately provide revenue to them).

Conjecture may have good responses to this. Perhaps there are products which are technically intricate to develop or have other barriers to entry making competition unlikely, and/or where Conjecture's internal models are sufficient. We don’t have reason to believe Verbalize falls into this category as there are several other competitors already (e.g. fireflies.ai, otter.ai, gong.io). We would encourage Conjecture to share any such plans they have to simultaneously serve two communities (for-profit VCs and TAIS), with sometimes conflicting priorities, for review with both sets of stakeholders.

Our impression is that they may not have a solid plan here (but we would invite them to share their plans if they do). Conjecture was trying to raise a series B from EA-aligned investors to become an alignment research organization. This funding round largely failed, causing them to pivot to focus more on VC funding. Based on their past actions we think it’s likely that they may eventually hit a wall with regards to product development, and decide to focus on scaling language models to get better results, contributing to race dynamics. In fairness to Conjecture, we would consider the race risk of Conjecture to be much smaller than that of Anthropic, which operates at a much bigger scale, is scaling much more rapidly, and has had more commercial success with its products.

It's not uncommon that people and orgs who conceive of or present themselves as AIS focused end up advancing capabilities much more than safety. OpenAI is perhaps the most egregious case of this, but we are also concerned about Anthropic (and will write about this in a future post). These examples should make us suspect that by default Conjecture's for-profit nature will end up causing it to advance capabilities, and demand a clear and detailed plan to avoid this to be convinced otherwise.

Our suggestions: In addition to sharing their plans for review, we recommend that Conjecture introduce robust corporate governance structures. Our understanding is that Conjecture is currently structured as a standard for-profit start-up with the founders controlling the majority of voting shares and around a third of the company owned by VCs. This is notably worse than OpenAI LP, structured as a "capped-profit" corporation with non-profit OpenAI, Inc. the sole controlling shareholder.^[17] One option would be for Conjecture to implement a "springing governance" structure in which given some trigger (such as signs that AGI is imminent, or that their total investment exceeds some threshold) its voting shares become controlled by a board of external advisors. This would pass governance power, but not financial equity, to people who Conjecture considers to be a good board – rather than being controlled wholly by their founding team.

Limited meaningful engagement with external actors

Lack of productive communication between TAIS researchers and Conjecture staff

We know several members of the EA and TAIS community who have tried to share feedback privately with Conjecture but found it very challenging. When negative feedback is shared, members of the Conjecture team sometimes do not engage meaningfully with it, missing the key point or reacting defensively. Conjecture leadership will provide many counter-arguments, none of which address the core point, or are particularly strong. This is reminiscent of the Gish gallop rhetorical technique, which can overwhelm interlocutors as it’s very difficult to rebut each counter-argument. Some Conjecture staff members also frequently imply that the person giving the criticism has ulterior motives or motivated reasoning.

It can be hard to hear criticism of a project you are passionate about and have invested considerable time in, so it’s natural that Conjecture staff are defensive over their work.

Our suggestions: We recommend Conjecture staff and especially leadership make an effort to constructively engage in criticism, seeking to understand where the critique is coming from, and take appropriate steps to correct misunderstandings and/or resolve the substance of the critique.

Lack of engagement with the broader ML community

Conjecture primarily disseminates their findings on the Alignment Forum. However, many of their topics (particularly interpretability) are at least adjacent to active research fields, such that a range of academic and industry researchers could both provide valuable feedback on Conjecture's research and gain insights from their findings.

Conjecture is not alone in this: as we wrote previously, we also think that Redwood could engage further with the ML community. Conjecture has not published any peer-reviewed articles, so we think they would benefit even more than Redwood from publishing their work and receiving external feedback.

Our suggestions: We recommend Conjecture focus on developing what they consider to be their most insightful research projects into a conference-level paper, and hiring more experienced ML research scientists or advisors to help them both effectively communicate their research and improve rigor.

Our views on Conjecture

We are genuinely concerned about Conjecture’s trustworthiness and how they might negatively affect the TAIS community and the TAIS community’s efforts to reduce risk from AGI. These are the main changes we call for, in rough order of importance.

We would generally recommend working at most other AI safety organizations above Conjecture^[1]

We think Conjecture needs to address key concerns before we would actively recommend working there. We expect it to be rare that an individual would have an offer from Conjecture but not have access to other opportunities that are better. In practice many organizations end up competing for the same, relatively small pool of the very top candidates. Our guess is that most individuals who could receive an offer from Conjecture might be likely to receive offers from non-profits such as Redwood, CAIS and FAR; alignment teams at Anthropic, OpenAI and DeepMind; or working with academics such as Stuart Russell, Sam Bowman, Jacob Steinhardt or David Krueger. Note we would not in general recommend working at capabilities-oriented teams at Anthropic, OpenAI, DeepMind or other AGI-focused companies.

Conjecture seems relatively weak for skill building, since their leadership team is relatively inexperienced and also stretched thin due to Conjecture's rapid scaling. We think people could pursue roles which provide better mentorship, like being a research assistant or PhD student in academia, or working in an ML engineering position in an applied team at a major tech company. These are generally close to capabilities-neutral, and can make individuals vastly more productive. We think these paths can absorb a relatively large amount of talent, although we note that most AI/ML fields are fairly competitive.

We also don’t generally recommend people pursue independent research, as we believe it’s a poor fit for most people. If someone feels their only good options are to do independent research or work at Conjecture, we feel somewhat ambivalent between these two options.

We could imagine Conjecture being the best option for a small fraction of people who are (a) excited by their current CoEm approach, (b) can operate independently in an environment with limited mentorship, (c) are confident they can withstand internal pressure (if there is a push to work on capabilities).

In general, we think that the attractiveness of working at an organization that is connected to the EA or TAIS communities makes it more likely for community members to take jobs at such organizations even if this will result in a lower lifetime impact than alternatives. Conjecture's sponsorship of TAIS field building efforts may also lead new talent, who are unfamiliar with Conjecture's history, to have more positive impression of them.

We would advise Conjecture to take care when engaging with important stakeholders and represent their place in the TAIS ecosystem accurately

We are concerned that Conjecture has misrepresented themselves to various important stakeholders, including funders and policymakers. We think there is a reasonable risk that Connor and Conjecture’s outreach to policymakers and media is alarmist and may decrease the credibility of x-risk. These unilateral actions may therefore prevent important relationships from being formed by other actors in the future. This risk is further exacerbated by Connor’s unilateralist actions in the past, Conjecture’s overall reluctance to take feedback from external actors, and their premature and rapid scaling.

We do not think that Conjecture should receive additional funding before addressing key concerns

We have substantial concerns with the organization’s trustworthiness and the CEO’s character. We would strongly recommend that any future funding from EA sources be conditional on Conjecture putting in place a robust corporate governance structure to bring them at least on par with other for-profit and alignment-sympathetic firms such as OpenAI and Anthropic.

Even absent these concerns, we would not currently recommend Conjecture for funding due to the lack of a clear impact track record despite a considerable initial investment of $10mn. To recommend funding, we would want to see both improvements in corporate governance and some signs of high-quality work that the TAIS community are excited by.

Largely we are in agreement with the status quo here: so far Conjecture has largely been unsuccessful fundraising from prominent EA funders, and where they have received funding it was for significantly less than their initial asks.

We encourage TAIS and EA community members to consider to what extent they want to legitimize Conjecture until Conjecture addresses these concerns

Conjecture has several red flags and a weak track record for impact. Although the TAIS and EA community have largely refrained from explicit endorsements of Conjecture (such as funding them), there are a variety of implicit endorsements. These include tabling at EA Global career fairs, Lightcone hosting Conjecture events and inviting Conjecture staff, field-building organizations such as MATS and ARENA working with Conjecture as a fiscal sponsor,^[18] as well as a variety of individuals in the community (mostly unaware of these issues) recommending Conjecture as a place to work.

To clarify, we think individuals should still read and engage with Conjecture's research where they judge it to be individually worth their time. We also welcome public debates involving Conjecture staff, such as the one between Paul Christiano and Gabriel Alfour. Our goal is not to shun Conjecture, but to avoid giving them undue influence until their research track record and governance structure improves.

We recognize that balancing these considerations can be tricky, which is why our main recommendation is to encourage people to spend time actively reflecting on how they want to engage with Conjecture in light of the information we present in this post (alongside other independent sources).

Appendix

Communication with Conjecture

We shared a draft of this post with Conjecture to review, and have included their full response (as they indicated they would post it publicly) below. We thank them for their engagement and made several minor updates to the post in response, however we disagree with several key claims made by Conjecture in their response. We describe the changes we made, and where we disagree, in the subsequent section.

Conjecture’s Reply

Hi,

Thank you for your engagement with Conjecture’s work and for providing us an opportunity to share our feedback.

As it stands, the document is a hit piece, whether intentional or not. It is written in a way such that it would not make sense for us to respond to points line-by-line. There are inaccuracies, critiques of outdated strategies, and references to private conversations where the details are obscured in ways that prevent us from responding substantively. The piece relies heavily on criticism of Connor, Conjecture CEO, but does not attempt to provide a balanced assessment: there are no positive comments written about Connor along with the critiques, and past mistakes he admitted to publicly are spun as examples of “low-integrity” behavior. Nuanced points such as the cost/benefit of releasing small open source models (pre-Chinchilla) are framed as “rash behavior,” even when you later write that you find Connor’s arguments “plausible.” Starting from this negative frame does not leave room for us to reply and trust that an object-level discussion will proceed.

We also find it surprising to see that most of the content of the piece is based on private discussions and documents shared between Conjecture, ~15 regrantors, and the FTX Future Fund team in August 2022. The piece does not disclose this context. Besides the fact that much of that information is outdated and used selectively, the information has either been leaked to the two anonymous authors, or one of the authors was directly involved in the regranting process. In either case, this is a violation of mutual confidentiality between Conjecture and regrantors/EA leadership involved in that channel.

We don’t mind sharing our past plans and discussions now and would be happy to publish the entire discussions from the Slack channel where those conversations took place (with consent of the other participants). However, it is a sad conclusion of that process that our openness to discussing strategy in front of regrantors formed the majority set of Bay Area TAIS leadership opinions about Conjecture that frame us as not open, despite these conversations being a deeper audit than pretty much any other TAIS organization.

We’d love to have a productive conversation here, but will only respond in detail if you reframe this post from a hit piece to something better informed. If your aim is to promote coordination, we would recommend asking questions about our plans and beliefs, focusing on the parts that do not make sense to you, and then writing your summary. Conjecture’s strategy is debatable, and we are open to changing it - and have done so in the past. Our research is also critiqueable: we agree that our research output has been weak and have already written about this publicly here. But as described above, this post doesn’t attempt to engage with Conjecture’s current direction.

Going further, if the aim of your critique is to promote truth-seeking and transparency, we would gladly participate in a project about creating and maintaining a questionnaire that all AI orgs should respond to, so that there is as little ambiguity in their plans as possible. In our posts we have argued for making AI lab’s safety plans more visible, and previously ran a project of public debates aimed at highlighting cruxes in research disagreements. Conjecture is open to our opinion being on the record, so much so that we have occasionally declined private debates with individuals who don’t want to be on record. This decision may contribute to some notion of our “lack of engagement with criticism.”

—

As a meta-point, we think that certain strategic disagreements between Conjecture and the Bay Area TAIS circles are bleeding into reputational accusations here. Conjecture has been critical of the role that EA actors have played in funding and supporting major AGI labs historically (OAI, Anthropic), and critical of current parts of the EA TAIS leadership and infrastructure that continue to support the development of superintelligence. For example, we do not think that GPT-4 should have been released and are concerned at the role that ARC’s benchmarking efforts played in safety-washing the model. These disagreements in the past have created friction, and we’d hazard that concerns about Conjecture taking “unilateral action” are predicted on this.

Instead of a more abstract notion of “race dynamics,” Conjecture’s main concern is that a couple of AI actors are unabashedly building superintelligence. We believe OpenAI, Deepmind, and Anthropic are not building superintelligence because the market and investors are demanding it. We believe they are building superintelligence because they want to, and because AGI has always been their aim. As such, we think you’re pointing the finger in the wrong direction here about acceleration risks.

If someone actually cares about curtailing “the race”, their best move would be to push for a ban on developing superintelligence and strongly oppose the organizations trying to build it. Deepmind, OpenAI, and Anthropic have each publicly pushed the AI state of the art. Deepmind and OpenAI have in their charters that they want to build AGI. Anthropic’s most recent pitch deck states that they are planning to train an LLM orders of magnitude larger than competitors, and that “companies that train the best 2025/26 models will be too far ahead for anyone to catch up in subsequent cycles,” which is awfully close to talking about DSA. No one at the leadership of these organizations (which you recommend people work at rather than Conjecture) have signed FLI's open letter calling for a pause in AI development. Without an alignment solution, the reasonable thing for any organization to do is stop development, not carve out space to continue building superintelligence unimpeded.

While Conjecture strongly disagrees with the strategies preferred by many in the Bay Area TAIS circles, we’d hope that healthy conversations would reveal some of these cruxes and make it easier to coordinate. As written, your document assumes the Bay Area TAIS consensus is superior (despite being what contributed largely to the push for ASI), casts our alternative as “risking unilateral action,” and deepens the rift.

—

We have a standing offer to anyone to debate with us, and we’d be very happy to discuss with you any part of our strategy, beliefs about AI risks, and research agenda.

More immediately, we encourage you to rewrite your post as a Q&A aimed at asking for our actual views before forming an opinion, or at a minimum, rewrite your post with more balance and breathing room to hear our view. As it stands, this post cleaves the relationship between part of the TAIS ecosystem and Conjecture further and is unproductive for both sides.

Given the importance of having these conversations in the open, we plan to make this reply public.

Thanks for your time and look forward to your response,

Conjecture C-Suite

Brief response and changes we made

Conjecture opted not to respond to our points line-by-line and instead asked us to rewrite the post as a Q&A or “with more balance and breathing room to hear our view.” While we won’t be rewriting the post, we have made changes to the post in response to their feedback, some of which are outlined below.

Conjecture commented that the tone of the post was very negative, and in particular there was a lack of positive comments written about Connor. We have taken that feedback into consideration and have edited the tone to be more neutral & descriptive (with particular attention to the section on Connor). Conjecture also noted that Connor admitted to some of his mistakes publicly. We had previously linked to Connor’s update post on the partial GPT-2 replication, but we edited the section to make it more clear that he did acknowledge his mistake. They also pointed out that we framed the point on releasing models “as “rash behavior,” even when you later write that you find Connor’s arguments “plausible.” We’ve changed this section to be more clear.

They say “this post doesn’t attempt to engage with Conjecture’s current direction.” As we write in our section on their cognitive emulation research, there is limited public information on their current research direction for us to comment on.

They believe that “most of the content of the piece is based on private discussions and documents shared between Conjecture, ~15 regrantors, and the FTX Future Fund team in August 2022.” This is not the case: the vast majority (90+%) of this post is based on publicly available information and our own views which were formed from our independent impression of Conjecture via conversations with them and other TAIS community members. We think the content they may be referring to is:

One conversation that we previously described in the research section regarding Conjecture's original research priorities. We have removed this reference.
One point providing quantitative details of Conjecture's growth plans in the scaling section, which we have removed the details of.
The section on how Conjecture and their CEO represent themselves to other parties. This information was not received from those private discussions and documents.

They say they wouldn’t mind “sharing our past plans and discussions now and would be happy to publish the entire discussions from the Slack channel where those conversations took place (with consent of the other participants).” We welcome and encourage the Conjecture team to share their past plans publicly.

They note that “Conjecture is open to our opinion being on the record, so much so that we have occasionally declined private debates with individuals who don’t want to be on record. This decision may contribute to some notion of our 'lack of engagement with criticism.'" This is not a reason for our comment on their lack of engagement. They mentioned they have “a standing offer to anyone to debate with us”. We appreciate the gesture, but do not have capacity to engage in something as in-depth as a public debate at this time (and many others who have given feedback don’t either).

Conjecture points out the role “EA actors have played in funding and supporting major AGI labs historically (OAI, Anthropic)”, that our “document assumes the Bay Area TAIS consensus is superior … casts our alternative as “risking unilateral action”, and that “these disagreements in the past have created friction, and we’d hazard that concerns about Conjecture taking “unilateral action” are predicted on this.” We outline our specific concerns on unilateralist action, which don’t have to do with Conjecture’s critiques of EA TAIS actors, here. Examples of disagreements with TAIS actors that they cite include:

Conjecture being critical of the role EA actors have played in funding/supporting major AGI labs.
EA TAIS leadership that continue to support development of AGI.
They don’t think GPT-4 should have been released.
They are concerned that ARC’s benchmarking efforts might have safety-washed GPT-4.

We are also concerned about the role that EA actors have and potentially continue to play in supporting AGI labs (we will cover some of these concerns in our upcoming post on Anthropic). We think that Conjecture’s views on ARC are reasonable (although we may not agree with their view). Further, many other EAs and TAIS community members have expressed concerns on this topic, and about OpenAI in particular. We do not think holding this view is particularly controversial or something that people would be critical of. Views like this did not factor into our critique.

Finally, they propose that (rather than critiquing them), we should push for a ban on AGI and oppose organizations trying to build it (OpenAI, DM & Anthropic). While we agree that other labs are concerning, that doesn’t mean that our concerns about Conjecture are erased.

Notes

Changelog

Note: Significant changes are listed with an (*). Places where we changed are our views or recommendations are marked with a (^).

We've added footnotes signposting all edits for clarity.

As of June 16 2023:

^Updated and enhanced our recommendation on working at Conjecture. We changed the top-line recommendation to be more precise and added more details of types of roles we would be more excited by, and added in some notes on people who might find it a good fit to work with Conjecture
*Added a subsection in for the lack of the team's track record
We adjusted the section discussing the appropriate bar for Conjecture’s research to be more clear
We added a specific example of a governance structure Conjecture could follow to our recommendations
We modified the section on misrepresentation and encouraged people to speak to others to validate our account and draw their own conclusions
We added specific open questions regarding the CoEm agenda that have been raised by others.
Various small grammar & language edits to make the post more clear and flow better

^{^}
[June 15 2023] Edited to make the language more nuanced and add more explicit examples and comparisons.
^{^}
Gabriel Alfour is still listed as the CEO on Marigold's website: we are unsure if this information is out of date, or if Gabriel still holds this position. We also lack a clear understanding of what Marigold's output is, but spent limited time evaluating this.
^{^}
[June 14 2023] This paragraph was edited for clarity
^{^}
[June 16 2023] We added the section on the lack of the team's track record
^{^}
[June 16 2023] In terms of the explicit comparison with ARC, we would like to note that ARC Theory's team size is an order of magnitude smaller than Conjecture. Based on ARC's recent hiring post, it appears the theory team consists of just three individuals: Paul Christiano, Mark Xu and Jacob Hilton. If ARC had a team ten times larger and had spent close to $10 mn, then we would indeed be disappointed if there were not more concrete wins.
^{^}
[June 14 2023] Added this paragraph to explain our position on hits-based agendas.
^{^}
[June 14 2023] Added specific open questions people have with the CoEm agenda.
^{^}
In particular, Connor has referred to AGI as god-like multiple times in interviews (CNN, Sifted). We are skeptical if this framing is helpful.
^{^}
Employee retention is a key mechanism by which tech companies have been held accountable: for example, Google employees' protest over Project Maven led to Google withdrawing from the project. Similarly, the exodus of AIS researchers from OpenAI to found Anthropic was partly fueled by concerns that OpenAI was contributing to AI risk.
^{^}
[June 15 2023]: We added a note encouraging people to speak to others and draw their own conclusions
^{^}
[June 15 2023]: We added a specific example of a governance structure Conjecture could follow to our recommendations
^{^}
Stable Diffusion is a state-of-the-art generative model with similar performance to OpenAI’s DALL-E. They are open-source and open-access - there are no restrictions or filters, so you're not limited by what restrictions a company like OpenAI might apply. This means that people can use the model for abusive behavior (such as deepfakes)
^{^}
Connor reports a WikiText2 perplexity of 43.79 for his replica. This is considerably worse than the 18.34 perplexity achieved by GPT-2 1.5B on this dataset (reported in Table 3 of Radfort et al), and substantially worse than the perplexity achieved by even the smallest GPT-2 117M of 29.41. It is slightly worse than the previously reported state-of-the-art prior to the GPT-2 paper, of 39.14 (reported in Table 2 of Gong et al). Overall, it’s a substantial accomplishment, especially for an undergraduate who built the entire training pipeline (including data scraping) from scratch, but is far from a replication.
^{^}
Here is the full text from the relevant section of the article: “model is not identical to OpenAI’s because I simply didn’t have all the details of what they did … [and] the samples and metrics I have shown aren’t 100% accurate. For one, my metric code is flawed, I made several rookie mistakes in setting up accurate evaluation (let train and eval data mix, used metrics whose math I didn’t understand etc), and the model I used to generate the samples is in fact not the final trained model, but one about halfway through the training. I didn’t take my time to evaluate the strength of my model, I simply saw I had the same amount of hardware as OpenAI and code as close to the paper as possible and went with it. The reason for this is a simple human flaw: I got cold feet once I realized what I was sitting on and acted rashly.”
^{^}
This was in part due to conversations with OpenAI and Buck Shlegeris (then at MIRI)
^{^}
Redwood and Conjecture have received similar levels of funding
^{^}
Anthropic has a public benefit corporation structure, with reports that it includes a long-term benefit committee of people unaffiliated with the company who can override the composition of its board. Overall we have too little information to judge whether this structure is better or worse than OpenAI’s, but both seem better than being a standard C-corporation.
^{^}
Conjecture has been active in running or supporting programs aimed at AI safety field-building. Most notably, they ran the Refine incubator, and are currently fiscally sponsoring ARENA and MATS for their London based cohort. We expect overall these programs are net-positive, and are grateful that Conjecture is contributing to them. However, it may have a chilling effect: individuals may be reluctant to criticize Conjecture if they want to be part of these sponsored programs. It may also cause attendees to be more likely than they otherwise would to work for Conjecture. We would encourage ARENA and MATS to find a more neutral fiscal sponsor in the UK to avoid potential conflicts of interest. For example, they could hire staff members using employer-of-record services such as Deel or Remote. If Conjecture does continue fiscally sponsoring organizations, we would encourage them to adopt a clear legal separation between Conjecture and fiscally sponsored entities along with a conflict-of-interest policy to safeguard the independence of the fiscally sponsored entities.

150 Reactions

Critiques of prominent AI safety labs: Redwood Research

91 comments339 karma

Mentioned in

107Long-Term Future Fund: April 2023 grant recommendations

87An Introduction to Critiques of prominent AI safety organizations

82Posts from 2023 you thought were valuable (and underrated)

38Productive criticism: what could help?

Comments83

Sorted by

New & upvoted

Click to highlight new comments since: Today at 1:38 AM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

mariushobbhahnJun 12 2023126

I personally have no stake in defending Conjecture (In fact, I have some questions about the CoEm agenda) but I do think there are a couple of points that feel misleading or wrong to me in your critique.

1. Confidence (meta point): I do not understand where the confidence with which you write the post (or at least how I read it) comes from. I've never worked at Conjecture (and presumably you didn't either) but even I can see that some of your critique is outdated or feels like a misrepresentation of their work to me (see below). For example, making recommendations such as "freezing the hiring of all junior people" or "alignment people should not join Conjecture" require an extremely high bar of evidence in my opinion. I think it is totally reasonable for people who believe in the CoEm agenda to join Conjecture and while Connor has a personality that might not be a great fit for everyone, I could totally imagine working with him productively. Furthermore, making a claim about how and when to hire usually requires a lot of context and depends on many factors, most of which an outsider probably can't judge.
Given that you state early on that you are an experienced member of ... (read more)

Rohin ShahJun 12 2023139

I'm not very compelled by this response.

It seems to me you have two points on the content of this critique. The first point:

I think it's bad to criticize labs that do hits-based research approaches for their early output (I also think this applies to your critique of Redwood) because the entire point is that you don't find a lot until you hit.

I'm pretty confused here. How exactly do you propose that funding decisions get made? If some random person says they are pursuing a hits-based approach to research, should EA funders be obligated to fund them?

Presumably you would want to say "the team will be good at hits-based research such that we can expect a future hit, for X, Y and Z reasons". I think you should actually say those X, Y and Z reasons so that the authors of the critique can engage with them; I assume that the authors are implicitly endorsing a claim like "there aren't any particularly strong reasons to expect Conjecture to do more impactful work in the future".

The second point:

Your statements about the VCs seem unjustified to me. How do you know they are not aligned? [...] I haven't talked to the VCs either, but I've at least asked people who work(ed) at Conjecture.

Hmm, it... (read more)

richard_ngoJun 15 202311

Good comment, consider cross-posting to LW?

mariushobbhahn

Jun 13 2023

1. Meta: maybe my comment on the critique reads stronger than intended (see comment with clarifications) and I do agree with some of the criticisms and some of the statements you made. I'll reflect on where I should have phrased things differently and try to clarify below. 2. Hits-based research: Obviously results are one evaluation criterion for scientific research. However, especially for hits-based research, I think there are other factors that cannot be neglected. To give a concrete example, if I was asked whether I should give a unit under your supervision $10M in grant funding or not, I would obviously look back at your history of results but a lot of my judgment would be based on my belief in your ability to find meaningful research directions in the future. To a large extent, the funding would be a bet on you and the research process you introduce in a team and much less on previous results. Obviously, your prior research output is a result of your previous process but especially in early organizations this can diverge quite a bit. Therefore, I think it is fair to say that both a) the output of Conjecture so far has not been that impressive IMO and b) I think their updates to early results to iterate faster and look for more hits actually is positive evidence about their expected future output. 3. Of course, VCs are interested in making money. However, especially if they are angel investors instead of institutional VCs, ideological considerations often play a large role in their investments. In this case, the VCs I'm aware of (not all of which are mentioned in the post and I'm not sure I can share) actually seem fairly aligned for VC standards to me. Furthermore, the way I read the critique is something like "Connor didn't tell the VCs about the alignment plans or neglects them in conversation". However, my impression from conversation with (ex-) staff was that Connor was very direct about their motives to reduce x-risks. I think it's clear that product

Rohin ShahJun 13 202332

On hits-based research: I certainly agree there are other factors to consider in making a funding decision. I'm just saying that you should talk about those directly instead of criticizing the OP for looking at whether their research was good or not.

(In your response to OP you talk about a positive case for the work on simulators, SVD, and sparse coding -- that's the sort of thing that I would want to see, so I'm glad to see that discussion starting.)

On VCs: Your position seems reasonable to me (though so does the OP's position).

On recommendations: Fwiw I also make unconditional recommendations in private. I don't think this is unusual, e.g. I think many people make unconditional recommendations not to go into academia (though I don't).

I don't really buy that the burden of proof should be much higher in public. Reversing the position, do you think the burden of proof should be very high for anyone to publicly recommend working at lab X? If not, what's the difference between a recommendation to work at org X vs an anti-recommendation (i.e. recommendation not to work at org X)? I think the three main considerations I'd point to are:

(Pro-recommendations) It's rare for people to do thi

... (read more)

mariushobbhahnJun 13 202338

Hmm, yeah. I actually think you changed my mind on the recommendations. My new position is something like:
1. There should not be a higher burden on anti-recommendations than pro-recommendations.
2. Both pro- and anti-recommendations should come with caveats and conditionals whenever they make a difference to the target audience.
3. I'm now more convinced that the anti-recommendation of OP was appropriate.
4. I'd probably still phrase it differently than they did but my overall belief went from "this was unjustified" to "they should have used different wording" which is a substantial change in position.
5. In general, the context in which you make a recommendation still matters. For example, if you make a public comment saying "I'd probably not recommend working for X" the severity feels different than "I collected a lot of evidence and wrote this entire post and now recommend against working for X". But I guess that just changes the effect size and not really the content of the recommendation.

Rohin Shah

Jun 13 2023

:) I'm glad we got to agreement! (Or at least significantly closer, I'm sure there are still some minor differences.)

OmegaJun 13 202353

We appreciate your detailed reply outlining your concerns with the post.

Our understanding is that your key concern is that we are judging Conjecture based on their current output, whereas since they are pursuing a hits-based strategy we should expect in the median case for them to not have impressive output. In general, we are excited by hits-based approaches, but we echo Rohin's point: how are we meant to evaluate organizations if not by their output? It seems healthy to give promising researchers sufficient runway to explore, but $10 million dollars and a team of twenty seems on the higher end of what we would want to see supported purely on the basis of speculation. What would you suggest as the threshold where we should start to expect to see results from organizations?

We are unsure where else you disagree with our evaluation of their output. If we understand correctly, you agree that their existing output has not been that impressive, but think that it is positive they were willing to share preliminary findings and that we have too high a bar for evaluating such output. We've generally not found their preliminary findings to significantly update our views, wher... (read more)

mariushobbhahn

Jun 13 2023

Meta: Thanks for taking the time to respond. I think your questions are in good faith and address my concerns, I do not understand why the comment is downvoted so much by other people. 1. Obviously output is a relevant factor to judge an organization among others. However, especially in hits-based approaches, the ultimate thing we want to judge is the process that generates the outputs to make an estimate about the chance of finding a hit. For example, a cynic might say "what has ARC-theory achieve so far? They wrote some nice framings of the problem, e.g. with ELK and heuristic arguments, but what have they ACtUaLLy achieved?" To which my answer would be, I believe in them because I think the process that they are following makes sense and there is a chance that they would find a really big-if-true result in the future. In the limit, process and results converge but especially early on they might diverge. And I personally think that Conjecture did respond reasonably to their early results by iterating faster and looking for hits. 2. I actually think their output is better than you make it look. The entire simulators framing made a huge difference for lots of people and writing up things that are already "known" among a handful of LLM experts is still an important contribution, though I would argue most LLM experts did not think about the details as much as Janus did. I also think that their preliminary research outputs are pretty valuable. The stuff on SVDs and sparse coding actually influenced a number of independent researchers I know (so much that they changed their research direction to that) and I thus think it was a valuable contribution. I'd still say it was less influential than e.g. toy models of superposition or causal scrubbing but neither of these were done by like 3 people in two weeks. 3. (copied from response to Rohin): Of course, VCs are interested in making money. However, especially if they are angel investors instead of institutional VCs, ide

OmegaJun 14 202314

We appreciate you sharing your impression of the post. It’s definitely valuable for us to understand how the post was received, and we’ll be reflecting on it for future write-ups.

1) We agree it's worth taking into account aspects of an organization other than their output. Part of our skepticism towards Conjecture – and we should have made this more explicit in our original post (and will be updating it) – is the limited research track record of their staff, including their leadership. By contrast, even if we accept for the sake of argument that ARC has produced limited output, Paul Christiano has a clear track record of producing useful conceptual insights (e.g. Iterated Distillation and Amplification) as well as practical advances (e.g. Deep RL From Human Preferences) prior to starting work at ARC. We're not aware of any equally significant advances from Connor or other key staff members at Conjecture; we'd be interested to hear if you have examples of their pre-Conjecture output you find impressive.

We're not particularly impressed by Conjecture's process, although it's possible we'd change our mind if we knew more about it. Maintaining high velocity in research is... (read more)

mariushobbhahn

Jun 14 2023

I'll only briefly reply because I feel like I've said most of what I wanted to say. 1) Mostly agree but that feels like part of the point I'm trying to make. Doing good research is really hard, so when you don't have a decade of past experience it seems more important how you react to early failures than whether you make them. 2) My understanding is that only about 8 people were involved with the public research outputs and not all of them were working on these outputs all the time. So the 1 OOM in contrast to ARC feels more like a 2x-4x. 3) Can't share. 4) Thank you. Hope my comments helped. 5) I just asked a bunch of people who work(ed) at Conjecture and they said they expect the skill building to be better for a career in alignment than e.g. working with a non-alignment team at Google.

OmegaJun 16 202316

We've updated the recommendation about working at Conjecture.

mariushobbhahnJun 12 202332

Some clarifications on the comment:
1. I strongly endorse critique of organisations in general and especially within the EA space. I think it's good that we as a community have the norm to embrace critiques.
2. I personally have my criticisms for Conjecture and my comment should not be seen as "everything's great at Conjecture, nothing to see here!". In fact, my main criticism of leadership style and CoEm not being the most effective thing they could do, are also represented prominently in this post.
3. I'd also be fine with the authors of this post saying something like "I have a strong feeling that something is fishy at Conjecture, here are the reasons for this feeling". Or they could also clearly state which things are known and which things are mostly intuitions.
4. However, I think we should really make sure that we say true things when we criticize people, quantify our uncertainty, differentiate between facts and feelings and do not throw our epistemics out of the window in the process.
5. My main problem with the post is that they make a list of specific claim with high confidence and I think that is not warranted given the evidence I'm aware of. That's all.

LinchJun 13 202325

You can see this already in the comments where people without context say this is a good piece and thanking you for "all the insights".

FWIW I Control-F'd for "all the insights" and did not see any other hit on this page other than your comment.

EDIT 2023/06/14: Hmm, so I've since read all the comments on this post on both the EAF and LW^[1], and I don't think your sentence was an accurate paraphrase for any of the comments on this post?

For context, the most positive comment on this post is probably mine, and astute readers might note that my comment was process-oriented rather than talking about quantity of insights.

^{^}
(yes, I have extremely poor time management. Why do you ask?)

mariushobbhahn

Jun 16 2023

The comment I was referring to was in fact yours. After re-reading your comment and my statement, I think I misunderstood your comment originally. I thought it was not only praising the process but also the content itself. Sorry about that. I updated my comment accordingly to indicate my misunderstanding. The "all the insights" was not meant as a literal quote but more as a cynical way of saying it. In hindsight, this is obviously bound to be misunderstood and I should have phrased it differently.

Linch

Jun 16 2023

Thanks for the correction! I also appreciate the polite engagement. As a quick clarification, I'm not a stickler for exact quotes (though e.g. the APA is), but I do think it's important for paraphrases to be accurate. I'll also endeavor to be make my own comments harder to misinterpret going forwards, to minimize future misunderstandings. To be clear, I also appreciate the content of this post, but more because it either brought new information to my attention, or summarized information I was aware of in the same place. (rather than because it offered particularly novel insights).

JasonJun 13 202318

Could you say a bit more about your statement that "making recommendations such as . . . . 'alignment people should not join Conjecture' require an extremely high bar of evidence in my opinion"?

The poster stated that there are "more impactful places to work" and listed a number of them -- shouldn't they say that if they believe it is more likely than not true? They have stated their reasons; the reader can decide whether they are well-supported. The statement that Conjecture seems "relatively weak for skill building" seems supported by reasonable grounds. And the author's characterization of likelihood that Conjecture is net-negative is merely "plausible." That low bar seems hard to argue with; the base rate of for-profit companies without any known special governance safeguards acting like for-profit companies usually do (i.e., in a profit-maximinzing manner) is not low.

mariushobbhahnJun 13 202310

Maybe we're getting too much into the semantics here but I would have found a headline of "we believe there are better places to work at" much more appropriate for the kind of statement they are making.
1. A blanket unconditional statement like this seems unjustified. Like I said before, if you believe in CoEm, Conjecture probably is the right place to work for.
2. Where does the "relatively weak for skill building" come from? A lot of their research isn't public, a lot of engineering skills are not very tangible from the outside, etc. Why didn't they just ask the many EA-aligned employees at Conjecture about what they thought of the skills they learned? Seems like such an easy way to correct for a potential mischaracterization.
3. Almost all AI alignment organizations are "plausibly" net negative. What if ARC evals underestimates their gain-of-function research? What if Redwood's advances in interpretability lead to massive capability gains? What if CAIS's efforts with the letter had backfired and rallied everyone against AI safety? This bar is basically meaningless without expected values.

Does that clarify where my skepticism comes from? Also, once again, my arguments should not be seen as a recommendation for Conjecture. I do agree with many of the criticisms made in the post.

TheAtheniansJun 12 2023126

Why are you doing critiques instead of evaluations? This seems like you're deliberately only looking for bad things instead of trying to do a balanced investigation into the impact of an organization.

This seems like bad epistemics and will likely lead to a ton of not necessarily warranted damage to orgs that are trying to do extremely important work. Not commenting on the content of your criticisms of Redwood or Conjecture, but your process.

Knowing there's a group of anonymous people who are explicitly looking to find fault with orgs feels like an instance of EA culture rewarding criticism to the detriment of the community as a whole. Generally, I can see that you're trying to do good, but your approach makes me feel like the EA community is hostile and makes me not want to engage with it.

JWS 🔸Jun 12 202328

I want to second what TheAthenians is asking here. I think there's plenty that can be valuable from evaluations (or even critiques) like this, and there's a lot that you've (Omega) posted here and in the Redwood Critique which is useful to know. In particular:

Startups in AI alignment are likely to be over-indexed in driven, technically capable people, but not those with extensive experience in how to run organisations^[1]
Navigating the pressure between gaining funding by promising exciting capabilities to funders and stopping contributing to 'race dynamics'^[2] is a hard problem that these orgs need to be aware of, and have structures against them to prevent this
Other labs outside of the EA/LW 'ingroup' may have done comparable/better research, or provided better outputs, and might be a better use of future AI Safety funding^[3]
I think it's a very good sign that you've shared both drafts so far with Redwood and Conjecture. No notes here, good job :)

However, after reading this post and reflecting, I can't help but agree with the somewhat sceptical mood of most(?) commenters here? In particular:

You don't say why you're doing this series, either in this post, or the Redwood one, or

... (read more)

OmegaJun 14 202322

Thank you for raising this point – you’re right that we don’t explain why we are writing this series, and we will update the sequence description to be more transparent on that point. The reasons you suggest are basically correct.

With increased attention to TAIS there are many people trying to get into TAIS roles. Without significant context on organizations, new entrants to the field will tend to go to TAIS organizations based on their prominence caused by factors such as total funding, media coverage, volume of output, etc. Much of the discussion we have observed around TAIS organizations, especially criticisms of them, happens behind closed doors in conversations that junior people are usually not privy to. We wish to help disseminate this information more broadly to enable individuals to make a better informed decision.

We are concerned “that the attractiveness of working at an organization that is connected to the EA or TAIS communities makes it more likely for community members to take jobs at such organizations even if this will result in a lower lifetime impact than alternatives. Conjecture's sponsorship of TAIS field building efforts may also lead new talent, who are unfami... (read more)

JWS 🔸

Jun 14 2023

Thanks for your thoughtful response here (and elsewhere). I definitely think you're acting in good faith (again, I think sharing your evaluations with the labs beforehand and seeking information/clarification from them is really big evidence of this), and I have appreciated both posts even if I was more on the critical side for this one. I'm sorry that you've found the response to this post difficult and I apologise if I contributed to that unfairly. I look forward to you continuing the series (I think with Anthropic?). On the object level I don't think we actually disagree that much - I'm very much in agreement with your sentiment on organisational governance, and how much this has been shown to be crucial in both the EA and AI spaces over the past year. I think allowing critical evaluations of AI Safety from inside the field is important to making sure the field stays healthy. I agree that many can achieve good impact working outside of explicitly EA-aligned organisations, and to not give those organisations a 'pass' because of that affiliation - especially if they are in the early stages of their career. And I agree that rate at which Conjecture has scaled at will likely lead to organisational problems that may impact the quality of their output. So on reflection, perhaps why my reaction to this post was more mixed than my reaction to the Redwood Post was is because you made some very strong and critical claims about Conjecture but the evidence you presented was often vague or a statement of your own beliefs.[1] So, for example, numerous concerns about Connor's behaviour are stated in the article, but I don't have much to update on apart from "The authors of Omega interpret these events as a sign of poor/untrustworthy character", and if I don't share the same interpretation (or to the same degree),[2] our beliefs can't converge any more unless further evidence/context of those claims is provided. The same goes for technical assessments about the quality of Co

Omega

Jul 19 2023

Hi JWS, Just wanted to let you know that we've posted our introduction to the series. We hope it adds some clarity to the points you've raised here for others.

OmegaJun 13 202316

Thanks for highlighting this potential issue. We'd like to clarify that our intention is to evaluate both the positives and negatives. In retrospect, calling our posts "critiques" may have given the wrong impression: although it's consistent with historical usage of the word^[1] it does tend to carry a negative connotation. Ultimately our evaluation of Conjecture ended up fairly negative, exacerbating this impression: we expect future posts in the series on organizations where we have a more mixed evaluation to have a greater mix of positives and negatives.

You are right that overall we focus more on negatives than positives. We believe this is justified since organizations are already incentivized to make the positive case for themselves, and routinely do so in public announcements as well as private recruitment and funding pitches. By contrast, there is little reward from highlighting negatives. Indeed, we're publishing this anonymously (foregoing any credit we could get from bringing these issues to attention) in order to protect against retaliation.

Our goal is not to make the EA community more hostile, and we're certainly sorry that this post made you want to engage less wit... (read more)

PabloJun 13 202344

You are right that overall we focus more on negatives than positives. We believe this is justified since organizations are already incentivized to make the positive case for themselves, and routinely do so in public announcements as well as private recruitment and funding pitches.

As a potential consumer of your critiques/evaluations, I would prefer that you distribute your focus exactly to the degree that you believe it to be warranted in light of your independent impression of the org in question, rather than try to rectify a possible imbalance by deliberately erring in the opposite direction.

Omega

Jun 13 2023

Hi Pablo, thanks for you comment. We want to clarify that we aren't trying to balance the critiques in a certain way, just that it so happens that the organizations that are next on our list will have a greater mix of positives and negatives.

Jonny Spicer 🔸Jun 12 202391

The CEO has been inconsistent over time regarding his position on releasing LLMs

I find this to be a pretty poor criticism, and its inclusion makes me less inclined to accept the other criticisms in this piece at face value.

Updating your beliefs and changing your mind in light of new evidence is undoubtedly a good thing. To say that doing so leaves you with concerns about Connor's "trustworthiness and character" seems not only unfair, but also creates a disincentive for people to publicly update their views on key issues, for fear of this kind of criticism.

OmegaJun 13 202315

Changing your mind in the face of new evidence is certainly commendable. In this case, we were highlighting that Connor has switched from confidently holding one extreme position (founding an organization dedicated to open-source output with all research conducted in public) to the opposite extreme (founding an organization with one of the most restrictive non-disclosure policies) without any substantial new evidence, and with little in the way of a public explanation.

In particular, we wanted to highlight that Conjecture may in the future radically depart from their current info-hazard policy. To the best of our knowledge, the info-hazard policy has no legal force: it is a policy maintained at the discretion of Conjecture leadership. Given Connor has previously radically changed his mind without corresponding extreme changes in the world, we should not be surprised if a major change of strategy occurs again. As such, we would suggest viewing their info-hazard policy as a short-term stance not a long-term commitment. This isn't necessarily a bad thing – we'd like Conjecture to share more details on their CogEm approach, for example. However, since the info-hazard policy has been rep... (read more)

Greg_Colbourn ⏸️ Jun 13 202322

may in the future radically depart from their current... policy

Organisations do this all the time, often without much prior reason to think they will (e.g. OpenAI re non-profit->for profit, and open-source->closed source). Saying that updating beliefs (imo in the right direction regarding x-risk) is bad because it makes this more likely is cynical and unfair.

Guy Raveh

Jun 12 2023

Changing your mind is good; being indecisive and inconsistent because you're just not really certain (instead of just communicating your uncertainty) is bad. I have no idea which of those the CEO of Conjecture did.

RyanCareyJun 12 202380

Yes, unfortunately I've also been hearing negatives about Conjecture, so much so that I was thinking of writing my own critical post (and for the record, I spoke to another non-Omega person who felt similarly). Now that your post is written, I won't need to, but for the record, my three main concerns were as follows:

1. The dimension of honesty, and the genuineness of their business plan. I won't repeat it here, because it was one of your main points, but I don't think that it's a way to run a business, to sell your investors on a product-oriented vision for the company, but to tell EAs that the focus is overwhelmingly on safety.

2. Turnover issues, including the interpretability team. I've encountered at least half a dozen stories of people working at or considering work at Conjecture, and I've yet to hear of any that were positive. This is about as negative a set of testimonials as I've heard about any EA organisation. Some prominent figures like Janus and Beren have left. In the last couple of months, turnover has been especially high - my understanding is that Connor told the interpretability team that they were to work instead on cognitive emulations, and most of them left. Much... (read more)

Jonathan LowJun 12 202370

I currently work at Conjecture (this comment is in my personal capacity). Without commenting on any of your other points I would like to add the data point that I enjoy working here and I've grown a lot in personal and professional capacity while being here. 6/6 of colleagues I asked said they did too.

j_presterJun 13 202327

Another data point: I worked for Conjecture until recently, and I broadly agree with Jonathan's assessment. It is a pretty impressive group of people and I enjoyed working with them. Work was ocassionaly quite intense but that is par for the course for such a young organisation thats moving incredibly fast in such a challenging field.

I would recommend working for Conjecture, especially for anyone located in Europe who wants to work in alignment.

LarksJun 12 202339

Most staff thinking AGI ruin >60% likely and most expecting AGI in <7 years, and tweeting it.
i.e. including non-researchers - it at-least makes one wonder about groupthink.

I would expect a lot of selection effects on who goes to work for Conjecture; similarly I wouldn't find it concerning if non-researchers at GreenPeace had extreme views about the environment.

SharmakeJun 12 202319

I actually would find this at least somewhat concerning, because selection bias/selection effects are my biggest worry with smart people working in an area. If a study area is selected based upon any non-truthseeking motivations, or if people are pressured to go along with a view for non-truthseeking reasons, then it's very easy to land into nonsense, where the consensus is based totally on selection effects, making them useless to us.

There's a link to the comment by lukeprog below on the worst case scenario for smart people being dominated by selection effects:

One marker to watch out for is a kind of selection effect.

In some fields, only 'true believers' have any motivation to spend their entire careers studying the subject in the first place, and so the 'mainstream' in that field is absolutely nutty.

Case examples include philosophy of religion, New Testament studies, Historical Jesus studies, and Quranic studies. These fields differ from, say, cryptozoology in that the biggest names in the field, and the biggest papers, are published by very smart people in leading journals and look all very normal and impressive but those entire fields are so incredibly screwed by the se

... (read more)

Holly Elmore ⏸️ 🔸Jun 15 202315

Why can’t non-research staff have an opinion about timelines? And why can’t staff tweet their timelines? Seems an overwhelmingly common EA thing to do.

RebeccaJun 15 202311

I don’t think the issue is that they have an opinion, rather that they have the same opinion - like, ‘all the researchers have the same p(doom), even the non-researchers too’ is exactly the sort of thing I’d imagine hearing about a cultish org

RyanCarey

Jun 16 2023

I feel like you're putting words put into my mouth a little bit, there. I didn't say that their beliefs/behaviour were dispositively wrong, but that IF you have longer timelines, then you might start to wonder about groupthink. That's because in surveys and discussions of these issues even at MIRI, FHI, etc there have always been some researchers who have taken more mainstream views - and non-research staff usually have more mainstream views than researchers (which is not unreasonable if they've thought less about the issue).

D0TheMathJun 12 202368

(cross-posted to LessWrong)

I agree with Conjecture's reply that this reads more like a hitpiece than an even-handed evaluation.

I don't think your recommendations follow from your observations, and such strong claims surely don't follow from the actual evidence you provide. I feel like your criticisms can be summarized as the following:

Conjecture was publishing unfinished research directions for a while.
Conjecture does not publicly share details of their current CoEm research direction, and that research direction seems hard.
Conjecture told the government they were AI safety experts.
Some people (who?) say Conjecture's governance outreach may be net-negative and upsetting to politicians.
Conjecture's CEO Connor used to work on capabilities.
One time during college Connor said that he replicated GPT-2, then found out he had a bug in his code.
Connor has said at some times that open source models were good for alignment, then changed his mind.
Conjecture's infohazard policy can be overturned by Connor or their owners.
They're trying to scale when it is common wisdom for startups to try to stay small.
It is unclear how they will balance profit and altruistic moti

... (read more)

Omega

Jun 13 2023

Regarding your specific concerns about our recommendations: 1) We address this point in our response to Marius (5th paragraph) 2) As we note in the relevant section: “We think there is a reasonable risk that Connor and Conjecture’s outreach to policymakers and media is alarmist and may decrease the credibility of x-risk.” This kind of relationship-building is unilateralist when it can decrease goodwill amongst policymakers. 3) To be clear, we do not expect Conjecture to have the same level of “organizational responsibility” or “organizational competence” (we aren’t sure what you mean by those phrases and don’t use them ourselves) as OpenAI or Anthropic. Our recommendation was for Conjecture to have a robust corporate governance structure. For example, they could change their corporate charter to implement a "springing governance" structure such that voting equity (but not political equity) shift to an independent board once they cross a certain valuation threshold. As we note in another reply, Conjecture’s infohazard policy has no legal force, and therefore is not as strong as either OpenAI or Anthropic’s corporate governance models. As we’ve noted already, we have concerns about both OpenAI and Anthropic despite having these models in place: Conjecture doesn’t even have those, which makes us more concerned. [Note: we edited point 3) for clarity on June 13 2023]

D0TheMath

Jun 13 2023

1. My response would be a worse version of Marius’s response. So just read what he said here for my thoughts on hits-based approaches for research. 2. I disagree, and wish you’d actually explain your position here instead of being vague & menacing. As I’ve said in my previous comment This is because they usually talk about the strongest case for x-risk when talking to reporters, and somehow get into the article, and then have the reporter speak positively about the cause. You’ve also said that some people think conjecture may be decreasing goodwill with policy makers. This announcement seems like a lot of evidence against this. Though there is debate on whether its good, the policy-makers are certainly paying lip-service to AI alignment-type concerns. I also want to know why would I trust such people to report on policy-makers opinions. Are these some Discord randos or parliament aides, or political researchers looking at surveys among parliament leaders, or deepmind policy people, or what? In general I reject that people shouldn’t talk to the government if they’re qualified (in a general sense) and have policy-goals which would be good to implement. If policy is to work its because someone did something. So its a good thing that Conjecture is doing something. 1. It is again really weird that you pull out OpenAI as an org with really strong corporate governance. Their charter is a laughing stock, and their policies did not stop them from reformat into a for-profit company once Sam, presumably, or whoever their leaders were at the time, saw they could make money. I don’t know anything about Anthropic’s corporate governance structure. But I also don’t know much about Conjecture. I know at one point I tried to find Anthropic’s board of directors, and found nothing. But that was just a bunch of googling. Conjecture’s infohazard policy not having legal force is bad, but not as bad as not having an infohazard policy in the first place. It seems like OpenAI and A

Holly Elmore ⏸️ 🔸Jun 15 202337

Shouldn’t Conjecture get some credit for their public communications about stopping frontier AI development? This piece seems to assume technical AI Safety and alignment are the only way forward, as evidenced by recommending working at major labs trying to make AGI. These reviews would be more useful if they took a broader perspective. I mean, regardless of how much better their papers are in the meantime, does it seem likely to you that those labs will solve alignment in time if they are racing to build bigger and bigger models?

Will AldredJun 15 202310

Omega does acknowledge the value of public communications efforts:

Moreover, in recent years Connor has been a vocal public advocate for safety: although we disagree in some cases with the framing of the resulting media articles, in general we are excited to see greater public awareness of safety risks.^[5]

I think they don't emphasize this more in this piece because of their concerns about Connor/Conjecture's particular style of communications:

We think there is a reasonable risk that Connor and Conjecture’s outreach to policymakers and media is alarmist and may decrease the credibility of x-risk.
...
^{^}In particular, Connor has referred to AGI as god-like multiple times in interviews (CNN, Sifted). We are skeptical if this framing is helpful.

Greg_Colbourn ⏸️

Jun 16 2023

What if alarmism - raising the alarm - is actually the appropriate response? The current situation is highly alarming imo.

Will Aldred

Jun 16 2023

I can see where you're coming from. However, I think it's worth noting that "raise the alarm" isn’t straightforwardly the appropriate response to "the situation is alarming, according to my inside view," for unilateralist's curse-type reasons. (I imagine this has been discussed in more depth elsewhere.)

Greg_Colbourn ⏸️

Jun 16 2023

What is the appropriate response? (This is not a rhetorical question; I want to know). There may be some risk of alarmism being negative, but I don't think there is much risk of it being "net" negative, given the default no-action path is we all just get killed in a few years. Also it's ironic that the EA community talks a lot about the unilateralist's curse, yet is arguably responsible for the worst cases of it (supporting DeepMind, OpenAI, and Anthropic, and thus kicking off and accelerating the race to uncontrollable AGI).

Sharmake

Jun 15 2023

Basically, yes. This isn't to say that we aren't doomed at all, but contrary to popular beliefs of EAs/rationalists, the situation you gave actually has a very good chance, like >50% of working, for a few short reasons: 1. Vast space for instrumental convergence/instrumental goals aren't incentivized in current AI, and in particular, the essentially unbounded instrumental goal AI is very bad for capabilities relative to more bounded instrumental goals. In general, the vaster the space for instrumental convergence, the worse the AI performs. This is explained well by porby's post below, and I'll copy the most important footnote here: https://www.lesswrong.com/posts/EBKJq2gkhvdMg5nTQ/instrumentality-makes-agents-agenty 1. Instrumental convergence appears to be too weak, on it's own to generate substantial conclusions that AI will doom us, barring more assumptions or more empirical evidence, contrary to Nick Bostrom's Superintelligence. This is discussed more in a post below, but in essence the things you can conclude from it are quite weak, and they don't imply anything near the inferences that EAs/rationalists made about AI risk. https://www.lesswrong.com/posts/w8PNjCS8ZsQuqYWhD/instrumental-convergence-draft Given that instrumental convergence/instrumental goals have been arguably a foundational assumption that EAs/rationalists make around AI risk, this has very, very big consequences, almost on par with discovering a huge new cause like AI safety.

Holly Elmore ⏸️ 🔸

Jun 15 2023

Does this really make you feel safe? This reads to me as a possible reason for optimism, but hardly reassures me that the worst won’t happen or that this author isn’t just failing to imagine what could lead to strong instrumental convergence (including different training regimes becoming popular).

-1

Sharmake

Jun 16 2023

Basically, kind of. The basic issue is that instrumental convergence, and especially effectively unbounded instrumental convergence is a central assumption of why AI is uniquely dangerous, compared to other technologies like biotechnology. And in particular, without the instrumental convergence assumption, or at least with the instrumental convergence assumption being too weak to make the case for doom, unlike what Superintelligence told you, matters a lot because it kind of ruins a lot of our inferences of why AI would likely doom us, like deception or unbounded powerseeking and many more inferences that EAs/rationalists did, without high prior probability on it already. This means we have to fall back on our priors for general technology being good or bad, and unless one already has high prior probability that AI is doomy, then we should update quite a lot downwards on our probability of doom by AI. It's probably not as small as our original prior, but I suspect it's enough to change at least some of our priorities. Remember, the fact that the instrumental convergence assumption was essentially used as though it was an implicit axiom on many of our subsequent beliefs turning out to be not as universal as we thought, nor as much evidence of AI doom as we thought (indeed even with instrumental convergence, we don't actually get enough evidence to move us towards high probability of doom, without more assumptions.) is pretty drastic, as a lot of beliefs around the dangerousness of AI rely on the essentially unbounded instrumental convergence and unbounded powerseeking/deception assumptions. So to answer your question, the answer depends on what your original priors were on technology and AI being safe. It does not mean that we can't go extinct, but it does mean that we were probably way overestimating the probability of going extinct.

Greg_Colbourn ⏸️

Jun 16 2023

Even setting aside instrumental convergence (to be clear, I don't think we can safely do this), there is still misuse risk and multi-agent coordination that needs solving to avoid doom (or at least global catastrophe).

Sharmake

Jun 16 2023

I agree, but that implies pretty different things than what is currently being done, and still implies that the danger from AI is overestimated, which bleeds into other things.

Holly Elmore ⏸️ 🔸

Jun 16 2023

I guess my real question is “how can you feel safe accepting the idea that ML or RL agents won’t show instrumental convergence?” Are you saying AIs trained this way won’t be agents? Because i don’t understand how we could call something AGI that doesn’t figure out it’s own solutions to reach it’s goals, and I don’t see how it can do that without stumbling on things that are generally good for achieving goals. And regardless of whatever else you’re saying, how can you feel safe that the next training regime won’t lead to instrumental convergence?

-2

Sharmake

Jun 16 2023

Not especially. If I had to state it simply, it's that massive space for instrumental goals isn't useful today, and plausibly in the future for capabilities, so we have at least some reason to not worry about misalignment AI risk as much as we do today. In particular, it means that we shouldn't assume instrumental goals to appear by default, and to avoid overrelying on non-empirical approaches like your intuition or imagination. We have to take things on a case-by-case basis, rather than using broad judgements. Note that instrumental convergence/instrumental goals isn't a binary, but rather a space, where more space for instrumental goals being useful for capabilities is continuously bad, rather than a sharp binary of instrumental goals being active or not active. My claim is that the evidence we have is evidence against much space for instrumental convergence being useful for capabilities, and I expect this trend to continue, at least partially as AI progresses. Yet I suspect that this isn't hitting at your true worry, and I want to address it today. I suspect that your true worry is this quote below: And while I can't answer that question totally, I'd like to suggest going on a walk, drinking water, or in the worst case getting mental help from a professional. But try to stop the loop of never feeling safe around something. The reason I'm suggesting this is because the problem with acting on your need to feel safe is that the following would happen: 1. This would, if adopted leave us vulnerable to arbitrarily high demands for safety, possibly crippling AI use cases, and as a general policy I'm not a fan of actions that would result in arbitrarily high demands for something, at least without scrutinizing it very heavily, and would require way, way more evidence than just a feeling. 2. We have no reason to assume that people's feelings of safety or unsafety actually are connected to the real evidence of whether AI is safe, or whether misalignment risk of A

PrometheusJun 12 202338

I'm starting to think that the EA community will scare themselves into never taking any action at all. I don't really feel like going over this point-for-point, because I think this post demonstrates a much greater failure at rationality. The short answer is: you're doing cost/benefit analysis wrong. You're zooming in on every possible critique, and determining it's an organization that shouldn't have talent directed toward it. Every alignment organization right now has relatively poor results. But the strategy for what to do with that isn't funneling talent into ones that have slightly better results, but encouraging the spread of talent among many different approaches.

OmegaJun 14 202310

Here's a more detailed response to these comments:

… You're zooming in on every possible critique, and determining it's an organization that shouldn't have talent directed toward it.

We chose to write up critiques we felt were directly relevant to our end-line views (our goal was not to write up every possible critique). As we explain in another comment: “We believe that an organization should be graded on multiple metrics. Their outputs are where we would put the most weight. However, their strategy and governance are also key. The last year has brought into sharp relief the importance of strong organizational governance.”

We are supportive of the EA community pursuing a diversified research agenda, and individual organizations pursuing hits based agendas (we talk about that more in the first couple paragraphs of this comment). However, we do think that choosing the right organizations can make a difference, since top candidates often have the option for working at many organizations.

This is because we don’t agree that every alignment organization right now has relatively poor results. Here are some examples of results we find impressive, and organizations we think wo... (read more)

SharmakeJun 12 202310

I definitely agree with this, and I'm not very happy with the way Omega solely focuses on criticism, at the very least without any balanced assessment.

And given the nature of the problem, some poor initial results should be expected, by default.

Omega

Jun 13 2023

Responding to both comments in this thread, we have written a reply to TheAthenians's comment which addresses the points raised regarding our focus on criticism.

LinchJun 13 202327

Interestingly, the reception at LessWrong is more critical (13 net karma at 66 votes), compared to here (113 net karma at 80 votes).

LinchJun 15 202323

Possibly overly meta, but my pet theory here is that Conjecture, or at least the culture as presented by Conjecture's public writings, is more culturally rationalist than approximately every other AI Safety org, so perhaps there's some degree of culture clash or mood affiliation going on here.

zchuang

Jun 16 2023

I think it's also that Conjecture is explicitly short timelines and high p(doom) which means a lot of the criticisms feel like criticisms of rationalists implicitly.

Will Aldred

Jun 15 2023

Interesting theory. Though I'm a little surprised by it, given that Conjecture is based in London (i.e., not SF/Berkeley), and given that Connor has talked about not vibing with Rationalist culture (1:07:37–01:14:15 of this podcast episode). Additionally, it's not clear that the EAF vs LW karma differential for this Conjecture piece will turn out to be an outlier within this critique series. Omega's Redwood critique currently stands at 338 EAF karma, but just 1 LW karma (though, oddly, there were only 12 votes over at LW, which weakens the comparison). But perhaps this actually makes the meta analysis more interesting? Consistently large karma differentials in this series might point to a more general cultural divide—for example, in how criticism is viewed—between EAs and Rationalists. (And a better understanding of these communities' cultures might help inform culture-shaping interventions.)

Omega

Jun 15 2023

We posted the Redwood post several weeks late on LW, which might explain the low karma on LW.

bruceJun 12 202327

Thanks for writing this! RE: We would advise against working at Conjecture

We think there are many more impactful places to work, including non-profits such as Redwood, CAIS and FAR; alignment teams at Anthropic, OpenAI and DeepMind; or working with academics such as Stuart Russell, Sam Bowman, Jacob Steinhardt or David Krueger. Note we would not in general recommend working at capabilities-oriented teams at Anthropic, OpenAI, DeepMind or other AGI-focused companies.
Additionally, Conjecture seems relatively weak for skill building [...] We expect most ML engineering or research roles at prominent AI labs to offer better mentorship than Conjecture. Although we would hesitate to recommend taking a position at a capabilities-focused lab purely for skill building, we find it plausible that Conjecture could end up being net-negative, and so do not view Conjecture as a safer option in this regard than most competing firms.

I don't work in AI safety and am not well-informed on the orgs here, but did want to comment on this as this recommendation might benefit from some clarity about who the target audience is.

As written, the claims sound something like:

CAIS et al., alignment teams at Anthro

... (read more)

OmegaJun 14 202310

Hi Bruce, thanks for this thoughtful comment. We think Conjecture needs to address key concerns before we would recommend working there, although we could imagine Conjecture being the best option for a small fraction of people who are (a) excited by their current CoEm approach, (b) can operate independently in an environment with limited mentorship, (c) are confident they can withstand internal pressure (if there is a push to work on capabilities). As a result of these (and other) comments in this comment thread, we will be updating our recommendation to work at Conjecture.

That being said, we expect it to be rare that an individual would have an offer from Conjecture but not have access to other opportunities that are better than independent research. In practice many organizations end up competing for the same, relatively small pool of the very top candidates. Our guess is that most individuals who could receive an offer from Conjecture could pursue one of the paths outlined above in our replies to Marius such as being a research assistant or PhD student in academia, or working in an ML engineering position in an applied team at a major tech company (if not from more promising places like the ones we discuss in the original post). We think these positions can absorb a fairly large amount of talent, although we note that most AI/ML fields are fairly competitive.

LinchJun 12 202327

Great post, thank you for writing it in such detail, with great care and clarity, and also I appreciate the blunt and very concrete takeaways!

Quick note: Footnotes do not work for me (Chrome on MacOS). Not sure if you guys made a mistake somewhere or if it's a EAF bug, just thought I should mention this in case this is true for other people as well and the relevant people can fix it.

Omega

Jun 12 2023

Thank you for the kind words Linch! We've fixed the footnotes! There was an issue when we converted Markdown to EA Forum Docs editor it seems.

NunoSempere

Jun 12 2023

+1, I also appreciated this post. Leaving a note explicitly since I left a somewhat critical comment in the last post and then two more here.

Max HJun 12 202316

As someone who is exploring a transition to full-time TAIS work, I appreciate this series of posts and other efforts like it. Having a detailed public critique and "adversarial summary" of multiple organizations will make any future due-diligence process (by me or others) much quicker and easier.

That said, I am pretty unconvinced of the key points of this post.

I suspect some of the criticism is more convincing and more relevant to those who share the authors' views on specific technical questions related to TAIS.
My own personal experience (very limited, detailed below) engaging with Conjecture and their public work conflicts with the claim that it is low quality and that they react defensively to criticism.
Some points are vague and I personally do not find them damning even if true.

Below, I'll elaborate on each bullet.

Criticism which seems dependent on technical views on TAIS

I agree with Eliezer's views on the difficulty and core challenges posed by the development of AGI.

In particular, I agree with his 2021 assessment of the AI safety field here, and List of Lethalities 38-41.

I realize these views are not necessarily consensus among EAs or the TAIS community, and I don't want to l... (read more)

MuireallJun 12 202316

Is there a canonical discussion of what you call "race dynamics" somewhere? I can see how proliferating firms and decentralized communities would "mak[e] potential moratoriums on capabilities research much harder (if not impossible) to enforce", but it's less clear to me what that means for how quickly capabilities advance. Is there evidence that, say, the existence of Anthropic has led to increased funding for OpenAI?

In particular, one could make the opposite argument—competition, at least intra-nationally, slows the feedback cycle for advancing capabilities. For example, a lot of progress in information technology seems to have been driven by concentration of R&D into Bell Labs. If the Bell monopoly had been broken up sooner, would that have accelerated progress? If some publicly-funded entity had provided email and internet search services, would Google have reached the same scale?

Meanwhile, training leading-edge models is capital intensive, and competing firms dilute available funding across many projects. Alternative commercial and open-source models drive potential margins down. Diminished prospects for monopoly limit the size and term of bets that investors are willing to make.

I don't know which way the evidence actually falls, but there seems to be a background assumption that competition, race dynamics, and acceleration of progress on capabilities always go hand in hand. I'd be very interested to read more detailed justifications for that assumption.

Stephen McAleeseJun 16 202312

Thanks for writing the post.

I know the sequence is about criticisms of labs but I personally think I would get more value if the post focused mainly on describing what the lab is doing with less about evaluating the organization because I think that the reader can form their own opinion themselves given an informative description. To use more technical language, I would be more interested in a descriptive post than a normative one.

My high-level opinion is that the post is somewhat more negative than I would like. My general sentiment on Conjecture is that it's one of the few AI safety labs that has been established outside of the bay area and the US.

As a result, Conjecture seems to have significantly boosted London as an AI safety hub which is extremely valuable because London is much more accessible for Europeans interested in AI safety than the bay area.

NunoSempereJun 12 202312

Our impression is that it's easy to use but no more powerful than existing open-source models like Whisper, although we are not aware of any detailed empirical evaluation.

Nitpick: I haven't tried it, but per <https://platform.conjecture.dev/>, it seems like it has diarization, i.e., the ability to distinguish between different speakers. Whisper doesn't have this built-in, and from what I recall, getting this to work with external libraries was extremely annoying.

NunoSempereJun 12 202311

> The piece relies heavily on criticism of Connor, Conjecture CEO, but does not attempt to provide a balanced assessment: there are no positive comments written about Connor along with the critiques

Could you elaborate more on your answer to this? I'm left a bit uncertain about whether there are redeeming aspects of Connor/Conjecture. For example, I might be inclined to count switching from capabilities to alignment for a lot of brownie points.

Omega

Jun 12 2023

Hi Nuno thanks for the question. Not sure if I am fully answering the question, so feel free to let us know if this doesn't address your point. We added this paragraph as a result of Conjecture's feedback, which accounts for his change of mind: We also modified this paragraph: We haven't attempted to quantify or sum up the positives and negatives in a more quantitative way that might make it easier to judge our perecption of this. On net we are still fairly concerned. Some things that might update us positively in favor of Connor are: * Seeing a solid plan for balancing profit and safety motives * Hearing about significant improvements in Conjecture and Connor's representation of themselves to external parties * Connor acknowledging his role in the formation of Stability AI and it's contribution to race dynamics

NunoSempere

Jun 12 2023

Cheers

darkJun 15 20235

This post claims, in two places, that Conjecture staff have been known to react defensively to negative feedback. When I first read the post, I was skeptical about this point i) being true, and ii) pointing to something substantial, if true. However, now that I've seen janus's response to this post over at LessWrong, I'm less skeptical. (Context: janus are former Conjecture staff.)

Linch

Jun 15 2023

I also consider Conjecture's official reply to be rather defensive, but I guess it could just be cultural differences.

Thomas KwaJun 19 20234

This currently has +154 karma on EA Forum and only +24 on LW, with similar exposure on each site, so I think it's fair to say that the reception is positive here and negative on LW. Maybe it's worth thinking about why.

Will Aldred

Jun 19 2023

(There's a thread with some discussion on this higher up.)

Lara_THJun 16 20234

We think there is a reasonable risk that Connor and Conjecture’s outreach to policymakers and media is alarmist and may decrease the credibility of x-risk.

One question that seems worth considering in this context: Couldn't a more "alarmist" tone in the outreach of Conjecture/Connor contribute to more moderate calls for regulation sounding more reasonable by comparison?
This is arguably what happened with Linus Pauling during nuclear non-proliferation activism in the 50's and 60's.

(Pauling sued the Eisenhower administration for riskin... (read more)

Arthur Conmy

Jun 16 2023

On a macro-level you could consider extreme AI Safety asks followed by moderate asks to be an example of the Door-in-the-face technique (which has a psychological basis and seems to have replicated)

Vasco Grilo🔸Jun 12 2023-3

Thanks for this analysis!

I remember being surprised by the number of times Connor Leahy (Conjecture's CEO) used the word "stupid" in this episode of the FLI podcast. 42 times (search for "stupid") in 1 h 5 min 5 s, i.e. once every 93.0 s (= (60^2 + 5*60 + 5)/42). I do not know whether this says something about Connor's character, but it jumped out to me. Here are the 1st 13 occurences of "stupid" (emphasis mine):

bad people most bad people are like shockingly stupid like truly genuinely
shockingly unintelligent especially like terrorists they're like and stu

... (read more)

-19

Tomas B.

Jun 14 2023

Critiques of prominent AI safety labs: Conjecture

150

Key Takeaways

Criticisms and Suggestions

Our views on Conjecture

About Conjecture

Funding

Outputs

Products

Alignment Research

Infohazard policy

Governance outreach

Incubator Program

Team

Conjecture in the TAIS ecosystem

Criticisms and Suggestions

Low quality research

General thoughts on Conjecture’s research

Lack of team's prior track record or experience in alignment and ML research[4]

Initial research agenda (March 2022 - Nov 2022)

New research agenda (Nov 22 - Present)

CEO’s character and trustworthiness

Conjecture and their CEO misrepresent themselves to various parties

Contributions to race dynamics

Overstatement of accomplishments and lack of attention to precision

Inconsistency over time regarding releasing LLMs

Scaling too quickly

Unclear plan for balancing profit and safety motives

Limited meaningful engagement with external actors

Lack of productive communication between TAIS researchers and Conjecture staff

Lack of engagement with the broader ML community

Our views on Conjecture

We would generally recommend working at most other AI safety organizations above Conjecture[1]

We would advise Conjecture to take care when engaging with important stakeholders and represent their place in the TAIS ecosystem accurately

We do not think that Conjecture should receive additional funding before addressing key concerns

We encourage TAIS and EA community members to consider to what extent they want to legitimize Conjecture until Conjecture addresses these concerns

Appendix

Communication with Conjecture

Conjecture’s Reply

Brief response and changes we made

Notes

Changelog

150

Reactions

Criticism which seems dependent on technical views on TAIS

Lack of team's prior track record or experience in alignment and ML research^[4]

We would generally recommend working at most other AI safety organizations above Conjecture^[1]