All of jacquesthibs's Comments + Replies

So, I don't think it's accurate to say EAs have made absolutely no effort on this front.

Thanks for the comment. I’m aware of the situations you mentioned and did not say that EA had not previously put effort into things. In fact, my question is essentially “Has EA given up on politics (perhaps because things went poorly before)?”

Also, note that I am not exactly suggesting pushing for left-wing things. Generally remedying the situation may need to go beyond trying to get one person in elected office. In fact, I think that such a bet would be unambitious and fail to meet the moment.

1
John Huang
My interest is in transformative reforms and the question of, "how could we do politics better?" How could decision making be improved? Politics is not just about funding some electoral candidate.  In my opinion, and I think it's been well known for decades, that electoral democracy has woeful limitations on good decision making. This opinion is shared by numerous political scientists. Since the 1950's, Downs Paradox has suggested that it has never been rational, in a self interested sense, to vote. The following 60 years of research has also demonstrated the woeful incompetence of voters. This latest cycle also demonstrates their incompetence, how Donald Trump was elected and his supporters being surprised by his policies - with many immigrant and gen Z voters quickly supporting, and then opposing, Trump. Anybody "in the know" understood what the Trump 2nd term was going to be about. Any surprise is a result of incompetent decision making.  The status quo of EA seems to be trying to get in on the rat race of electoral politics, to persuade these  ignorant voters. Yet propaganda has never been the strength of EA. I then claim that most potential future efforts of campaigning will have mediocre results.  The typical fascist is always going to have an easier time. It's just easy to use a tried and true tactic - scapegoat a minority (ie trans people and immigrants), blame them for all our problems, and use them as a vehicle to take power. Fascism takes advantage of our tribal instincts, whereas something like EA demands a rationality that is too expensive to transmit through mass propaganda.  Is there something better out there? In my opinion yes, and it's called sortition. The premise is simple. Instead of demanding everyone participate in politics, you draw a random sample. With fewer participants, you can now focus resources on the sample.  Imagine you want to select a president or some other leadership role for government. You could use sortition to construct

We’re working on providing clarity on:

“What AI safety research agendas could be massively sped up by AI agents? What properties do they have (e.g. easily checkable, engineering > conceptual ...)?”

I’ll strongly consider putting out a post with a detailed breakdown and notes on when we think it’ll be possible. We’re starting to run experiments that will hopefully inform things as well.

4
calebp
Do you have a list of research questions that you think could easily be sped up with AI systems? I suspect that I'm more pessimistic than you are due to concerns around scheming AI agents doing intentional research sabotage and think that the affordances of AI agents might make some currently intractable agendas more tractable.

Just a quick note, I completely understand where you guys are coming from and just wanted to share the information. This wasn’t intended as a call-out or anything. I trust you guys and appreciate the work you do!

Ok, but the message I received was specifically saying you can’t fund for-profits and that we can re-apply as a non-profit:

"We rejected this on the grounds that we can't fund for-profits. If you reorganize as a non-profit, you can reapply to the LTFF in an future funding round, as this would change the application too significantly for us to evaluate it in this funding round.
Generally, we think it's good when people run for-profits, and other grant makers can fund them."

We will reconsider going the for-profit route in the future (something we’ve thought a lot about), but for now have gotten funding elsewhere as a non-profit to survive for the next 6 months.

4
calebp
Sorry, I agree this message is somewhat misleading - I'll ask our ops team to review this.

In case this is useful to anyone in the future: LTFF does not provide funding for for-profit organizations. I wasn't able to find mentions of this online, so I figured I should share.

I was made aware of this after being rejected today for applying to LTFF as a for-profit. We updated them 2 weeks ago on our transition into a non-profit, but it was unfortunately too late, and we'll need to send a new non-profit application in the next funding round.

2
calebp
Thanks. We should probably try to display this on our website properly. We have been able to fund for-profits in the past, but it is pretty difficult. I don't think the only reason we passed on your application was that it's for-profit, but that did make our bar much higher (this is a consequence of US/UK charity law and isn't a reflection on the impact of non-profits/for-profits). By the way, I personally think that your project should probably be a for-profit, as it will be easier to raise funding, users will hold you to higher standards, and your team seems quite value-aligned.

We put out a proposal for automating AI safety research on Manifund. We got our first $10k. I figured I'd share this here if you or someone you might know would like to fund our work! Thanks!

Coordinal Research: Accelerating the research of safely deploying AI systems.

Project summary

What are this project's goals? How will you achieve them?

Coordinal Research (formerly Lyra Research, merging with Vectis AI) wants to accelerate the research of safe and aligned AI systems. We're complementing existing research in these directions through two key approaches... (read more)

Are you or someone you know:

1) great at building (software) companies
2) care deeply about AI safety
3) open to talk about an opportunity to work together on something

If so, please DM with your background. If someone comes to mind, also DM. I am looking thinking of a way to build companies in a way to fund AI safety work.

Yeah, apologies; I thought I had noted that, but I only mentioned the iOS app. There are a few that exist, but I think the ones I've seen are only Mac-compatible at the moment, unfortunately. There has to be a Windows or Linux one...

I’m still getting the hang of it, but primarily have been using it when I want to brainstorm some project ideas that I can later pass off to an LLM for context on what I’m working on or when I want to reflect on a previous meeting I had. Will probably turn it on about ~1 time per week while I’m walking to work and ramble about a project in case I think of something good. (I also sometimes use it to explain the project spec or small adjustments I want my AI coding assistant to do.)

Sometimes I’ll use the Advanced Voice Mode or normal voice mode from ChatGPT ... (read more)

1
CB🔸
I tried to acquire superwhisper - but it's only available on mac and iphone, so in the end I couldn't use it (I'll look into other speech to text translators)
1
CB🔸
Ok, interesting. Do you think that if i were to use it, I would be able to win some significant time by using it to write reports, or social media posts, or even comment on the EA Forum ?

Yeah, I think most of the gains we've gotten from AI have been in coding and learning. Many of the big promises have yet to be met; definitely still a struggle to get it to work well for writing (in the style we'd want it to write) or getting AI agents to work well, so it limits the possible useful application.

I quickly wrote up some rough project ideas for ARENA and LASR participants, so I figured I'd share them here as well. I am happy to discuss these ideas and potentially collaborate on some of them.

Alignment Project Ideas (Oct 2, 2024)

1. Improving "A Multimodal Automated Interpretability Agent" (MAIA)

Overview

MAIA (Multimodal Automated Interpretability Agent) is a system designed to help users understand AI models by combining human-like experimentation flexibility with automated scalability. It answers user queries about AI system components by iteratively ... (read more)

I just saw this; thanks for sharing! Yup, some of these should be able to be solved quickly with LLMs.

I'm exploring the possibility of building an alignment research organization focused on augmenting alignment researchers and progressively automating alignment research (yes, I have thought deeply about differential progress and other concerns). I intend to seek funding in the next few months, and I'd like to chat with people interested in this kind of work, especially great research engineers and full-stack engineers who might want to cofound such an organization. If you or anyone you know might want to chat, let me know! Send me a DM, and I can send you ... (read more)

Hey everyone, in collaboration with Apart Research, I'm helping organize a hackathon this weekend to build tools for accelerating alignment research. This hackathon is very much related to my effort in building an "Alignment Research Assistant."

Here's the announcement post:

2 days until we revolutionize AI alignment research at the Research Augmentation Hackathon!

As AI safety researchers, we pour countless hours into crucial work. It's time we built tools to accelerate our efforts! Join us in creating AI assistants that could supercharge the very research w... (read more)

4
Yonatan Cale
Hey :)   Looking at some of the engineering projects (which is closest to my field) : I'm guessing Claude 3.5 Sonnet could do these things, probably using 1 prompt for each (or perhaps even all at once). Consider trying, if you didn't yet. You might not need any humans for this. Or if you already did then oops and never mind!   Thanks for saving the world!

We're doing a hackathon with Apart Research on 26th. I created a list of problem statements for people to brainstorm off of.

Pro-active insight extraction from new research

Reading papers can take a long time and is often not worthwhile. As a result, researchers might read too many papers or almost none. However, there are still valuable nuggets in papers and posts. The issue is finding them. So, how might we design an AI research assistant that proactively looks at new papers (and old) and shares valuable information with researchers in a naturally consumab... (read more)

As an update to the Alignment Research Assistant I'm building, here is a set of shovel-ready tasks I would like people to contribute to (please DM if you'd like to contribute!):

Core Features

1. Setup the Continue extension for research: https://www.continue.dev/ 

  • Design prompts in Continue that are suitable for a variety of alignment research tasks and make it easy to switch between these prompts
  • Figure out how to scaffold LLMs with Continue (instead of just prompting one LLM with additional context)
    • Can include agents, search, and more
  • Test out models to
... (read more)
4
jacquesthibs
We're doing a hackathon with Apart Research on 26th. I created a list of problem statements for people to brainstorm off of. Pro-active insight extraction from new research Reading papers can take a long time and is often not worthwhile. As a result, researchers might read too many papers or almost none. However, there are still valuable nuggets in papers and posts. The issue is finding them. So, how might we design an AI research assistant that proactively looks at new papers (and old) and shares valuable information with researchers in a naturally consumable way? Part of this work involves presenting individual research with what they would personally find valuable and not overwhelm them with things they are less interested in. How can we improve the LLM experience for researchers? Many alignment researchers will use language models much less than they would like to because they don't know how to prompt the models, it takes time to create a valuable prompt, the model doesn't have enough context for their project, the model is not up-to-date on the latest techniques, etc. How might we make LLMs more useful for researchers by relieving them of those bottlenecks? Simple experiments can be done quickly, but turning it into a full project can take a lot of time  One key bottleneck for alignment research is transitioning from an initial 24-hour simple experiment in a notebook to a set of complete experiments tested with different models, datasets, interventions, etc. How can we help researchers move through that second research phase much faster? How might we use AI agents to automate alignment research? As AI agents become more capable, we can use them to automate parts of alignment research. The paper "A Multimodal Automated Interpretability Agent" serves as an initial attempt at this. How might we use AI agents to help either speed up alignment research or unlock paths that were previously inaccessible? How can we nudge research toward better objectives (age

I've created a private discord server to discuss this work. If you'd like to contribute to this project (or might want to in the future if you see a feature you'd like to contribute to) or if you are an alignment/governance researcher who would like to be a beta user so we can iterate faster, please DM me for a link!

Yes, I’ve talked to them a few times in the last 2 years!

Hey everyone, my name is Jacques, I'm an independent technical alignment researcher (primarily focused on evaluations, interpretability, and scalable oversight). I'm now focusing more of my attention on building an Alignment Research Assistant. I'm looking for people who would like to contribute to the project. This project will be private unless I say otherwise.

Side note: I helped build the Alignment Research Dataset ~2 years ago. It has been used at OpenAI (by someone on the alignment team), (as far as I know) at Anthropic for evals, and is now used as t... (read more)

6
jacquesthibs
As an update to the Alignment Research Assistant I'm building, here is a set of shovel-ready tasks I would like people to contribute to (please DM if you'd like to contribute!): Core Features 1. Setup the Continue extension for research: https://www.continue.dev/  * Design prompts in Continue that are suitable for a variety of alignment research tasks and make it easy to switch between these prompts * Figure out how to scaffold LLMs with Continue (instead of just prompting one LLM with additional context) * Can include agents, search, and more * Test out models to quickly help with paper-writing 2. Data sourcing and management * Integrate with the Alignment Research Dataset (pulling from either the SQL database or Pinecone vector database): https://github.com/StampyAI/alignment-research-dataset  * Integrate with other apps (Google Docs, Obsidian, Roam Research, Twitter, LessWrong) * Make it easy to look and edit long prompts for project context 3. Extract answers to questions across multiple papers/posts (feeds into Continue) * Develop high-quality chunking and scaffolding techniques * Implement multi-step interaction between researcher and LLM 4. Design Autoprompts for alignment research * Creates lengthy, high-quality prompts for researchers that get better responses from LLMs 5. Simulated Paper Reviewer * Fine-tune or prompt LLM to behave like an academic reviewer * Use OpenReview data for training 6. Jargon and Prerequisite Explainer * Design a sidebar feature to extract and explain important jargon * Could maybe integrate with some interface similar to https://delve.a9.io/  7. Setup automated "suggestion-LLM" * An LLM periodically looks through the project you are working on and tries to suggest *actually useful* things in the side-chat. It will be a delicate balance to make sure not to share too much and cause a loss of focus. This could be custom for the research with an option only to give automated suggestions post-research
2
jacquesthibs
I've created a private discord server to discuss this work. If you'd like to contribute to this project (or might want to in the future if you see a feature you'd like to contribute to) or if you are an alignment/governance researcher who would like to be a beta user so we can iterate faster, please DM me for a link!
1
defun 🔸
Have you talked with someone from Ought/Elicit? It seems like they should be able to give you useful feedback.

GPT-2 was trained in 2019 with an estimated 4e21 FLOP, and GPT-4 was trained in 2023 with an estimated 8e24 to 4e25 FLOP.

Correction: GPT-2 was trained in 2018 but partially released in February 2019. Similarly, GPT-4 was trained in 2022 but released in 2023.

1
OscarD🔸
Thanks, fixed. I was basing this off of Table 1 (page 20) in the original but I suppose Leopold meant the release year there.

For instance (and to their credit), OpenAI has already committed 20% of their compute secured to date to solving the problem of aligning superintelligent AI systems.

lol


I'm currently trying to think of project/startup ideas in the space of d/acc. If anyone would like to discuss ideas on how to do this kind of work outside of AGI labs, send me a DM.

Note that Entrepreneurship First will be running a cohort of new founders focused on d/acc for AI.

I shared the following as a bio for EAG Bay Area 2024. I'm sharing this here if it reaches someone who wants to chat or collaborate.

Hey! I'm Jacques. I'm an independent technical alignment researcher with a background in physics and experience in government (social innovation, strategic foresight, mental health and energy regulation). Link to Swapcard profile. Twitter/X.

CURRENT WORK

  • Collaborating with Quintin Pope on our Supervising AIs Improving AIs agenda (making automated AI science safe and controllable). The current project involves a new method allowi
... (read more)

Another data point: I got my start in alignment through the AISC. I had just left my job, so I spent 4 months skilling up and working hard on my AISC project. I started hanging out on EleutherAI because my mentors spent a lot of time there. This led me to do AGISF in parallel.

After those 4 months, I attended MATS 2.0 and 2.1. I've been doing independent research for ~1 year and have about 8.5 more months of funding left.

3
Remmelt
I did not know this. Thank you for sharing all the details! It's interesting to read about the paths you went through:  AISC --> EleutherAI --> AGISF              --> MATS 2.0 and 2.1               --> Independent research grant I'll add it as an individual anecdote to our sheet.

More information about the alleged manipulative behaviour of Sam Altman

Source

Update, board members seem to be holding their ground more than expected in this tight situation:

My current speculation as to what is happening at OpenAI

How do we know this wasn't their best opportunity to strike if Sam was indeed not being totally honest with the board?

Let's say the rumours are true, that Sam is building out external orgs (NVIDIA competitor and iPhone-like competitor) to escape the power of the board and potentially go against the charter. Would this 'conflict of interest' be enough? If you take that story forward, it sounds more and more like he was setting up AGI to be run by external companies, using OpenAI as a fundraising bargai... (read more)

2
jacquesthibs
Update, board members seem to be holding their ground more than expected in this tight situation:

Quillette founder seems to be planning to write an article regarding EA's impact on on tech:

"If anyone with insider knowledge wants to write about the impact of Effective Altruism in the technology industry please get in touch with me claire@quillette.com. We pay our writers and can protect authors' anonymity if desired."

It would probably be impactful if someone in the know provided a counterbalance to whoever will undoubtedly email her to disparage EA with half-truths/lies.

To share another perspective: As an independent alignment researcher, I also feel really conflicted. I could be making several multiples of my salary if my focus was to get a role on an alignment team at an AGI lab. My other option would be building startups trying to hit it big and providing more funding to what I think is needed.

Like, I could say, "well, I'm already working directly on something and taking a big pay-cut so I shouldn't need to donate close to 10%", but something about that doesn't feel right... But then to counter-balance that, I'm constantly worried that I just won't get funding anymore at some point and would be in need of money to pay for expenses during a transition.

4
calebp
Fwiw my personal take (and this is not in my capacity as a grantmaker) is that building up your runway seems really important, and I personally think that it should be a higher priority than donating 10%. My guess is that GWWC would suggest dropping your commitment to say 2% as a temporary measure while you build up your savings.

I've also started working on a repo in order to make Community Notes more efficient by using LLMs.

Don't forget that we train language models on the internet! The more truthful your dataset is, the more truthful the models will be! Let's revamp the internet for truthfulness, and we'll subsequently improve truthfulness in our AI systems!!

I shared a tweet about it here: https://x.com/JacquesThibs/status/1724492016254341208?s=20

Consider liking and retweeting it if you think this is impactful. I'd like it to get into the hands of the right people.

If you work at a social media website or YouTube (or know anyone who does), please read the text below:

Community Notes is one of the best features to come out on social media apps in a long time. The code is even open source. Why haven't other social media websites picked it up yet? If they care about truth, this would be a considerable step forward beyond. Notes like “this video is funded by x nation” or “this video talks about health info; go here to learn more” messages are simply not good enough.

If you work at companies like YouTube or know someone who... (read more)

2
jacquesthibs
I've also started working on a repo in order to make Community Notes more efficient by using LLMs.
5
Ian Turner
One may infer that they do not care about truth, at least not relative to other considerations.
2
jacquesthibs
Don't forget that we train language models on the internet! The more truthful your dataset is, the more truthful the models will be! Let's revamp the internet for truthfulness, and we'll subsequently improve truthfulness in our AI systems!!
2
jacquesthibs
I shared a tweet about it here: https://x.com/JacquesThibs/status/1724492016254341208?s=20 Consider liking and retweeting it if you think this is impactful. I'd like it to get into the hands of the right people.

Attempt to explain why I think AI systems are not the same thing as a library card when it comes to bio-risk.

To focus on less of an extreme example, I’ll be ignoring the case where AI can create new, more powerful pathogens faster than we can create defences, though I think this is an important case (some people just don’t find it plausible because it relies on the assumption that AIs being able to create new knowledge).

I think AI Safety people should make more of an effort to walkthrough the threat model so I’ll give an initial quick first try:

1) Library.... (read more)

I'm working on an ultimate doc on productivity I plan to share and make it easy, specifically for alignment researchers.

Let me know if you have any comments or suggestions as I work on it.

Roam Research link for easier time reading.

Google Docs link in case you want to leave comments there.

From what I understand, Amazon does not get a board seat for this investment. Figured that should be highlighted. Seems like Amazon just gets to use Anthropic’s models and maybe make back their investment later on. Am I understanding this correctly? 

As part of the investment, Amazon will take a minority stake in Anthropic. Our corporate governance structure remains unchanged, with the Long Term Benefit Trust continuing to guide Anthropic in accordance with our Responsible Scaling Policy. As outlined in this policy, we will conduct pre-deployment tests

... (read more)

I would, however, not downplay their talent density.

1
Zhijing Jin
Good idea! Just made the other post to reach more audience!

Thanks for sharing. I think the above are examples of things people often don't think of when trying new ways to be more productive. Instead, the default is trying out new productivity tools and systems (which might also help!). Environment and being in a flux period can totally change your behaviour in the long term; sometimes, it's the only way to create lasting change.

Answer by jacquesthibs32
14
1
2

When I first was looking into being veg^n, I became irritated by the inflated reviews at veg^n restaurants. It didn’t take me long to apply a veg^n tax; I started to assume the restaurant’s food was 1 star below what their average was. Made me more distrustful of veg^ns too.

I think using virtue ethics is the right call here, just be truthful.

Is someone planning on doing an overview post of all the AI Pause discussion? I’m guessing some people would appreciate it if someone took the time to make an unbiased synthesis of the posts and discussions.

3
Will Aldred
According to the debate week announcement, Scott Alexander will be writing a summary/conclusion post.

Are you or any other EA lawyer still doing this?

Either way, I’m seeking advice to figure out how I can save money on taxes once I move to the UK (I’m from Canada) and receive funding for my independent AI Safety research. I’ll be going to the UK on a Youth Mobility visa. I’m wondering if it’s possible for me to setup something so that I can save tax on ‘business’ expenses (office space, laptop, monitor, etc.).

I’m happy to pay if someone can help with this (otherwise I will reach out to non-EA lawyers).

Would newer people find it valuable to have some kind of 80,000 hours career chatbot that had access to the career guide, podcast notes, EA forum posts, job postings, etc, and then answered career questions? I’m curious if it could be designed to be better than just a raw read of the career guide or at least a useful add-on to the career guide.

Potential features:

  • It could collect your conversation and convert most of it into an application for a (human) 1-on-1 meeting.
  • You could have a speech-to-text option to ramble all the things you’ve been thinking of.
  • ???

If anyone from 80k is reading this, I’d be happy to build this as a paid project.

Would be great to have someone who is exceptional at convincing high net worth individuals to donate for specific causes. I’m sure some people in the AI Safety community would find that valuable given the large funding gap despite the exceptional amount of attention the field is receiving. I’m sure other cause areas would also find it valuable.

EDIT: I’ve gotten a few disagree-votes, which is totally fine! Though, I’m curious why some people disagree. If it‘s because they wouldn’t find this interesting, they don’t think it would be appropriate for the podcast, or…?

Thanks for all your work, JJ! Good luck with whatever you end up doing next!

(Note in case anybody else wants to pick up where AISS left off: This is a bit unfortunate for me given that not having an org to sponsor work visas in the UK might affect my decision for moving to there. We had talked about AISS trying to do the work to get that setup in the next 1-2 years.)

I still have plans to setup a research org in the UK

In this framework, I propose that Xnder ³amoXnt of mone\ to donate ́ Ze bXndle
considerations relating to taxes, weakness of will and uncertainty as well as financial
investment.

One thing I think is missing from the "how much you should donate" section above is a discussion about what kind of job the person is doing. Should the percentage be the same for someone doing Earning to Give vs someone working on a direct cause area?

1
Denis
IMHO this is a very personal, case-by-case calculation.  A person will donate what they can rather than just a fixed percentage. But this can depend on many factors, not just jobs / income, but also expenditures (do they have kids? are they paying off college loans? a mortgage? ...) and potential risks (what if they lose their job? what if one of the kids gets sick? ...).  That said, I believe there is a huge opportunity to maximise the "what they can donate" with a more structured approach. Today we have a very simplistic all-or-nothing donation model. For every dollar or euro you have, you either donate it (and lose it forever) or you don't donate it at all. I believe there could also be a happy-medium, I've started a draft post on that ... 

I recently sent in some grant proposals to continue working on my independent alignment research. It gives an overview of what I'd like to work on for this next year (and more really). If you want to have a look at the full doc, send me a DM. If you'd like to help out through funding or contributing to the projects, please let me know.

Here's the summary introduction:

12-month salary for building a language model system for accelerating alignment research and upskilling (additional funding will be used to create an organization), and studying how to&nbs... (read more)

I gave talk about my Accelerating Alignment with LLMs agenda about 1 month ago (which is basically a decade in AI tools time). Part of the agenda covered (publicly) here.

I will maybe write an actual post about the agenda soon, but would love to have some people who are willing to look over it. If you are interested, send me a message. I am currently applying for grants and exploring the possibility of building an org focused on speeding up this agenda and avoid spreading myself too thin.

Load more