Hide table of contents
4 min read 5

41

I want to publicly state my intentions going into my CEEALAR stay, to reflect on later and observe any drift that occurs for better or worse. I want to be accountable to the community, since in some sense I'm an expenditure of community resources. If you have any feedback or advice, please drop it below! 

I am admitted to CEEALAR January to July 2021 (6 months is the maximum tourist visa for a US citizen).

weights to my time

I'm forecasting the following weights assigned to my time.

  1. 60% on writing Coq until I complete either 4 or 5 volumes of Software Foundations. I completed volume 1 last year and I might skip volume 4.
  2. 15% haskell/typescript projects. (If I go back to industry after this, I'd like it to be a haskell job, moreover I want to get better at web dev since I dumped a lot of time into ML and can barely hold a website together)
  3. 5% remote college stuff from back home (a physics course and tutoring for the math dept) (the course is base credentialism i.e. I'm not actually learning the material just optimizing for the A, while the tutoring is to increase my mastery over calc3 and graph theory)
  4. 5% writing nonfiction on the following subjects:
    1. infodemics (including organizing a panel for EA Philly then producing transcripts for the EA forum)
    2. my thinking about freedom/autonomy and how it impacts moral patienthood and ai safety (response to When is Unaligned AI Morally Valuable, response to Fun Theory, etc.)
    3. the situation of proof assistants & formal verification in AI safety as I forecast it (tentatively called "Going Long on Formal Verification")
  5. 5% producing and hosting the Technical AI Safety Podcast.
  6. 5% commitments to projects I was doing anyway, like this weekly twitch stream and this group as well as attending lectures and community events.
  7. 5% applying to colleges. (i have a 3.9/4 community college GPA and I want to do elite undergrad when I get back to the states)

After I complete Software Foundations I'll either 1. pivot to replicating ML papers. Right now, I'm especially interested in this soon to be covered in TASP episode 1, but as I cover more papers for TASP I might discover something else I'd like to replicate. Or 2. roll my first proof assistant. 

After every last college deadline is over I'll reapportion that time to something, to be determined. 

possible disruption to these weights

  1. I get admitted to AI Safety Camp
  2. I get the CHAI internship
  3. I boost the podcast up from 1-2 episodes per month or I'm simply wrong about how long 1-2 episodes per month takes.

I think in each of these cases I'll reduce time I spend on Software Foundations, even if it means not getting to the end of the book before I have to go back to the states. 

how formal am I being with time management? 

I haven't decided yet if I want the cognitive overhead of a timekeep app, clocking in and clocking out for each project etc. but the data could be valuable. You might ask "what's the point of declaring weights if you won't be keeping score numerically?" which is a reasonable question, is it an awful answer to say that I'm just using my intuition to approximate the weights? 

I do have formal task management in Jira, multiple kanban boards collated into a single todo list with due dates, and I think this is the right step as it has negligible cognitive overhead. 

hours in absolute terms

A lot of this work is rather demanding intellectually, I think 40 hours/week should be fine, and I should cut myself off around 50/week to avoid burnout. I think one complete day off per week to sit around and recharge is the right move. I'm open to being wrong; I'm open to finding that 30-35 hours per week is my maximal effectiveness. 

what could go horribly wrong? 

I could be drastically miscalculating my motivation, interests, and stamina. I don't know what to do to guard against this besides

  1. revising the weights only if I'm faced with drastic motivation/interests misalignment.
  2. going easy on myself in hours per week terms if I'm facing stamina droughts.
  3. remembering to exercise and take my meds.
  4. I'm resting a lot right now during my last month in the states before I leave, cuz I want to be refreshed and ready to go when I arrive in Blackpool.

what are my blindspots? 

can anybody identify any? 

why I'm doing this

my vision and ambitions: 

  • 0-3 years after leaving CEEALAR: enough mastery over formalization & type theory that I can automate some agent foundations results and/or some of Vanessa Kosoy's research agenda. Professional and academic interests will be competing for time: professionally, I could reach the Haskell skill level of a MIRI engineer, or academically, I could complete an undergrad and position myself for grad school. 
  • 3-7 years after leaving CEEALAR: I'm designing new proof assistants for ML systems, optimistically if folks like CHAI discover sets of proof obligations by then I'll be able to implement tools that make satisfying those obligations commercially viable. 

41

0
0

Reactions

0
0

More posts like this

Comments5


Sorted by Click to highlight new comments since:

I also completed Software Foundations Volume 1 last year, and have been kind of meaning to do the rest of the volumes but other things keep coming up. I'm working full-time so it might be beyond my time/energy constraints to keep a reasonable pace, but would you be interested in any kind of accountability buddy / sharing notes / etc. kind of thing?

Maybe! I'm only going after a steady stream of 2-3 chapters per week. Be in touch if you're interested: I'm re-reading the first quarter of PLF since they published a new version in the time since I knocked out the first quarter of it. 

Thanks for sharing that, that sounds like an interesting plan.

A while ago I was trying to think about potential ways to have large impact via formal verification (after reading this post). I didn't give it much attention, but it looks like others and I don't see a case for this career path to be highly impactful, but I'd to love be proven wrong. I would appreciate it if you could elaborate on your perspective on this. I should probably mention that I couldn't find a reference to formal verification at agent foundations (but I didn't really read it), and Vanessa seemed to reference it as a tangential point, but I might be wrong about both.

I'm interested in formal verification from a purely mathematical point of view. That is, I think it's important for math (but I don't think that formalizing [mainstream] math is likely to be very impactful outside of math). Additionally, I am interested in ideas developed in homotopy type theory, because of their connections to homotopy theory, rather than because I think it is impactful.

Thanks for the comment. I wasn't aware of yours and Rohin's discussion on Arden's post. Did you flesh out the inductive alignment idea on lw or alignment forum? It seems really promising to me. 

I want to jot down notes more substantive than "wait until I post 'Going Long on FV' in a few months" today. 

FV in AI Safety in particular

As Rohin's comment suggests, both aiming proofs about properties of models toward today's type theories and aiming tomorrow's type theories toward ML have two classes of obstacles: 1. is it possible? 2. can it be made competitive?

I've gathered that there's a lot of pessimism about 1, in spite of MIRI's investment in type theory and in spite of the word "provably" in CHAI's charter. My personal expected path to impact as it concerns 1. is "wait until theorists smarter than me figure it out", and I want to position myself to worry about 2.. 

I think there's a distinction between theories and products, and I think programmers need to be prepared to commercialize results. There's a fundamental question: should we expect that a theory's competitiveness can be improved one or more orders of magnitude by engineering effort, or will engineering effort only provide improvements of less than an order of magnitude? I think a lot depends on how you feel about this. 

Asya: 

While I agree that proof assistants right now are much slower than doing math proofs yourself, verification is a pretty immature field. I can imagine them becoming a lot better such that they do actually become better to use than doing math proofs yourself, and don't think this would be the worst thing to invest in.

Asya may not have been speaking about AI safety here, but my basic thinking is that if less primitive proof assistants end up drastically more competitive, and at the same time there are opportunities convert results in verified ML into tooling, expertise in this area could gain a lot of leverage. 

FV in other paths to impact

Rohin: 

it is plausibly still worthwhile becoming an expert on formal verification because of the potential applications to cybersecurity. (Though it seems like in that case you should just become an expert on cybersecurity.)

It's not clear to me that grinding FV directly is as wise as, say, CompTIA certifications. From the expectation that FV pays dividends in advanced cybersec, we cannot conclude that FV is relevant to early stages of a cybersec path. 

Related: Information security careers for GCR reduction. I think the software safety standards in a wide variety of fields have a lot of leverage over outcomes.

I was speaking about AI safety! To clarify, I meant that investments in formal verification work could in part be used to develop those less primitive proof assistants.

Curated and popular this week
Ben_West🔸
 ·  · 1m read
 · 
> Summary: We propose measuring AI performance in terms of the length of tasks AI agents can complete. We show that this metric has been consistently exponentially increasing over the past 6 years, with a doubling time of around 7 months. Extrapolating this trend predicts that, in under a decade, we will see AI agents that can independently complete a large fraction of software tasks that currently take humans days or weeks. > > The length of tasks (measured by how long they take human professionals) that generalist frontier model agents can complete autonomously with 50% reliability has been doubling approximately every 7 months for the last 6 years. The shaded region represents 95% CI calculated by hierarchical bootstrap over task families, tasks, and task attempts. > > Full paper | Github repo Blogpost; tweet thread. 
Ronen Bar
 ·  · 10m read
 · 
"Part one of our challenge is to solve the technical alignment problem, and that’s what everybody focuses on, but part two is: to whose values do you align the system once you’re capable of doing that, and that may turn out to be an even harder problem", Sam Altman, OpenAI CEO (Link).  In this post, I argue that: 1. "To whose values do you align the system" is a critically neglected space I termed “Moral Alignment.” Only a few organizations work for non-humans in this field, with a total budget of 4-5 million USD (not accounting for academic work). The scale of this space couldn’t be any bigger - the intersection between the most revolutionary technology ever and all sentient beings. While tractability remains uncertain, there is some promising positive evidence (See “The Tractability Open Question” section). 2. Given the first point, our movement must attract more resources, talent, and funding to address it. The goal is to value align AI with caring about all sentient beings: humans, animals, and potential future digital minds. In other words, I argue we should invest much more in promoting a sentient-centric AI. The problem What is Moral Alignment? AI alignment focuses on ensuring AI systems act according to human intentions, emphasizing controllability and corrigibility (adaptability to changing human preferences). However, traditional alignment often ignores the ethical implications for all sentient beings. Moral Alignment, as part of the broader AI alignment and AI safety spaces, is a field focused on the values we aim to instill in AI. I argue that our goal should be to ensure AI is a positive force for all sentient beings. Currently, as far as I know, no overarching organization, terms, or community unifies Moral Alignment (MA) as a field with a clear umbrella identity. While specific groups focus individually on animals, humans, or digital minds, such as AI for Animals, which does excellent community-building work around AI and animal welfare while
Max Taylor
 ·  · 9m read
 · 
Many thanks to Constance Li, Rachel Mason, Ronen Bar, Sam Tucker-Davis, and Yip Fai Tse for providing valuable feedback. This post does not necessarily reflect the views of my employer. Artificial General Intelligence (basically, ‘AI that is as good as, or better than, humans at most intellectual tasks’) seems increasingly likely to be developed in the next 5-10 years. As others have written, this has major implications for EA priorities, including animal advocacy, but it’s hard to know how this should shape our strategy. This post sets out a few starting points and I’m really interested in hearing others’ ideas, even if they’re very uncertain and half-baked. Is AGI coming in the next 5-10 years? This is very well covered elsewhere but basically it looks increasingly likely, e.g.: * The Metaculus and Manifold forecasting platforms predict we’ll see AGI in 2030 and 2031, respectively. * The heads of Anthropic and OpenAI think we’ll see it by 2027 and 2035, respectively. * A 2024 survey of AI researchers put a 50% chance of AGI by 2047, but this is 13 years earlier than predicted in the 2023 version of the survey. * These predictions seem feasible given the explosive rate of change we’ve been seeing in computing power available to models, algorithmic efficiencies, and actual model performance (e.g., look at how far Large Language Models and AI image generators have come just in the last three years). * Based on this, organisations (both new ones, like Forethought, and existing ones, like 80,000 Hours) are taking the prospect of near-term AGI increasingly seriously. What could AGI mean for animals? AGI’s implications for animals depend heavily on who controls the AGI models. For example: * AGI might be controlled by a handful of AI companies and/or governments, either in alliance or in competition. * For example, maybe two government-owned companies separately develop AGI then restrict others from developing it. * These actors’ use of AGI might be dr
Recent opportunities in Community
46
Ivan Burduk
· · 2m read