All of sbowman's Comments + Replies

Why not offer a multi-million / billion dollar prize for solving the Alignment Problem?

My colleagues have often been way too nice about reading group papers, rather than the opposite. (I’ll bet this varies a ton lab-to-lab.)

Why not offer a multi-million / billion dollar prize for solving the Alignment Problem?

I like the TruthfulQA idea/paper a lot, but I think incentivizing people to optimize against it probably wouldn't be very robust, and non-alignment-relevant ideas could wind up making a big difference. 

Just one of several issues: The authors selected questions adversarially against GPT-3—i.e., they oversampled the exact questions GPT-3 got wrong—so, simply replacing GPT-3 with something equally misaligned but different, like Gopher, should yield significantly better performance. That's really not something you want to see in an alignment benchmark.

Yeah that's a good point. Another hack would be training a model on text that specifically includes the answers to all of the TruthfulQA questions. The real goal is to build new methods and techniques that reliably improve truthfulness over a range of possible measurements. TruthfulQA is only one such measurement, and performing well on it does not guarantee a signficant contribution to alignment capabilities. I'm really not sure what the unhackable goal looks like here.
Managing COVID restrictions for EA Global travel: My plans + request for other examples

Related question: EAG now requires you to have a lateral flow test result within 48h of the start of the event.  Am I correct in understanding that lateral flow tests in the UK are often DIY kits, where  you don't get any formal documentation of the results? If so, does anyone know what kind of documentation/evidence the EAG staff will be looking for?

3Amy Labenz10mo
Hi sbowman, thanks for asking! We are following the honor system for testing, and will not be requiring proof of the test results. When you arrive, we will ask you to confirm that you have taken one in the last 48 hours and that the result was negative. As for the tests themselves, a DIY kit would work. We describe some details in our COVID protocol [] : * The conference venue cannot facilitate onsite testing so please take a test before you arrive and stay at your accommodation if you have a positive result. * UK residents can order free test kits online [] or collect them from a local UK pharmacy []. (From 4 October, you’ll need a collect code [] to pick up test kits from a pharmacy.) * Attendees arriving from abroad can order a Day 2 test kit [] in advance and have it delivered to their hotel address. * Attendees who cannot afford a test, or are unable to order one in advance, can collect a test from the EA Global team at the following times: * Thursday 28 October, 11:00 am - 7:00 pm The Grubstreet Author [], Milton Street, London, EC2Y 9BH * Friday 29 October, 3:30 pm - 5:00 pm The Brewery [] , 52 Chiswell Street, London, EC1Y 4SD Please don’t hesitate to reach out to [] with any other questions!
Managing COVID restrictions for EA Global travel: My plans + request for other examples

This is great!

  • Why the Randox test in particular?
  • Does it seem viable to use the Day 2 test as the US return test? (I'll only be there Thursday to Monday, so a test on Saturday satisfies both requirements, if there's no other catch.)
1. I used Randox because of social proof and price. Randox was listed by British Airways on this website [] was the cheapest on that website. 2. I don't recommend using Day 2 tests for the return trip. Day 2 tests have to be mailed back to the lab. So it takes some days to get results. For the return flight, you need results back before you fly. So I would worry that I might not get results back in time for my flight.
2Charles He10mo
I don't know if this is right but this is what I did: * I ordered 2x Randox Day 2 tests. I'm staying 6 days and I will try to use one test as my Day 2 and the other as the pre test to return to Canada. My guess is that the "Official Date" of the test is determined by online registration, so I can use the tests this way and take it at the required dates. * It seems like I can speak to others/authorities during my trip, so a mistake on the preflight return test is not too costly. * I choose Randox because of this post, because I saw it positively mentioned on some random Reddit post, and because the site looks legit, with detailed information (e.g. 1-2 pages with "common mistakes" and instructions).
sbowman's Shortform

Naïve question: What's the deal with the cheapest CO2 offset prices?

It seems, though, that the current price of credible offsets is much lower than the social cost of carbon, and possibly so low that just buying offsets starts to look competitive with GiveWell top charities.

I'm not an expert on this. (I run an offsetting program for a small organization, but that takes about 4h/year. Otherwise I don't think about this much.) I'm also not anywhere near advocating that we should sink tons of money into offsets. But this observation strikes me as unintuitive ... (read more)

3Neel Nanda1y
I am also confused about the general question, but I found this intervention interesting to think about. It seems like the legitimacy of this comes down to the elasticity of demand for coal in India (basically, if someone buys 1 ton less of coal, will someone else buy that same ton for a lower price, or will coal producers make one ton less?). I couldn't find any data on elasticity of demand for coal in India, but this paper [] estimates it for China as 0.3 to 0.7, which is maybe an OK proxy? And I don't know if it's reasonable to model the elasticity of demand of coal and of the other fossil fuel as the same (eg, it would be terrible if not buying 1 ton of coal reduces the total coal by 0.2 tons produced, but buying 1 ton of oil increases total oil produced by 0.8 tons). Overall it feels non-obvious to me whether it's legit, though I lean towards "probably, but about half as effective as a naive calculation suggests"
Is shareholder activism worth it?

Update: It seems like the new VOTE ETF could be significantly cheaper and significantly higher-impact than the alternatives I'd mentioned, though still not overtly EA-oriented. Any thoughts?

My quick notes, in tweet form:

How to run a high-energy reading group

Riffing on this, there's an academic format that I've seen work well that doesn't fit too neatly into this rubric:

At each meeting, several people give 15-30m critical summaries of papers, with no expectation that the audience looks at any of the papers beforehand. If the summaries prompt anyone in the audience to express interest or ask good questions, the discussion can continue informally afterward.

This isn't optimized at all for producing new insights during the meeting, but I think it works well in areas (like much of AI) where (i) there's an extremely... (read more)

I agree, this is a common format I've experienced in academia. For what it's worth, I've found that it sometimes evolves into unnecessary criticisms of the paper, and sometimes the criticisms aren't really correct (i.e. the author isn't there to defend the method and perhaps the presenter hasn't quite understood the paper or reasoning themselves). I've started to believe that this reading group format might actually contribute to why a lot of PhD students feel so frozen/overwhelmed when writing papers of their own... they watch perfectly fine papers get ritually dunked on once a week, and then those criticisms get embedded into their inner critic and sabotage their writing progress! :-)
RyanCarey's Shortform

This is probably overstated—at most major US research universities, tenure outcomes are fairly predictable, and tenure is granted in 80-95% of cases. This obviously depends on your field and your sense of your fit with a potential tenure-track job, though.

That said, it is much easier to do research when you're at an institution that is widely considered to be competitive/credible in your field and subfield, and the set of institution... (read more)

Estimation of probabilities to get tenure track in academia: baseline and publications during the PhD.

Academic here:

  • Essentially all of these numbers vary wildly  across subfields, across countries, and on other assumptions like how prestigious the labs are that you're considering. Judging based on numbers from physics, or from US PhDs overall, could leave you off by an order of magnitude or more. They also vary significantly over time. 
  • The populations in PhD programs vary a lot from field to field as well, and how you fit relative to those populations will help tilt the odds. Being intrinsically motivated and a good English writer (the two things
... (read more)
Is shareholder activism worth it?

Thanks, Wayne!

This looks like a good starting point for further research, but it's hard to take much that's actionable from this without more background in finance. Is there anything you'd take away as advice to a smallish-scale individual investor?

Is shareholder activism worth it?

Thanks! This is helpful, and nudging me away from this approach.

Do you know of any good primers to get a better sense of how/when these levers get used on socially relevant issues?

The levers of corporate governance are pretty limited. The corporate form is designed to limit the extent to which minority owners of common stock can intervene in corporate operations. As a result, most proxy proposals of social concern pertain to public disclosures (e.g. of environmental impact, of lobbying expenditures, of pay equity data, etc.) or to the appointment of sympathetic board members/removal of unsympathetic board members. These are nowhere near a majority of total proxy proposals, but they're a sizable percentage of total shareholder proposals (most of which do not pass). For more detail on this, see this comment [] .
Is shareholder activism worth it?

Hrm, this is useful context, but I think you may be getting at a different issue. For the mutual funds that I'm looking at, they seem to be viewing shareholder activism as a potential avenue to have prosocial (ESG) impact on the companies that they invest in, such that their activism strategy likely increases fees a bit without impacting returns either way.

Ah, sorry, I must have misread your original question. Here are my top-level takes on the question you did, in fact, ask: 1) I see some of these Calvert funds have done well over the past few years, but I'm sufficiently convinced by some form of the efficient markets hypothesis to be skeptical that their above-market returns will continue to exceed their fees over the medium-to-long-term. 2) While I do think that shareholder activism through the proxy process can occasionally yield important, positive changes in the corporate world, the levers of corporate governance are limited enough that I very seriously doubt that the money you're spending on Calvert's fees is doing more good maintaining your investments in those funds than it would do if it were donated to, say, Malaria Consortium's SMC program. 3) It's important to remember the actual comparative here. It's not Calvert vs. an evil money manager; it's probably Calvert vs. BlackRock, which has been loudly pushing its portfolio companies to be more conscientious about their impact on the climate []. Of course, on account of its scale, BlackRock's mutual fund and ETF offerings will be much, much cheaper than comparable funds offered by Calvert, and also on account of its scale, BlackRock controls a far larger number of shareholder votes than Calvert does. Ultimately, you have to ask yourself: How often do BlackRock and Calvert vote in different directions on issues that I think are genuinely high-impact (assuming Calvert always casts a socially optimal vote, which I also doubt)? And: How often are Calvert's votes likely to decide those shareholder elections (when BlackRock is voting the other way)? And: How often are my Calvert fund shares likely to make the difference in whether Calvert decides those shareholder elections favorably? If you were to try to model that, I suspect you'd find that investing through Calvert isn't much at all