All of nikvetr's Comments + Replies

Careers Questions Open Thread

Working at a ritzy quant firm shouldn't impact your competitiveness for PhD programs too much (could even improve it), and if you're getting $1M+ / 5y E2G-worthy offers halfway through ugrad (and have already published!), you'll probably still be able to get comparable offers if you decide to e.g. master out. So in that regard, it probably doesn't matter too much which path you take, since neither preclude reinvention.

If it were me, I'd take the bird in hand and work in the quant role... but if I felt myself able to make more meaningful "direct" contributi... (read more)

Careers Questions Open Thread

Maybe so! Might just be the career questions are a bit too targeted (partner also has had trouble getting advice on how to best leverage her tissue engineering / veterinary background to best serve animal welfare, e.g. working directly with researchers using animal models vs. developing in vitro meat in a more wet bench role). Was just curious to get an outside view, especially from a more "value-aligned" group than might be found in your typical career center or through existing mentors etc. Thank you for your response! 

Careers Questions Open Thread

I'd second the Ng Coursera course -- very straightforward and easy to follow for those lacking technical backgrounds! Which may be a plus or a minus, depending on your desired rigor.

Careers Questions Open Thread

(removed for privacy + inappropriateness)

1Ben_West1yMy experience with bioinformatics is almost exclusively on the industry side, and more the informatics than the bio. With that caveat, a few thoughts: My experience is that the highest earning positions are not "sexy" (in the way I think you are using the term). I recall one conference I attended in which the speaker was describing some advanced predictive algorithm, and a doctor in the back raised their hand and said "this is all nice but I can't even generate a list of my diabetic patients so could you start with that please?" This might also address your question "how easy is it to, say, break into industry data science for anthropology graduates with experience in computational stats methods development?" – I think it depends very much on what you mean by "data science". A lot of the most successful bioinformatics companies' products are quite mundane by academic standards: alerting clinicians to well-known drug-drug interactions, identifying patients based on well validated reference ranges for lab tests, etc. My impression is that getting a position at one of these places is approximately similar to getting any other programming job. If you are looking for something more academic though, the requirements are different. A problem I suspect you will run into is that methods development requires (often quite large) data sets. I get the sense from your brief bio that you aren't interested in doing any wet lab work, meaning that if you were to work on, say, cultured meat, you would need a data set from some collaborator. If I were you, I might try to resolve this first. I know GFI has an academic network you can join and you could message people there about the existence of data sets. Also, you might be interested in OpenPhil's early career GCBR funding [https://www.openphilanthropy.org/focus/global-catastrophic-risks/biosecurity/open-philanthropy-project-early-career-funding-global-catastrophic-biological-risks] . Even if you don't need funding, they might be
5HStencil1yIt sounds like you're doing some awesome work, and these are great questions, but I very seriously doubt you will be able to get good answers to them from anyone without domain expertise in your field, so this may not be best place to look. I personally have some very cursory exposure to biostatistics and health data science (definitely less than you), but I imagine I have significantly more familiarity with the area, especially in the U.S., than most people on the EA Forum, and I have zero clue about the answers to your questions.
An Effective Altruist Message Test

Sure! Though unfortunately most of the stuff comes from scattered lectures, workshops, discussions, book chapters, seminars, papers, etc. But for intro multilevel Bayesian regression in R/STAN I'd say John Kruschke's "Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan" and Richard McElreath's "Statistical Rethinking: A Bayesian Course with Examples in R and Stan" would be really solid (Richard also has his course lectures up on youtube if you prefer that, though I found his book super readable, so much so that when I took t... (read more)

An Effective Altruist Message Test

Ah, gotcha. But re: code review, even the most beautifully constructed chains can fail, and how you specify your model can easily cause things to go kabloom even if the machine's doing everything exactly how it's supposed to. And it only takes a few minutes to drag your log files into something like Tracer and do some basic peace-of-mind checks (and others, e.g. examine bivariate posterior distributions to assess nonidentifiably wrt your demographic params). More sophisticated diagnostics are scattered across a few programs but don't take too long to run e... (read more)

0Peter Wildeford5yDo you have any good textbooks or educational resources to learn these kinds of techniques?
0Michael_S5ySounds interesting. Would love to take a look when you get a chance to provide the links.
An Effective Altruist Message Test

Of course (though wheel reinvention can be super helpful educationally), but there are great free public R packages that interface to STAN (I use "rethinking" for my hierarchical Bayesian regression needs but I think Rstan would work, too), so going with someone's unnamed, private code isn't necessary imo. How much did the survey cost (was it a lot longer than the included google doc, then? e.g. Did you have screening questions to make sure people read the paragraph?). And model+mcmc specification can have lots of fiddly bits that can easily lead us astray, I'd say

1Michael_S5yYeah, the survey was a lot longer. Typically general public surveys will cost over 10 dollars a complete, so getting 1200 cases for a survey like this can cost thousands of dollars. I agree that model specification can be tricky, which is a reason I felt it well worth it to use the proprietary software I had access to that has been thoroughly vetted and code reviewed and is used frequently to run similar analyses rather than trying to construct my own. I did not make sure people read the paragraph. I discussed the issue a bit in my discussion section, but one way a web survey might understate the effect is if people would pay closer attention and respond better to a friend delivering the message. OTOH, surveys do have some potentual vulnerability to the hawthorne effect, though that didn't seem to express itself in the donations question.
An Effective Altruist Message Test

Ah, I guess that's better than no control, and presumably paying attention to a paragraph of text doesn't make someone substantially more or less generous. Did you fit a bunch of models with different predictors and test for a sufficient improvement of fit with each? Might do to be wary of overfitting in those regards maybe... though since those aren't focal Bayes tends to be pretty robust there, imo, so long as you used sensible priors

"I used a multilevel model to estimate the effects among those with and without a bachelor's degree. So, the bachelo... (read more)

1Michael_S5yNo; I did not fit multiple models. Lasso regression was used to fit a propensity model using the predictors. Using bachelor's vs. non-bachelor's has advantages in interpretability, so I think this was the right move for my purposes. I did not spend an exorbitant amount of time investigating diagnostics, for the same reason I used a proprietary package was has been built for running these tests at a production level and has been thoroughly code reviewed. I don't think it's worth the time to construct an overly customized analysis.
An Effective Altruist Message Test

Ah, interesting! What package? I've never heard of something like that before. Usually in the cold, mechanical heart of every R package is the deep desire to be used and shared as far as possible. If it's just someone's personal interface code, why not use something more publicly available? Can you write out your basic script in pseudocode (or just math/words?)? Especially the model and MCMC specification bits?

1Michael_S5ySure, in an ideal world, software would all be free for everyone; alas, we do not live in such a world :p. I used the proprietary package because it did exactly what I needed and doesn't require writing STAN code or anything myself. I'd rather not re-invent the wheel. I felt the tradeoff of transparency for efficiency and confidence in its accuracy was worth it, especially since I wouldn't be able to share the data either way (such are the costs of getting these questions on a 1200 person survey without paying a substantial amount). But the basic model was just a multilevel binomial model predicting the dependent variable using the treatments and questions asked earlier in the survey as controls.
An Effective Altruist Message Test

Yep, and alongside it, of course, the raw data!

An Effective Altruist Message Test

Yay for Bayesian regression (binomial, I'm guessing? You re-binned your attitude and donations responses? I think an ordered logit would be more appropriate here and result in less of a loss in resolution, or even a dirichlet, but then you'd lose yer ordering)! Those posteriors look decently tight, though I do have some questions!

I'm a little confused on what your control was, exactly. You have both points and distributions in your posterior plots, but you don't have any control paragraph blurb in you google doc questionnaire. How did you evaluate your con... (read more)

1Michael_S5yYup, binomial. The respondents in a treatment were each shown a message and asked how compelling they thought it was. The control was shown no message. Yeah; the plots are the predicted values for those given a particular treatment. and Average Treatment Effect is the difference with the control. I did not include every control used in the provided questionnaire. There were a mix of demographics/attitudinal/behavioral questions asked in the survey that I also used. These controls, particularly previous donations, were important for decreasing variance. I used a multilevel model to estimate the effects among those with and without a bachelor's degree. So, the bachelor's estimate borrow's power from those without a degree, reducing problems with over fitting. These models used STAN, which handles these multilevel models well. Convergence was assessed with gelman-rubin statistics.
2Peter Wildeford5yIt would be cool to provide the code, for both learning and verification purposes.