I think this is, to a significant extent, definitionally impossible with longtermist interventions, because the 'long-term' part excludes having an empirical feedback loop quick enough to update our models of the world.
For example, if I'm curious about whether malaria net distribution or vitamin A supplementation is more 'cost-effective' than another, I can fund interventions and run RCTs, and then model the resulting impact according to some metric like the DALY. This isn't cast-iron secure evidence, but it is at least causally connected to the result I care about.
For interventions that target the long-run future of humanity, this is impossible. We can't run counterfactuals of the future or past, and I at least can't wait 1000 years to see the long-term impact of certain decisions on the civilizational trajectory of the world. Thus, any longtermist intervention cannot really get empirical feedback on the parameters of action, and mostly rely on subjective human judgement about them.
To their credit, the EA Long-Term Future Fund says as much on their own web page:
Unfortunately, there is no robust way of knowing whether succeeding on these proxy measures will cause an improvement to the long-term future.
For similar thoughts, see Laura Duffy's thread on empirical vs reason-driven EA