# -1

I suspect this is true for some other fields, but medical research relies very heavily on the arbitrary p < 0.05 threshold for concluding whether interventions in RCTs are effective or not.

If there is good reason to use a different threshold than p < 0.05, switching to this new threshold would improve biomedical research. With the US National Institute for Health alone spending \$37.3 billion annually on biomedical research, improving the efficiency of biomedical research seems to have very high expected value, despite it not being 'neglected'.

A lower threshold (eg p < 0.01) would mean that we conclude that fewer interventions are effective. This would reduce our false positive rate but increase our false negative rate.

A higher threshold  (eg p < 0.1) would mean that we conclude that more interventions are effective. This would reduce our false negative rate but increase our false positive rate.

This can make it look like there is no way for a threshold to be a 'better' threshold.

But that is hard to accept when we look at extremes - surely a p < 0.99 threshold is worse than a p < 0.05 threshold.

I think a better threshold would simply be one that is less arbitrary. Given that the current threshold is essentially 100% arbitrary, just loosely rooting a new threshold in empirical evidence would make it better than the current one.

Here is how I think this can be achieved:

• deciding, based on reason, that Exposure A is certain to have no effect on Outcome X, and then repeatedly running RCTs for the effect of exposure A on Outcome X to obtain a range of p values
• deciding, based on reason, that Exposure B is certain to have an effect on Outcome X, and then repeatedly running RCTs for the effect of exposure B on Outcome X to obtain a range of p values
• Looking at the p values with Exposure A and Exposure B and picking a better p value threshold based on this
• Or, repeating this a bunch of times for different exposures and outcomes to pick a better p value threshold

(Obviously, it is impossible to literally be 100% certain about the effect or lack of effect of an exposure based on reason / theoretical considerations, but I think this exercise is still useful even without the 100% certainty, because the bar for selecting a less arbitrary threshold is so low)