There are several different things going on in this post. First, there is an implicit empirical claim about the variability in inferences drawn from alternative statistical tests applied to the same data. I think this claim is incorrect. Usually, different statistical tests of the same null applied to a given data set produce the same inference. If they do not, that is a signal that the evidence provided by the data is not very strong, which is something that should be reported.
Second, there is an implicit call for what might label informed Bayesianism. Jeff wants a machine that allows the reader to apply his or her own prior to the data and then obtain the implied posterior distribution of the relevant parameters. In the absence of such a machine (and, in fact, there are Bayesian software packages that could function as such a machine) one might call for, and Jeff might be happy with, applied Bayesian analyses that presented results conditional on a set of thoughtfully chosen informative priors rather than, as is common at present, on a single analytically convenient non-informative prior.
Indeed, one can imagine having all sorts of rhetorical fun with meta-priors over priors, or even obtaining evidence on the distribution of priors in the population of researchers and using that to guide the choice of what is presented. I would argue that, in an informal way that I like to call casual Bayesianism, that this is essentially what we already do as readers of scholarly articles. We take the (almost always) classical statistical (or frequentist, if you prefer) evidence and informally use it to update our informal prior to produce an informal posterior. I would further argue that formalizing this process generally does not pass a cost-benefit test.
Contra Jeff, I think the main danger in most empirical work is not deliberate manipulation of what is presented by researchers (other than, perhaps, choosing regression specifications at the margin to get the standard error from 1.85 to 2.05) but rather coding errors in generating the analysis data or in doing the analysis itself, combined with, on occasion, not getting the standard errors right. It is those problems that keep me awake at night. Well, not really, but I do worry about them.