Uh... replicating experiments is exactly what scientists do to confirm conclusions. In fact Scientists aim for their studies to be replicable — meaning that another researcher could perform a similar investigation and obtain the same basic results. When a study cannot be replicated, it suggests that our current understanding of the study system or our methods of testing are insufficient.
Sent from my iPad using Tapatalk
First and foremost, I appeciate the data and experiments provided by Brulosophy. I just think (like all scientific literature), you need to view their results through the proper lens. Data is data, but conclusions based on probabilities are not absolutes.
Their choice of a p-value of 0.05 is a blessing and a curse. Their typical xBmt is not powerful enough to resolve all but the most clear-cut differences. You can trust that a p-value set that fine will eliminate the majority of false positives, but this isn't data from a particle collider or pharmaceutical research. And even in physics or medical journals the actual calculated p-value is always listed, and data from small experiments that trend towards a conclusion but have wider error bars are still presented as such. This allows the reader to apply the data itself, rather than just the authors' conclusions. It would be nice to see the actual p-value for each experiment, at the very least. The experimenters tend to treat their p-value of 0.05 as all-or-nothing, but like all probabilities this is just not the case.
And it's being dismissive or argumentative to state that that the failure to reject the null hypothesis doesn't say much, because in these experiments that is absolutely true. Brulosophy has chosen to focus on eliminating false positives, but that necessarily comes at the expense of increasing "false negatives" (in quotes because they aren't testing for negatives). That effect is increased significantly in small data sets. It's not a knock on the experimenters at all, it's simply how the math works out.