It is standard practice in online retail to run pricing experiments by randomizing at the article-level, i.e. by changing prices of different products to identify treatment effects. Due to customers' cross-price substitution behavior, such experiments suffer from interference bias: the observed difference between treatment groups in the experiment is typically significantly larger than the global effect that could be expected after a roll-out decision of the tested pricing policy. We show in simulations that such bias can be as large as 100%, and report experimental data implying bias of similar magnitude. Finally, we discuss approaches for de-biased pricing experiments, suggesting observational methods as a potentially attractive alternative to clustering.