Since Shannon’s seminal work, channel capacity has been a fundamental quantity in information thoery. The general definition is formulated as an optimization problem of a probability measure under some moment constraints. Similarly, the definition of rate-distortion function is formulated as a probability measure optimization problem. We have observed that the optimal measure becomes discrete even if continuous measures are allowed. The same phenomena are observed in Bayesian statistics, where the channel capacity problem is deeply related to the reference prior. In this talk, the background of the problem is introduced with some examples, and its applications to communication theory is discussed.