Among scoring methods employed to determine the performance of probability predictions, the log‐likelihood method is the most common and useful. Although the log‐likelihood score evaluates the comprehensive power of forecasts, we need to further evaluate the topical predictive powers of respective factors of seismicity, such as total numbers, occurrence times, locations, and magnitudes. For this purpose, we used the conditional‐ or marginal‐likelihood function based on the observed events. Such topical scores reveal both strengths and weaknesses of a forecasting model and suggest the necessary improvements. We applied these scores to the probability forecasts during the devastating period of March 2011, during which the Mw 9.0 Off the Pacific Coast of Tohoku‐Oki earthquake struck. However, the evaluations did not suggest that any of the prospective forecast models were consistently satisfactory. Hence, we undertook two additional types of retrospective forecasting experiments to investigate the reasons, including the possibility that the seismicity rate pattern has changed after the M 9 mega‐earthquake. In addition, our experiments revealed a technical difficulty in the one‐day forecasting protocol adopted by the Collaboratory for the Study of Earthquake Predictability (CSEP). Results of further experiments lead us to recommend specific modifications to the CSEP protocols, leading to real‐time forecasts and their evaluations.