The study focuses on the investigation of the process during which raters of EFL written performance make their decisions. It consists of a pilot and a main study, each of which concentrates on assessment of writing. The rationale is to detect the decision-making processes that raters follow, which can be used for training raters, and with which the reliability of rating can be improved. The pilot study is based on data collected during a large-scale language proficiency assessment of two age groups from learners of English and German languages. Raters were asked to think aloud during the rating task. Data was then transcribed and analysed. The participants in the main study were novice raters, who produced verbal protocols. 37 EFL teacher trainees took part in rater training and practised think-aloud protocol production. Then, they evaluated ten compositions written by EFL learners and verbalised their thought processes. The verbal protocols served as a basis for data collection. The analysis of data resulted in the conclusion according to which more reliable and objective assessment is possible when evaluating written performance.