I had the idea for this post back in June, during the second ITP session . At the time, I thought I’d had an original idea. Twitter both disabused me of that particular notion – reading Joe Kirby’s blog this morning, it turns out that other people have had similar ideas  – but also gave me the impetus to turn my idea into a finished piece of writing, in order to jump on (or, rather, chase after) the passing “formative observations” bandwagon.
The inspiration for the idea was a discussion of comment-only marking. Black et al cite a 1988 study by Ruth Butler:
‘Butler was interested in the type of feedback that students received on their written work. In a controlled experimental study, she set up three different ways of feedback to learners – marks, comments and a combination of marks and comments. The latter is the method by which most teachers provide feedback to their learners in the UK. The study showed that learning gains were greatest for the group given only comments, with the other two treatments showing no gains.’ 
This post rests on an assumption that teachers are no different to students in how they react to feedback. As a result, if we want to use observations to promote a growth mindset and a culture of deliberate practice, leading to genuine improvements in teaching, the implications are clear. If the results of the Butler study can be applied to feedback that teachers receive on their lesson observations, then any feedback that includes a summative grade – even if it includes comments as well – will not result in any improvements in the quality of teaching. Only comment-only feedback will result in improvements. Unless they are to be used as a summative data-gathering exercise, there should be no grades in teaching observations.
Additionally, I want to propose an alternative model for lesson observations. After all, in the words of the Chinese proverb, ‘It is better to light a candle than to curse the darkness.’ I take my cue in this section from Tom Boulter’s distinction between ‘pale’ and ‘pure’ AfL, in which he argues that comment-only marking is most effective when combined with clear success criteria from which, in turn, any feedback should derive . My proposal, then, involves abolishing the current ‘Outstanding’, ‘Good’, ‘Requires Improvement’ and ‘Inadequate’ grades and their associated descriptors, and replacing them with one single set of success criteria that describes the very best lesson that teachers could be reasonably expected to teach day-in, day-out. The feedback from lesson observations should then be based on these criteria: which ones represent teachers’ strengths, and which ones represent their primary areas for development.
I close with two questions. First: do you agree? Second: if not, why not? And if so, what should the success criteria include?
 C. Phillips (2013) ‘In which I reflect on day two of the Improving Teacher Programme’, https://forwardsnotbackwards.wordpress.com/2013/06/11/in-which-i-reflect-on-day-two-of-the-improving-teacher-programme/ (post dated 11 June 2013; site accessed 24 August 2013)
 J. Kirby (2013) ‘What if all observations were only formative?’, http://pragmaticreform.wordpress.com/2013/08/24/what-if-observations/ (post dated 24 August 2013; site accessed 24 August 2013)
 P. Black, C. Harrison, C. Lee, B. Marshall and D. Wiliam (2003) Assessment for Learning: Putting It Into Practice (Maidenhead, Open University Press), quote from p.43 (my emphasis)
 T. Boulter (2012) ‘AFL – from Pale to Pure’, http://thinkingonlearning.blogspot.co.uk/2012/07/afl-from-pale-to-pure.html (post dated 14 July 2012; site accessed 24 August 2013)