History of Internal Assessment
This page provides background information for teachers to see how the assessment of practical work by the IB has developed over the past fifty years to reach the current assessment model. Understanding how the IA has changed over time can perhaps help to put its current format into context. Each change has come with advantages and disadvantages and none has received universal approval from teachers or students.
Over the years there has always been some controversy about how practical work for the IB chemistry diploma programme is assessed.
A typical example is the comment from a teacher on the IB Chemistry Teacher’s Facebook site in January 2021.
"Why do students have to go through remarking every year undergoing tremendous stress and paying so much when an IB education is already expensive? Why can't they grade it objectively in the first place? It's a shame that IB coordinators spend most of their summers chasing a long paper trail. I also feel some examiners themselves don't understand the high end IAs, particularly the data processing section. IA needs a fix. Either scrap the IA altogether or allocate 20% worth questions in paper 2 for experimental work. I have seen similar concerns on 'My IB' and hope some higher-ups are actually rethinking this broken system. I wouldn't mind signing a petition for fixing the IA."Everyone seems agreed that practical work should be an integral part of the course as chemistry is an experimental science – the problem is what it should cover and how it can be assessed fairly. It has always been allotted time for it to be performed but the way in which it counts towards the final mark has changed radically over the years. Understanding how the assessment of the IA has developed over time can perhaps help to put its current format into context. Each change has come with advantages and disadvantages and none has received universal approval from teachers or students.
History by decades
When the IB first started in the early 1970s there was only a very sketchy syllabus (for example “Transition elements” was literally all that teachers had to go on for the d-block elements) and teachers had to look at examination questions to second guess the depth of treatment required. The practical component counted 24% towards the final mark to reflect the fact that approximately 25% of the allotted time for both SL (40 hours out of 150 hours) and HL (60 hours out of 240 hours) was dedicated to practical work. Teachers were expected to carry out a practical programme but no guidance was given. The marks submitted by the teacher at the end of the course for the practical component were adjusted according to the list of practicals covered by the school and how well students overall within the same school performed on the three externally assessed written examinations (Papers 1, 2 and 3). This had the advantage that teachers were completely free to devise their own practical programme and perhaps strangely, even though internal and external assessment were not covering the same areas, there was very little unease expressed with the moderation process.
A more detailed syllabus appeared in the early 1980s and the first topic on the syllabus was a list of practical techniques that students were expected to cover in the laboratory. For the first time IB Chemistry appointed a North American chief examiner. Professor Ron Ragsdale was a professor at the University of Utah and also an AP chief examiner. He became concerned that on marking the written exams there was evidence that some students were completely unfamiliar with some of the techniques even though schools were listing them as having been performed by students during their practical programme. Ron was a keen advocate of Michael Faraday, particularly the way Faraday wrote his laboratory notes so he introduced the idea that students should keep a laboratory notebook in which their raw data was recorded. Schools were required to send these notebooks to the IB as part of the internal assessment (and it was fine if they had chemical stains all over them!). The evidence that the work had actually been performed by students rather than by a teacher in front of students raised the level but it was clear that the marks given by teachers varied considerably with some teachers giving full marks for poor work and others penalising the same error several times. In the late 1980s the same ‘student’ write-ups of five different practicals were sent to each teacher and they were asked to mark them and submit their mark to the IB. This enabled teachers to be given a moderation factor so that the marks they were awarding their own students could be adjusted. This led to many examiners complaining about garages full of laboratory notebooks! It also became clear quite quickly that some students were taught by more than one teacher and the information as to which teacher was teaching which student was not recorded for the IB so this innovation did not last long.
A big review of all the IB’s science programmes took place in early 1990 with a new programme produced for first teaching in 1996 with the first examinations in 1998. The internally assessed component changed radically. For the first time all three sciences, Physics, Chemistry and Biology had exactly the same assessment criteria and ten hours of the 40 h (SL) and 60 h (HL) allotted practical time was given over to a new combined Group 4 Project. This meant that the IB rather than the separate subject Chief Examiners were effectively responsible for setting the criteria for practical assessment. The assessment covered eight separate criteria (Planning (a), Planning (b), Data Collection, Data Processing and Presentation, Conclusion and Evaluation, Manipulative skills, Personal Skills (a) and Personal skills (b)) These were assessed according to some very basic descriptors; for example:
Planning (a): Defined problem(s)/research question(s); formulated hypothesis(es); selected any relevant variables.
Data Collection: Observed and recorded raw data with precision and presented them in an organised way (using a range of appropriate scientific methods/techniques).
Teachers had to decide which mark to give between 0 and 3 for each of the eight criteria to give a maximum total of 24.
0 The candidate has not reached a standard described by any of the descriptors.
1 The candidate meets all aspects of the criterion partially or a few aspects of the criterion completely.
2 The candidate meets all aspects of the criterion partially and most aspects of the criterion completely.
3 The candidate meets all aspects of the criterion completely.
For moderation each school had to submit the complete portfolios of work from a minimum of five candidates, drawn from both HL and SL. If the students were taught by more than one teacher the samples had to be from candidates of each teacher. Log books containing the raw data that accompanied the portfolios could also be requested for submission.
For each candidate in the sample there had to be two pieces of written work that supported the judgement made for each of the first five criteria (Planning (a) and (b), Data collection, Data processing and presentation and Conclusion and evaluation). No evidence was required for manipulative and personal skills (a) and (b). These pieces of work did not need to necessarily be the same for each candidate in the sample. One experiment/investigation could be used to provide evidence against several criteria (perhaps even the whole range) but more generally individual pieces of practical work were likely to be more appropriate as evidence for certain criteria than for others. This meant that the number of practical write-ups to be sent from each student varied between a minimum of two and a maximum of ten.
The Group 4 project formed part of this assessment and was particularly useful when it came to assessing Personal skills (a) (working within a team etc.). At the same time the form Group 4 Practical Scheme of Work (4PSOW) was introduced so teachers had to record for each student the practical programme they had completed including the time spent as well as including the marks received for each assessed criteria.
One unintended consequence of students addressing these specified criteria led to some bemused university lecturers. They found that many of their former IB students were writing Planning (a) (concerned with defining the research question, formulating a hypothesis and selecting key variables) and Planning (b) (concerned with designing an appropriate method, controlling the variables and devising how to collect the data) as headings in their university lab reports!
A similar system was implemented for the 2001 programme that was first assessed in 2003 with more clarity being given to what was expected for the Complete (C), ‘Partial (P), Not at all (N) grading. Each criterion was defined by either two or three ‘aspects’ and teachers awarded C, P or N for each of the aspects. A grid system was then used to arrive at the final mark.
With hindsight it can be seen that this was the beginning of a ‘tick the box’ type of assessment. Students could be given a checklist to follow and a sheet on how to maximise their marks.
The next major change came for first assessment in 2009. Although the Group 4 project was retained it was only assessed for Personal skills and it was up to the school as to how to do this. The assessed criteria were reduced to five: Design (D), Data collection and processing (DCP), Conclusion and evaluation (CE), Manipulative skills (MS) and Personal skills (PS) whilst still retaining the complete, partial, not at all levels for each of the aspects. Evidence for the marks given for the first three criteria (each marked out of 6 with a maximum of 2 for each of the three aspects) was required for two examples to be sent to the IB for selected students for moderation and the remaining two criteria were internally graded. Schools sent in a mark out of 48 which was then halved to give the IA component which continued to make up 24% of the final assessment marks. Significantly since 1996 the syllabus had had no mandatory practical techniques that had to be covered. However as students had been required to comment on errors and uncertainties in their laboratory reports and this had not previously been on the syllabus (so no time to teach it was allocated) ‘Measurement and data processing’ now appeared as Topic 11 on the syllabus. The other big change was that students were required to show evidence of the use of ICT in their practical work and this was recorded on the form 4PSOW. This involved data logging, software for graph plotting, spread sheets, database and computer modelling/simulation.
The next big review occurred in the first few years of the 2010s. The first teaching of the current programme started in 2014 with the first exams in 2016. The same criteria for all three sciences was retained but the big change was that there is no formal assessment of the practical programme itself except for a few short written questions in Section A of Paper 3 (with Section B covering the options). The Group 4 project is retained but no evidence is required to show it has been completed (except for the form 4PSOW which is retained by the school but no longer sent to the IB) and it does not need to be assessed. The practical work includes some mandatory areas such as the determination of molar mass but the method(s) to be used are not specified so there are no mandatory practical experiments. The component mark for the practical work has been reduced from 24% to 20% and is based solely on a ten hour scientific investigation. This investigation may include:
• a hands-on laboratory investigation
• using a spreadsheet for analysis and modeling
• extracting data from a database and analyzing it graphically
• producing a hybrid of spreadsheet/database work with a traditional hands-on investigation
• using a simulation provided it is interactive and open-ended.
The written report of the investigation should be between 6-12 pages and is the only evidence seen by the IB. It is assessed according to five criteria - Personal engagement (2), Exploration (6), Analysis (6), Evaluation (6) and Communication (4) – to give a total mark out of 24 which is scaled by the IB to become 20% of the overall final assessment mark. Teachers arrive at the mark by choosing the markband which most closely aligns with the student’s performance. Examples of marked student’s work are provided in the Teacher’s Support Material and schools still need to keep a completed 4PSOW for each class they teach but not one for individual students. Schools need to internally standardise the marks given by different teachers and a sample number of the student’s reports of their individual scientific investigation are sent to the IB for moderation.
Until the end of 2019 the vast majority of students still submitted hands-on experimental data as the basis of their investigation. The Covid pandemic has changed this balance completely. Many schools are or were either closed or the laboratory access severely limited so that the focus currently is on IAs based on secondary data from databases or simulations. The IA also assumed huge importance in determining how the final grades were awarded in May 2020 as no external assessment took place.
A new programme is currently being completed and will be first taught in 2023 with the first examinations in 2025.