Evaluation of teaching/learning software consists of two types: formative and summative. Formative methods of evaluation are used when a projectís outline has been decided and work has begun on the design and development of the various parts. It can be deliberate and consist of a series of methods to determine whether the project can work as planned or it can be so ad hoc that it consists mainly of obtaining the opinions of passers-by as to the visual effectiveness of a series of screens. As the first researchers of software development methods under the former CAUT (Committee for the Advancement of University Teaching)-funded grants, Hayden and Speedy (1995) found that, although formative evaluation was a project requirement, many grantees either paid lip service or simply ran out of time before they could implement it. These authors suggest that the grantees did not understand the main purpose of such evaluation and so considered it an ëadd-oní. Alexander and Hedberg, in noting the level of academic effort which goes into developing educational software, state
Given the high expectations of technology to provide more cost-effective learning and to improve the quality of the learning, together with the need to gain recognition for academics undertaking such development projects, the time has come for a re-examination of the role of evaluation in the development and implementation cycles. (Alexander & Hedberg, 1994:234)
Yet Moses and Johnson, in their review of CAUTís National Teaching Development Grants, found that ë...some projects were funded despite [added italics] the proponentsí lack of expertise in evaluation and of knowledge about learning theories and practicesí (1995:36). Thus, although formative evaluations should occur during the process of developing a teaching program, whether it consists entirely of software or also has other components, it appears that many projects reach completion without the benefit of data which have the potential to inform and improve them.
Northrup, writing about formative evaluation for multimedia, states that it is ë...an ongoing process conducted along every step of program developmentí (1995:24) and finds that, if a first draft or version of a product is created before a formative evaluation is conducted, then major modifications will not occur even when they appear to be required. Too much money, effort and time will have gone into the product to allow a major rework to take place. To help prevent this unfortunate situation, she offers guidelines for the development team which include the need for all the stakeholders to be involved and support for, and enforcement of, formative evaluation at all stages. She also discusses how data can be collected and used. The only aspect Northrup does not address is the recognition of students or other potential users as stakeholders. However, Biraimah (1993), Barker & King (1993), Reiser & Kegelmann (1994) and Henderson (1996) all agree that learners are stakeholders and that they should help carry out the formative evaluation in a number of ways, even if they function mainly to check for biases of gender and race or to see if the program will actually load. Indeed, Reiser and Kegelmann (1994:64) note that student evaluation of software is necessarily subjective and should be supplemented by that of subject matter experts, media specialists and administrators.
In comparison, summative evaluations can be much wider in scope. They occur when the finished product is examined and can benefit from hindsight. Thorpe, an open learning specialist, defines evaluation as ë...the collection, analysis and interpretation of information about any aspect of a programme of education and training, as part of a recognised process of judging its effectiveness, its efficiency and any other outcomes it may have (1988:5). She notes that a number of characteristics go with this definition, such as inclusiveness, the search for both intended and unintended effects and the capability of the activity to be made public. She emphasises that evaluation is not synonymous with assessment.
Teaching approach
Although both types of evaluation are important, and should be conducted at appropriate times throughout the life cycle of any educational program, they are less effective for the stated purpose when they occur in isolation from the evaluatorís teaching philosophy and preferred methods. Although some software examples may be considered to be more ësophisticatedí than others because, for example, the screens are more visually interesting or require more student input, those programs may actually be examples of the reproductive/transmitting method of teaching.
Bain and McNaught examined the ways in which academic faculty view student learning. They suggest that academics hold certain views on the ways in which students learn and therefore tend to adopt one of the following teaching approaches:
a reproducing/transmitting/expository conception which tends to encourage...reproductive learning
a pre-emptive orientation...sensitive to past student learning difficulties.focuses on explanations
a conversational or transformative conception.understanding is constructed by the student with the assistance of the teacher (Bain & McNaught, 1996:56).
All three of these approaches can be found in educational software. For example, the reproducing/transmitting/expository conception can be found in software which provides drill-and-practice or the short explanation, selection of readings and student input-to-exercises model used in many Web-based subjects or in some examples of electronic books or simulations. The pre-emptive orientation, in which the academic knows much about the learning difficulties past students have exhibited, can be found in interactive multimedia as well as in games, simulations and problem-solving courseware. The conversational approach may be found in multimedia exploration-of-a-microworld examples and in simulations and games where students interact with both software and people to construct knowledge and receive feedback on their thinking.
Barker states that
People design ëlearning productsí in order to meet some perceived learning or training need. We therefore define ëlearning designí as the overall effects of the cognitive activity that takes place within the...design team during the conception and formulation of a learning product... produced to meet some pre-defined pedagogic requirement. (1995:87)
Therefore, it is hardly surprising that some software is only treasured by its developers. When its advocates leave teaching, the product is no longer used.
Approaches to evaluation
A simple question for any educational software should be, ìCan this product actually teach what it is supposed to?î It is a simple question to ask, but often is difficult to answer because the product may have so many beguiling features. It requires the evaluator to recognise his/her own view of the ways in which students learn, to relate that view to the learning objectives of that portion of the course and to determine how and whether those objectives are carried out in the software.
Category & Discussion
Quality of end-user interface design
Investigation shows that the designers of the most highly-rated products follow well-established rules & guidelines. This aspect of design affects usersí perception of the product, what they can do with it and how completely it engages them.
Engagement
Appropriate use of audio & moving video segments can contribute greatly to usersí motivation to work with the medium.
Interactivity
Usersí involvement in participatory tasks helped make the product meaningful and provoke thought.
Tailorability
Products which allow users to configure them and change them to meet particular individual needs contribute well to the quality of the educational experience.
Excerpted from Barker & King (1993) p309.
The technical approach to evaluation used to be very important. For example, many papers were written in the 1980s about the importance of ëdebuggingí software and ensuring it would run as intended. Students were said to be frustrated with technical problems and to complain that these interfered with their learning. Technical evaluations of software are still of significance even though students of the 1990s are accustomed to computer crashes and often know how to address them. Technical difficulties often arise with authoring products produced in an educational organisation. With some echoes of the well-known MicroSift courseware proformas, Squires and McDougall (1994) provide a helpful series of lists for technical evaluations of software.
Barker and King (1993) have developed a method for evaluating interactive multimedia courseware. They provide four factors which their research suggests are of key importance to successful products. They state that several other factors should be considered as well, although their importance is seen as somewhat less than the four listed in Table 1. These secondary factors are: appropriateness of multimedia mix, mode and style of interaction, quality of interaction, user learning styles, monitoring and assessment techniques, built-in intelligence, adequacy of ancillary learning support tools and suitability for single user/group/distributed use (Barker & King, 1993:309).
Although Barker and Kingís factors do make substantial contributions to the ëlook and feelí of successful products, some of them need more explanation. For example, the ëmode and style of interactioní impacts on how a user navigates through the product. Difficulty in choosing an appropriate navigation method may arise if the designer and the academic hold different views of the ways in which users will learn from the software.
Young (1996) points out that, if students are allowed to control the sequence and content of the instruction, they must acquire self-regulated learning strategies for the instructional experience to be successful. Youngís research concerned students in seventh grade -- a learning stage which might be construed as ënaiveí and at which students may hold beliefs which are poorly thought out. Lawless and Brown, in surveying research on navigation and learning outcomes, find that learners who are ë...limited in both domain knowledge and metacognitive skillsí (1997:126) may not benefit from a high degree of learner control and may get lost in the environment. They also note that such students may be beguiled by special features not central to the instruction and fail to acquire the information important tothe section. This finding is supported by Blissett and Atkins (1993), who find that the sophistication of the multimedia environment may prevent some students from taking time to reflect on what they have just learned. Yildiz and Atkins suggest that students do not cope well with multimedia if they lack ë...advance organizers or mental frameworks on which to hang the surrogate experience...they therefore had difficulty in making personal, meaningful sense of what they saw and did...í (1993:138). Laurillard (1993: 30-31) notes that university students may hold naive beliefs and that university lecturers may make erroneous assumptions about their studentsí grasp of prerequisite concepts. Unless software is specifically designed to expose naive beliefs and support the construction of more accurate knowledge, it is likely that at least some users can navigate through a product without recognising that they hold erroneous ideas.
The importance of context
Administrators of tertiary institutions in which there is an increased use of educational software to supply some of the teaching may hold a perception that the desired learning has taken place if the assigned work is accomplished. Ramsden thinks that the context of learning is very important and remarks on unintended consequences of planned educational interventions which can result in an increase in superficial learning (1992:62-63) rather than the opposite. He suggests that assessment methods may have a negative effect on student learning. If these effects are true, then an outcome of multimedia teaching could be superficial learning, just as with more traditional methods.
Some multimedia proponents suggest that experiential, visually accurate, interactive software will help users attempt to solve problems in the ways that experts would. Henderson, after a careful long-term examination of mature studentsí work with multimedia packages, states that ë...knowledge acquisition is essentially and inescapably a socio-economic-historical-political-cultural processí (1996:90) and that studentsí mental processes depend on context specificity. Thus, students from cultures different from that in which a software product is developed are likely to experience difficulties in using that product. Baumgartner and Payr state
Learning with software...is a social process in at least two ways: first, it takes place in a certain social situation (in the classroom, at work, at home) and is motivated by it. Secondly any relevant learning process has as its goal the ability to cope with the social situation (professional or everyday tasks, etc.) The evaluation of interactive media has to satisfy three conditions: 1. It has to take into account the social situation in which the media are used, and must not be limited to the media themselves; 2. It has to take into account the goal of dealing with complex social situations and must not limit itself to the isolated individual learner; and 3. It must take into account the specific forms of interaction between the learner and society. These interactions range from the passive reception of static knowledge to the active design of complex situations that characterizes the ëexpertí. (1996:32)
Ramsdenís concern with context in traditional classrooms is thus seen to be of relevance to software developers or evaluators who want educational products which will fulfil a variety of teaching/learning needs.
Conversation
Laurillard (1993) notes that ëconversationí about oneís perception of an instructional sequence is an important part of learning and gives examples of ways in which conversation can be carried out by instructors and students face-to-face (102-104) or via intelligent tutoring systems. Blissett and Atkins (1993) advocate a strong teacher role in pursuing conversation about multimedia experiences and in promoting student reflection on their learning, in part due to their finding that students may not have acquired knowledge at a deep level from that experience. Collis (1996), in agreement with these authors about the importance of conversation about multimedia learning, advocates the provision of computerised communication opportunities among the lecturer and students and among the students themselves. She believes that what she terms ëtelelearningí, even for lecturers who use reproductive/transmitting teaching methods, forces the introduction of more communication into the ëinstructional balanceí (Collis, 1996:299). Her suggestions of supplying IRC-type group discussion facilities and email communication with the instructor as part of each computer-mediated instructional event would offer students the opportunity to engage in conversation about what they are learning even while it happens. It is much more likely that learners would then stop and think about their learning if an opportunity to share it with others were offered than if they are simply carried along by a multimedia experience and not ëanchoredí to the active, cognitive world.
Summary
A number of approaches to formative and summative evaluation have been touched upon above, supported by a set of references which should help beginning evaluators check out this time- and resource-intensive area further. It is no wonder that software developers may wish to avoid formative evaluation at every step of their project, especially if they have commenced it in a wave of enthusiasm, as Hayden and Speedy have noted. Designers and developers working on large-scale projects in Europe such as the DELTA (Barker & King, 1993) have had to turn away from the fun of carrying out innovative ideas and instead establish criteria whereby the work can stand up to evaluation which, as Thorpe states, is capable of being a very public thing. The work of software evaluation is very necessary but it is also expensive if done properly. The time and money required for this aspect -- whether formative evaluation or an evaluation to see if a produced program is effective -- should be a budget item whenever software development or use is considered.