The Virtual Creatures Project

Pre-Evaluation and Evaluation Plan

Jennifer Tschudy, Ali Borjian, Richard Siegesmund, William Lorié

under the direction of

Decker Walker, Miriam Ben-Peretz, Rich Shavelson,

Stanford University School of Education

June 3, 1997

Index:

The Nature and Purpose of this Evaluation Plan

The Activities of the Virtual Creatures Evaluation Planning Team

Pre-Evaluation

Evaluation Plans

Instruments

Sources

The Nature and Purpose of this Evaluation Plan

The Virtual Creatures (VC) research project is an interdisciplinary project conducted by the Stanford University SUMMIT Lab and sponsored by the National Science Foundation (NSF). The project seeks to create rich, interactive, cross-disciplinary, learning environments centered on biology but integrating disciplines such as mathematics, physics and chemistry. Using advanced computer technology, the VC team seeks to improve classroom practice and student achievement in secondary school science by providing three dimensional, interactive, computer models of creatures, virtual tools to enhance the study of the model, and other resources for use within six curriculum units for application at the middle school and high school levels.

Our Evaluation Team sought to provide a goals and deliverables inventory of the VC work to date and a plan for the future formative evaluations that will begin this summer. We chose as well to focus on a specific curriculum unit being developed, entitled Jumping Frog (JF). The JF Unit is the first in a series of curriculum units currently being planned by the VC team, whose intention it is to provide to students - via computer-based technologies - the resources necessary for students to engage in meaningful inquiry that would lead to active knowledge of various aspects of vertebrate biology. This unit seemed sufficiently articulated and testable to provide enough feedback that a formative evaluation was feasible, and we have prliminary findings on this unit that should be useful for future evaluations.

In addition, the The Evaluation Planning Team has drafted four instruments to account for the concerns of teachers and students, the consequences of using JF and other units with groups of students, and elusive outcomes. The primary audience for this report is the VC project staff. It is our hope that they will find it valuable and informative in their design efforts.

The Activities of the Virtual Creatures Evaluation Planning Team

1. The Virtual Creatures Evaluation Planning team was engaged in the following activities:

2. Three meetings to determine our Evaluation Planning Team goals and decide what to count as resources for our evaluation instrument development process.

3. A review of the records of the Virtual Creatures projects as posted on the World Wide Web Site. Among other resources, this site provides:

4. Met with individual members of the VC development team, including Decker Walker and Leroy Heinrichs. Reviewed a concept map of the VC database drafted by Dr. Heinrichs.

5. Attended one focus session with high school teachers and some members of the VC dev-elopment team.

6. Reviewed the audio recording of a focus session with middle school teachers and some mem-bers of the VC development team.

7. Reviewed records of focus groups where students outlined resources they wanted or perceived they needed for successfully engaging in the JF unit.

8. Used VC records and research on curriculum evaluation to construct evaluation instruments for the JF and other units.


Pre-Evaluation

The Goals of Virtual Creatures

According to VC project documents, the project goals include:

Deliverable Features

These are deliverable features for the VC project, as revealed by grant proposal:

1. Bar-linkage models for studying the actions of joints and bones in locomotion

2. Compartmental models of the digestive system

3. Molecular and cellular models for studying energy production and consumption.

Pre-Evaluation Findings for Virtual Creatures

The Evaluation Planning Team found that Web-based access of resources was more limited than local access because of user-end hardware limitations. We found rich visualizations based on realistic data. Some tools and abstractions were present, but not others. The VC staff might benefit from an inventory of these findings.

Visualization Tools

Abstractions and Other Resources

The Goals of Jumping Frog

In the description of the Jumping Frog curriculum unit, the educational objectives seem limited to learning about bone and muscle. However, there was also evidence that there was a larger objective of learning about locomotion. To this end, Dr. Heinrichs has produced a concept map on learning about locomotion which was to guide the construction of the database underlying the Virtual Frog in general and the Jumping Frog unit in particular. According to Dr. Heinrich's concept map, the problem of locomotion could be broken down into five major sub-categories:

It appeared to the Evaluation Planning Team that one of the desired outcomes of the high school student focus group studies was to gauge how students would want to respond to the Jumping Frog scenario. Would they see it as a bone and muscle problem, as it is described in the handout materials, or would they take the initiative to expand it into a larger problem of locomotion?

Observations on Jumping Frog

In interviews, science teachers and students immediately wanted to expand the issues of Jumping Frog to include learning about systems and interrelations among systems. They did not see this as simply a bone/joint/muscle problem. Beyond Dr. Heinrich's five major sub- categories, students also identified issues of nutrition as affecting the physiological production of energy and the creation of force that would result in a frog jumping longer distances.

The focus groups clearly desired a greater complexity to JF than originally presented. This desire was reflected not only in the ways the focus groups wanted to have the freedom to solve the problem, but also in identifying the resource data necessary to answer factual questions and the virtual tools for testing hypotheses. These tools and materials were all anticipated by the VC staff in its original proposal to the NSF, although not all of these tools have been developed. We assume that one reason these tools are not yet present is the difficulty inherent in their development and a desire to know on the part of VC staff, to the greatest degree possible, the functionality that needs to be incorporated into these tools. This added desire - on the part of focus groups - for greater complexity poses design and programming problems for the VC staff:

1. The VC staff will have to make decisions regarding the degree of complexity to which Dr. Heinrich's five major sub-categories can be embraced within the single Jumping Frog unit. Is this the direction that the VC staff wants to take Jumping Frog? Is the complexity that the focus groups desired for JF a reflection that the five other curriculum units have not yet been designed? In other words, if the five other curriculum units existed, would people be happy with a Jumping Frog unit that focused on the problem of bone and muscle?

2. Who are the audiences for Jumping Frog? A curriculum designed for middle school students needs to deal with the broad thematics of systems. A high school curriculum is more appropriate for exploring the depth of detailed data that is reflected in Dr. Heinrich's concept map. Additionally, JF would engage high school students from biology, physics, or chemistry, so there may be a variety of ways to approach the unit and utilize it in a variety of classrooms. If there are multiple audiences, are there multiple "lessons", or multiple levels of complexity that need to be designed into the Jumping Frog Unit?

3. Finally, the project staff can consider the following. Are there computer programming limitations to be considered as requests for added complexity increase? The VC staff may need to consider what technical limitations (if any) there are in constructing JF. These limitations may place parameters around how this unit can be designed.


Evaluation Plans

The VC staff plans has been giving thought to conducting a formative evaluation of their project this summer. The Evaluation Planning Team thought it worthwhile to consider the staff's thoughts on evaluation plans, in formulating our own. What is outlined here are plans for the evaluation of the 3-year project, which is pending NSF approval and funding. Because the VC staff perceives their efforts on the first year project as being contiguous with project efforts under further funding, we believe that these plans are relevant for the development of instruments for the evaluation this summer.

The Virtual Creatures staff has listed three major educational outcomes that it seeks to measure, and proposes these measurement instruments:

1. Students will act/talk more like scientists.

To be measured by discourse analysis and observation.

2. The resources provided will facilitate biological learning.

To be measured by testing.

3. The resources provided will facilitate higher order reasoning.

To be measured by discourse analysis and observation.

To assess these outcomes the project plans to continue working with teacher and student focus groups and rely on the following data gathering devices:

1. Tracking of user interaction (through observation).

2. Electronically generated reports.

3. Analysis of the scientific qualities of student discourse in small groups. Qualities to look for include:

4. Usability testing to measure the physical manipulablity of the model and materials.

5. Performance assessment tasks for student self-evaluation and teacher evaluation of student learning.

Additionally, the Project Team is considering establishing a project advisory group with all major stake holders. Suggested representation includes teachers, students, parents, biology teachers (secondary level), and biologists. Interim evaluation reports will utilize multimedia showing typical and exemplary interactions with materials.

Proposed Evaluation Instruments

Introduction to Proposed Evaluation Instruments

For our instrument development project, the Evaluation Planning Team chose to focus on the needs and wants of teachers and students, as expressed in the many meetings and focus group discussions conducted by the VC team, as well as some of the small-group inquiry outcomes the staff envisions. Our interest in stakeholder needs is also an interest shared by the funding agency, and a common theme of VC team meetings.

The Evaluation Planning Team has developed four instruments. The first two are open-ended checklists intended for taking account of the already expressed needs and desires of teachers and students regarding the learning of vertebrate biology, and the interests of the VC project staff. The third instrument addresses some of the student learning concerns that can be addressed by observing students engaged in the JF and other curriculum units. Finally, we present a list of questions for addressing affective outcomes related to the technologies developed, elusive outcomes, and issues of triangulation with other sources of evidence. Our team designed a student questionnaire to address some of these outcomes.

Checklist of Teacher Concerns

A number of teachers at both the high school and middle school level were interviewed by the VC project. Both sets of teachers had definite ideas on how they could incorporate the creature model into their classrooms and preexisting lessons. Some of the teachers' concerns were translated into an evaluative checklist of project and program specific features.

The high school teachers envisioned that the model would be used by student teams as an additional resource for dissection. It could be used as a review tool by students who erred during an actual dissection or as a replacement exercise for students who objected to or were absent during the dissection. The teachers felt that the model would be useful as a preparatory aid before the dissection. It could allow students to become familiar with the structures they will observed during dissection and enable the teachers to clarify points during their lessons. The teachers also thought that the model could be used as an exam to test students' understanding of the structures they observed during dissection. Lastly, the high school teachers expressed a need for further training in using the model in order to understand better its uses as a tool and compliment to the textbook and other existing classroom resources.

The middle school teachers thought of the same uses for the model with respect to dissection, but also felt the model could be useful in teaching comparative anatomy. They were interested in expanding students' understanding beyond that of the creature to other animals and to humans. These teachers were also more concerned with the detail given to anatomical systems versus individual structures. Finally, the middle school teachers envisioned exams administered by the program that would record individual performances of the students.

Checklist of Student Concerns

Approximately 40 students of varying ages from differing locales in the Bay area took part in focus groups throughout the development of the Virtual Creature project. Students had difficulty in envisioning making use of the creature for school tasks or learning modes other than those with which they were already familiar, utilizing photographs, videos and books. It was easiest for them to see the creature as an alternative to or rehearsal for dissection or as an additional resource in preparing science labs or science projects. The students thought of the program as a way to aid them in satisfactory completion of such projects, but not as a replacement for them.

When discussing the specifications of the Jumping Frog unit, students engaged in group brainstorm sessions to write a descriptive list of resources that they envisioned would aid in their solution of the problem. The highlights of this list are the elements of an evaluation instrument designed to check that the VC product contains these resources. Students felt that the ability to see organs, muscles, bones and anatomical systems separate form the whole allowed for focused, in-depth learning of that particular piece. They also commented that the ability to alter the creature anatomically combined with the ability to animate the creature before and after such changes would aid in the visualization of the causes and effects of genetic mutations, illnesses or debilitating accidents. The students were most interested in the computer model's ability to illustrate movement and motion via animation, to illustrate anatomical systems in action, and to aid in visualization of the frog functioning normally in its natural habitat.

The students also had feedback to give with regard to the interface by which to manipulate the creature. Their concerns about the ease of using the program are also reflected in the checklist. The students had varying levels of previous computer usage - from simple word processing to very interactive, exploratory game-like programs. Two suggestions to alleviate a possible computer-literacy bias are to utilize interface and navigation systems that are similar in nature to basic word processors, but the VC staff needs to make judgements about the relative importance of this bias, weighed against novel proposed technologies, like force-feedback haptics.

The Student Group Activity Record

The JF unit currently being developed by the VC team is an open-ended scenario that attempts to engage student thinking about the relationships between the anatomical and physiological features of a frog in locomotion and the problem of engineering a frog so that it jumps a greater distance. The aim of this section is to provide a general framework for identifying evidence of positive outcomes indicating educationally fruitful use of the technologies the VC team wishes to provide to student communities of inquiry. The VC team may want to consider this framework in drafting its plan for a formative evaluation of any curriculum unit designed for the engagement of small groups of students in problem-solving, knowledge-building tasks. Here we derive some implications for the construction of an instrument designed to investigate systematically the outcome variables of interest in these learning environments, and we propose such an instrument.

Nira Hativa (1994) has suggested that one of the design implications of her extensive 6-year study of integrated learning systems (ILSs) is that it is important to preplan and incorporate into ILS design the kinds of social interaction among students, while using the ILS, the designers envision. While computer-based learning, with the growth of Internet technologies, has shifted from the functional, self-contained ILS paradigm that Hativa studied, it is important for the VC team to consider the many environments in which curriculum units such as JF would be adopted. The VC team would do well to revisit the target audience of the VC project, and attempt to describe the kind of environment, and number and kind of participants (for example, students only, teachers and students) in that setting. Because student and teacher attitudinal and consultative investigations of the JF unit have been conducted by VC researchers in small or large group settings, and because the computer resources of the typical science classroom are meager (one or two computers per room, if any), it is reasonable to assume that the JF unit may be operationalized in small group settings, if it is adopted into classroom practice. This is the rationale for considering here instruments for formative evaluations of JF as adopted by small groups of students, preferably in naturalistic or quasi-naturalistic settings.

While Hativa does not assign positive or negative values to cooperation and competition, she does note features that encourage these among students, which should be of interest to designers as they envision the kinds of learning environments they intend to promote through JF and other VC units. If the JF unit were to be used by a small group of students, the VC designers may want to consider the following factors which were found to discourage and decrease cooperation among students:

Features which discouraged competition and promoted cooperation were those factors that reduced or eliminated pressure to achieve. These included the absence of timing, the absence of numerical evaluations on the screen, the sporadic incorporation of games, and permission to use paper-and-pencil during the computer sessions.

The Evaluation Planning Team suggested three indicators of positive outcomes for the use of computer-afforded resources in evaluating the VC curricular units. These include that these resources are acknowledged, used extensively, and used intensively. A table for organizing the data gathered during an interactions among individuals in groups of students and the resources provided by the JF curriculum package was designed and is included under the section Instruments. The proposed instrument is a way of organizing information obtained during observations of groups of students engaged in the JF unit. It might be useful to distinguish it from checklists, analytical reviews, and experiments, all identified by Levin (1991) as methods for evaluation computerized curriculum materials. The Student Group Activity Record (GAR) does not have the character of a checklist, because it does not identify a priori the elements expected of students engaged in the JF unit. Also, the GAR does not make use of control groups, so it is non-experimental. Finally, in using the GAR, expert or specialist opinion are applied to resources as they are used by students, and not to resources in themselves, so this instrument differs from analytical or open-ended reviews as described by Levin.

This instrument emphasizes the formative nature of the evaluation envisioned by the Evaluation Planning Team, in that it assumes an iterative or cyclical model of product design, where here product is meant to refer primarily but not exclusively to the JF unit. In addition the product that is being designed is also the set of resources made available by the VC in general, and other curricular scenarios. By recording the acknowledgement of resources over group time, noting when a resource is used or proposed for use, and the number of participants engaged in discussion, resource acknowledgment and resource use, the evaluators can reconstruct the task domain and begin to make judgements about the relevance of the resources they are providing, in the form in which these resources are made available. These judgements will inform the developers as to whether and how a resource can be redesigned to meet the needs of groups engaged in the JF and other curricular units.

From student use or proposed use of resources, evaluators and project designers can begin to think about other resources that may afford more fruitful inquiry within the same curriculum scenario, as well as other curriculum scenarios that are better suited as frameworks in which to exploit the learning potential of a particular resource.

Moreover, the use by subject-matter experts of such an evaluation instrument has value in identifying possible misconceptions that students have about the domain of inquiry. These misconceptions should be made known to computer resource designers so that in conjunction with subject-matter specialists they might design resources in ways that challenge these misconceptions, rather than avoid or supress them. In this light, it is even justifiable to begin with "bad" resources - resources that do not challenge common misconceptions - so that more and more of these misconceptions may be uncovered in the design-evaluation process.

Elusive Outcomes and Equity Issues in Evaluation Design

As part of an on going self-evaluation, the VC staff should be concerned with evaluation design and implementation. Issues about which focus questions to choose, sample population, validity, and unexpected outcomes may arise during this process.

To address these issues, the Evaluation Planning Team developed a student questionnaire designed to insure these concerns were met. It is recommended that the Jumping Frog unit evaluation be guided by focus questions concerning: (1) the effects of the JF unit on student interest; (2) the influence of the JF unit on organization and delivery of instruction; (3) the effects of the JF unit on teachers; and (4) unintended effects that may be attributed to the unit. The list of questions provided for program developers, as well as the student questionnaire, focus on the first and last of these concerns.

In order to be able to increase the generalizability of findings, the VC staff needs to design a sampling method in which a more heterogeneous population is sought. This is particularly important due to the fact that a large gap exists between the educational achievement of students with different ethnic and socioeconomic backgrounds. In addition, variability in students' experiences with computers should be considered. Furthermore, the school population is linguistically and culturally diverse. As more culturally and linguistically diverse students are enrolled in schools, educators should focus on providing full access, equality of instruction, and appropriate learning environments to all students. It is likely that classrooms in which the VC is implemented will have these characteristics, so it is especially important for project designers to be attentive to program features that take seriously the possibility of heterogeneous classrooms.

Instruments

Middle and High School Teacher Checklists

The checklists proposed below are organized in three sections: Lesson-based learning, puzzles and constructions, and human-computer interface (HCI) issues These checklists are intended to be applied to those parts of the VC curriculum components that are meant to provide for computer-assisted teacher-directed lesson-based learning, as well as computer-assisted student-directed exploratory or experimental learning settings. In addition, a list of questions about HCI issues that teachers raised is included, and intended for application to the system in those learning environments.

Checklist for concerns by High School Teachers

Lesson-based learning

Puzzles and Constructions

Human-Computer Interface Issues

Check list for concerns by Middle School Teachers

Lesson based learning

Puzzles and Constructions

Human-Computer Interface Issues

Student Concerns Checklist

Content

Interface

Student Group Activity Record

Student Group Activity Record


R1

R2

R3..

Discussion

Use or Proposed Use

# of Participants





















Key: R1, R2, R3... are pre-identified resources, such as animation of jumping frog, 3D model, text on muscle flexion and extension.

Possible misconceptions:




General Questions for Elusive Outcomes - Focus on Students

Student Questionnaire

Age______________

Gender: Male_____ Female_____

Ethnicity_____________________

Grade Level___________________

Name of School________________________

Father's Occupation_______________________

Mother's Occupation______________________

1) How much experience do you have with computers?

1 2 3 4 5

(None) (A lot)

Comments:

2) How often do you use a computer?

3) What computer applications do you mostly use?

__ Word Proccessing

__ Data Base

__ Spreadsheet

__ Programming

__ Other (Please specify: _________________________ )

4) To what extent did you enjoy using the Jumping Frog Program?

1 2 3 4 5

(Not at all) (Very much)

Comments:

5) Overall, how would you rate the usefulness of the Jumping Frog Program?

1 2 3 4 5

(Not at all) (Very much)

Comments:

6) What specific part(s) of the program did you find most useful?

7) What specific parts(s) of the program was (were) not necesseary?

Sources

goals.html

Hativa, N. "What You Design Is Not What You Get (WYDINWYG): Cognitive, Affective, and Social Impacts of Learning with ILS - An Integration of Findings from Six Years of Qualitative and Quantitative Studies", International Journal of Educational Research, 1994, 21(1), 81-111.

Levin, T. "Evaluating Computerized Curriculum Materials" pp 474-477 in The International Encyclopedia of Curriculum. Arieh Lewy, ed. New York: Pergamon Press. 1991.

Posavac, E. and Carey, R. Program Evaluation: Methods and Case Studies, Fifth Edition. New Jersey: Prentice Hall. 1997.

Scardamalia, M. and Bereiter, C. "Computer Support for Knowledge-Building Communities", The Journal of the Learning Sciences, 1994, 3(3), 265-283.