45 research outputs found
Reliability and Criterion Validity for Three Potential Algebra Measures
This technical report summarizes the results of a study in which we examined the technical adequacy of three potential algebra measures for progress monitoring. One hundred thirteen students (18 of whom were receiving special education services) completed two forms of a Basic Skills probe, an Algebra Concepts probe, and a Content Analysis probe. In addition, we gathered data on criterion variables including grades, classroom assessment records, teacher ratings, and standardized test scores. We used multiple timing periods for the Basic Skills and Algebra Concepts probes to examine the efficacy of differing durations. We examined both test-retest and alternate form reliability for individual probes of all three types and for aggregated scores from two of the probe types. Criterion validity was examined using correlations between students’ probe scores and their scores from other indicators of algebra proficiency. The results of the study indicate that the Algebra Concepts probe is the most promising of the three measures investigated. It has adequate reliability and demonstrated the strongest correlations with the criterion measures. The Basic Skills probe had lower levels of reliability and more limited relations to the criterion measures (with the exception of the computation subtests of the standardized achievement tests). The Content Analysis probe had the highest levels of reliability among the three probes and moderate to moderately high correlations with several of the criterion measures. Concerns were identified about the difficulty of this probe because a large proportion of the students had scores of 10 or fewer points on the probe. Further research is needed to investigate more appropriate timing duration for the Basic Skills and Algebra Concepts probes. In the current study, the duration of the former was too short, while the duration of the latter was too long. The study should be replicated with additional, and more diverse student populations to determine the generalizability of the findings. Finally, subsequent research should examine the effects of routine progress monitoring on the measures’ stability and sensitivity to growth
Reliability and Criterion Validity of Two Algebra Measures: Translations and Content Analysis-Multiple Choice
This technical report summarizes the results of a study in which we examined the technical adequacy of two potential measures for algebra progress monitoring. Eighty-seven students (11 of whom were receiving special education services) completed two forms of a Translations measure and two forms of a Content Analysis-Multiple Choice measure during each of two data collection sessions. In addition, we gathered data on criterion variables including grades, overall grade point average, teacher ratings of student proficiency, and scores on districtadministered standardized tests, as well as a measure of algebra aptitude. We examined both test-retest and alternate form reliability for both single probe scores and aggregated scores (computed by averaging two individual scores). Criterion validity was examined by computing correlations between students’ single and aggregated scores on the probes with their scores on other indicators of proficiency in algebra. The results of this study suggest that the Translations measure is more promising than the Content Analysis-Multiple Choice measure in terms of both reliability and criterion validity. The strength of the relations obtained in this study were in the low to moderate range and were not as strong as the relations obtained with a different sample in this district using three other algebra measures (see Project AAIMS Technical Report 2 for details of the earlier study). Both measures produced acceptable distributions that were free from floor and ceiling effects. Students had roughly similar means and standard deviations on both measures. Reliability estimates for both measures fell short of expected levels for both single probes and aggregated scores. The Translations measure produced stronger correlations than the Content Analysis-Multiple Choice measure, but did not demonstrate a level of reliability that would be acceptable for instructional decision making. The majority of the criterion validity relations were in the low to moderate range. Aggregated scores produced improvements in the criterion validity estimates for the Translations measure, but not for the Content Analysis-Multiple Choice measure. The strongest relations were identified between the Translations measure and eighth graders’ performance on the district’s math achievement test, as well as between the Translations measure and all students’ performance on the algebra aptitude test. These two relations were in the moderate to strong range; relations between the Translations measure and the remaining criterion variables were in the low range
Classroom Observation Data for District C: Momentary Time Sampling
This report documents the results of momentary time sampling observations conducted in District C during the fall of 2004. It identifies typical student and teacher behaviors, as well as typical instructional organization patterns and task formats in Algebra I classes in this district. We found that District C beginning algebra teachers devoted nearly equal amounts of class time to whole class and independent work. These teachers spent about one half of the time we observed engaged in talking to their students about algebra or listening to students’ questions or comments about the day’s lesson. Their students were assigned paper and pencil tasks for more than half of the observational intervals and were expected to listen to lectures or participate in discussions for slightly more than 40% of the time. The most typical student behavior was listening to teachers (or displaying some other appropriate behavior) with taking notes, working on an assignment, or answering a teacher’s question (all active academic responses) as the second most typical type of student behavior
Classroom Observation Data for District B: Anecdotal Observation Results
This report documents the results of anecdotal observations conducted in District B during the fall of 2004. It describes the algebra topics addressed during our observations, the expected tasks (class activities), teacher actions, and student actions in four Algebra IA classes and two Algebra IB classes in this district. We looked at the algebra curriculum for students in these beginning algebra classes, the ways that class periods were structured in these classes, the kinds of instructional approaches that were used, and students’ responses to these instructional approaches. Student with and without disabilities were all enrolled in general education beginning algebra classes in District B; therefore they completed the same curriculum. The two Algebra IA teachers moved through the textbook at slightly different rates, but students were exposed to basically the same content. One teacher taught both of the Algebra IB classes, and her lessons concentrated on the same topics for each class. The most common expected task varied by teacher. Teacher 1 taught one section of Algebra IA and two sections of Algebra IB. In her Algebra IA class, the most typical task was checking homework, in Algebra IB, it was leading a review. Teacher 2 taught three sections of Algebra IA where the most prevalent expected task was teacher-led instruction. The most typical instructional approaches that we observed in District B were providing individual student assistance and modeling as the teacher showed how to solve algebra problems or reviewed for an exam in both courses. Completing assignments was the most typical productive student action in both courses, with listening observed just as often in the Algebra IB classes. Off task behavior was the most common nonproductive student action, and it was the most often observed student action in District B
Alignment of Algebra Curriculum, Assessment, and Instructional Practices in District B: A Case Study of Fall 2004
Project AAIMS (Algebra Assessment and Instruction: Meeting Standards) is a federally funded project that has two objectives. The first is to examine the alignment of algebra 1 curriculum, instruction, and assessment in general and special education. The second is to develop and validate algebra assessment tools for use in general and special education classes. This case study focuses on the first objective – it examines the alignment of algebra curriculum, instruction, and assessment for students with and without disabilities in one of the three districts participating in Project AAIMS
Reliability and Criterion Validity of Five Algebra Measures in Districts B and C
This technical report summarizes the results of a study in which we examined the technical adequacy of five potential measures for algebra progress monitoring. One hundred three students (14 of whom were receiving special education services) completed two forms of a Basic Skills measure, two forms of an Algebra Foundations measure, one form of a Content Analysis-Constructed Response measure, two forms of a Translations measure, and two forms of a Content Analysis-Multiple Choice measure administered over two data collection sessions. Each probe data collection session was repeated to investigate the test-retest reliability of the measures. In addition, we gathered data on criterion variables including grades, overall grade point average, teacher ratings of student proficiency, and scores on district-administered standardized tests, as well as a measure of algebra aptitude. We examined both test-retest and alternate form reliability for both single probe scores and aggregated scores (computed by averaging two individual scores). Criterion validity was examined by computing correlations between students’ single and aggregated scores on the probes with their scores on other indicators of proficiency in algebra. We found that four of the five measures produced effective distributions of student scores, with no signs of floor or ceiling effects. On the Translations probe, students produced nearly as many incorrect responses as they did correct responses, suggesting a high rate of guessing on that measure. The test-retest and alternate form reliability of single probes ranged from .4 to .9, with most coefficients in the .4 to .6 range. Aggregating scores from two probes produced slight increases in the reliability of the probes, with most correlations ranging from .5 to .7. For both single probes and aggregated scores, test-retest reliability coefficients exceeded those obtained for alternate form reliability. Neither the single nor the aggregated probes consistently produced reliability coefficients above the .80 level that represents a standard benchmark. Criterion validity coefficients were also lower than those obtained in previous research (Foegen & Lind, 2004). Coefficients were generally in the low range (.2 to .4); the exception to this pattern was for the Iowa Algebra Aptitude Test, which was more strongly related to the algebra progress monitoring measures (coefficients in the .3 to .5 range). The Content Analysis Constructed Response, the Algebra Foundations, and the Content Analysis-Multiple Choice measures produced the strongest relations with the criterion measures, with lower relations obtained for the Basic Skills and Translations measures. Concerns were identified with difficulty of scoring the Content Analysis-Constructed Response probes efficiently and accurately, which will likely limit the viability of this measure in applied settings. Issues for future research are identified
Classroom Observation Data for District A: Anecdotal Observation Results
This report documents the results of anecdotal observations conducted in District A during the spring of 2004. It describes the algebra topics addressed during our observations, the expected tasks (class activities), teacher actions, and student actions in six different beginning algebra courses this district. We looked at the similarities and differences in the algebra curriculum for students with and without disabilities in the different algebra courses, the ways that class periods were structured in these classes, the kinds of instructional approaches that were used in general education and special education algebra courses, and students’ responses to these instructional approaches
A Replication Study of the Reliability, Criterion Validity and Sensitivity to Growth of Two Algebra Progress Monitoring Measures
This study served as a replication of previous work examining the reliability, validity, and sensitivity to growth of the Basic Skills and the Content-Analysis-Multiple Choice probes in two Iowa schools districts. One hundred five students in grades nine to twelve participated in the study. Data were gathered from February 2006 to April 2006. Over three months of data collection, students completed two Basic Skills probes and two Content Analysis-Multiple Choice probes each month. We examined the alternate form reliability and test-retest reliability for both types of probes. We found that both types of probes possessed adequate levels of reliability, with the Basic Skills probes demonstrating a higher level of reliability than the Content Analysis-Multiple Choice probes. To assess the validity of the probes, we gathered data from a variety of indicators of students’ proficiency in algebra including course grades, teachers’ evaluations of student proficiency and growth, and performance on standardized assessment instruments including the Iowa Test of Education Development (ITED) and the Iowa Algebra Aptitude Test (IAAT). We examined both concurrent and predictive validity. Concurrent validity was supported by finding moderate correlations between both types of probes and criterion measures including teachers’ evaluation of their students’ proficiency and IAAT scores. Predictive validity of both types of probes was supported by finding low or moderate correlations between the relationship of the earliest probe scores to other indicators administered at the end of the course including teacher ratings of growth, students’ end-term algebra grades, IAAT scores taken at the end of the semester, and ITED scores. We also examined the extent to which both types of probes reflect student growth and explored the relationship between student growth on the algebra measures and other indicators of growth. We found that students who did not drop the course grew .42 and 1.52 points each week on Basic Skills and Content Analysis- Multiple Choice probes, respectively. This result suggests that Content Analysis-Multiple Choice probes may be more sensitive to reflecting student growth. We also found that there was a small but significant correlation between teacher ratings of growth and Content Analysis-Multiple Choice probes. No significant correlation existed between Basic Skills probes and other indicators of growth
Technical Characteristics of Two Algebra Progress Monitoring Measures: Reliability, Criterion Validity and Sensitivity to Growth
The primary purpose of this study was to investigate the reliability, validity of Algebra Foundations and Content Analysis-Multiple probes and to examine the utility of these probes in monitoring student progress over a full school year. In addition, we also examined the use of a third type of probe, Basic Skills, with a small number of students. Our findings revealed that both the Algebra Foundations and Content Analysis-Multiple Choice probes possessed adequate levels of alternate form and test-retest reliability. We examined two types of validity: concurrent and predictive validity. Concurrent validity was assessed by investigating the relationship between probe scores and other indicators of proficiency in algebra including teacher proficiency ratings and standardized test scores. In general, we found probe scores were associated with standardized test scores for students in grades 9-12, but not with teacher ratings of proficiency or standardized test scores for eighth grade students. The predictive validity of the probes was assessed by examining the association between probe scores and other indicators of proficiency including teacher ratings of growth and standardized test scores. Our findings were identical to those for concurrent validity. When examining student progress over time, we found that the Content Analysis-Multiple Choice probes were more sensitive to reflecting student growth than were the Algebra Foundations probes. When investigating student progress over time by class type, we found that only 8th Grade Algebra students showed .5 unit weekly growth on both probes; Algebra 1 students had a mean slope value near this threshold (.47) on the Content Analysis-Multiple Choice probes. This result indicated that the utility of the probes for monitoring student growth may differ for students of different mathematics ability levels
Reliability and Criterion Validity of Four Revised Algebra Measures in Districts B and C
This technical report summarizes the results of a study in which we examined the technical adequacy of four revised measures for algebra progress monitoring. The measures investigated included a Basic Skills probe, an Algebra Foundations probe, a Translations probe, and a Content Analysis-Multiple Choice probe. Revisions to the measures included the addition of a sample page prior to the initial administration of each type of probe and changes to the design templates used to create the Content Analysis-Multiple Choice measure. Seventy-eight students (6 of whom were receiving special education services) completed two forms of a Basic Skills measure, two forms of an Algebra Foundations measure, two forms of a Translations measure, and two forms of a Content Analysis-Multiple Choice measure administered over two data collection sessions. Each probe data collection session was then repeated to investigate the test-retest reliability of the measures. In addition, we gathered data on criterion variables including grades, overall grade point average, teacher ratings of student proficiency, and scores on district-administered standardized tests, as well as a measure of algebra aptitude. We examined both test-retest and alternate form reliability, as well as criterion validity, for both single probe scores and aggregated scores (computed by averaging two individual scores). We found that three of the four measures produced effective distributions of student scores, with no signs of floor or ceiling effects. On the Translations probe, students produced nearly half as many incorrect responses as they did correct responses, suggesting a high rate of guessing on that measure. The test-retest and alternate form reliability of single probes was higher than results obtained in previous studies, with correlations for most measures (except Translations) in the .6 to .8 range. Aggregating scores from two probes produced reliability estimates in the .7 to .8 range for all measures except the Translations measure. Criterion validity was examined by correlating students’ scores on the probes with other indicators of proficiency in algebra, including grades in algebra, teacher ratings, scores on the district’s achievment test, and scores on a standardized test of algebra aptitude. Correlation coefficients were higher than those obtained in earlier studies (Foegen & Olson, 2005; Foegen, Olson, & Perkmen, 2005). Correlation coefficients for single probes were generally in the low to moderate range (.3 to .5); small increments in the coefficients were obtained when aggregared scores were used. The strongest relations were obtained for the Basic Skills and the Content Analysis-Multiple Choice measures, followed by the Algebra Foundations measure. Students’ scores on the probes are most related to their teachers’ ratings, their grades in algebra, and their scores on the algebra aptitude test. Issues for future research are identified