Article

Discrimination in Grading

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

We report the results of an experiment that was designed to test for discrimination in grading in India. We recruited teachers to grade exams. We randomly assigned child "characteristics" (age, gender, and caste) to the cover sheets ofthe exams to ensure that there is no relationship between these observed characteristics and the exam quality. We find that teachers give exams that are assigned to be lower caste scores that are about 0.03 to 0.08 standard deviations lowerthan those that are assigned to be high caste. The teachers' behavior appears consistent with statistical discrimination.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... One potential issue is that the KS exam scripts include the name of the student. In theory, graders could be influenced by whether it is a male or female name, but Baird [21], Hanna and Linden [22], and Chowdhury et al. [23] find no evidence for this type of grader behaviour. 16 In addition to the key stage exams, at the end of KS 1, 2, and 3, teachers are asked to provide an assessment of whether each student is meeting the learning objectives as set out in the national curriculum. ...
... There is evidence of an important role for comparative advantage in explaining the substantial gender gap in STEM. 22 Standardised international tests tend to find a gender difference in mathematics favouring boys (often small, and non-existent in some countries), while boys tend to score significantly worse than girls in reading [35]. 23 However, decisions will likely depend on how students perceive their comparative advantage and teachers may play an important role in affecting how students think about their own abilities. ...
... 21 For example, if three individuals share the top position in a school cohort, then using the average rank for ties, each is assigned an ordinal rank of 2, and the following person in the sequence receives an ordinal rank of 4. We find that our estimates are similar if we assign ties either the highest rank or the lowest rank. 22 Papers that explore this issue include Aucejo and James [26], Delaney and Devereux [27], [28,29], Goulas et al. [30], Saltiel [31], Shi [32] and Speer [33,34]. 23 See Cavaglia et al. [36] and Delaney and Devereux [28]; [29] for a review of the literature on gender gaps in educational achievement. ...
Article
Full-text available
We study differences in teacher evaluations of student performance relative to those measured by test scores. While much literature is concerned with estimating various types of teacher biases, we show conceptually that there is no single ‘teacher bias’ effect. Even if teachers have no group bias, teacher evaluation differences by group masystematically deviate from test score differences if the distribution of test scores differs across groups. Commonly used approaches are not equivalent and can lead to different conclusions as they target different estimands. We demonstrate our findings using Monte Carlo simulations and, using two recent UK cohort surveys, we show that these conceptual issues matter in practice when we evaluate whether teachers are likely to over‐estimate female performance in English. Finally, we use the methods to examine an issue of substantive importance, gender differences in teacher perceptions in comparative advantage in English relative to mathematics. Our findings suggest that it is unlikely that teacher misperceptions of comparative advantage by gender are an important cause of the gender gap in STEM.
... Few studies have examined the combination of caste and SES in teacher grading, even though previous research on these two demographic factors has frequently focused on them independently (Hanna & Linden, 2012;Sirin, 2005). By examining how these interconnected identities (caste and SES) impact teacher grading, this study offers a more sophisticated understanding of in-group bias in educational settings. ...
... Several empirical research that has examined in-group bias in educational contexts have shown how teachers' social identities have an impact on how they grade assignments. For example, Hanna and Linden (2012) found that in India, even in cases when students from lower castes performed equally academically, teachers were much more likely to give preference to those from higher castes. As a result, these students received better scores than their classmates from lower castes. ...
... These findings align with the social identity hypothesis by Tajfel & Turner (1986) that people acquire self-esteem from their group memberships and are likely to favor students who share their group identification. Moreover, the results of this study are consistent with research carried out by Hanna and Linden (2012) who discovered that low caste students were subjected to discrimination by upper caste teachers. In line with the results of the current study, teachers were seen to assign lower grades to students from lower castes than to their peers from higher castes. ...
Article
Full-text available
This article studies the extent of teacher’s in-group bias in occupational expectations and grading on the basis of a student’s caste and socioeconomic status. The article adopts an experimental approach and draws on data generated from 122 teachers from 19 schools in Delhi, India. The caste and socio-economic status of students were randomly assigned to a set of essays written by them such that the assigned characteristics were not related to essay quality. The results show that high caste teachers hold higher occupational expectations from their in-group category and are biased against the low caste category. For instance, high caste teachers assign 0.53 per cent or 0.019 points higher occupational expectations to high caste students and assign 5.6 per cent or 0.19 points lower occupational expectations to low caste students. The magnitude of coefficients is small but significant at 5 per cent level (P value<0.005). In terms of marks assigned, results show, that high caste teachers assign 2.36 points or 3.22 per cent higher marks when the assigned characteristics belong to a high caste; indicating in-group bias/favor for the same caste. The coefficient is positive and significant at 5 per cent level (P- value < 0.05). In contrast, high caste teachers are shown to be biased against low caste students as they assign 2.41 points or 3.41 per cent lower marks when the assigned characteristics is a low caste. Given the ultra-competitive nature of schooling in India and the importance of grades in determining access to higher education in India, even a point disadvantage is substantial.
... In this paper, we thus aim to uncover grading bias-understood as grading differentials for students with the same achievement levels due to unequal treatment by teachers-in German secondary schools [13,17]. Existing research suggests grading bias against lower-class youth [17,[24][25][26][27][28][29][30][31] and minority [12,13,25,28,[30][31][32][33][34][35][36][37][38] as well as overweight or obese [17,[39][40][41][42] pupils. Furthermore, there is also evidence for grading bias by gender, benefiting boys or girls depending on the subject [9-12, 25, 27, 30, 35, 43-52]. ...
... Numerous studies across various contexts and subjects (mostly mathematics and languages) have found that girls receive better grades than boys for the same standardised test performance (for Israel [43]; for Denmark [52]; for Spain [30,51]; for the US [10,25]; for Czechia [11]; for Germany [12]; for Italy [49,50]; for France [9]; for Switzerland [27]; for Greece [46]; for Portugal [47]; for New Zealand [35]). However, there are also studies that do not find such a bias (for Sweden [64]; for India [26]). A study using the same data from Germany as we use here, but examining younger children, finds a grading bias in favour of girls in German and in favour of boys in mathematics [17]. ...
... Compared to the body of research on grading bias due to gender and ethnicity or migration background, there are significantly fewer studies on the influence of social background and weight status. Several studies indicate the existence of grading bias against socially disadvantaged children, more frequently observed in languages than in mathematics (for Italy [28]; for Switzerland [27]; for Germany [17]; for Spain [30,31]; for India [26]; for the US [25]). In a German survey of teachers, children, and parents it was shown that teachers subconsciously gave higher grades to children from higher social classes than was justified by their actual competence levels. ...
Article
Full-text available
We aim to uncover grading bias by gender, socio-economic status, ethnic/migration background as well as body weight in the German secondary school system. Following an intersectional approach, we test whether—controlling for ability—students receive different grades depending on (the specific combination of) ascriptive characteristics. Using data from the fourth starting cohort (SC4, 13.0.0, first survey in year 9 in 2010) of the National Educational Panel Study (NEPS) consisting of more than 14,000 ninth graders, we compute the predicted differences in grades for the different groups of students depending on whether they are a boy or a girl, whether they are obese/overweight or not, their socio-economic status (SES) and ethnic background. We rely on a grade equation approach, assuming that discrepancies between observed grades and achievement as measured in standardised tests are evidence of biased grading. We control for two different competence tests—the Domain General Cognitive Functions (DGCF) and a standardised domain-specific competence test—as objective measures of ability as well as secondary school track. Even after controlling for different personality and behavioural traits—the “big five”, the Strengths and Difficulties Questionnaire (SDQ), the Sick, Control, One, Fat and Food (SCOFF), health satisfaction and class retention—substantial differentials in grading across almost all factors and subjects remain. To account for the fact that many students may face bias on multiple grounds, we then compare the differences in predicted grades for groups with overlapping (dis)advantaging characteristics (e.g. low SES overweight Turkish boy vs a high SES non-overweight majority girl), while controlling for the objective ability measures. Significant differentials in grades are found in almost all cases, with the largest effect sizes for the subject German. We also compute models including all 2-way or 4-way interactions between the four axes of inequality and find the main effects largely unchanged. On the whole our findings are indicative of widespread additive intersectional effects of gender, social and ethnic origin as well as body weight on grading bias.
... We should note that various other factors could influence our outcome variable on which we do not have data, the most obvious of which are related to school-level inputs, which is discussed previously. 14 Within school-level inputs, research has also suggested the importance of teachers in the classroom for educational outcomes (Lavy & Sand, 2018), especially discussing the possibility of teacher biases leading to differences in grading (Hanna & Linden, 2012). However, there is no systematic evidence that this happens along linguistic lines, and that instead it is typically along gendered dimensions (Ferman & Fontes, 2022;Gortazar et al., 2022). ...
... Next, we do not have detailed school-level data on the institutional characteristics. This would be helpful to control for other factors at the school level that could also be influencing grade repetition, such as grading policies or teacher biases (Hanna & Linden, 2012). Third, in developing a robustness check using linguistic distance, we are able to verify that the results are not sensitive to the formulation of the language discrepancy variable. ...
... These studies identify discrimination by randomly assigning minority and majority names to tests teachers need to evaluate. Experimental evidence was found for grading discrimination against low-caste students in India (Hanna and Linden, 2012) and against students of Turkish origin in Germany (Sprietsma, 2013) but not in the Netherlands ( van Ewijk, 2011). Furthermore, experimental evidence was found for ethnic discrimination in secondary school track recommendations in Germany and the Netherlands ( van Ewijk, 2011;Sprietsma, 2013;Wenz and Hoenig, 2020). ...
... In our case, the estimated effect sizes are between −0.057 and −0.086 SD of the outcome variable, depending on the model specification. Previous studies have found effect sizes between −0.01 and −0.12 SD ( van Ewijk, 2011;Hanna and Linden, 2012;Sprietsma, 2013;Wenz and Hoenig, 2020). Thus, our effect size lies in the middle of the distribution of effect sizes identified in other studies. ...
Article
This study examines discrimination in teacher assessments and track recommendations against Roma minority students in Hungary. We conducted a pre-registered randomized experiment among 413 primary school teachers. Participating teachers evaluated six mathematics or literacy and grammar tests with fictitious, randomized student names and recommended a high school track. Our results show mixed evidence for discrimination against Roma students: teachers do not discriminate in test evaluations but do so in high school track recommendations, though this latter effect is small. We find that contextual factors play a substantial role in discrimination in track recommendations: teachers who receive tests with fewer Roma than non-Roma names discriminate against Roma students, whereas teachers who receive tests with more Roma names do not. In the latter case, non-Roma students receive similarly low track recommendations as Roma students in both experimental conditions. The results are consistent with stereotype-based theories of discrimination.
... Emerging behavioral (Carlana, La Ferrera and Pinotti 2022; Alesina et al. 2018) and experimental studies (Gilgen and Stocker 2022;Owens, 2022;Geven et al. 2021;Quinn 2020;Wenz and Hoenig 2020;Tobisch and Dresel 2017;Glock et al. 2015;Sprietsma 2013;Hanna and Linden 2012;Auwarter and Aruguete 2010) indeed document that teacher assessments might depend on student ascribed features. Likewise, previous observational research identified a residual effect of teacher bias in grading (Schuessler and Sønderskov 2023), expectations (Timmermans, Kuyper and Werf 2015), and tracking or grade retention recommendations (Batruch et al. 2023;Carlana, La Ferrera and Pinotti 2022;Salza 2022;Timmermans et al. 2018) as a function of students' ascribed characteristics 1 , namely, gender (Marcenaro-Gutierrez, Prieto-Latorre and Sánchez Rodriguez 2023; Carlana 2019), ethnic origin (Kisfalusi, Janky and Takács 2021; Triventi 2019; Alesina et al. 2018;Botelho, Madeira and Rangel 2015), 9 SES (Gortázar, Martínez de Lafuente and Vega-Bayo 2022.), and cultural capital (Jaeger 2022;Jaeger and Møllegaard 2017). ...
... Even though the theory was initially developed to explain hiring discrimination, recently, it has been applied to studying discrimination in the educational context. For instance, Hanna and Linden (2012) find experimental evidence of statistical discrimination in grading. When asked to evaluate a series of exams with randomly assigned ascribed characteristics (gender, age, and caste), they find that teachers' bias against low-caste students decreases as the evaluation process advances. ...
Preprint
Full-text available
Abstract: Fair evaluations are fundamental for equal opportunity, with teachers as gatekeepers of academic merit in educational systems. Still, identifying their direct role in reproducing or mitigating inequalities via assessments is empirically challenging, yielding inconsistent findings on teacher bias from observational and experimental studies. We test interdisciplinary theories of status characteristics beliefs, statistical discrimination, and cultural reproduction with a pre-registered factorial experiment run on a large representative sample of Spanish pre-service teachers (n=1,717). This design causally identifies, net of true academic competence, the impact of student-ascribed status characteristics—gender, migrant and class origins—and cultural capital on teacher short- and long-term assessments, improving prior studies’ limitations regarding theory testing, confounding, and power. Findings reveal teacher bias in an immediate task of essay grading favoring girls and highbrow cultural capital signals, aligning with status characteristics and cultural reproduction theories, respectively. Concerning teachers’ long-term expectations, findings hint at statistical discrimination against boys, migrant-origin, and working-class students under uncertain information. Unexpectedly, ethnic discrimination changes from teachers favoring native origin in long-term expectations to migrant origin in essay evaluations, suggesting compensatory grading practices. These findings dig deeper into the complex roots of discrimination in teacher assessments as a mechanism underlying educational (in)equality. https://publications.jrc.ec.europa.eu/repository/handle/JRC136851
... Discrimination theories from sociology (e.g., status characteristics theory) (Melamed et al., 2019;Ridgeway, 2014), psychology (e.g., implicit bias) (Fazio et al., 2023;Greenwald and Krieger, 2006), and economics (e.g., statistical discrimination) (Botelho et al., 2015;Hanna and Linden, 2012;Arrow, 1998) offer distinct micro-level explanations for how evaluators may systematically favour or penalise individuals based on ascribed status characteristics such as class, race, or sex. In educational settings, these theories suggest that teachers may consciously or unconsciously favour students from higher SES backgrounds, due to prevailing group-based ability stereotypes. ...
Article
Full-text available
Teachers act as judges of academic merit, but unfair evaluations beyond students' true abilities may perpetuate inequality based on socioeconomic status (SES). This article investigates if student SES biases teacher grades and track recommendations, after accounting for measurement error (test scores) and omitted variables (noncognitive skills) in ability-a widespread issue that can lead to overestimating teacher bias in observational studies. Using the German NEPS panel across elementary education, we seek to identify student ability through various cognitive (standardised test scores) and noncognitive (effort) measures, along with an instrumental variable (IV) design. We also examine heterogeneity across the ability distribution to test whether teacher bias is most pronounced among low performers. First, after approximating latent student ability with the IV approach and adjusting for noncognitive skills, the residual effect of student SES drops by over 40 %. This reduction indicates that estimates of teacher bias are substantially inflated when relying solely on snapshot test scores. Second, despite this reduction, the effect size of student SES remains considerable at 6 % of a standard deviation for GPA and five percentage points for track recommendations. Third, residual SES effects are disproportionately seen among low-to-average-performing students, suggesting that high-SES parents use compensatory strategies to influence teacher assessments. We discuss the theoretical and methodological implications of our findings for estimating and interpreting teacher bias as a mechanism of educational reproduction in observational research.
... Traditional grading methods often disadvantage students from lower socioeconomic backgrounds by not separating academic abilities from noncognitive behaviors (Clark, 2014;Feldman, 2018Gorski, 2013Morris & McKenzie, 2024). Hanna and Linden (2012) and Tobisch and Dresel (2017) demonstrate that teachers may unconsciously bias their assessments against economically disadvantaged students, leading to lower GPAs and increased likelihood of course failures. Research underscores the need for grading practices that accurately reflect students' academic skills rather than extraneous factors (Feldman, 2018. ...
... This is particularly important for the case of teachers: optimistic teachers' expectations have been found to particularly benefit the achievement of students from minorities in the US (Jussim and Harber, 2005). More gender egalitarian teachers have been found to increase the performance and uptake of STEM by girls (Alan et al., 2018;Carlana, 2019;Ash and Maguire, 2024;Hawkins et al., 2023), and generally to be able to increase the performance of pupils through positive expectations of them (Figlio, 2005;Sprietsma, 2013;Campbell, 2015;Hanna and Linden, 2012). Research has also shown that teachers' diminished expectations of children with names associated with low socio-economic status affect student's cognitive performance (Figlio, 2005), that essays designated with either German or Turkish names were differently graded in schools in Germany (Sprietsma, 2013), and that the assessment of children's behaviour was rated as more disruptive and inattentive by teachers from a different ethnic group (Dee, 2005;Gilliam et al., 2016;Blank et al., 2016). ...
Preprint
Full-text available
Reliance on stereotypes is a persistent feature of human decision-making and has been extensively documented in educational settings, where it can shape students' confidence, performance, and long-term human capital accumulation. While effective techniques exist to mitigate these negative effects, a crucial first step is to establish whether teachers can recognize stereotypes in their professional environment. We introduce the Stereotype Identification Test (SIT), a novel survey tool that asks teachers to evaluate and comment on the presence of stereotypes in images randomly drawn from school textbooks. Their responses are systematically linked to established measures of implicit bias (Implicit Association Test, IAT) and explicit bias (survey scales on teaching stereotypes and social values). Our findings demonstrate that the SIT is a valid and reliable measure of stereotype recognition. Teachers' ability to recognize stereotypes is linked to trainable traits such as implicit bias awareness and inclusive teaching practices. Moreover, providing personalized feedback on implicit bias improves SIT scores by 0.25 standard deviations, reinforcing the idea that stereotype recognition is malleable and can be enhanced through targeted interventions.
... Note that blind grading should not be confused with anonymous grading (Hanna & Linden, 2012); in anonymous grading, assessors do not see the students' names to avoid certain biases (e.g., gender, ethnicity). ...
Article
Full-text available
Assessing exams with multiple assessors is challenging regarding inter-rater reliability and feedback. This paper presents 'checkbox grading,' a digital method where exam designers have predefined checkboxes with both feedback and associated partial grades. Assessors then tick the checkboxes relevant to a student solution. Dependencies between checkboxes ensure consistency among assessors in following the grading scheme. Moreover, the approach supports 'blind grading' by hiding the grades associated with the checkboxes, thus focusing assessors on the criteria rather than the scores. The approach was studied during a large-scale mathematics state exam. Results show that assessors perceived checkbox grading as very useful. However, compared to traditional grading-where assessors follow a correction scheme and communicate the resulting grade-more time is spent on checkbox grading, while both approaches are equally reliable. Blind grading improved inter-rater reliability for some tasks. Overall, checkbox grading might lead to a smoother process where feedback, not solely grades, is communicated to students.
... For example, in a study conducted in Israel, boys faced grade discrimination in literacy, math, and science while comparing blind and non-blind evaluated primary school test scores (Lavy, 2008). The literacy review by Protivínský and Münich (2018) showed that only 2 of 13 studies did not find gendered grading bias against boys; however, the methodology of these two studies was different because teachers evaluated exams with a particular gender, age, and caste assigned (see Hanna and Linden, 2012). ...
Article
Full-text available
We studied the gender achievement gap in grades and standardised test scores in Finland, where the gender differences are largest among OECD countries. We compared the gender achievement gap in standardised test scores from PISA surveys and grades from high-quality school registers in literacy. Furthermore, we analysed how grades differ from standardised test scores by family background and students' SES composition of the schools. By using the Blinder-Oaxaca decomposition method, we explored how different characteristics between girls and boys explain gender differences in grading. Our findings indicate that boys' grades were lower than can be expected based on standardised test scores. The gender gap in grades was explained by boys' lower reading interests, effort put into schoolwork, and conscientiousness on homework. However, even adjusting for schooling characteristics and competence, boys have lower grades than test scores in schools that have low SES student composition.
... At the end of the study, all students were again tested with the same IQ test and the bloomers' scores were statistically significant higher than other students' scores. Some recent studies present evidence that, on average, teachers hold lower expectations for another race/ethnicity students than do teachers evaluating the same race/ethnicity students (Cornwell, Mustard, and Van Parys 2013;Gershenson, Holt, and Papageorge 2016;Hanna and Linden 2012). ...
... Entwined with implicit biases related to socioeconomic factors, these practices can lead to unequal evaluations [38,39]. Students of lower SES are frequently graded more harshly, a pattern observed universally, irrespective of school poverty levels [40,41]. Such disparities intensify educational inequalities, significantly burdening the economically disadvantaged. ...
Article
Full-text available
This study explores the grading disparities among ninth-grade students within the American educational system, emphasizing the comparative analysis between economically disadvantaged students (indicated by Free or Reduced-Price Lunch status) and their more advantaged counterparts across urban, suburban, and rural locales. Drawing on a robust dataset of 65,017 first-time, full-time ninth graders from Arkansas, spanning the academic years 2020-21 to 2021-22, this research employs logistic regression analysis to uncover the nuanced relationships of socioeconomic status and geographical setting on course failure rates. The ninth grade is highlighted as a critical juncture in the U.S. educational trajectory, serving as a foundational year that significantly influences students' future academic and career pathways. My findings reveal that, although rural students initially present with lower failure rates, a detailed logit analysis accounting for individual and district-level characteristics demonstrates that rural ninth graders face the highest risk of course failure, especially among those with Free or Reduced Lunch (FRL) status. These results underscore the pressing need for implementing equitable grading practices and bolstering professional development for educators in rural areas to mitigate these disparities. This study contributes to the broader field of educational equity by highlighting systemic challenges and advocating for targeted interventions to support disadvantaged students, particularly in the pivotal year of ninth grade.
... Although initially developed to explain hiring discrimination, this theory has recently been applied to the educational context. Hanna and Linden (2012) find experimental evidence of statistical discrimination in grading: When asked to evaluate a series of exams with randomly assigned ascribed characteristics (gender, age, and caste), teachers rely less on them, reducing bias against low-caste students, as information about the testing instrument and grade distribution is obtained. Likewise, Botelho et al. (2015) compare teacher assessments of 8 th graders across 10.6 thousand classrooms in Brazil to standardized scores (blindly marked) to study racial discrimination. ...
Article
Full-text available
Teachers are the evaluators of academic merit. Identifying if their assessments are fair or biased by student-ascribed status is critical for equal opportunity but empirically challenging, with mixed previous findings. We test status characteristics beliefs, statistical discrimination, and cultural capital theories with a pre-registered factorial experiment on a large sample of Spanish pre-service teachers (n = 1, 717). This design causally identifies, net of ability, the impact of student-ascribed characteristics on teacher short- and long-term assessments, improving prior studies’ theory testing, confounding, and power. Findings unveil teacher bias in an essay grading task favoring girls and highbrow cultural capital, aligning with status characteristics and cultural capital theories. Results on teachers’ long-term expectations indicate statistical discrimination against boys, migrant origin, and working-class students under uncertain information. Unexpectedly, ethnic discrimination changes from teachers favoring native origin in long-term expectations to migrant origin in short-term evaluations, suggesting compensatory grading. We discuss the complex roots of discrimination in teacher assessments as an educational (in)equality mechanism.
... (Arrow 1998) propose instead that, without complete information on students' true ability, teachers rely on group-level characteristics (e.g., average historical, educational outcomes by SES groups) to gauge student potential, leading to biased assessments. These would become fairer as teachers get new input on the individual student (Botelho, Maderia and Rangel 2015;Hanna and Linden 2012). Thus, statistical discrimination might be particularly salient for uncertain long-term outcomes, such as educational expectations or tracking recommendations (Batruch et al. 2023). ...
Preprint
Full-text available
Teachers are academic merit gatekeepers. Yet their potential role in reproducing inequality via assessments was overlooked or not correctly identified, being 'an elephant in the classroom'. This article teases if teacher grades and track recommendations are biased by student SES or unobserved ability, leading to overestimation in prior research. Using the German NEPS panel across elementary education, we identify student ability with multiple cognitive and noncognitive composite measures and an instrumental variable design. We further assess heterogeneity along the ability distribution to test whether, according to the compensatory hypothesis, teacher bias is largest among low-performers. First, accounting for measurement error, teacher bias declines by 40%, indicating substantial overestimation in previous studies. Second, it concentrates on underperformers, suggesting high-SES parental compensatory strategies to boost teacher assessments. Thus, families and teachers might influence each other in the evaluation process. We discuss the findings' theoretical and methodological implications for teacher bias as an educational reproduction mechanism.
... In other words, teachers make an assessment without being able to observe the exit test scores. Following the literature on objective (blind testing such as the school exit test) and subjective (such as the teacher track recommendation) measures of ability (e.g., Alesina et al., 2018;Botelho et al., 2015;Burgess & Greaves, 2013;Hanna & Linden, 2012;Van Ewijk, 2011), we can form two conflicting hypotheses. The first is that the initial teacher track recommendation will be closer to the true ability of a migrant student without a school exit test score, because the test scores are relatively low for migrant students due to language barriers (Crawford, 2004). ...
Article
Full-text available
This paper evaluates whether educational outcomes of first-generation migrant children improved relative to those of natives after a policy change which delayed an important primary school exit test by three months. Using Dutch register data and a difference-in-differences methodology, we show that the policy change increased the academic rank of migrants relative to natives upon first enrollment. The policy change, therefore, has had an important positive effect on the educational chances of migrant children. Our analyses suggest that the results are driven by higher relative exit test scores and higher relative teacher recommendations.
... Suppose handling large amounts of coursework to be corrected. In that case, it might be helpful to use the software to check one's evaluations double to counteract potential biases (e.g., tendencies to grade first and last answers better or individual biases when grading; see [7] or Protivínský & Münich [ 15 ], for instance). This becomes specifically important when anonymous grading is not possible. ...
Article
Full-text available
Evaluating text-based answers obtained in educational settings or behavioral studies is time-consuming and resource-intensive. Applying novel artificial intelligence tools such as ChatGPT might support the process. Still, currently available implementations do not allow for automated and case-specific evaluations of large numbers of student answers. To counter this limitation, we developed a flexible software and user-friendly web application that enables researchers and educators to use cutting-edge artificial intelligence technologies by providing an interface that combines large language models with options to specify questions of interest, sample solutions, and evaluation instructions for automated answer scoring. We validated the method in an empirical study and found the software with expert ratings to have high reliability. Hence, the present software constitutes a valuable tool to facilitate and enhance text-based answer evaluation. - Generative AI-enhanced software for customizable, case-specific, and automized grading of large amounts of text-based answers - Open-source software and web application for direct implementation and adaptation
... They revealed that students from reserved categories were allocated 1-4 out of 30 marks in the viva-voce examination-to 'deny entry to aspirants from marginalised groups in educational and public sector employment on the pretext that they lack even minimum merit is not a new phenomenon' (Alpha, 2021). Discrimination in grading based on caste identity and teacher bias has even been noted in the Indian school education system (Hanna & Linden, 2006;Nayak, this volume). ...
Chapter
Full-text available
Education, like other social institutions, has been reproducing age-old prejudices, practices and stigma through pedagogical practices. Scholars (Kumar, Economic and Political Weekly, 51, 12–15, 2016; Pathania & Tierney, Tertiary Education and Management, 24, 221–231, 2018; Chand & Karre, Contemporary Voice of Dalit, 11, 55–61, 2019) have argued that educational institutions are the reflection of caste prejudices and discrimination. Studies have highlighted subtle processes of discrimination within educational institutions. Using a semi-structured interview method, this chapter unfolds a case study of the higher education system in India’s largest state, eastern Uttar Pradesh, and examines the role of ‘caste dynamics’ in viva-voce examinations in higher education, because low-caste students face discrimination on the basis of their social position in the caste-ridden hierarchical society. The chapter attempts to answer the following questions: (a) How do educational institutions play a significant role in the (re)-production of caste in a society in general, and educational settings in particular? (b) How does prejudice and discrimination maintain caste reproduction in contemporary times? and (c) How does the educational institution act as an ideological state apparatus for the same? Finally, the chapter humbly tries to unravel the changing nature of caste and its discriminatory role in the viva-voce examination.
... All rights reserved. historical discrimination would explicitly inherit existing societal bias and further lead to cascading unfair decisionmaking in real-world applications (Fay and Williams 1993;Hanna and Linden 2012;Obermeyer et al. 2019). ...
Article
Full-text available
While machine learning models have achieved unprecedented success in real-world applications, they might make biased/unfair decisions for specific demographic groups and hence result in discriminative outcomes. Although research efforts have been devoted to measuring and mitigating bias, they mainly study bias from the result-oriented perspective while neglecting the bias encoded in the decision-making procedure. This results in their inability to capture procedure-oriented bias, which therefore limits the ability to have a fully debiasing method. Fortunately, with the rapid development of explainable machine learning, explanations for predictions are now available to gain insights into the procedure. In this work, we bridge the gap between fairness and explainability by presenting a novel perspective of procedure-oriented fairness based on explanations. We identify the procedure-based bias by measuring the gap of explanation quality between different groups with Ratio-based and Value-based Explanation Fairness. The new metrics further motivate us to design an optimization objective to mitigate the procedure-based bias where we observe that it will also mitigate bias from the prediction. Based on our designed optimization objective, we propose a Comprehensive Fairness Algorithm (CFA), which simultaneously fulfills multiple objectives - improving traditional fairness, satisfying explanation fairness, and maintaining the utility performance. Extensive experiments on real-world datasets demonstrate the effectiveness of our proposed CFA and highlight the importance of considering fairness from the explainability perspective. Our code: https://github.com/YuyingZhao/FairExplanations-CFA.
... Obese students and those with a low socioeconomic status are evaluated worse than healthy-weight students (Dian & Triventi, 2021;MacCann & Roberts, 2013) and those with a higher SES, respectively (Doyle et al., 2022;Westphal et al., 2016). The same is true in India for students belonging to a lower caste (Hanna & Linden, 2012). On the contrary, it was not possible to confirm the existence of discrimination in grading based on either family type (Guttmann & Boudo, 1988) or physical attractiveness (Kehle et al., 1974). ...
Article
Full-text available
Studies on teachers' grading suggest that school grades depend not only on students' performance but also on teachers' bias toward specific social categories. Numerous studies tested the existence of discrimination in grading using different strategies and focusing on multiple students' characteristics. This study aims to summarise those studies by identifying (1) the methodologies used, (2) the characteristics on which discrimination is based and (3) the empirical results. We conducted a scoping review where studies were selected blindly by the two authors. The initial search was conducted with ERIC, Education Database and PsycInfo and 37 studies were identified. A comparison among the included studies suggests that the main strategies used are experiments and regression analysis on the difference between blind and non-blind scores, while gender, race/ethnicity and migration background are the most frequently tested characteristics. Finally, on average studies confirmed the presence of discrimination in grading still with some exceptions and, sometimes, under specific conditions. To conclude, it is challenging to test teachers' discrimination through grading and to date the methodologies used have some limitations. However, on average, empirical evidence suggests that school grades are affected by teachers' bias.
... The caste identity of an individual has always been a determining factor in education (Munshi and Rosenzweig, 2006;Hanna and Linden, 2012;Hoff and Pandey, 2014), access to healthcare services (Luke and Munshi, 2007), access to public goods (Anderson, 2011) and marital choices (Munshi and Rosenzweig, 2009). The Government of India has undertaken world's largest affirmative action program to eliminate caste-based discrimination and social exclusion, but caste continues to play a significant role in all facets of Indian society, even among the ostensibly progressive educated urban population (Banerjee et al., 2013). ...
Article
Full-text available
Students from marginalized backgrounds are more likely to be subject to stereotypical and deficit‐oriented teacher beliefs, which may contribute to low learning outcomes and diminished wellbeing. Most of the research on this topic is conducted in the US and other high‐income countries, however, and evidence from the majority world, which is home of over 85% of all children, remains scattered and disconnected. This study therefore conducts a structured review of the literature on teacher beliefs and their implications for the educational outcomes of disadvantaged students in low‐ and middle‐income countries. Based on the findings, we develop a conceptual framework that identifies the structural and cultural determinants of teacher beliefs and the various mechanisms that link them to diminished wellbeing and educational achievement for disadvantaged students. Our review poignantly illustrates the degree of injustice and discrimination inflicted on economically disadvantaged children as they navigate education systems around the world. We also highlight examples of positive teacher beliefs and practices, however, and discuss promising avenues for future research.
Chapter
This chapter examines teacher evaluation bias as a potential driver of reversed gender disparities in academic performance in rural Philippines. Leveraging a natural experimental design, we analyze a difference-in-differences (DID) estimand between “non-blind” report-card (RC) scores and “blind” standardized test (NAT) scores for the same students. Despite controlling for individual, household, and other school characteristics, boys consistently underperform in teacher-assigned scores—a discrepancy not observed in blind assessments. We identify two mechanisms: (i) implicit stereotypes favoring girls, and (ii) the tendency for teacher evaluations to capture diligence-related behaviors where boys may fall short. This gendered biasing effect on the scores is compounded by a highly gender-skewed teacher labor market, where female teachers dominate due to limited professional alternatives for educated women in underdeveloped local economies. Our findings suggest that evaluation bias is not merely pedagogical but rooted in broader institutional and labor market structures. The chapter concludes with implications for gender-equalizing education policy and the importance of integrating educational reforms with economic development strategies aimed at diversifying employment pathways, rather than attributing disparities solely to individual teacher consciousness.
Article
Full-text available
The paper presents an empirical study that examines the academic performance in mathematics of fifteen-year-old students at the end of elementary school in Serbia. In conducting the analysis, the student results of the national test and their mathematics grades were utilized. The study covers a seven-year period 2013-19 and over 440 thousand students. Empirical findings affirm that girls exhibit superior math grades and higher achievements in national testing. However, the occurrence of Simpson's paradox indicates a grading bias favoring girls. The results further highlight that girls exhibit a notably higher interest in attempting to solve open-ended tasks, even when unsuccessful, in contrast to boys, who leave those tasks unanswered during tests more frequently. The qualitative component of the research involved a focus group comprising teachers. It was conducted to gain insights into the teachers' perspectives and experiences regarding the gender grading gap.
Article
The Great British Bake Off is a popular amateur cooking competition show, and its design offers an opportunity for analyzing serial position bias among expert rankings. In this paper, we use the technical challenge portion of the show to assess whether experts—in this case, the judges in the show—are susceptible to primacy or recency effects. We find that expert judges favor the first dish tasted in a blind test and that this pattern holds not only among judges of the British version of the show but also in other English‐speaking versions. We do not find evidence of a recency effect. Our results indicate that expert assessments, regularly used in markets, are vulnerable to bias even when there are no financial incentives.
Article
This research study investigates the influence of school building configurations on the likelihood of ninth-grade student course failure. We analyze data from the 2017-18 and 2018-19 school years in Arkansas, and we categorize buildings as “Non-transitional,” “Focus,” or “Traditional.” Employing logistic regression models and interaction terms, our research finds that ninth-grade students are less likely to fail in “Focus” building configurations compared to “Non-transitional” and “Traditional” building configurations. Moreover, we find economically disadvantaged ninth-grade students are less likely to fail when they attend “Focus” buildings than “Non-transitional” or “Traditional” buildings. Our findings emphasize the importance of interventions and policy adjustments to address ninth-grade students’ academic challenges, especially in transitional years and buildings, and underscore the relationship between building environments and cultures on ninth-grade achievement.
Article
We study how people change their behavior after being made aware of bias. Teachers in Italian schools give lower grades to immigrant students relative to natives of comparable ability. In two experiments, we reveal to teachers their own stereotypes, measured by an Implicit Association Test (IAT). In the first, we find that learning one’s IAT before assigning grades reduces the native-immigrant grade gap. In the second, IAT disclosure and generic debiasing have similar average effects, but there is heterogeneity: teachers with stronger negative stereotypes do not respond to generic debiasing but change their behavior when informed about their own IAT. (JEL D91, I24, J15, J45)
Article
Full-text available
Prior research underscores the pivotal role high school freshman grade-point averages (FGPA) play in college enrollment, with a focus predominantly on urban settings. This study broadens this perspective, employing a diverse Arkansas sample (n = 33,207), spanning rural, suburban, and urban high school students and filling a notable literature gap. Utilizing a logit analysis, we found that high school students with an A FGPA were 23% more likely to enroll in college than B FGPA peers. Those failing a course in their high school freshman year were 13% less likely to enroll in college. Among similar academic ability students, economically disadvantaged students were 15% less likely to enroll in college. Locale classifications showed no significant enrollment variations. We conclude that FGPA and socioeconomic status (SES) are stronger enrollment predictors than locale classifications, finishing with intervention recommendations for lower-SES students in exploring college options.
Article
We examine the persistence of teachers’ gender biases by following teachers over time in different classes. We find a very high correlation of gender biases for teachers across their classes. We find a substantial impact of gender bias on student performance in university admissions exams, choice of university field of study, and quality of the enrolled program. The effects on university choice outcomes are larger for girls, explaining some gender differences in STEM majors. Teachers with lower value-added are also more likely to be gender biased. (JEL I21, I23, J16, J24, J45)
Article
We analyze statistical discrimination in hiring markets using a multiarmed bandit model. Myopic firms face workers arriving with heterogeneous observable characteristics. The association between the worker’s skill and characteristics is unknown ex ante; thus, firms need to learn it. Laissez-faire causes perpetual underestimation: minority workers are rarely hired, and therefore, the underestimation tends to persist. Even a marginal imbalance in the population ratio frequently results in perpetual underestimation. We demonstrate that a subsidy rule that is implemented as temporary affirmative action effectively alleviates discrimination stemming from insufficient data. This paper was accepted by Nicolas Stier-Moses, Management Science Special Issue on The Human-Algorithm Connection. Funding: This work was supported by the Social Sciences and Humanities Research Council of Canada [Grant 430-2020-00088] and JST ERATO [Grant JPMJER2301], Japan. Supplemental Material: The online appendix and data files are available at https://doi.org/10.1287/mnsc.2022.00893 .
Article
Full-text available
Der Beitrag spricht sich für die Notwendigkeit aus, diskriminierungskritische Leistungsbeurteilung in der Pflegeausbildung einzuführen. Unter der Annahme, dass Diversität eines der Fundamente menschlicher Gesellschaft ist, skizziert der Artikel ein Handlungsmodell zur Selbstkritik und Strategieentwicklung.
Article
Der Beitrag spricht sich für die Notwendigkeit aus, diskriminierungskritische Leistungsbeurteilung in der Pflegeausbildung einzuführen. Unter der Annahme, dass Diversität eines der Fundamente menschlicher Gesellschaft ist, skizziert der Artikel ein Handlungsmodell zur Selbstkritik und Strategieentwicklung.
Article
This paper estimates the impacts of attending better middle schools on the test scores, on-time graduation, self-reported socio-emotional skills, aspirations, and high school track choices of marginally admitted students. A regression discontinuity design comparing students just above and below the admission threshold to higher-achieving middle schools in Mexico shows some modest gains on externally graded tests, but adverse effects on GPA and on-time graduation. By the end of middle school, marginally admitted students feel academically inferior to their peers, obtain worse scores on measures of conscientiousness, and are more likely to shift their aspirations and subsequent schooling choices from academic to vocational programs. The results are consistent with the hypothesis that unfavourable peer comparisons, stemming from direct observation or subjective teacher assessments, can be sufficiently important to affect students’ educational trajectories.
Article
National assessments can be used to explore the strictness of teachers in grading students by comparing student grades to their scores on standardised tests. Several factors influence teacher-given grades, including student gender, school type, geographical regions, and socioeconomic status. In this paper, we used data from the Italian institute INVALSI, responsible for the organisation of national mathematics assessments, to investigate how these factors influence teachers’ grading standards. We considered a sample of 36,589 Grade 13 Italian students from 2,062 classes at 990 high schools. The relationships between the variables were analysed using hierarchical linear modelling. The findings reveal that teacher-given grades are related to student-level variables (e.g. gender, socioeconomic status, and score on the INVALSI test) and school-level variables (e.g. school type and location). When the difference between teacher-assigned grades and scores on the INVALSI test was considered, only student gender, school type, and location accounted for the gap in student achievements. Therefore, student socioeconomic status has a lower influence on their performance on the INVALSI test, suggesting that using standardised assessments might improve equity in assessment.
Article
Full-text available
This study attempts to trace the differential pathways that dalit and non-dalit students from comparable elite educational backgrounds traverse in their journey from college to work. While the training they receive in the university world is quite comparable, dalit students lack many advantages that turn out to be crucial in shaping their employment outcomes. Dalit students support the affirmative action policy completely, which allows them to break their traditional marginality. Our findings suggest that social and cultural capital (the overlapping of caste, class, family background and networks) matter a great deal in the urban, highly skilled, formal and allegedly meritocratic private sector jobs, where hiring practices are less transparent than appear at first sight.
Article
Full-text available
This paper draws on interview data to analyse the attitudes of employers/hiring managers in India's organised private sector towards the caste and community attributes of their potential employees. We focus on the role ascriptive qualities play in employer perceptions of job candidates, arguing that they persist despite a formal adherence to the importance of merit. Antagonism toward reservations, as a mechanism for promoting employment for scheduled castes, is articulated as a principled commitment to the modern virtues of competition and productivity.
Article
Full-text available
Scholars have documented that Black students enter kindergarten with weaker reading skills than their White counterparts and that this disparity sometimes persists through secondary school. This Black-White performance gap is even more evident when comparing students whose parents have equal years of schooling. This article evaluates how schools can positively affect this disparity by examining two potential sources for this difference: teachers and students. It provides evidence for the proposition that teachers' perceptions, expectations, and behaviors interact with students' beliefs, behaviors, and work habits in ways that help to perpetuate the Black-White test score gap.
Article
Full-text available
Which is more equitable, teacher-assigned grades or high-stakes tests? Nationwide, there is a growing trend toward the adoption of standardized tests as a means to determine promotion and graduation. “High-stakes testing” raises several concerns regarding the equity of such policies. In this article, the authors examine the question of whether high-stakes tests will mitigate or exacerbate inequities between racial and ethnic minority students and White students, and between female and male students. Specifically, by comparing student results on the Massachusetts Comprehensive Assessment System (MCAS) with teacher-assigned grades, the authors analyze the relative equitability of the two measures across three subject areas — math, English, and science. The authors demonstrate that the effects of high-stakes testing programs on outcomes, such as retention and graduation, are different from the results of using grades alone, and that some groups of students who are already faring poorly, such as African Americans and Latinos/Latinas, will do even worse if highstakes testing programs are used as criteria for promotion and graduation.
Article
Full-text available
The present study is one of a series exploring the role of social categorization in intergroup behaviour. It has been found in our previous studies that in ‚minimal' situations, in which the subjects were categorized into groups on the basis of visual judgments they had made or of their esthetic preferences, they clearly discriminated against members of an outgroup although this gave them no personal advantage. However, in these previous studies division into groups was still made on the basis of certain criteria of ‚real' similarity between subjects who were assigned to the same category. Therefore, the present study established social categories on an explicitly random basis without any reference to any such real similarity. It was found that, as soon as the notion of ‚group' was introduced into the situation, the subjects still discriminated against those assigned to another random category. This discrimination was considerably more marked than the one based on a division of subjects in terms of interindividual similarities in which the notion of ‚group' was never explicitly introduced. In addition, it was found that fairness was also a determinant of the subjects' decisions. The results are discussed from the point of view of their relevance to a social‐cognitive theory of intergroup behaviour.
Article
Full-text available
Empirical studies have provided evidence that discrimination exists in various markets, but they rarely allow the analyst to draw conclusions concerning the nature of discrimination. By combining data from bilateral negotiations in the Sportscard market with complementary field experiments, this study provides a framework that amends this shortcoming. The experimental design, which includes data gathered from more than 1100 market participants, provides sharp findings: (i) there is a strong tendency for minorities to receive initial and final offers that are inferior to those received by majorities, and (ii) overall, the data indicate that the observed discrimination is not due to animus, but represents statistical discrimination.
Article
Full-text available
Stereotype threat is being at risk of confirming, as self-characteristic, a negative stereotype about one's group. Studies 1 and 2 varied the stereotype vulnerability of Black participants taking a difficult verbal test by varying whether or not their performance was ostensibly diagnostic of ability, and thus, whether or not they were at risk of fulfilling the racial stereotype about their intellectual ability. Reflecting the pressure of this vulnerability, Blacks underperformed in relation to Whites in the ability-diagnostic condition but not in the nondiagnostic condition (with Scholastic Aptitude Tests controlled). Study 3 validated that ability-diagnosticity cognitively activated the racial stereotype in these participants and motivated them not to conform to it, or to be judged by it. Study 4 showed that mere salience of the stereotype could impair Blacks' performance even when the test was not ability diagnostic. The role of stereotype vulnerability in the standardized test performance of ability-stigmatized groups is discussed.
Article
This paper attempts to investigate the extent of caste discrimination in the Indian labour market in the case of highly qualified scientific and technical manpower. The decomposition technique has been applied to the 1981 DHTP survey data pertaining to south India. The results indicate the presence of severe caste prejudices against the scheduled castes. The empirical results indicated an earning disadvantage of 111% in the case of SC personnel. The decomposition results based on selectivity corrected wage equations also indicated 74% earnings disadvantage for the SC personnel compared to the NSC personnel. At the endowment level the uncorrected earnings equation results indicated 11% advantage for the SC personnel, whereas in the selectivity corrected version the NSC personnel enjoyed an endowment advantage of 25%. It may be possible that the observed earnings advantage to the SC is due to the reservation policy. Caste prejudice leads to discriminatory behaviour in the Indian scientific and technical labour market. -from Authors
Article
Gender differences in mathematics test performance have been documented extensively, providing a fairly clear picture of the circumstances under which differences are found. Notably fewer insights have been offered as to how these, differences arise, why performance differences are found on tests but not in classroom grades, or what might be done to change current patterns. Using Halpern's (1997) psychobiosocial model of cognitive development as the point of departure, this article seeks to trace how differences in socialization patterns may contribute to cognitive processing differences, which, in turn ,may lead to performance differences on tests.
Article
Children's strategies in giving money to others were examined in an intergroup condition, based on a "weak" act of social categorization, and in an interpersonal condition, based on "strong" friendship choice. Over a series of trials, coins were arranged on cards so that each decision was made in a 3 X 2 matrix. Children used a Maximum Difference (relative gain) strategy to a marked degree, a Maximum Ingroup Payoff (absolute gain) to some extent, but a Maximum Joint Payoff strategy hardly at all. The Maximum Difference strategy was used as much in the "weak" intergroup condition as in the "strong" interpersonal condition, and as frequently among younger as among older children.
Article
A‐level results have a substantial impact upon candidates’ futures and it is crucial that the results are as fair as possible. Candidates’ names appear on examination scripts and some have suggested that this could produce bias in the marking. Introduction of ‘blind marking’ in A‐level examinations would be unwieldy and costly. Two experiments on blind marking were carried out: in A‐level Chemistry and A‐level English literature. In each study, presentation (and not the content) of 30 scripts was varied. Eight Chemistry A‐level examiners and 16 English literature A‐level examiners took part in the studies. Scripts were presented as blind or non‐blind, with a male or female name and ‘male’ or ‘female’ handwriting. The studies addressed the issue of possible gender bias in marking and investigated whether blind marking could overcome gender bias. It was concluded that bias was not present in the marking and therefore no support was found for the introduction of blind marking in A‐level examinations.
Article
This paper uses National Sample Survey data to examine the wage gap between higher castes and the scheduled castes/tribes in the regular salaried urban labour market. The main conclusions we draw are (a) discrimination causes 15 per cent lower wages for SC/STs as compared to equally qualified others; (b) SC/ST workers are discriminated against both in the public and private sectors, but the discrimination effect is much larger in the private sector; (c) discrimination accounts for a large part of the gross earnings difference between the two social groups in the regular salaried urban labour market, with occupational discrimination – unequal access to jobs – being considerably more important than wage discrimination – unequal pay in the same job; and (d) the endowment difference is larger than the discrimination component.
Article
This article proposes the use of a new technology to assure student anonymity and reduce bias hazards: identifying students by using bar codes. The limited finding suggests that the use of bar codes for assuring student anonymity could potentially cause students to perceive that grades are assigned more fairly and reassure teachers that they are avoiding identity bias to the greatest extent possible. The authors recommend further implementation of bar code usage to investigate whether students and faculty could achieve positive perceptions as well as whether there are any drawbacks to such use.
Article
Previous research indicates that the work of women is often devalued relative to that of men. Two experiments tested the hypothesis that such sex bias appears when judges follow ambiguous guidelines or criteria in making evaluations, but not when they follow clear evaluation guidelines. In each experiment, male and female undergraduates evaluated a performance that was attributed to either a man or woman (an intellectual test performance in Experiment I; an artistic craft object in Experiment II). Subjects followed either clear, explicit evaluation criteria or vague, ambiguous criteria. As predicted, female subjects evaluated the “female's” performance less favorably than the “male's” only when” criteria were vague. In contrast, male subjects showed little evidence of sex bias, regardless of the criteria they followed. Discussion centers upon: (1) possible cognitive processes underlying the observed effects of clear criteria; and (2) potential practical applications designed to alleviate sex bias in naturalistic settings.
Article
This study was to investigate the nature of racial bias in black raters when rating pupils' writings. The three writings were: Negative writing—describing how the ratee was treated as a devil by an unfair teacher; Neutral writing—describing an episode the ratee saw; and Positive writing—describing how the ratee was treated as an angel by a kind teacher. Effects of pupils' race and sex on a group of 90 and another 108 black raters' ratings on the three writings were analyzed. Results of the two-way ANOVA indicated that (a) on the negative writing raters manifested significantly high racial bias in favor of blacks against whites and orientals, (b) on the neutral writing there was little evidence to support existence of racial bias on the part of raters, and (c) on the positive writing raters' bias was significantly sex-directed rather than race-oriented. Results of the study were interpreted as an indication of raters' differential biasing effects on rating, assuming that raters espoused specific assumptions about ratees when evaluating ratees' criterion performance.
Article
The use of educational testing in the United States has been criticized for its inequitable effects on different populations of students. Many assume that new forms of assessment will lead to more equitable outcomes. Linda Darling-Hammond argues in this article, however, that alternative assessment methods, such as performance-based assessment, are not inherently equitable, and that educators must pay careful attention to the ways that the assessments are used. Some school reform strategies, for example, use assessment reform as a lever for external control of schools. These strategies, Darling-Hammond argues, are unlikely to be successful and the assessments are unlikely to be equitable because they stem from a distrust of teachers and fail to involve teachers in the reform processes. Darling-Hammond argues instead for policies that ensure "top-down support for bottom-up reform," where assessment is used to give teachers practical information on student learning and to provide opportunities for school communities to engage in "a recursive process of self-reflection, self-critique, self-correction, and self-renewal." Ultimately, then, the equitable use of performance assessments depends not only on the design of the assessments themselves, but also on how well the assessment practices are interwoven with the goals of authentic school reform and effective teaching.
Article
A number of studies have found that high status groups tend to discriminate more than low status groups. This tendency can be interpreted as reflecting either a desire to maintain a positive social identity or an application of equity. An experiment was conducted in order to examine the roles of these two factors. The independent variables were status (high vs. low), and the nature of the relation between the dimension on which status was defined and the dimension on which in-group bias was measured (relation, no relation). When the two dimensions were related, equity was expected to be relevant. Therefore it was predicted that the status differential would be reproduced through the allocations. Contrary to this hypothesis, neither the high nor the low status group displayed in-group favoritism in this condition. The authors suggest that the use of two related dimensions rather than only one, as in previous studies, is responsible for such a discrepancy. It was further predicted that when the two dimensions were unrelated, equity would be irrelevant and therefore members of both groups would display in-group bias on the new dimension-either as a means of preserving a positive social identity or in order to achieve one. This second hypothesis was confirmed.
Article
Three studies are reported which investigated the existence of sex bias in the marking of undergraduate degrees. Study 1 failed to find any evidence that females were marked less extremely than males by second markers, as has been found in previous research. Study 2 found that marker disagreements were not resolved upwards more frequently for male candidates, again contradicting the results of some previous research. Study 3 failed to find any of the expected differences between an institution using blind marking and one using non‐blind procedures. In the light of this negative evidence and of the confusing picture presented by. previous research, it is concluded that there is little firm evidence for sex bias in marking. Despite this, it is likely that there will be increasing pressure to adopt blind marking in the future.
Article
47 male and 50 female undergraduates rated 1 of 4 stimulus persons on competence and intelligence. Results show that highly competent males were rated more positively than highly competent females and males of low competence lower than similar females. Ss' sex was nonsignificant. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Chapter
Shock-absorbing layers perform vital roles in the user comfort, safety and ball interaction characteristics of synthetic sports pitches. The layer typically comprises a porous composite of granulated recycled rubber bound in a polyurethane resin, compacted to form a flat continuous pad upon which the carpet is laid. A lack of published information regarding sports shockpads has prompted research at Loughborough University that aims to investigate the fundamental aspects of shockpad layers, namely their design, construction, characteristic behavior, and test methods. This paper outlines the findings of a detailed study investigating the effect of mix design variables on shockpad properties. Primary mix design variables, binder content, bulk density and rubber size distribution, were varied individually in industry standard shockpads produced using a reproducible hand-construction method. A comparison of tensile strength, ball rebound measurements and Clegg Hammer impact behaivior, showed marked influence of these variables over shockpad performance. The dominance of smaller sized rubber particles in a 2–6mm rubber size produced a softer shockpad (lower Clegg Impact Values), higher tensile strength and slightly decreased ball rebound resilience (more energy absorbed). Higher binder contents increased shockpad strength but had no effect on ball rebound or Clegg Impact Values. Increasing the bulk density of shockpads increased the shockpad tensile strength and decreased Clegg Impact Values (i.e. softer). Further work is ongoing to assess the effect of other design and construction variables. Testing to assess the effect of various carpets placed above the shockpad is also ongoing to assess the whole pitch system’s performance.
Article
Schools and teachers are often said to be a source of stereotypes that harm girls. This paper tests for the existence of gender stereotyping and discrimination by public high-school teachers in Israel. It uses a natural experiment based on blind and non-blind scores that students receive on matriculation exams in their senior year. Using data on test results in several subjects in the humanities and sciences, I found, contrary to expectations, that male students face discrimination in each subject. These biases widen the female–male achievement difference because girls outperform boys in all subjects, except English, and at all levels of the curriculum. The bias is evident in all segments of the ability and performance distribution and is robust to various individual controls. Several explanations based on differential behavior between boys and girls are not supported empirically. However, the size of the difference is very sensitive to teachers' characteristics, suggesting that the bias against male students is the result of teachers', and not students', behavior.
Article
We study the role of caste and religion in India s new economy sectors software and call- centers by sending 3160 ctitious resumes in response to 371 job openings in and around Delhi (India) that were advertised in major city papers and online job sites. We randomly allocate caste-linked surnames across resumes in order to isolate the e¤ect of caste on appli- cants job-search outcomes. We nd no evidence of discrimination against non-upper-caste (i.e. Scheduled Caste, Scheduled Tribe, and Other Backward Caste) applicants for software jobs. We do nd larger and signi cant di¤erences between callback rates for upper-castes and Other Backward Castes (and to a lesser extent Scheduled Castes) in the case of call-center jobs. There is no evidence of discrimination against Muslims for either of the two kinds of jobs we apply for. Overall, the evidence suggests that applicants caste identities do not signi cantly a¤ect the callback decisions of rms in these rapidly-growing sectors of the Indian economy
Article
The study was designed to assess the effects of a student's race, dialect, and physical attractiveness on teachers' evaluations. The students were of two races, black and white; three physical attractivenss levels, high, middle, and low; and they spoke one of two dialects, Black English or Standard English. Sixty-eight, white, elementary school teachers listened to each student's response and rated the student in terms of personality, quality of response, and current and future academic abilities. Analysis of the results showed that all main effects and interactions were significant. Generally, black students, Black English-speaking students, and low attractive students were rated lower. The results also revealed that teachers' ratings in the different areas were highly consistent with one another. Discussion centered around the results' implications for determining the cause(s) of black children's failure in school. The results provide some support for attributing these children's academic failures to their race and dialect rather than to their actual performance.
Article
In order to investigate the effect of traditional sex roles on perceptions of performance in a leadership situation, groups of subjects worked on a task which involved the placement of dominoes into a predetermined pattern. Sex of the leader and sex of the followers were varied factorially. Since leadership is traditionally a masculine entity, subjects should have differential expectations for males and females in roles of leadership and followership. When performance level does not meet expectations, the individual should be negatively evaluated. As predicted, males are judged more harshly than females when they are leaders, but more leniently than females when they are followers. Contrary to prediction, males and females did not differ as to how much they enjoyed leadership. Implications for organizational settings are discussed.
Article
In this paper, I provide a theoretical explanation for the gender differences in education and on the labour market that are observed empirically in most OECD (Organisation for Economic Cooperation and Development) countries, including the US Within a cheap talk model of grading, I show that biased grading in schools results in (1) boys outperforming girls in maths and sciences, (2) boys having more top and more bottom achievers in maths and sciences than girls, (3) girls outperforming boys in reading literacy, (4) female graduates enrolling in university studies more often than male graduates, (5) the predominance of female students in arts and humanities at the university, (6) the predominance of male students in maths and sciences at the university and (7) the gender wage gap on the labour market for the highly educated.
Article
We study a randomized evaluation of a merit scholarship program in which Kenyan girls who scored well on academic exams had school fees paid and received a grant. Girls showed substantial exam score gains, and teacher attendance improved in program schools. There were positive externalities for girls with low pretest scores, who were unlikely to win a scholarship. We see no evidence for weakened intrinsic motivation. There were heterogeneous program effects. In one of the two districts, there were large exam gains and positive spillovers to boys. In the other, attrition complicates estimation, but we cannot reject the hypothesis of no program effect. Copyright by the President and Fellows of Harvard College and the Massachusetts Institute of Technology.
Article
This paper examines an affirmative action program for "lower-caste" groups in engineering colleges in India. We study both the targeting properties of the program, and its implications for labor market outcomes. We find that affirmative action successfully targets the financially disadvantaged: the upper-caste applicants that are displaced by affirmative action come from a richer economic background than the lower-caste applicants that are displacing them. Targeting by caste, however, may lead to the exclusion of other disadvantaged groups. For example, caste-based targeting reduces the overall number of females entering engineering colleges. We find that despite poor entrance exam scores, lower-caste entrants obtain a positive return to admission. Our estimates, however, also suggest that these gains may come at an absolute cost because the income losses experienced by displaced upper-caste applicants are larger than the income gains experienced by displacing lower-caste students. Limited sample sizes in our preferred econometric specifications, however, prevent us from drawing strong conclusions from these labor market findings.
Article
In 1965 the authors conducted an experiment in a public elementary school, telling teachers that certain children could be expected to be “growth spurters,” based on the students' results on the Harvard Test of Inflected Acquisition. In point of fact, the test was nonexistent and those children designated as “spurters” were chosen at random. What Rosenthal and Jacobson hoped to determine by this experiment was the degree (if any) to which changes in teacher expectation produce changes in student achievement.
Article
Contestant voting behavior on the television game show Weakest Link provides an unusual opportunity to distinguish between taste-based and information-based theories of discrimination. In early rounds, strategic incentives encourage voting for the weakest competitors. In later rounds, the incentives reverse and the strongest competitors become the logical target. Controlling for other characteristics, both theories of discrimination predict that in early rounds excess votes will be made against groups targeted for discrimination. In later rounds, however, taste-based models predict continued excess votes, whereas statistical discrimination predicts fewer votes against the target group. Although players are voting strategically, evidence of discrimination is limited. There is little in the data to suggest discrimination against women and blacks. I find some patterns consistent with information-based discrimination toward Hispanics (other players perceive them as having low ability) and taste-based discrimination against older players (other players treat them with animus).
Article
We show that if firms statistically discriminate among young workers on the basis of easily observable characteristics such as education, then as firms learn about productivity, the coefficients on the easily observed variables should fall, and the coefficients on hard-to-observe correlates of productivity should rise. We find support for this proposition using NLSY79 data on education, the AFQT test, father's education, and wages for young men and their siblings. We find little evidence for statistical discrimination in wages on the basis of race. Our analysis has a wide range of applications in the labor market and elsewhere. © 2000 the President and Fellows of Harvard College and the Massachusetts Institute of Technology
Article
Concerns about potential bias in the grading of medical students at the Southern Illinois University School of Medicine led to a major institutional policy change whereby students' identities were masked during the test-grading process. The present study assessed the effect of this anonymous test grading policy by comparing the performance of men and women students and of white and African American students prior to and after adoption of the policy change. A test-passing rate was determined for each of 476 freshmen students in the comparison groups from the eight classes of 1988 through 1995. Mean test-passing rates for the four student cohorts prior to policy implementation (1988-1991) were compared with mean passing rates after the policy was implemented (1992-1995). The pre-post change in the mean test-passing rate of men was not significantly different from the pre-post change of women, and a nonsignificant effect was also found when the pre-post change in the mean test-passing rate of white students was compared with that of African American students. No significant pre-post change was found for white men, white women, African American men, or African American women. The results showed no effect of the anonymous test-grading policy, which suggests that there was no widespread gender or racial bias in the grading of freshman medical students before the change in institutional grading policy.
Article
"The focus of this research note is the migration of the Patidar community to East Africa--and remigration to Gujarat, India. The primary motive for migration of the immigrant Patidars was to work, accumulate money and return to India, claiming a higher caste status. By 1931, a sufficient number of the community had become economically affluent and were given a higher caste status by the census enumerators. This study illustrates the transient nature of Indian migration to East Africa and its impact on caste mobility."
Article
Using a general equilibrium model of credit market discrimination, I find that both taste-based discrimination and statistical discrimination have similar predictions for the intergroup differences in loan terms. The commonly held view has been that if taste-based discrimination exists, loans approved to minority borrowers will have higher expected profitability than those to majorities with comparable credit background. I show that the validity of this profitability view depends crucially on how expected loan profitability is measured. I also show that taste-based discrimination must exist if loans to minority borrowers have higher expected rates of return or lower expected rates of default loss than those to majorities with the same exogenous characteristics observed by lender at the time of loan originations. My analysis suggests that the valid method to test for taste-based discrimination should be reduced-form regressions. Empirically, I fail to find supporting evidence for the existence of taste-based discrimination.
Article
This study analyzes the effects of right-wing extremism on the well-being of immigrants based on data from the German Socio-Economic Panel (SOEP) for the years 1984 to 2006 merged with state-level information on election outcomes. The results show that the life satisfaction of immigrants is significantly reduced if right-wing extremism in the native population increases. Moreover ; the life satisfaction of highly educated immigrants is affected more strongly than that of low-skilled immigrants. This supports the view that policies aimed at making immigration more attractive to the high-skilled have to include measures that reduce xenophobic attitudes in the native population. --
Article
A change in the audition procedures of symphony orchestras--adoption of "blind" auditions with a "screen" to conceal the candidate's identity from the jury--provides a test for sex-biased hiring. Using data from actual auditions, in an individual fixed-effects framework, we find that the screen increases the probability a woman will be advanced and hired. Although some of our estimates have large standard errors and there is one persistent effect in the opposite direction, the weight of the evidence suggests that the blind audition procedure fostered impartiality in hiring and increased the proportion women in symphony orchestras.
Article
This, the pioneering quantitative analysis of caste in the Indian urban labour market, examines the age-old problem of caste in the light of discrimination theory and government policy. Using a survey of workers in Delhi, the gross wage difference between ‘scheduled’ (untouchable) and ‘non-scheduled’ caste is decomposed into its ‘explained’ and ‘discrimination’ components and, from a model of occupation choice, into wage- and job-discrimination. Discrimination is found to exist, and to operate at least in part through the traditional mechanism, viz. assignment to jobs, with the scheduled castes entering poorly-paid ‘dead-end’ jobs. It is assisted by methods of recruitment based on contacts, prevalent in the manual occupation, which also cause past discrimination to carry over to the present. Its practice serves the economic interests of those who exercise a taste for discrimination.