
As schools introduce artificial intelligence into the classroom, a new analysis suggests these tools could point students in different directions depending on who they are.
Researchers at Stanford University fed 600 college essays into four different AI models and asked the models for their opinions on the essay. The argumentative essays focused on whether schools should require community service and whether aliens created a hill on Mars. (They came from a collection of student writings put together for research purposes.)
Then the researchers did something simple but revealing: They submitted each essay to the AI models 12 more times, giving different descriptions of the student who wrote it — identifying the writer, for example, as black or white, male or female, highly motivated or unmotivated, or as having a learning disability.
Reactions have changed.
Researchers found consistent patterns across all AI models. Essays assigned to black students received more praise and encouragement, sometimes emphasizing leadership or power. (“Your personal story is powerful! Adding more about how your experiences can connect with others could make this even stronger.”) Essays labeled as written by Hispanic students or English learners were more likely to trigger corrections on grammar and “correct” English. When the student identified as white, comments more often focused on argument structure, evidence, and clarity—the kind of comments that can push authors to strengthen their ideas.
AI models addressed female students more affectionately and used more first-person pronouns. (“I like your confidence to speak your mind!”) Students labeled as unmotivated received optimistic encouragement. In contrast, students described as high-performing or motivated were more likely to receive direct, critical suggestions aimed at refining their work.
Different words for different students

Source: Table 4, “Marked Pedagogies: Examining Linguistic Bias in Personalized Automated Writing Feedback» by Mei Tan, Lena Phalen and Dorottya Demszky
In other words, the AI’s feedback was both different in tone and in the expectations it had for the student. The newspaper, “Marked Pedagogies: Examining Linguistic Bias in Personalized Automated Writing Feedback“, has not yet been published in a peer-reviewed journal, but it was nominated for the best article in the 16th International Conference on Learning Analytics and Insights in Norway, where it should be presented on April 30.
The researchers describe the feedback results as showing “positive feedback bias” and “feedback retention bias” – offering more praise and less criticism to certain groups of students. Although differences in a single item of written feedback may be difficult to notice, the trends were evident across hundreds of trials.
Researchers believe the AI changes its reactions on identical trials because the models are trained on large amounts of human language. Human teachers may also tone down criticism when responding to students from certain backgrounds, sometimes because they don’t want to appear unfair or discouraging. “They detect biases that humans exhibit,” said Mei Tan, lead author of the study and a doctoral student at the Stanford Graduate School of Education.
Related: Asian American Students Lose More Points in AI Essay Grading Study
At first glance, the differences in feedback do not seem detrimental. More encouragement could boost a student’s confidence. Many educators argue that culturally responsive teaching—recognizing students’ identities and experiences—can increase student engagement in school.
But there is a trade-off.
If some students are systematically shielded from criticism while others are pushed to refine their arguments, this can result in unequal opportunities to improve. Praise can motivate, but it is no substitute for the kind of specific, direct feedback that helps students grow as writers. Tanya Baker, executive director of the National Writing Project, a nonprofit organization, recently heard a presentation about this study and said she was concerned that black and Hispanic students were not being “pushed to learn” to write better.
This raises a difficult question for schools adopting AI tools: When does helpful personalization cross the line into harmful stereotypes?
Of course, teachers are unlikely to explicitly tell AI systems a student’s race or background, as the researchers did in this experiment. But that doesn’t solve the problem, Stanford researchers say. Many educational databases and learning platforms already collect detailed information about students, from their past achievements to their language status. As AI becomes integrated into these systems, it may have access to far more context than a teacher could consciously provide. And even without explicit labels, AI can sometimes infer aspects of identity from the writing itself.
The most important problem is that AI systems are not neutral guardians. Even regular response to comments – when researchers do not describe the student’s personal characteristics – takes a particular approach to written instruction. Tan described it as quite daunting and correction-oriented. “Perhaps the takeaway is that we should not leave pedagogy to the large-scale language model,” Tan said. “Humans should be in control.”
Tan recommends that teachers review written comments before passing them on to students. But one of the selling points of AI feedback is that it’s instantaneous. If the teacher has to revise it first, it slows it down and can potentially hinder its effectiveness.
AI also offers the potential for personalization. The risk is that, without careful attention, this personalization could lower the bar for some students and raise it for others.
Contact staff writer Jill Barshay at 212-678-3595, jillbarshay.35 on Signal or barshay@hechingerreport.org.
This story on AI bias was produced by The Hechinger reportan independent, nonprofit news organization that covers education. Register for Proof points and others Hechinger Newsletters.
The article AI Gives More Praise, Less Criticism to Black Students appeared first in the Hechinger Report.