I was writing a piece on Kristin Hannah’s book The Women recently, and as part of my research, I read The New York Times’s review of the book. ChatGPT has shown proficiency in writing in a New York Times style (allegedly because the AI was trained on Times articles). I wanted to see if it could transform my writing, which was for a high school audience, to the Times’s style. To test this out, I fed my review to ChatGPT and asked it to rewrite it as if it were an editor for the Times.
Can you tell which section below was my writing and which was GPTs? GPTZero and other AI detectors were 87% confident in the right answer. Here’s the first passage:
Frankie, The Women’s central character, comes from an affluent American family steeped in patriarchal tradition. Her father’s prized “hero’s wall” celebrates the men of the family who’ve served in the military. Women, by contrast, are relegated to supporting roles, memorialized only in photographs taken with their husbands. Frankie’s brother, Finley, is a Navy man, but her father cannot fathom his daughter serving her country in uniform.
And the second:
The main character of The Women, Frankie McGrath, comes from an archetypical wealthy American family. The family is dead set in patriarchal traditions. Though her brother Finley McGrath serves in the Navy, Frankie’s father would not dream of a woman of the family in military service. Frankie’s father’s prized “hero’s wall” honors the men who fight for the United States. The wives get a photo on the wall with their husbands – not alone, but with their man – and nothing else.
These two passages are very similar because I had ChatGPT rewrite my article instead of creating one from scratch. I believe in the powers of GPT as a copy editor, and in fact, I preferred the version it wrote, but it also didn’t sound like it was mine. I wrote the second paragraph. Both pieces were written in the formal style of academic and news writing. However, because of how ChatGPT was trained, it often fails to write informally. In other words, ChatGPT can write a great literary paper, but it probably cannot text a friend about it. In the style of a 16-year-old texting a friend to complain about Biology, ChatGPT and I wrote:
Yo [Name], bio just straight-up humbled me 💀 that test was a whole jump scare—like, why did it feel like half the questions were in a different language?? we’re out here learning gene expression, but I’m the one getting repressed fr. might just start manifesting extra brain cells bc the grind is not sustainable rn 😭
Here’s the second writing sample:
Hey [Name] – ngl bio hard rn. Will said I should drop it but nah. High School Ybk said it’s the hardest class at the school… Learning a lot of heat stuff but don’t know if it’s worth the grind. What’d you think of that last test? – I thought I was boxed on the last prob. Meh though we’ll see the grade.
You can probably tell that I wrote the second text. (Yes, I know it is cringe-worthy, and I am sorry). ChatGPT was much harder to tell apart in the first two paragraphs, which were in a much more formal style. Here, ChatGPT’s response used slang well and read more like someone of my generation would text, but it didn’t feel human. It lacked the quintessence of texting – not formulaic, but intentionally fluid and informal.
My “text” may seem a little too informal, but it still exemplifies the issue at the heart of ChatGPT. Can it write better than a human? Undoubtedly. Can it write just as well, just as creatively, as a human? Probably not. Again, part of this is due to model design. ChatGPT generates text in “tokens” – small bits of a language that are represented as numbers (“1” as “ski”, “2” as “ing”, and so on). The problem with tokens is that a model cannot create new ones. Let’s say that someone creates a new symbol in physics. ChatGPT will not understand what it means, even if you can type it. This is because it understands things not conceptually, but as tokens.
Here’s another interesting example of its inability to conceptually understand. Try to solve this riddle. It’s a very small twist on the original (adapted from Wikipedia) that I have highlighted in bold:
A farmer with a wolf, a goat, and a cabbage must cross a river by boat. The boat can carry only the farmer and three items. If left unattended together, the wolf would eat the goat, or the goat would eat the cabbage. How can they cross the river without anything being eaten?
Now, to a human, it is obvious that the farmer can carry everything in one trip. However, since ChatGPT reads this sentence as tokens and not words, it doesn’t understand that. Instead, it interprets the token pattern as the original riddle. In other words, since the original says the boat can only carry one item, ChatGPT presumes that that is the case despite being told otherwise. It then outputs a step-by-step solution of how to solve the original riddle, which involves taking the goat over, then the wolf, bringing the goat back, etc. But this is unnecessary because the twist that ChatGPT somehow misses is the three other items. The actual answer is: the farmer puts everything in the boat and brings it across. This is support of how ChatGPT can’t actually reason like a human.
ChatGPT fails to pass writing-detection tests, too; GPTZero, an online AI detector, correctly detected all of the pieces above as AI and my writing as human. Many English professors say that they can tell the AI writing apart because it has no voice. Can you?