ai write essay openai

PRO Courses Guides New Tech Help Pro Expert Videos About wikiHow Pro Upgrade Sign In
EDIT Edit this Article
EXPLORE Tech Help Pro About Us Random Article Quizzes Request a New Article Community Dashboard This Or That Game Popular Categories Arts and Entertainment Artwork Books Movies Computers and Electronics Computers Phone Skills Technology Hacks Health Men's Health Mental Health Women's Health Relationships Dating Love Relationship Issues Hobbies and Crafts Crafts Drawing Games Education & Communication Communication Skills Personal Development Studying Personal Care and Style Fashion Hair Care Personal Hygiene Youth Personal Care School Stuff Dating All Categories Arts and Entertainment Finance and Business Home and Garden Relationship Quizzes Cars & Other Vehicles Food and Entertaining Personal Care and Style Sports and Fitness Computers and Electronics Health Pets and Animals Travel Education & Communication Hobbies and Crafts Philosophy and Religion Work World Family Life Holidays and Traditions Relationships Youth
Browse Articles
Learn Something New
Quizzes Hot
This Or That Game
Train Your Brain
Explore More
Support wikiHow
About wikiHow
Log in / Sign up
Computers and Electronics
Online Communications

How to Get ChatGPT to Write an Essay: Prompts, Outlines, & More

Last Updated: June 2, 2024 Fact Checked

Getting ChatGPT to Write the Essay

Using ai to help you write, expert interview.

This article was written by Bryce Warwick, JD and by wikiHow staff writer, Nicole Levine, MFA . Bryce Warwick is currently the President of Warwick Strategies, an organization based in the San Francisco Bay Area offering premium, personalized private tutoring for the GMAT, LSAT and GRE. Bryce has a JD from the George Washington University Law School. This article has been fact-checked, ensuring the accuracy of any cited facts and confirming the authority of its sources. This article has been viewed 49,174 times.

Are you curious about using ChatGPT to write an essay? While most instructors have tools that make it easy to detect AI-written essays, there are ways you can use OpenAI's ChatGPT to write papers without worrying about plagiarism or getting caught. In addition to writing essays for you, ChatGPT can also help you come up with topics, write outlines, find sources, check your grammar, and even format your citations. This wikiHow article will teach you the best ways to use ChatGPT to write essays, including helpful example prompts that will generate impressive papers.

Things You Should Know

To have ChatGPT write an essay, tell it your topic, word count, type of essay, and facts or viewpoints to include.
ChatGPT is also useful for generating essay topics, writing outlines, and checking grammar.
Because ChatGPT can make mistakes and trigger AI-detection alarms, it's better to use AI to assist with writing than have it do the writing.

Before using the OpenAI's ChatGPT to write your essay, make sure you understand your instructor's policies on AI tools. Using ChatGPT may be against the rules, and it's easy for instructors to detect AI-written essays.
While you can use ChatGPT to write a polished-looking essay, there are drawbacks. Most importantly, ChatGPT cannot verify facts or provide references. This means that essays created by ChatGPT may contain made-up facts and biased content. [1] X Research source It's best to use ChatGPT for inspiration and examples instead of having it write the essay for you.

The topic you want to write about.
Essay length, such as word or page count. Whether you're writing an essay for a class, college application, or even a cover letter , you'll want to tell ChatGPT how much to write.
Other assignment details, such as type of essay (e.g., personal, book report, etc.) and points to mention.
If you're writing an argumentative or persuasive essay , know the stance you want to take so ChatGPT can argue your point.
If you have notes on the topic that you want to include, you can also provide those to ChatGPT.
When you plan an essay, think of a thesis, a topic sentence, a body paragraph, and the examples you expect to present in each paragraph.
It can be like an outline and not an extensive sentence-by-sentence structure. It should be a good overview of how the points relate.

"Write a 2000-word college essay that covers different approaches to gun violence prevention in the United States. Include facts about gun laws and give ideas on how to improve them."
This prompt not only tells ChatGPT the topic, length, and grade level, but also that the essay is personal. ChatGPT will write the essay in the first-person point of view.
"Write a 4-page college application essay about an obstacle I have overcome. I am applying to the Geography program and want to be a cartographer. The obstacle is that I have dyslexia. Explain that I have always loved maps, and that having dyslexia makes me better at making them."

Tyrone Showers

Be specific when using ChatGPT. Clear and concise prompts outlining your exact needs help ChatGPT tailor its response. Specify the desired outcome (e.g., creative writing, informative summary, functional resume), any length constraints (word or character count), and the preferred emotional tone (formal, humorous, etc.)

In our essay about gun control, ChatGPT did not mention school shootings. If we want to discuss this topic in the essay, we can use the prompt, "Discuss school shootings in the essay."
Let's say we review our college entrance essay and realize that we forgot to mention that we grew up without parents. Add to the essay by saying, "Mention that my parents died when I was young."
In the Israel-Palestine essay, ChatGPT explored two options for peace: A 2-state solution and a bi-state solution. If you'd rather the essay focus on a single option, ask ChatGPT to remove one. For example, "Change my essay so that it focuses on a bi-state solution."

Pay close attention to the content ChatGPT generates. If you use ChatGPT often, you'll start noticing its patterns, like its tendency to begin articles with phrases like "in today's digital world." Once you spot patterns, you can refine your prompts to steer ChatGPT in a better direction and avoid repetitive content.

"Give me ideas for an essay about the Israel-Palestine conflict."
"Ideas for a persuasive essay about a current event."
"Give me a list of argumentative essay topics about COVID-19 for a Political Science 101 class."

"Create an outline for an argumentative essay called "The Impact of COVID-19 on the Economy."
"Write an outline for an essay about positive uses of AI chatbots in schools."
"Create an outline for a short 2-page essay on disinformation in the 2016 election."

"Find peer-reviewed sources for advances in using MRNA vaccines for cancer."
"Give me a list of sources from academic journals about Black feminism in the movie Black Panther."
"Give me sources for an essay on current efforts to ban children's books in US libraries."

"Write a 4-page college paper about how global warming is changing the automotive industry in the United States."
"Write a 750-word personal college entrance essay about how my experience with homelessness as a child has made me more resilient."
You can even refer to the outline you created with ChatGPT, as the AI bot can reference up to 3000 words from the current conversation. For example: "Write a 1000 word argumentative essay called 'The Impact of COVID-19 on the United States Economy' using the outline you provided. Argue that the government should take more action to support businesses affected by the pandemic."

Step 5 Use ChatGPT to proofread and tighten grammar.

One way to do this is to paste a list of the sources you've used, including URLs, book titles, authors, pages, publishers, and other details, into ChatGPT along with the instruction "Create an MLA Works Cited page for these sources."
You can also ask ChatGPT to provide a list of sources, and then build a Works Cited or References page that includes those sources. You can then replace sources you didn't use with the sources you did use.

Expert Q&A

Because it's easy for teachers, hiring managers, and college admissions offices to spot AI-written essays, it's best to use your ChatGPT-written essay as a guide to write your own essay. Using the structure and ideas from ChatGPT, write an essay in the same format, but using your own words. Thanks Helpful 0 Not Helpful 0
Always double-check the facts in your essay, and make sure facts are backed up with legitimate sources. Thanks Helpful 0 Not Helpful 0
If you see an error that says ChatGPT is at capacity , wait a few moments and try again. Thanks Helpful 0 Not Helpful 0

Using ChatGPT to write or assist with your essay may be against your instructor's rules. Make sure you understand the consequences of using ChatGPT to write or assist with your essay. Thanks Helpful 1 Not Helpful 0
ChatGPT-written essays may include factual inaccuracies, outdated information, and inadequate detail. [3] X Research source Thanks Helpful 0 Not Helpful 0

How Do You Know Someone Blocked You on Discord

Thanks for reading our article! If you’d like to learn more about completing school assignments, check out our in-depth interview with Bryce Warwick, JD .

↑ https://help.openai.com/en/articles/6783457-what-is-chatgpt
↑ https://platform.openai.com/examples/default-essay-outline
↑ https://www.ipl.org/div/chatgpt/

About This Article

Send fan mail to authors

Is this article up to date?

Featured Articles

Watch Articles

Terms of Use
Privacy Policy
Do Not Sell or Share My Info
Not Selling Info

wikiHow Tech Help Pro:

Level up your tech skills and stay ahead of the curve

ChatGPT: Everything you need to know about the AI-powered chatbot

ChatGPT, OpenAI’s text-generating AI chatbot, has taken the world by storm since its launch in November 2022. What started as a tool to hyper-charge productivity through writing essays and code with short text prompts has evolved into a behemoth used by more than 92% of Fortune 500 companies .

That growth has propelled OpenAI itself into becoming one of the most-hyped companies in recent memory. And its latest partnership with Apple for its upcoming generative AI offering, Apple Intelligence, has given the company another significant bump in the AI race.

2024 also saw the release of GPT-4o, OpenAI’s new flagship omni model for ChatGPT. GPT-4o is now the default free model, complete with voice and vision capabilities. But after demoing GPT-4o, OpenAI paused one of its voices , Sky, after allegations that it was mimicking Scarlett Johansson’s voice in “Her.”

OpenAI is facing internal drama, including the sizable exit of co-founder and longtime chief scientist Ilya Sutskever as the company dissolved its Superalignment team. OpenAI is also facing a lawsuit from Alden Global Capital-owned newspapers , including the New York Daily News and the Chicago Tribune, for alleged copyright infringement, following a similar suit filed by The New York Times last year.

Here’s a timeline of ChatGPT product updates and releases, starting with the latest, which we’ve been updating throughout the year. And if you have any other questions, check out our ChatGPT FAQ here.

Timeline of the most recent ChatGPT updates

February 2024, january 2024.

ChatGPT FAQs

OpenAI delays ChatGPT’s new Voice Mode

OpenAI planned to start rolling out its advanced Voice Mode feature to a small group of ChatGPT Plus users in late June, but it says lingering issues forced it to postpone the launch to July. OpenAI says Advanced Voice Mode might not launch for all ChatGPT Plus customers until the fall, depending on whether it meets certain internal safety and reliability checks.

ChatGPT releases app for Mac

ChatGPT for macOS is now available for all users . With the app, users can quickly call up ChatGPT by using the keyboard combination of Option + Space. The app allows users to upload files and other photos, as well as speak to ChatGPT from their desktop and search through their past conversations.

The ChatGPT desktop app for macOS is now available for all users. Get faster access to ChatGPT to chat about email, screenshots, and anything on your screen with the Option + Space shortcut: https://t.co/2rEx3PmMqg pic.twitter.com/x9sT8AnjDm — OpenAI (@OpenAI) June 25, 2024

Apple brings ChatGPT to its apps, including Siri

Apple announced at WWDC 2024 that it is bringing ChatGPT to Siri and other first-party apps and capabilities across its operating systems. The ChatGPT integrations, powered by GPT-4o, will arrive on iOS 18, iPadOS 18 and macOS Sequoia later this year, and will be free without the need to create a ChatGPT or OpenAI account. Features exclusive to paying ChatGPT users will also be available through Apple devices .

Apple is bringing ChatGPT to Siri and other first-party apps and capabilities across its operating systems #WWDC24 Read more: https://t.co/0NJipSNJoS pic.twitter.com/EjQdPBuyy4 — TechCrunch (@TechCrunch) June 10, 2024

House Oversight subcommittee invites Scarlett Johansson to testify about ‘Sky’ controversy

Scarlett Johansson has been invited to testify about the controversy surrounding OpenAI’s Sky voice at a hearing for the House Oversight Subcommittee on Cybersecurity, Information Technology, and Government Innovation. In a letter, Rep. Nancy Mace said Johansson’s testimony could “provide a platform” for concerns around deepfakes.

ChatGPT experiences two outages in a single day

ChatGPT was down twice in one day: one multi-hour outage in the early hours of the morning Tuesday and another outage later in the day that is still ongoing. Anthropic’s Claude and Perplexity also experienced some issues.

You're not alone, ChatGPT is down once again. pic.twitter.com/Ydk2vNOOK6 — TechCrunch (@TechCrunch) June 4, 2024

The Atlantic and Vox Media ink content deals with OpenAI

The Atlantic and Vox Media have announced licensing and product partnerships with OpenAI . Both agreements allow OpenAI to use the publishers’ current content to generate responses in ChatGPT, which will feature citations to relevant articles. Vox Media says it will use OpenAI’s technology to build “audience-facing and internal applications,” while The Atlantic will build a new experimental product called Atlantic Labs .

I am delighted that @theatlantic now has a strategic content & product partnership with @openai . Our stories will be discoverable in their new products and we'll be working with them to figure out new ways that AI can help serious, independent media : https://t.co/nfSVXW9KpB — nxthompson (@nxthompson) May 29, 2024

OpenAI signs 100K PwC workers to ChatGPT’s enterprise tier

OpenAI announced a new deal with management consulting giant PwC . The company will become OpenAI’s biggest customer to date, covering 100,000 users, and will become OpenAI’s first partner for selling its enterprise offerings to other businesses.

OpenAI says it is training its GPT-4 successor

OpenAI announced in a blog post that it has recently begun training its next flagship model to succeed GPT-4. The news came in an announcement of its new safety and security committee, which is responsible for informing safety and security decisions across OpenAI’s products.

Former OpenAI director claims the board found out about ChatGPT on Twitter

On the The TED AI Show podcast, former OpenAI board member Helen Toner revealed that the board did not know about ChatGPT until its launch in November 2022. Toner also said that Sam Altman gave the board inaccurate information about the safety processes the company had in place and that he didn’t disclose his involvement in the OpenAI Startup Fund.

Sharing this, recorded a few weeks ago. Most of the episode is about AI policy more broadly, but this was my first longform interview since the OpenAI investigation closed, so we also talked a bit about November. Thanks to @bilawalsidhu for a fun conversation! https://t.co/h0PtK06T0K — Helen Toner (@hlntnr) May 28, 2024

ChatGPT’s mobile app revenue saw biggest spike yet following GPT-4o launch

The launch of GPT-4o has driven the company’s biggest-ever spike in revenue on mobile , despite the model being freely available on the web. Mobile users are being pushed to upgrade to its $19.99 monthly subscription, ChatGPT Plus, if they want to experiment with OpenAI’s most recent launch.

OpenAI to remove ChatGPT’s Scarlett Johansson-like voice

After demoing its new GPT-4o model last week, OpenAI announced it is pausing one of its voices , Sky, after users found that it sounded similar to Scarlett Johansson in “Her.”

OpenAI explained in a blog post that Sky’s voice is “not an imitation” of the actress and that AI voices should not intentionally mimic the voice of a celebrity. The blog post went on to explain how the company chose its voices: Breeze, Cove, Ember, Juniper and Sky.

We’ve heard questions about how we chose the voices in ChatGPT, especially Sky. We are working to pause the use of Sky while we address them. Read more about how we chose these voices: https://t.co/R8wwZjU36L — OpenAI (@OpenAI) May 20, 2024

ChatGPT lets you add files from Google Drive and Microsoft OneDrive

OpenAI announced new updates for easier data analysis within ChatGPT . Users can now upload files directly from Google Drive and Microsoft OneDrive, interact with tables and charts, and export customized charts for presentations. The company says these improvements will be added to GPT-4o in the coming weeks.

We're rolling out interactive tables and charts along with the ability to add files directly from Google Drive and Microsoft OneDrive into ChatGPT. Available to ChatGPT Plus, Team, and Enterprise users over the coming weeks. https://t.co/Fu2bgMChXt pic.twitter.com/M9AHLx5BKr — OpenAI (@OpenAI) May 16, 2024

OpenAI inks deal to train AI on Reddit data

OpenAI announced a partnership with Reddit that will give the company access to “real-time, structured and unique content” from the social network. Content from Reddit will be incorporated into ChatGPT, and the companies will work together to bring new AI-powered features to Reddit users and moderators.

We’re partnering with Reddit to bring its content to ChatGPT and new products: https://t.co/xHgBZ8ptOE — OpenAI (@OpenAI) May 16, 2024

OpenAI debuts GPT-4o “omni” model now powering ChatGPT

OpenAI’s spring update event saw the reveal of its new omni model, GPT-4o, which has a black hole-like interface , as well as voice and vision capabilities that feel eerily like something out of “Her.” GPT-4o is set to roll out “iteratively” across its developer and consumer-facing products over the next few weeks.

OpenAI demos real-time language translation with its latest GPT-4o model. pic.twitter.com/pXtHQ9mKGc — TechCrunch (@TechCrunch) May 13, 2024

OpenAI to build a tool that lets content creators opt out of AI training

The company announced it’s building a tool, Media Manager, that will allow creators to better control how their content is being used to train generative AI models — and give them an option to opt out. The goal is to have the new tool in place and ready to use by 2025.

OpenAI explores allowing AI porn

In a new peek behind the curtain of its AI’s secret instructions , OpenAI also released a new NSFW policy . Though it’s intended to start a conversation about how it might allow explicit images and text in its AI products, it raises questions about whether OpenAI — or any generative AI vendor — can be trusted to handle sensitive content ethically.

OpenAI and Stack Overflow announce partnership

In a new partnership, OpenAI will get access to developer platform Stack Overflow’s API and will get feedback from developers to improve the performance of their AI models. In return, OpenAI will include attributions to Stack Overflow in ChatGPT. However, the deal was not favorable to some Stack Overflow users — leading to some sabotaging their answer in protest .

U.S. newspapers file copyright lawsuit against OpenAI and Microsoft

Alden Global Capital-owned newspapers, including the New York Daily News, the Chicago Tribune, and the Denver Post, are suing OpenAI and Microsoft for copyright infringement. The lawsuit alleges that the companies stole millions of copyrighted articles “without permission and without payment” to bolster ChatGPT and Copilot.

OpenAI inks content licensing deal with Financial Times

OpenAI has partnered with another news publisher in Europe, London’s Financial Times , that the company will be paying for content access. “Through the partnership, ChatGPT users will be able to see select attributed summaries, quotes and rich links to FT journalism in response to relevant queries,” the FT wrote in a press release.

OpenAI opens Tokyo hub, adds GPT-4 model optimized for Japanese

OpenAI is opening a new office in Tokyo and has plans for a GPT-4 model optimized specifically for the Japanese language. The move underscores how OpenAI will likely need to localize its technology to different languages as it expands.

Sam Altman pitches ChatGPT Enterprise to Fortune 500 companies

According to Reuters, OpenAI’s Sam Altman hosted hundreds of executives from Fortune 500 companies across several cities in April, pitching versions of its AI services intended for corporate use.

OpenAI releases “more direct, less verbose” version of GPT-4 Turbo

Premium ChatGPT users — customers paying for ChatGPT Plus, Team or Enterprise — can now use an updated and enhanced version of GPT-4 Turbo . The new model brings with it improvements in writing, math, logical reasoning and coding, OpenAI claims, as well as a more up-to-date knowledge base.

Our new GPT-4 Turbo is now available to paid ChatGPT users. We’ve improved capabilities in writing, math, logical reasoning, and coding. Source: https://t.co/fjoXDCOnPr pic.twitter.com/I4fg4aDq1T — OpenAI (@OpenAI) April 12, 2024

ChatGPT no longer requires an account — but there’s a catch

You can now use ChatGPT without signing up for an account , but it won’t be quite the same experience. You won’t be able to save or share chats, use custom instructions, or other features associated with a persistent account. This version of ChatGPT will have “slightly more restrictive content policies,” according to OpenAI. When TechCrunch asked for more details, however, the response was unclear:

“The signed out experience will benefit from the existing safety mitigations that are already built into the model, such as refusing to generate harmful content. In addition to these existing mitigations, we are also implementing additional safeguards specifically designed to address other forms of content that may be inappropriate for a signed out experience,” a spokesperson said.

OpenAI’s chatbot store is filling up with spam

TechCrunch found that the OpenAI’s GPT Store is flooded with bizarre, potentially copyright-infringing GPTs . A cursory search pulls up GPTs that claim to generate art in the style of Disney and Marvel properties, but serve as little more than funnels to third-party paid services and advertise themselves as being able to bypass AI content detection tools.

The New York Times responds to OpenAI’s claims that it “hacked” ChatGPT for its copyright lawsuit

In a court filing opposing OpenAI’s motion to dismiss The New York Times’ lawsuit alleging copyright infringement, the newspaper asserted that “OpenAI’s attention-grabbing claim that The Times ‘hacked’ its products is as irrelevant as it is false.” The New York Times also claimed that some users of ChatGPT used the tool to bypass its paywalls.

OpenAI VP doesn’t say whether artists should be paid for training data

At a SXSW 2024 panel, Peter Deng, OpenAI’s VP of consumer product dodged a question on whether artists whose work was used to train generative AI models should be compensated . While OpenAI lets artists “opt out” of and remove their work from the datasets that the company uses to train its image-generating models, some artists have described the tool as onerous.

A new report estimates that ChatGPT uses more than half a million kilowatt-hours of electricity per day

ChatGPT’s environmental impact appears to be massive. According to a report from The New Yorker , ChatGPT uses an estimated 17,000 times the amount of electricity than the average U.S. household to respond to roughly 200 million requests each day.

ChatGPT can now read its answers aloud

OpenAI released a new Read Aloud feature for the web version of ChatGPT as well as the iOS and Android apps. The feature allows ChatGPT to read its responses to queries in one of five voice options and can speak 37 languages, according to the company. Read aloud is available on both GPT-4 and GPT-3.5 models.

ChatGPT can now read responses to you. On iOS or Android, tap and hold the message and then tap “Read Aloud”. We’ve also started rolling on web – click the "Read Aloud" button below the message. pic.twitter.com/KevIkgAFbG — OpenAI (@OpenAI) March 4, 2024

OpenAI partners with Dublin City Council to use GPT-4 for tourism

As part of a new partnership with OpenAI, the Dublin City Council will use GPT-4 to craft personalized itineraries for travelers, including recommendations of unique and cultural destinations, in an effort to support tourism across Europe.

A law firm used ChatGPT to justify a six-figure bill for legal services

New York-based law firm Cuddy Law was criticized by a judge for using ChatGPT to calculate their hourly billing rate . The firm submitted a $113,500 bill to the court, which was then halved by District Judge Paul Engelmayer, who called the figure “well above” reasonable demands.

ChatGPT experienced a bizarre bug for several hours

ChatGPT users found that ChatGPT was giving nonsensical answers for several hours , prompting OpenAI to investigate the issue. Incidents varied from repetitive phrases to confusing and incorrect answers to queries. The issue was resolved by OpenAI the following morning.

Match Group announced deal with OpenAI with a press release co-written by ChatGPT

The dating app giant home to Tinder, Match and OkCupid announced an enterprise agreement with OpenAI in an enthusiastic press release written with the help of ChatGPT . The AI tech will be used to help employees with work-related tasks and come as part of Match’s $20 million-plus bet on AI in 2024.

ChatGPT will now remember — and forget — things you tell it to

As part of a test, OpenAI began rolling out new “memory” controls for a small portion of ChatGPT free and paid users, with a broader rollout to follow. The controls let you tell ChatGPT explicitly to remember something, see what it remembers or turn off its memory altogether. Note that deleting a chat from chat history won’t erase ChatGPT’s or a custom GPT’s memories — you must delete the memory itself.

We’re testing ChatGPT's ability to remember things you discuss to make future chats more helpful. This feature is being rolled out to a small portion of Free and Plus users, and it's easy to turn on or off. https://t.co/1Tv355oa7V pic.twitter.com/BsFinBSTbs — OpenAI (@OpenAI) February 13, 2024

OpenAI begins rolling out “Temporary Chat” feature

Initially limited to a small subset of free and subscription users, Temporary Chat lets you have a dialogue with a blank slate. With Temporary Chat, ChatGPT won’t be aware of previous conversations or access memories but will follow custom instructions if they’re enabled.

But, OpenAI says it may keep a copy of Temporary Chat conversations for up to 30 days for “safety reasons.”

Use temporary chat for conversations in which you don’t want to use memory or appear in history. pic.twitter.com/H1U82zoXyC — OpenAI (@OpenAI) February 13, 2024

ChatGPT users can now invoke GPTs directly in chats

Paid users of ChatGPT can now bring GPTs into a conversation by typing “@” and selecting a GPT from the list. The chosen GPT will have an understanding of the full conversation, and different GPTs can be “tagged in” for different use cases and needs.

You can now bring GPTs into any conversation in ChatGPT – simply type @ and select the GPT. This allows you to add relevant GPTs with the full context of the conversation. pic.twitter.com/Pjn5uIy9NF — OpenAI (@OpenAI) January 30, 2024

ChatGPT is reportedly leaking usernames and passwords from users’ private conversations

Screenshots provided to Ars Technica found that ChatGPT is potentially leaking unpublished research papers, login credentials and private information from its users. An OpenAI representative told Ars Technica that the company was investigating the report.

ChatGPT is violating Europe’s privacy laws, Italian DPA tells OpenAI

OpenAI has been told it’s suspected of violating European Union privacy , following a multi-month investigation of ChatGPT by Italy’s data protection authority. Details of the draft findings haven’t been disclosed, but in a response, OpenAI said: “We want our AI to learn about the world, not about private individuals.”

OpenAI partners with Common Sense Media to collaborate on AI guidelines

In an effort to win the trust of parents and policymakers, OpenAI announced it’s partnering with Common Sense Media to collaborate on AI guidelines and education materials for parents, educators and young adults. The organization works to identify and minimize tech harms to young people and previously flagged ChatGPT as lacking in transparency and privacy .

OpenAI responds to Congressional Black Caucus about lack of diversity on its board

After a letter from the Congressional Black Caucus questioned the lack of diversity in OpenAI’s board, the company responded . The response, signed by CEO Sam Altman and Chairman of the Board Bret Taylor, said building a complete and diverse board was one of the company’s top priorities and that it was working with an executive search firm to assist it in finding talent.

OpenAI drops prices and fixes ‘lazy’ GPT-4 that refused to work

In a blog post , OpenAI announced price drops for GPT-3.5’s API, with input prices dropping to 50% and output by 25%, to $0.0005 per thousand tokens in, and $0.0015 per thousand tokens out. GPT-4 Turbo also got a new preview model for API use, which includes an interesting fix that aims to reduce “laziness” that users have experienced.

Expanding the platform for @OpenAIDevs : new generation of embedding models, updated GPT-4 Turbo, and lower pricing on GPT-3.5 Turbo. https://t.co/7wzCLwB1ax — OpenAI (@OpenAI) January 25, 2024

OpenAI bans developer of a bot impersonating a presidential candidate

OpenAI has suspended AI startup Delphi, which developed a bot impersonating Rep. Dean Phillips (D-Minn.) to help bolster his presidential campaign. The ban comes just weeks after OpenAI published a plan to combat election misinformation, which listed “chatbots impersonating candidates” as against its policy.

OpenAI announces partnership with Arizona State University

Beginning in February, Arizona State University will have full access to ChatGPT’s Enterprise tier , which the university plans to use to build a personalized AI tutor, develop AI avatars, bolster their prompt engineering course and more. It marks OpenAI’s first partnership with a higher education institution.

Winner of a literary prize reveals around 5% her novel was written by ChatGPT

After receiving the prestigious Akutagawa Prize for her novel The Tokyo Tower of Sympathy, author Rie Kudan admitted that around 5% of the book quoted ChatGPT-generated sentences “verbatim.” Interestingly enough, the novel revolves around a futuristic world with a pervasive presence of AI.

Sam Altman teases video capabilities for ChatGPT and the release of GPT-5

In a conversation with Bill Gates on the Unconfuse Me podcast, Sam Altman confirmed an upcoming release of GPT-5 that will be “fully multimodal with speech, image, code, and video support.” Altman said users can expect to see GPT-5 drop sometime in 2024.

OpenAI announces team to build ‘crowdsourced’ governance ideas into its models

OpenAI is forming a Collective Alignment team of researchers and engineers to create a system for collecting and “encoding” public input on its models’ behaviors into OpenAI products and services. This comes as a part of OpenAI’s public program to award grants to fund experiments in setting up a “democratic process” for determining the rules AI systems follow.

OpenAI unveils plan to combat election misinformation

In a blog post, OpenAI announced users will not be allowed to build applications for political campaigning and lobbying until the company works out how effective their tools are for “personalized persuasion.”

Users will also be banned from creating chatbots that impersonate candidates or government institutions, and from using OpenAI tools to misrepresent the voting process or otherwise discourage voting.

The company is also testing out a tool that detects DALL-E generated images and will incorporate access to real-time news, with attribution, in ChatGPT.

Snapshot of how we’re preparing for 2024’s worldwide elections: • Working to prevent abuse, including misleading deepfakes • Providing transparency on AI-generated content • Improving access to authoritative voting information https://t.co/qsysYy5l0L — OpenAI (@OpenAI) January 15, 2024

OpenAI changes policy to allow military applications

In an unannounced update to its usage policy , OpenAI removed language previously prohibiting the use of its products for the purposes of “military and warfare.” In an additional statement, OpenAI confirmed that the language was changed in order to accommodate military customers and projects that do not violate their ban on efforts to use their tools to “harm people, develop weapons, for communications surveillance, or to injure others or destroy property.”

ChatGPT subscription aimed at small teams debuts

Aptly called ChatGPT Team , the new plan provides a dedicated workspace for teams of up to 149 people using ChatGPT as well as admin tools for team management. In addition to gaining access to GPT-4, GPT-4 with Vision and DALL-E3, ChatGPT Team lets teams build and share GPTs for their business needs.

OpenAI’s GPT store officially launches

After some back and forth over the last few months, OpenAI’s GPT Store is finally here . The feature lives in a new tab in the ChatGPT web client, and includes a range of GPTs developed both by OpenAI’s partners and the wider dev community.

To access the GPT Store, users must be subscribed to one of OpenAI’s premium ChatGPT plans — ChatGPT Plus, ChatGPT Enterprise or the newly launched ChatGPT Team.

the GPT store is live! https://t.co/AKg1mjlvo2 fun speculation last night about which GPTs will be doing the best by the end of today. — Sam Altman (@sama) January 10, 2024

Developing AI models would be “impossible” without copyrighted materials, OpenAI claims

Following a proposed ban on using news publications and books to train AI chatbots in the U.K., OpenAI submitted a plea to the House of Lords communications and digital committee. OpenAI argued that it would be “impossible” to train AI models without using copyrighted materials, and that they believe copyright law “does not forbid training.”

OpenAI claims The New York Times’ copyright lawsuit is without merit

OpenAI published a public response to The New York Times’s lawsuit against them and Microsoft for allegedly violating copyright law, claiming that the case is without merit.

In the response , OpenAI reiterates its view that training AI models using publicly available data from the web is fair use. It also makes the case that regurgitation is less likely to occur with training data from a single source and places the onus on users to “act responsibly.”

We build AI to empower people, including journalists. Our position on the @nytimes lawsuit: • Training is fair use, but we provide an opt-out • "Regurgitation" is a rare bug we're driving to zero • The New York Times is not telling the full story https://t.co/S6fSaDsfKb — OpenAI (@OpenAI) January 8, 2024

OpenAI’s app store for GPTs planned to launch next week

After being delayed in December , OpenAI plans to launch its GPT Store sometime in the coming week, according to an email viewed by TechCrunch. OpenAI says developers building GPTs will have to review the company’s updated usage policies and GPT brand guidelines to ensure their GPTs are compliant before they’re eligible for listing in the GPT Store. OpenAI’s update notably didn’t include any information on the expected monetization opportunities for developers listing their apps on the storefront.

GPT Store launching next week – OpenAI pic.twitter.com/I6mkZKtgZG — Manish Singh (@refsrc) January 4, 2024

OpenAI moves to shrink regulatory risk in EU around data privacy

In an email, OpenAI detailed an incoming update to its terms, including changing the OpenAI entity providing services to EEA and Swiss residents to OpenAI Ireland Limited. The move appears to be intended to shrink its regulatory risk in the European Union, where the company has been under scrutiny over ChatGPT’s impact on people’s privacy.

What is ChatGPT? How does it work?

ChatGPT is a general-purpose chatbot that uses artificial intelligence to generate text after a user enters a prompt, developed by tech startup OpenAI . The chatbot uses GPT-4, a large language model that uses deep learning to produce human-like text.

When did ChatGPT get released?

November 30, 2022 is when ChatGPT was released for public use.

What is the latest version of ChatGPT?

Both the free version of ChatGPT and the paid ChatGPT Plus are regularly updated with new GPT models. The most recent model is GPT-4o .

Can I use ChatGPT for free?

There is a free version of ChatGPT that only requires a sign-in in addition to the paid version, ChatGPT Plus .

Who uses ChatGPT?

Anyone can use ChatGPT! More and more tech companies and search engines are utilizing the chatbot to automate text or quickly answer user questions/concerns.

What companies use ChatGPT?

Multiple enterprises utilize ChatGPT, although others may limit the use of the AI-powered tool .

Most recently, Microsoft announced at it’s 2023 Build conference that it is integrating it ChatGPT-based Bing experience into Windows 11. A Brooklyn-based 3D display startup Looking Glass utilizes ChatGPT to produce holograms you can communicate with by using ChatGPT. And nonprofit organization Solana officially integrated the chatbot into its network with a ChatGPT plug-in geared toward end users to help onboard into the web3 space.

What does GPT mean in ChatGPT?

GPT stands for Generative Pre-Trained Transformer.

What is the difference between ChatGPT and a chatbot?

A chatbot can be any software/system that holds dialogue with you/a person but doesn’t necessarily have to be AI-powered. For example, there are chatbots that are rules-based in the sense that they’ll give canned responses to questions.

ChatGPT is AI-powered and utilizes LLM technology to generate text after a prompt.

Can ChatGPT write essays?

Can chatgpt commit libel.

Due to the nature of how these models work , they don’t know or care whether something is true, only that it looks true. That’s a problem when you’re using it to do your homework, sure, but when it accuses you of a crime you didn’t commit, that may well at this point be libel.

We will see how handling troubling statements produced by ChatGPT will play out over the next few months as tech and legal experts attempt to tackle the fastest moving target in the industry.

Does ChatGPT have an app?

Yes, there is a free ChatGPT mobile app for iOS and Android users.

What is the ChatGPT character limit?

It’s not documented anywhere that ChatGPT has a character limit. However, users have noted that there are some character limitations after around 500 words.

Does ChatGPT have an API?

Yes, it was released March 1, 2023.

What are some sample everyday uses for ChatGPT?

Everyday examples include programing, scripts, email replies, listicles, blog ideas, summarization, etc.

What are some advanced uses for ChatGPT?

Advanced use examples include debugging code, programming languages, scientific concepts, complex problem solving, etc.

How good is ChatGPT at writing code?

It depends on the nature of the program. While ChatGPT can write workable Python code, it can’t necessarily program an entire app’s worth of code. That’s because ChatGPT lacks context awareness — in other words, the generated code isn’t always appropriate for the specific context in which it’s being used.

Can you save a ChatGPT chat?

Yes. OpenAI allows users to save chats in the ChatGPT interface, stored in the sidebar of the screen. There are no built-in sharing features yet.

Are there alternatives to ChatGPT?

Yes. There are multiple AI-powered chatbot competitors such as Together , Google’s Gemini and Anthropic’s Claude , and developers are creating open source alternatives .

How does ChatGPT handle data privacy?

OpenAI has said that individuals in “certain jurisdictions” (such as the EU) can object to the processing of their personal information by its AI models by filling out this form . This includes the ability to make requests for deletion of AI-generated references about you. Although OpenAI notes it may not grant every request since it must balance privacy requests against freedom of expression “in accordance with applicable laws”.

The web form for making a deletion of data about you request is entitled “ OpenAI Personal Data Removal Request ”.

In its privacy policy, the ChatGPT maker makes a passing acknowledgement of the objection requirements attached to relying on “legitimate interest” (LI), pointing users towards more information about requesting an opt out — when it writes: “See here for instructions on how you can opt out of our use of your information to train our models.”

What controversies have surrounded ChatGPT?

Recently, Discord announced that it had integrated OpenAI’s technology into its bot named Clyde where two users tricked Clyde into providing them with instructions for making the illegal drug methamphetamine (meth) and the incendiary mixture napalm.

An Australian mayor has publicly announced he may sue OpenAI for defamation due to ChatGPT’s false claims that he had served time in prison for bribery. This would be the first defamation lawsuit against the text-generating service.

CNET found itself in the midst of controversy after Futurism reported the publication was publishing articles under a mysterious byline completely generated by AI. The private equity company that owns CNET, Red Ventures, was accused of using ChatGPT for SEO farming, even if the information was incorrect.

Several major school systems and colleges, including New York City Public Schools , have banned ChatGPT from their networks and devices. They claim that the AI impedes the learning process by promoting plagiarism and misinformation, a claim that not every educator agrees with .

There have also been cases of ChatGPT accusing individuals of false crimes .

Where can I find examples of ChatGPT prompts?

Several marketplaces host and provide ChatGPT prompts, either for free or for a nominal fee. One is PromptBase . Another is ChatX . More launch every day.

Can ChatGPT be detected?

Poorly. Several tools claim to detect ChatGPT-generated text, but in our tests , they’re inconsistent at best.

Are ChatGPT chats public?

No. But OpenAI recently disclosed a bug, since fixed, that exposed the titles of some users’ conversations to other people on the service.

What lawsuits are there surrounding ChatGPT?

None specifically targeting ChatGPT. But OpenAI is involved in at least one lawsuit that has implications for AI systems trained on publicly available data, which would touch on ChatGPT.

Are there issues regarding plagiarism with ChatGPT?

Yes. Text-generating AI models like ChatGPT have a tendency to regurgitate content from their training data.

More TechCrunch

Get the industry’s biggest tech news, techcrunch daily news.

Every weekday and Sunday, you can get the best of TechCrunch’s coverage.

Startups Weekly

Startups are the core of TechCrunch, so get our best coverage delivered weekly.

TechCrunch Fintech

The latest Fintech news and analysis, delivered every Tuesday.

TechCrunch Mobility

TechCrunch Mobility is your destination for transportation news and insight.

HealthEquity says data breach is an ‘isolated incident’

HealthEquity said in an 8-K filing with the SEC that it detected “anomalous behavior by a personal use device belonging to a business partner.”

Roll20, an online tabletop role-playing game platform, discloses data breach

Roll20 said that on June 29 it had detected that a “bad actor” gained access to an account on the company’s administrative website for one hour.

Fisker asks bankruptcy court to sell its EVs at average of $14,000 each

Fisker has a willing buyer for its remaining inventory of all-electric Ocean SUVs, and has asked the Delaware Bankruptcy Court judge overseeing its Chapter 11 case to approve the sale.…

Fizz, the anonymous Gen Z social app, adds a marketplace for college students

Teddy Solomon just moved to a new house in Palo Alto, so he turned to the Stanford community on Fizz to furnish his room. “Every time I show up to…

Why deep tech VC Driving Forces is shutting down

With increasing competition for what is, essentially, still a small number of hard tech and deep tech deals, Sidney Scott realized it would be a challenge for smaller funds like…

How to turn off those silly video call reactions on iPhone and Mac

A guide to turn off reactions on your iPhone and Mac so you don’t get surprised by effects during work video calls.

Amazon retires its Astro for Business security robot after only 7 months

Amazon has decided to discontinue its Astro for Business device, a security robot for small- and medium-sized businesses, just seven months after launch. In an email sent to customers and…

This Week in AI: With Chevron’s demise, AI regulation seems dead in the water

Hiya, folks, and welcome to TechCrunch’s regular AI newsletter. This week in AI, the U.S. Supreme Court struck down “Chevron deference,” a 40-year-old ruling on federal agencies’ power that required…

noplace, a mashup of Twitter and Myspace for Gen Z, hits No. 1 on the App Store

Noplace had already gone viral ahead of its public launch because of its feature that allows users to express themselves by customizing the colors of their profile.

Cloudflare launches a tool to combat AI bots

Cloudflare analyzed AI bot and crawler traffic to fine-tune automatic bot detection models.

Twilio says hackers identified cell phone numbers of two-factor app Authy users

Twilio says “threat actors were able to identify” phone numbers of people who use the two-factor app Authy.

Nano Dimension is buying Desktop Metal

The news brings closure to more than two years of volleying back and forth between some of the biggest names in additive manufacturing.

Groups save big at TechCrunch Disrupt 2024

Planning to attend TechCrunch Disrupt 2024 with your team? Maximize your team-building time and your company’s impact across the entire conference when you bring your team. Groups of 4 to…

Music video-sharing app Popster uses generative AI and lets artists remix videos

As more music streaming apps and creation tools emerge to compete for users’ attention, social music-sharing app Popster is getting two new features to grow its user base: an AI…

Threads nears its one-year anniversary with more than 175M monthly active users

Meta’s Threads now has more than 175 million monthly active users, Mark Zuckerberg announced on Wednesday. The announcement comes two days away from Threads’ first anniversary. Zuckerberg revealed back in…

From burritos to biotech: How robotics startup Cartken found its AV niche

Cartken and its diminutive sidewalk delivery robots first rolled into the world with a narrow charter: carrying everything from burritos and bento boxes to pizza and pad thai that last…

Granza Bio grabs $7M seed from Felicis and YC to advance delivery of cancer treatments

Ashwin Nandakumar and Ashwin Jainarayanan were working on their doctorates at adjacent departments in Oxford, but they didn’t know each other. Nandakumar, who was studying oncology, one day stumbled across…

LG acquires smart home platform Athom to bring third-party connectivity to its ThinQ ecosytem

LG has acquired an 80% stake in Athom, a Dutch smart home company and maker of the Homey smart home hub. According to LG’s announcement, it will purchase the remaining…

CoinDCX acquires BitOasis in international expansion push

CoinDCX, India’s leading cryptocurrency exchange, is expanding internationally through the acquisition of BitOasis, a digital asset platform in the Middle East and North Africa, the companies said Wednesday. The Bengaluru-based…

In a major update, Proton adds privacy-safe document collaboration to Drive, its freemium E2EE cloud storage service

Collaborative document features are being made available inside Proton Drive, further extending the company’s trademark pitch of robust security.

Telegram lets creators share paid content to channels

Telegram launched a digital currency called Stars for in-app use last month. Now, the company is expanding its use cases to paid content. The chat app is also allowing channels…

Altrove uses AI models and lab automation to create new materials

For the past couple of years, innovation has been accelerating in new materials development. And a new French startup called Altrove plans to play a role in this innovation cycle.…

Indian social network Koo is shutting down as buyout talks collapse

The Indian social media platform Koo, which positioned itself as a competitor to Elon Musk’s X, is ceasing operations after its last-resort acquisition talks with Dailyhunt collapsed. Despite securing over…

Europe is still serious about ESG, and Apiday is helping companies comply

Apiday leverages AI to save time for its customers. But like legacy consultants, it also offers human expertise.

Google’s environmental report pointedly avoids AI’s actual energy cost

Google totally dodges the question of how much energy is AI is using — perhaps because the answer is “way more than we’d care to say.”

SpaceX wants to launch up to 120 times a year from Florida — and competitors aren’t happy about it

SpaceX’s ambitious plans to launch its Starship mega-rocket up to 44 times per year from NASA’s Kennedy Space Center are causing a stir among some of its competitors. Late last…

Newsletter writer covering Evolve Bank’s data breach says the bank sent him a cease and desist letter

The situation around a data breach that’s affected an ever-growing number of fintech companies has gotten even weirder. Evolve Bank & Trust announced last week that it was hacked and…

Twitter/X alternative Mastodon appeals to journalists with new ‘byline’ feature

The new bylines go beyond the typical @username references that often accompany link posts from news publications and those pointing to other written content, like a WordPress blog or Substack

X weighs adding a downvote button to replies — but it doesn’t want to emulate Reddit

code references found in the X iOS app indicate that the company could be considering adding downvotes for replies only to improve how they’re ranked.

Yieldstreet says some of its customers were affected by the Evolve Bank data breach

Evolve, a popular financial institution for fintech startups, announced that a cyberattack affected “the data and personal information of some Evolve retail bank customers and financial technology partners’ customers.”

Career Advice

Anatomy of an AI Essay

How might you distinguish one from a human-composed counterpart? After analyzing dozens, Elizabeth Steere lists some key predictable features.

By Elizabeth Steere

You have / 5 articles left. Sign up for a free account or log in.

Human hand writing in script while a robot hand types on a laptop

baona /iStock/Getty images Plus

Since OpenAI launched ChatGPT in 2022, educators have been grappling with the problem of how to recognize and address AI-generated writing. The host of AI-detection tools that have emerged over the past year vary greatly in their capabilities and reliability. For example, mere months after OpenAI launched its own AI detector, the company shut it down due to its low accuracy rate.

Understandably, students have expressed concerns over the possibility of their work receiving false positives as AI-generated content. Some institutions have disabled Turnitin’s AI-detection feature due to concerns over potential false allegations of AI plagiarism that may disproportionately affect English-language learners . At the same time, tools that rephrase AI writing—such as text spinners, text inflators or text “humanizers”—can effectively disguise AI-generated text from detection. There are even tools that mimic human typing to conceal AI use in a document’s metadata.

While the capabilities of large language models such as ChatGPT are impressive, they are also limited, as they strongly adhere to specific formulas and phrasing . Turnitin’s website explains that its AI-detection tool relies on the fact that “GPT-3 and ChatGPT tend to generate the next word in a sequence of words in a consistent and highly probable fashion.” I am not a computer programmer or statistician, but I have noticed certain attributes in text that point to the probable involvement of AI, and in February, I collected and quantified some of those characteristics in hopes to better recognize AI essays and to share those characteristics with students and other faculty members.

I asked ChatGPT 3.5 and the generative AI tool included in the free version of Grammarly each to generate more than 50 analytical essays on early American literature, using texts and prompts from classes I have taught over the past decade. I took note of the characteristics of AI essays that differentiated them from what I have come to expect from their human-composed counterparts. Here are some of the key features I noticed.

AI essays tend to get straight to the point. Human-written work often gradually leads up to its topic, offering personal anecdotes, definitions or rhetorical questions before getting to the topic at hand.

AI-generated essays are often list-like. They may feature numbered body paragraphs or multiple headings and subheadings.

The paragraphs of AI-generated essays also often begin with formulaic transitional phrases. As an example, here are the first words of each paragraph in one essay that ChatGPT produced:

“In contrast”
“Furthermore”
“On the other hand”
“In conclusion.”

Notably, AI-generated essays were far more likely than human-written essays to begin paragraphs with “Furthermore,” “Moreover” and “Overall.”

AI-generated work is often banal. It does not break new ground or demonstrate originality; its assertions sound familiar.

AI-generated text tends to remain in the third person. That’s the case even when asked a reader response–style question. For example, when I asked ChatGPT what it personally found intriguing, meaningful or resonant about one of Edgar Allan Poe’s poems, it produced six paragraphs, but the pronoun “I” was included only once. The rest of the text described the poem’s atmosphere, themes and use of language in dispassionate prose. Grammarly prefaced its answer with “I’m sorry, but I cannot have preferences as I am an AI-powered assistant and do not have emotions or personal opinions,” followed by similarly clinical observations about the text.

AI-produced text tends to discuss “readers” being “challenged” to “confront” ideologies or being “invited” to “reflect” on key topics. In contrast, I have found that human-written text tends to focus on hypothetically what “the reader” might “see,” “feel” or “learn.”

AI-generated essays are often confidently wrong. Human writing is more prone to hedging, using phrases like “I think,” “I feel,” “this might mean …” or “this could be a symbol of …” and so on.

AI-generated essays are often repetitive. An essay that ChatGPT produced on the setting of Rebecca Harding Davis’s short story “Life in the Iron Mills” contained the following assertions among its five brief paragraphs: “The setting serves as a powerful symbol,” “the industrial town itself serves as a central aspect of the setting,” “the roar of furnaces serve as a constant reminder of the relentless pace of industrial production,” “the setting serves as a catalyst for the characters’ struggles and aspirations,” “the setting serves as a microcosm of the larger societal issues of the time,” and “the setting … serves as a powerful symbol of the dehumanizing effects of industrialization.”

Editors’ Picks

DEI Ban Prompts Utah Colleges to Close Cultural Centers, Too
Supreme Court Decision Weakens Education Department
The Only Certainty Is Uncertainty

AI writing is often hyperbolic or overreaching. The quotes above describe a “powerful symbol,” for example. AI essays frequently describe even the most mundane topics as “groundbreaking,” “vital,” “esteemed,” “invaluable,” “indelible,” “essential,” “poignant” or “profound.”

AI-produced texts frequently use metaphors, sometimes awkwardly. ChatGPT produced several essays that compared writing to “weaving” a “rich” or “intricate tapestry” or “painting” a “vivid picture.”

AI-generated essays tend to overexplain. They often use appositives to define people or terms, as in “Margaret Fuller, a pioneering feminist and transcendentalist thinker, explored themes such as individualism, self-reliance and the search for meaning in her writings …”

AI-generated academic writing often employs certain verbs. They include “delve,” “shed light,” “highlight,” “illuminate,” “underscore,” “showcase,” “embody,” “transcend,” “navigate,” “foster,” “grapple,” “strive,” “intertwine,” “espouse” and “endeavor.”

AI-generated essays tend to end with a sweeping broad-scale statement. They talk about “the human condition,” “American society,” “the search for meaning” or “the resilience of the human spirit.” Texts are often described as a “testament to” variations on these concepts.

AI-generated writing often invents sources. ChatGPT can compose a “research paper” using MLA-style in-text parenthetical citations and Works Cited entries that look correct and convincing, but the supposed sources are often nonexistent. In my experiment, ChatGPT referenced a purported article titled “Poe, ‘The Fall of the House of Usher,’ and the Gothic’s Creation of the Unconscious,” which it claimed was published in PMLA , vol. 96, no. 5, 1981, pp. 900–908. The author cited was an actual Poe scholar, but this particular article does not appear on his CV, and while volume 96, number 5 of PMLA did appear in 1981, the pages cited in that issue of PMLA actually span two articles: one on Frankenstein and one on lyric poetry.

AI-generated essays include hallucinations. Ted Chiang’s article on this phenomenon offers a useful explanation for why large language models such as ChatGPT generate fabricated facts and incorrect assertions. My AI-generated essays included references to nonexistent events, characters and quotes. For example, ChatGPT attributed the dubious quote “Half invoked, half spontaneous, full of ill-concealed enthusiasms, her wild heart lay out there” to a lesser-known short story by Herman Melville, yet nothing resembling that quote appears in the actual text. More hallucinations were evident when AI was generating text about less canonical or more recently published literary texts.

This is not an exhaustive list, and I know that AI-generated text in other formats or relating to other fields probably features different patterns and tendencies . I also used only very basic prompts and did not delineate many specific parameters for the output beyond the topic and the format of an essay.

It is also important to remember that the attributes I’ve described are not exclusive to AI-generated texts. In fact, I noticed that the phrase “It is important to … [note/understand/consider]” was a frequent sentence starter in AI-generated work, but, as evidenced in the previous sentence, humans use these constructions, too. After all, large language models train on human-generated text.

And none of these characteristics alone definitively point to a text having been created by AI. Unless a text begins with the phrase “As an AI language model,” it can be difficult to say whether it was entirely or partially generated by AI. Thus, if the nature of a student submission suggests AI involvement, my first course of action is always to reach out to the student themselves for more information. I try to bear in mind that this is a new technology for both students and instructors, and we are all still working to adapt accordingly.

Students may have received mixed messages on what degree or type of AI use is considered acceptable. Since AI is also now integrated into tools their institutions or instructors have encouraged them to use—such as Grammarly , Microsoft Word or Google Docs —the boundaries of how they should use technology to augment human writing may be especially unclear. Students may turn to AI because they lack confidence in their own writing abilities. Ultimately, however, I hope that by discussing the limits and the predictability of AI-generated prose, we can encourage them to embrace and celebrate their unique writerly voices.

Elizabeth Steere is a lecturer in English at the University of North Georgia.

Male professor and student sit together at a table working on a paper

Supporting Dissertation Writers Through the Silent Struggle

While we want Ph.D.

Share This Article

We See You, Student Parents

Alex Rockey recommends eight principles for transforming academic access for them through mobile-friendly courses.

View looking over shoulder of young instructor facing a classroom of seated college students

Beyond the Research

Michel Estefan offers a roadmap for helping graduate student instructors cultivate their distinct teaching style.

Become a Member
Sign up for Newsletters
Learning & Assessment
Diversity & Equity
Career Development
Labor & Unionization
Shared Governance
Academic Freedom
Books & Publishing
Financial Aid
Residential Life
Free Speech
Physical & Mental Health
Race & Ethnicity
Sex & Gender
Socioeconomics
Traditional-Age
Adult & Post-Traditional
Teaching & Learning
Artificial Intelligence
Digital Publishing
Data Analytics
Administrative Tech
Alternative Credentials
Financial Health
Cost-Cutting
Revenue Strategies
Academic Programs
Physical Campuses
Mergers & Collaboration
Fundraising
Research Universities
Regional Public Universities
Community Colleges
Private Nonprofit Colleges
Minority-Serving Institutions
Religious Colleges
Women's Colleges
Specialized Colleges
For-Profit Colleges
Executive Leadership
Trustees & Regents
State Oversight
Accreditation
Politics & Elections
Supreme Court
Student Aid Policy
Science & Research Policy
State Policy
Colleges & Localities
Employee Satisfaction
Remote & Flexible Work
Staff Issues
Study Abroad
International Students in U.S.
U.S. Colleges in the World
Intellectual Affairs
Seeking a Faculty Job
Advancing in the Faculty
Seeking an Administrative Job
Advancing as an Administrator
Beyond Transfer
Call to Action
Confessions of a Community College Dean
Higher Ed Gamma
Higher Ed Policy
Just Explain It to Me!
Just Visiting
Law, Policy—and IT?
Leadership & StratEDgy
Leadership in Higher Education
Learning Innovation
Online: Trending Now
Resident Scholar
University of Venus
Student Voice
Academic Life
Health & Wellness
The College Experience
Life After College
Academic Minute
Weekly Wisdom
Reports & Data
Quick Takes
Advertising & Marketing
Consulting Services
Data & Insights
Hiring & Jobs
Event Partnerships

4 /5 Articles remaining this month.

Sign up for a free account or log in.

The College Essay Is Dead

Nobody is prepared for how AI will transform academia.

An illustration of printed essays arranged to look like a skull

Suppose you are a professor of pedagogy, and you assign an essay on learning styles. A student hands in an essay with the following opening paragraph:

The construct of “learning styles” is problematic because it fails to account for the processes through which learning styles are shaped. Some students might develop a particular learning style because they have had particular experiences. Others might develop a particular learning style by trying to accommodate to a learning environment that was not well suited to their learning needs. Ultimately, we need to understand the interactions among learning styles and environmental and personal factors, and how these shape how we learn and the kinds of learning we experience.

Pass or fail? A- or B+? And how would your grade change if you knew a human student hadn’t written it at all? Because Mike Sharples, a professor in the U.K., used GPT-3, a large language model from OpenAI that automatically generates text from a prompt, to write it. (The whole essay, which Sharples considered graduate-level, is available, complete with references, here .) Personally, I lean toward a B+. The passage reads like filler, but so do most student essays.

Sharples’s intent was to urge educators to “rethink teaching and assessment” in light of the technology, which he said “could become a gift for student cheats, or a powerful teaching assistant, or a tool for creativity.” Essay generation is neither theoretical nor futuristic at this point. In May, a student in New Zealand confessed to using AI to write their papers, justifying it as a tool like Grammarly or spell-check: “I have the knowledge, I have the lived experience, I’m a good student, I go to all the tutorials and I go to all the lectures and I read everything we have to read but I kind of felt I was being penalised because I don’t write eloquently and I didn’t feel that was right,” they told a student paper in Christchurch. They don’t feel like they’re cheating, because the student guidelines at their university state only that you’re not allowed to get somebody else to do your work for you. GPT-3 isn’t “somebody else”—it’s a program.

The world of generative AI is progressing furiously. Last week, OpenAI released an advanced chatbot named ChatGPT that has spawned a new wave of marveling and hand-wringing , plus an upgrade to GPT-3 that allows for complex rhyming poetry; Google previewed new applications last month that will allow people to describe concepts in text and see them rendered as images; and the creative-AI firm Jasper received a $1.5 billion valuation in October. It still takes a little initiative for a kid to find a text generator, but not for long.

The essay, in particular the undergraduate essay, has been the center of humanistic pedagogy for generations. It is the way we teach children how to research, think, and write. That entire tradition is about to be disrupted from the ground up. Kevin Bryan, an associate professor at the University of Toronto, tweeted in astonishment about OpenAI’s new chatbot last week: “You can no longer give take-home exams/homework … Even on specific questions that involve combining knowledge across domains, the OpenAI chat is frankly better than the average MBA at this point. It is frankly amazing.” Neither the engineers building the linguistic tech nor the educators who will encounter the resulting language are prepared for the fallout.

A chasm has existed between humanists and technologists for a long time. In the 1950s, C. P. Snow gave his famous lecture, later the essay “The Two Cultures,” describing the humanistic and scientific communities as tribes losing contact with each other. “Literary intellectuals at one pole—at the other scientists,” Snow wrote. “Between the two a gulf of mutual incomprehension—sometimes (particularly among the young) hostility and dislike, but most of all lack of understanding. They have a curious distorted image of each other.” Snow’s argument was a plea for a kind of intellectual cosmopolitanism: Literary people were missing the essential insights of the laws of thermodynamics, and scientific people were ignoring the glories of Shakespeare and Dickens.

The rupture that Snow identified has only deepened. In the modern tech world, the value of a humanistic education shows up in evidence of its absence. Sam Bankman-Fried, the disgraced founder of the crypto exchange FTX who recently lost his $16 billion fortune in a few days , is a famously proud illiterate. “I would never read a book,” he once told an interviewer . “I don’t want to say no book is ever worth reading, but I actually do believe something pretty close to that.” Elon Musk and Twitter are another excellent case in point. It’s painful and extraordinary to watch the ham-fisted way a brilliant engineering mind like Musk deals with even relatively simple literary concepts such as parody and satire. He obviously has never thought about them before. He probably didn’t imagine there was much to think about.

The extraordinary ignorance on questions of society and history displayed by the men and women reshaping society and history has been the defining feature of the social-media era. Apparently, Mark Zuckerberg has read a great deal about Caesar Augustus , but I wish he’d read about the regulation of the pamphlet press in 17th-century Europe. It might have spared America the annihilation of social trust .

These failures don’t derive from mean-spiritedness or even greed, but from a willful obliviousness. The engineers do not recognize that humanistic questions—like, say, hermeneutics or the historical contingency of freedom of speech or the genealogy of morality—are real questions with real consequences. Everybody is entitled to their opinion about politics and culture, it’s true, but an opinion is different from a grounded understanding. The most direct path to catastrophe is to treat complex problems as if they’re obvious to everyone. You can lose billions of dollars pretty quickly that way.

As the technologists have ignored humanistic questions to their peril, the humanists have greeted the technological revolutions of the past 50 years by committing soft suicide. As of 2017, the number of English majors had nearly halved since the 1990s. History enrollments have declined by 45 percent since 2007 alone. Needless to say, humanists’ understanding of technology is partial at best. The state of digital humanities is always several categories of obsolescence behind, which is inevitable. (Nobody expects them to teach via Instagram Stories.) But more crucially, the humanities have not fundamentally changed their approach in decades, despite technology altering the entire world around them. They are still exploding meta-narratives like it’s 1979, an exercise in self-defeat.

Read: The humanities are in crisis

Contemporary academia engages, more or less permanently, in self-critique on any and every front it can imagine. In a tech-centered world, language matters, voice and style matter, the study of eloquence matters, history matters, ethical systems matter. But the situation requires humanists to explain why they matter, not constantly undermine their own intellectual foundations. The humanities promise students a journey to an irrelevant, self-consuming future; then they wonder why their enrollments are collapsing. Is it any surprise that nearly half of humanities graduates regret their choice of major ?

The case for the value of humanities in a technologically determined world has been made before. Steve Jobs always credited a significant part of Apple’s success to his time as a dropout hanger-on at Reed College, where he fooled around with Shakespeare and modern dance, along with the famous calligraphy class that provided the aesthetic basis for the Mac’s design. “A lot of people in our industry haven’t had very diverse experiences. So they don’t have enough dots to connect, and they end up with very linear solutions without a broad perspective on the problem,” Jobs said . “The broader one’s understanding of the human experience, the better design we will have.” Apple is a humanistic tech company. It’s also the largest company in the world.

Despite the clear value of a humanistic education, its decline continues. Over the past 10 years, STEM has triumphed, and the humanities have collapsed . The number of students enrolled in computer science is now nearly the same as the number of students enrolled in all of the humanities combined.

And now there’s GPT-3. Natural-language processing presents the academic humanities with a whole series of unprecedented problems. Practical matters are at stake: Humanities departments judge their undergraduate students on the basis of their essays. They give Ph.D.s on the basis of a dissertation’s composition. What happens when both processes can be significantly automated? Going by my experience as a former Shakespeare professor, I figure it will take 10 years for academia to face this new reality: two years for the students to figure out the tech, three more years for the professors to recognize that students are using the tech, and then five years for university administrators to decide what, if anything, to do about it. Teachers are already some of the most overworked, underpaid people in the world. They are already dealing with a humanities in crisis. And now this. I feel for them.

And yet, despite the drastic divide of the moment, natural-language processing is going to force engineers and humanists together. They are going to need each other despite everything. Computer scientists will require basic, systematic education in general humanism: The philosophy of language, sociology, history, and ethics are not amusing questions of theoretical speculation anymore. They will be essential in determining the ethical and creative use of chatbots, to take only an obvious example.

The humanists will need to understand natural-language processing because it’s the future of language, but also because there is more than just the possibility of disruption here. Natural-language processing can throw light on a huge number of scholarly problems. It is going to clarify matters of attribution and literary dating that no system ever devised will approach; the parameters in large language models are much more sophisticated than the current systems used to determine which plays Shakespeare wrote, for example . It may even allow for certain types of restorations, filling the gaps in damaged texts by means of text-prediction models. It will reformulate questions of literary style and philology; if you can teach a machine to write like Samuel Taylor Coleridge, that machine must be able to inform you, in some way, about how Samuel Taylor Coleridge wrote.

The connection between humanism and technology will require people and institutions with a breadth of vision and a commitment to interests that transcend their field. Before that space for collaboration can exist, both sides will have to take the most difficult leaps for highly educated people: Understand that they need the other side, and admit their basic ignorance. But that’s always been the beginning of wisdom, no matter what technological era we happen to inhabit.

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

View all journals
Explore content
About the journal
Publish with us
Sign up for alerts
Open access
Published: 30 October 2023

A large-scale comparison of human-written versus ChatGPT-generated essays

Steffen Herbold 1 ,
Annette Hautli-Janisz 1 ,
Ute Heuer 1 ,
Zlata Kikteva 1 &
Alexander Trautsch 1

Scientific Reports volume 13 , Article number: 18617 ( 2023 ) Cite this article

22k Accesses

26 Citations

97 Altmetric

Metrics details

Computer science
Information technology

ChatGPT and similar generative AI models have attracted hundreds of millions of users and have become part of the public discourse. Many believe that such models will disrupt society and lead to significant changes in the education system and information generation. So far, this belief is based on either colloquial evidence or benchmarks from the owners of the models—both lack scientific rigor. We systematically assess the quality of AI-generated content through a large-scale study comparing human-written versus ChatGPT-generated argumentative student essays. We use essays that were rated by a large number of human experts (teachers). We augment the analysis by considering a set of linguistic characteristics of the generated essays. Our results demonstrate that ChatGPT generates essays that are rated higher regarding quality than human-written essays. The writing style of the AI models exhibits linguistic characteristics that are different from those of the human-written essays. Since the technology is readily available, we believe that educators must act immediately. We must re-invent homework and develop teaching concepts that utilize these AI models in the same way as math utilizes the calculator: teach the general concepts first and then use AI tools to free up time for other learning objectives.

ChatGPT-3.5 as writing assistance in students’ essays

Perception, performance, and detectability of conversational artificial intelligence across 32 university courses

The model student: GPT-4 performance on graduate biomedical science exams

Introduction.

The massive uptake in the development and deployment of large-scale Natural Language Generation (NLG) systems in recent months has yielded an almost unprecedented worldwide discussion of the future of society. The ChatGPT service which serves as Web front-end to GPT-3.5 1 and GPT-4 was the fastest-growing service in history to break the 100 million user milestone in January and had 1 billion visits by February 2023 2 .

Driven by the upheaval that is particularly anticipated for education 3 and knowledge transfer for future generations, we conduct the first independent, systematic study of AI-generated language content that is typically dealt with in high-school education: argumentative essays, i.e. essays in which students discuss a position on a controversial topic by collecting and reflecting on evidence (e.g. ‘Should students be taught to cooperate or compete?’). Learning to write such essays is a crucial aspect of education, as students learn to systematically assess and reflect on a problem from different perspectives. Understanding the capability of generative AI to perform this task increases our understanding of the skills of the models, as well as of the challenges educators face when it comes to teaching this crucial skill. While there is a multitude of individual examples and anecdotal evidence for the quality of AI-generated content in this genre (e.g. 4 ) this paper is the first to systematically assess the quality of human-written and AI-generated argumentative texts across different versions of ChatGPT 5 . We use a fine-grained essay quality scoring rubric based on content and language mastery and employ a significant pool of domain experts, i.e. high school teachers across disciplines, to perform the evaluation. Using computational linguistic methods and rigorous statistical analysis, we arrive at several key findings:

AI models generate significantly higher-quality argumentative essays than the users of an essay-writing online forum frequented by German high-school students across all criteria in our scoring rubric.

ChatGPT-4 (ChatGPT web interface with the GPT-4 model) significantly outperforms ChatGPT-3 (ChatGPT web interface with the GPT-3.5 default model) with respect to logical structure, language complexity, vocabulary richness and text linking.

Writing styles between humans and generative AI models differ significantly: for instance, the GPT models use more nominalizations and have higher sentence complexity (signaling more complex, ‘scientific’, language), whereas the students make more use of modal and epistemic constructions (which tend to convey speaker attitude).

The linguistic diversity of the NLG models seems to be improving over time: while ChatGPT-3 still has a significantly lower linguistic diversity than humans, ChatGPT-4 has a significantly higher diversity than the students.

Our work goes significantly beyond existing benchmarks. While OpenAI’s technical report on GPT-4 6 presents some benchmarks, their evaluation lacks scientific rigor: it fails to provide vital information like the agreement between raters, does not report on details regarding the criteria for assessment or to what extent and how a statistical analysis was conducted for a larger sample of essays. In contrast, our benchmark provides the first (statistically) rigorous and systematic study of essay quality, paired with a computational linguistic analysis of the language employed by humans and two different versions of ChatGPT, offering a glance at how these NLG models develop over time. While our work is focused on argumentative essays in education, the genre is also relevant beyond education. In general, studying argumentative essays is one important aspect to understand how good generative AI models are at conveying arguments and, consequently, persuasive writing in general.

Related work

Natural language generation.

The recent interest in generative AI models can be largely attributed to the public release of ChatGPT, a public interface in the form of an interactive chat based on the InstructGPT 1 model, more commonly referred to as GPT-3.5. In comparison to the original GPT-3 7 and other similar generative large language models based on the transformer architecture like GPT-J 8 , this model was not trained in a purely self-supervised manner (e.g. through masked language modeling). Instead, a pipeline that involved human-written content was used to fine-tune the model and improve the quality of the outputs to both mitigate biases and safety issues, as well as make the generated text more similar to text written by humans. Such models are referred to as Fine-tuned LAnguage Nets (FLANs). For details on their training, we refer to the literature 9 . Notably, this process was recently reproduced with publicly available models such as Alpaca 10 and Dolly (i.e. the complete models can be downloaded and not just accessed through an API). However, we can only assume that a similar process was used for the training of GPT-4 since the paper by OpenAI does not include any details on model training.

Testing of the language competency of large-scale NLG systems has only recently started. Cai et al. 11 show that ChatGPT reuses sentence structure, accesses the intended meaning of an ambiguous word, and identifies the thematic structure of a verb and its arguments, replicating human language use. Mahowald 12 compares ChatGPT’s acceptability judgments to human judgments on the Article + Adjective + Numeral + Noun construction in English. Dentella et al. 13 show that ChatGPT-3 fails to understand low-frequent grammatical constructions like complex nested hierarchies and self-embeddings. In another recent line of research, the structure of automatically generated language is evaluated. Guo et al. 14 show that in question-answer scenarios, ChatGPT-3 uses different linguistic devices than humans. Zhao et al. 15 show that ChatGPT generates longer and more diverse responses when the user is in an apparently negative emotional state.

Given that we aim to identify certain linguistic characteristics of human-written versus AI-generated content, we also draw on related work in the field of linguistic fingerprinting, which assumes that each human has a unique way of using language to express themselves, i.e. the linguistic means that are employed to communicate thoughts, opinions and ideas differ between humans. That these properties can be identified with computational linguistic means has been showcased across different tasks: the computation of a linguistic fingerprint allows to distinguish authors of literary works 16 , the identification of speaker profiles in large public debates 17 , 18 , 19 , 20 and the provision of data for forensic voice comparison in broadcast debates 21 , 22 . For educational purposes, linguistic features are used to measure essay readability 23 , essay cohesion 24 and language performance scores for essay grading 25 . Integrating linguistic fingerprints also yields performance advantages for classification tasks, for instance in predicting user opinion 26 , 27 and identifying individual users 28 .

Limitations of OpenAIs ChatGPT evaluations

OpenAI published a discussion of the model’s performance of several tasks, including Advanced Placement (AP) classes within the US educational system 6 . The subjects used in performance evaluation are diverse and include arts, history, English literature, calculus, statistics, physics, chemistry, economics, and US politics. While the models achieved good or very good marks in most subjects, they did not perform well in English literature. GPT-3.5 also experienced problems with chemistry, macroeconomics, physics, and statistics. While the overall results are impressive, there are several significant issues: firstly, the conflict of interest of the model’s owners poses a problem for the performance interpretation. Secondly, there are issues with the soundness of the assessment beyond the conflict of interest, which make the generalizability of the results hard to assess with respect to the models’ capability to write essays. Notably, the AP exams combine multiple-choice questions with free-text answers. Only the aggregated scores are publicly available. To the best of our knowledge, neither the generated free-text answers, their overall assessment, nor their assessment given specific criteria from the used judgment rubric are published. Thirdly, while the paper states that 1–2 qualified third-party contractors participated in the rating of the free-text answers, it is unclear how often multiple ratings were generated for the same answer and what was the agreement between them. This lack of information hinders a scientifically sound judgement regarding the capabilities of these models in general, but also specifically for essays. Lastly, the owners of the model conducted their study in a few-shot prompt setting, where they gave the models a very structured template as well as an example of a human-written high-quality essay to guide the generation of the answers. This further fine-tuning of what the models generate could have also influenced the output. The results published by the owners go beyond the AP courses which are directly comparable to our work and also consider other student assessments like Graduate Record Examinations (GREs). However, these evaluations suffer from the same problems with the scientific rigor as the AP classes.

Scientific assessment of ChatGPT

Researchers across the globe are currently assessing the individual capabilities of these models with greater scientific rigor. We note that due to the recency and speed of these developments, the hereafter discussed literature has mostly only been published as pre-prints and has not yet been peer-reviewed. In addition to the above issues concretely related to the assessment of the capabilities to generate student essays, it is also worth noting that there are likely large problems with the trustworthiness of evaluations, because of data contamination, i.e. because the benchmark tasks are part of the training of the model, which enables memorization. For example, Aiyappa et al. 29 find evidence that this is likely the case for benchmark results regarding NLP tasks. This complicates the effort by researchers to assess the capabilities of the models beyond memorization.

Nevertheless, the first assessment results are already available – though mostly focused on ChatGPT-3 and not yet ChatGPT-4. Closest to our work is a study by Yeadon et al. 30 , who also investigate ChatGPT-3 performance when writing essays. They grade essays generated by ChatGPT-3 for five physics questions based on criteria that cover academic content, appreciation of the underlying physics, grasp of subject material, addressing the topic, and writing style. For each question, ten essays were generated and rated independently by five researchers. While the sample size precludes a statistical assessment, the results demonstrate that the AI model is capable of writing high-quality physics essays, but that the quality varies in a manner similar to human-written essays.

Guo et al. 14 create a set of free-text question answering tasks based on data they collected from the internet, e.g. question answering from Reddit. The authors then sample thirty triplets of a question, a human answer, and a ChatGPT-3 generated answer and ask human raters to assess if they can detect which was written by a human, and which was written by an AI. While this approach does not directly assess the quality of the output, it serves as a Turing test 31 designed to evaluate whether humans can distinguish between human- and AI-produced output. The results indicate that humans are in fact able to distinguish between the outputs when presented with a pair of answers. Humans familiar with ChatGPT are also able to identify over 80% of AI-generated answers without seeing a human answer in comparison. However, humans who are not yet familiar with ChatGPT-3 are not capable of identifying AI-written answers about 50% of the time. Moreover, the authors also find that the AI-generated outputs are deemed to be more helpful than the human answers in slightly more than half of the cases. This suggests that the strong results from OpenAI’s own benchmarks regarding the capabilities to generate free-text answers generalize beyond the benchmarks.

There are, however, some indicators that the benchmarks may be overly optimistic in their assessment of the model’s capabilities. For example, Kortemeyer 32 conducts a case study to assess how well ChatGPT-3 would perform in a physics class, simulating the tasks that students need to complete as part of the course: answer multiple-choice questions, do homework assignments, ask questions during a lesson, complete programming exercises, and write exams with free-text questions. Notably, ChatGPT-3 was allowed to interact with the instructor for many of the tasks, allowing for multiple attempts as well as feedback on preliminary solutions. The experiment shows that ChatGPT-3’s performance is in many aspects similar to that of the beginning learners and that the model makes similar mistakes, such as omitting units or simply plugging in results from equations. Overall, the AI would have passed the course with a low score of 1.5 out of 4.0. Similarly, Kung et al. 33 study the performance of ChatGPT-3 in the United States Medical Licensing Exam (USMLE) and find that the model performs at or near the passing threshold. Their assessment is a bit more optimistic than Kortemeyer’s as they state that this level of performance, comprehensible reasoning and valid clinical insights suggest that models such as ChatGPT may potentially assist human learning in clinical decision making.

Frieder et al. 34 evaluate the capabilities of ChatGPT-3 in solving graduate-level mathematical tasks. They find that while ChatGPT-3 seems to have some mathematical understanding, its level is well below that of an average student and in most cases is not sufficient to pass exams. Yuan et al. 35 consider the arithmetic abilities of language models, including ChatGPT-3 and ChatGPT-4. They find that they exhibit the best performance among other currently available language models (incl. Llama 36 , FLAN-T5 37 , and Bloom 38 ). However, the accuracy of basic arithmetic tasks is still only at 83% when considering correctness to the degree of $10^{-3}$ , i.e. such models are still not capable of functioning reliably as calculators. In a slightly satiric, yet insightful take, Spencer et al. 39 assess how a scientific paper on gamma-ray astrophysics would look like, if it were written largely with the assistance of ChatGPT-3. They find that while the language capabilities are good and the model is capable of generating equations, the arguments are often flawed and the references to scientific literature are full of hallucinations.

The general reasoning skills of the models may also not be at the level expected from the benchmarks. For example, Cherian et al. 40 evaluate how well ChatGPT-3 performs on eleven puzzles that second graders should be able to solve and find that ChatGPT is only able to solve them on average in 36.4% of attempts, whereas the second graders achieve a mean of 60.4%. However, their sample size is very small and the problem was posed as a multiple-choice question answering problem, which cannot be directly compared to the NLG we consider.

Research gap

Within this article, we address an important part of the current research gap regarding the capabilities of ChatGPT (and similar technologies), guided by the following research questions:

RQ1: How good is ChatGPT based on GPT-3 and GPT-4 at writing argumentative student essays?

RQ2: How do AI-generated essays compare to essays written by students?

RQ3: What are linguistic devices that are characteristic of student versus AI-generated content?

We study these aspects with the help of a large group of teaching professionals who systematically assess a large corpus of student essays. To the best of our knowledge, this is the first large-scale, independent scientific assessment of ChatGPT (or similar models) of this kind. Answering these questions is crucial to understanding the impact of ChatGPT on the future of education.

Materials and methods

The essay topics originate from a corpus of argumentative essays in the field of argument mining 41 . Argumentative essays require students to think critically about a topic and use evidence to establish a position on the topic in a concise manner. The corpus features essays for 90 topics from Essay Forum 42 , an active community for providing writing feedback on different kinds of text and is frequented by high-school students to get feedback from native speakers on their essay-writing capabilities. Information about the age of the writers is not available, but the topics indicate that the essays were written in grades 11–13, indicating that the authors were likely at least 16. Topics range from ‘Should students be taught to cooperate or to compete?’ to ‘Will newspapers become a thing of the past?’. In the corpus, each topic features one human-written essay uploaded and discussed in the forum. The students who wrote the essays are not native speakers. The average length of these essays is 19 sentences with 388 tokens (an average of 2.089 characters) and will be termed ‘student essays’ in the remainder of the paper.

For the present study, we use the topics from Stab and Gurevych 41 and prompt ChatGPT with ‘Write an essay with about 200 words on “[ topic ]”’ to receive automatically-generated essays from the ChatGPT-3 and ChatGPT-4 versions from 22 March 2023 (‘ChatGPT-3 essays’, ‘ChatGPT-4 essays’). No additional prompts for getting the responses were used, i.e. the data was created with a basic prompt in a zero-shot scenario. This is in contrast to the benchmarks by OpenAI, who used an engineered prompt in a few-shot scenario to guide the generation of essays. We note that we decided to ask for 200 words because we noticed a tendency to generate essays that are longer than the desired length by ChatGPT. A prompt asking for 300 words typically yielded essays with more than 400 words. Thus, using the shorter length of 200, we prevent a potential advantage for ChatGPT through longer essays, and instead err on the side of brevity. Similar to the evaluations of free-text answers by OpenAI, we did not consider multiple configurations of the model due to the effort required to obtain human judgments. For the same reason, our data is restricted to ChatGPT and does not include other models available at that time, e.g. Alpaca. We use the browser versions of the tools because we consider this to be a more realistic scenario than using the API. Table 1 below shows the core statistics of the resulting dataset. Supplemental material S1 shows examples for essays from the data set.

Annotation study

Study participants.

The participants had registered for a two-hour online training entitled ‘ChatGPT – Challenges and Opportunities’ conducted by the authors of this paper as a means to provide teachers with some of the technological background of NLG systems in general and ChatGPT in particular. Only teachers permanently employed at secondary schools were allowed to register for this training. Focusing on these experts alone allows us to receive meaningful results as those participants have a wide range of experience in assessing students’ writing. A total of 139 teachers registered for the training, 129 of them teach at grammar schools, and only 10 teachers hold a position at other secondary schools. About half of the registered teachers (68 teachers) have been in service for many years and have successfully applied for promotion. For data protection reasons, we do not know the subject combinations of the registered teachers. We only know that a variety of subjects are represented, including languages (English, French and German), religion/ethics, and science. Supplemental material S5 provides some general information regarding German teacher qualifications.

The training began with an online lecture followed by a discussion phase. Teachers were given an overview of language models and basic information on how ChatGPT was developed. After about 45 minutes, the teachers received a both written and oral explanation of the questionnaire at the core of our study (see Supplementary material S3 ) and were informed that they had 30 minutes to finish the study tasks. The explanation included information on how the data was obtained, why we collect the self-assessment, and how we chose the criteria for the rating of the essays, the overall goal of our research, and a walk-through of the questionnaire. Participation in the questionnaire was voluntary and did not affect the awarding of a training certificate. We further informed participants that all data was collected anonymously and that we would have no way of identifying who participated in the questionnaire. We orally informed participants that they consent to the use of the provided ratings for our research by participating in the survey.

Once these instructions were provided orally and in writing, the link to the online form was given to the participants. The online form was running on a local server that did not log any information that could identify the participants (e.g. IP address) to ensure anonymity. As per instructions, consent for participation was given by using the online form. Due to the full anonymity, we could by definition not document who exactly provided the consent. This was implemented as further insurance that non-participation could not possibly affect being awarded the training certificate.

About 20% of the training participants did not take part in the questionnaire study, the remaining participants consented based on the information provided and participated in the rating of essays. After the questionnaire, we continued with an online lecture on the opportunities of using ChatGPT for teaching as well as AI beyond chatbots. The study protocol was reviewed and approved by the Research Ethics Committee of the University of Passau. We further confirm that our study protocol is in accordance with all relevant guidelines.

Questionnaire

The questionnaire consists of three parts: first, a brief self-assessment regarding the English skills of the participants which is based on the Common European Framework of Reference for Languages (CEFR) 43 . We have six levels ranging from ‘comparable to a native speaker’ to ‘some basic skills’ (see supplementary material S3 ). Then each participant was shown six essays. The participants were only shown the generated text and were not provided with information on whether the text was human-written or AI-generated.

The questionnaire covers the seven categories relevant for essay assessment shown below (for details see supplementary material S3 ):

Topic and completeness

Logic and composition

Expressiveness and comprehensiveness

Language mastery

Vocabulary and text linking

Language constructs

These categories are used as guidelines for essay assessment 44 established by the Ministry for Education of Lower Saxony, Germany. For each criterion, a seven-point Likert scale with scores from zero to six is defined, where zero is the worst score (e.g. no relation to the topic) and six is the best score (e.g. addressed the topic to a special degree). The questionnaire included a written description as guidance for the scoring.

After rating each essay, the participants were also asked to self-assess their confidence in the ratings. We used a five-point Likert scale based on the criteria for the self-assessment of peer-review scores from the Association for Computational Linguistics (ACL). Once a participant finished rating the six essays, they were shown a summary of their ratings, as well as the individual ratings for each of their essays and the information on how the essay was generated.

Computational linguistic analysis

In order to further explore and compare the quality of the essays written by students and ChatGPT, we consider the six following linguistic characteristics: lexical diversity, sentence complexity, nominalization, presence of modals, epistemic and discourse markers. Those are motivated by previous work: Weiss et al. 25 observe the correlation between measures of lexical, syntactic and discourse complexities to the essay gradings of German high-school examinations while McNamara et al. 45 explore cohesion (indicated, among other things, by connectives), syntactic complexity and lexical diversity in relation to the essay scoring.

Lexical diversity

We identify vocabulary richness by using a well-established measure of textual, lexical diversity (MTLD) 46 which is often used in the field of automated essay grading 25 , 45 , 47 . It takes into account the number of unique words but unlike the best-known measure of lexical diversity, the type-token ratio (TTR), it is not as sensitive to the difference in the length of the texts. In fact, Koizumi and In’nami 48 find it to be least affected by the differences in the length of the texts compared to some other measures of lexical diversity. This is relevant to us due to the difference in average length between the human-written and ChatGPT-generated essays.

Syntactic complexity

We use two measures in order to evaluate the syntactic complexity of the essays. One is based on the maximum depth of the sentence dependency tree which is produced using the spaCy 3.4.2 dependency parser 49 (‘Syntactic complexity (depth)’). For the second measure, we adopt an approach similar in nature to the one by Weiss et al. 25 who use clause structure to evaluate syntactic complexity. In our case, we count the number of conjuncts, clausal modifiers of nouns, adverbial clause modifiers, clausal complements, clausal subjects, and parataxes (‘Syntactic complexity (clauses)’). The supplementary material in S2 shows the difference between sentence complexity based on two examples from the data.

Nominalization is a common feature of a more scientific style of writing 50 and is used as an additional measure for syntactic complexity. In order to explore this feature, we count occurrences of nouns with suffixes such as ‘-ion’, ‘-ment’, ‘-ance’ and a few others which are known to transform verbs into nouns.

Semantic properties

Both modals and epistemic markers signal the commitment of the writer to their statement. We identify modals using the POS-tagging module provided by spaCy as well as a list of epistemic expressions of modality, such as ‘definitely’ and ‘potentially’, also used in other approaches to identifying semantic properties 51 . For epistemic markers we adopt an empirically-driven approach and utilize the epistemic markers identified in a corpus of dialogical argumentation by Hautli-Janisz et al. 52 . We consider expressions such as ‘I think’, ‘it is believed’ and ‘in my opinion’ to be epistemic.

Discourse properties

Discourse markers can be used to measure the coherence quality of a text. This has been explored by Somasundaran et al. 53 who use discourse markers to evaluate the story-telling aspect of student writing while Nadeem et al. 54 incorporated them in their deep learning-based approach to automated essay scoring. In the present paper, we employ the PDTB list of discourse markers 55 which we adjust to exclude words that are often used for purposes other than indicating discourse relations, such as ‘like’, ‘for’, ‘in’ etc.

Statistical methods

We use a within-subjects design for our study. Each participant was shown six randomly selected essays. Results were submitted to the survey system after each essay was completed, in case participants ran out of time and did not finish scoring all six essays. Cronbach’s $\alpha$ 56 allows us to determine the inter-rater reliability for the rating criterion and data source (human, ChatGPT-3, ChatGPT-4) in order to understand the reliability of our data not only overall, but also for each data source and rating criterion. We use two-sided Wilcoxon-rank-sum tests 57 to confirm the significance of the differences between the data sources for each criterion. We use the same tests to determine the significance of the linguistic characteristics. This results in three comparisons (human vs. ChatGPT-3, human vs. ChatGPT-4, ChatGPT-3 vs. ChatGPT-4) for each of the seven rating criteria and each of the seven linguistic characteristics, i.e. 42 tests. We use the Holm-Bonferroni method 58 for the correction for multiple tests to achieve a family-wise error rate of 0.05. We report the effect size using Cohen’s d 59 . While our data is not perfectly normal, it also does not have severe outliers, so we prefer the clear interpretation of Cohen’s d over the slightly more appropriate, but less accessible non-parametric effect size measures. We report point plots with estimates of the mean scores for each data source and criterion, incl. the 95% confidence interval of these mean values. The confidence intervals are estimated in a non-parametric manner based on bootstrap sampling. We further visualize the distribution for each criterion using violin plots to provide a visual indicator of the spread of the data (see Supplementary material S4 ).

Further, we use the self-assessment of the English skills and confidence in the essay ratings as confounding variables. Through this, we determine if ratings are affected by the language skills or confidence, instead of the actual quality of the essays. We control for the impact of these by measuring Pearson’s correlation coefficient r 60 between the self-assessments and the ratings. We also determine whether the linguistic features are correlated with the ratings as expected. The sentence complexity (both tree depth and dependency clauses), as well as the nominalization, are indicators of the complexity of the language. Similarly, the use of discourse markers should signal a proper logical structure. Finally, a large lexical diversity should be correlated with the ratings for the vocabulary. Same as above, we measure Pearson’s r . We use a two-sided test for the significance based on a $\beta$ -distribution that models the expected correlations as implemented by scipy 61 . Same as above, we use the Holm-Bonferroni method to account for multiple tests. However, we note that it is likely that all—even tiny—correlations are significant given our amount of data. Consequently, our interpretation of these results focuses on the strength of the correlations.

Our statistical analysis of the data is implemented in Python. We use pandas 1.5.3 and numpy 1.24.2 for the processing of data, pingouin 0.5.3 for the calculation of Cronbach’s $\alpha$ , scipy 1.10.1 for the Wilcoxon-rank-sum tests Pearson’s r , and seaborn 0.12.2 for the generation of plots, incl. the calculation of error bars that visualize the confidence intervals.

Out of the 111 teachers who completed the questionnaire, 108 rated all six essays, one rated five essays, one rated two essays, and one rated only one essay. This results in 658 ratings for 270 essays (90 topics for each essay type: human-, ChatGPT-3-, ChatGPT-4-generated), with three ratings for 121 essays, two ratings for 144 essays, and one rating for five essays. The inter-rater agreement is consistently excellent ( $\alpha >0.9$ ), with the exception of language mastery where we have good agreement ( $\alpha =0.89$ , see Table 2 ). Further, the correlation analysis depicted in supplementary material S4 shows weak positive correlations ( $r \in 0.11, 0.28]$ ) between the self-assessment for the English skills, respectively the self-assessment for the confidence in ratings and the actual ratings. Overall, this indicates that our ratings are reliable estimates of the actual quality of the essays with a potential small tendency that confidence in ratings and language skills yields better ratings, independent of the data source.

Table 2 and supplementary material S4 characterize the distribution of the ratings for the essays, grouped by the data source. We observe that for all criteria, we have a clear order of the mean values, with students having the worst ratings, ChatGPT-3 in the middle rank, and ChatGPT-4 with the best performance. We further observe that the standard deviations are fairly consistent and slightly larger than one, i.e. the spread is similar for all ratings and essays. This is further supported by the visual analysis of the violin plots.

The statistical analysis of the ratings reported in Table 4 shows that differences between the human-written essays and the ones generated by both ChatGPT models are significant. The effect sizes for human versus ChatGPT-3 essays are between 0.52 and 1.15, i.e. a medium ( $d \in [0.5,0.8)$ ) to large ( $d \in [0.8, 1.2)$ ) effect. On the one hand, the smallest effects are observed for the expressiveness and complexity, i.e. when it comes to the overall comprehensiveness and complexity of the sentence structures, the differences between the humans and the ChatGPT-3 model are smallest. On the other hand, the difference in language mastery is larger than all other differences, which indicates that humans are more prone to making mistakes when writing than the NLG models. The magnitude of differences between humans and ChatGPT-4 is larger with effect sizes between 0.88 and 1.43, i.e., a large to very large ( $d \in [1.2, 2)$ ) effect. Same as for ChatGPT-3, the differences are smallest for expressiveness and complexity and largest for language mastery. Please note that the difference in language mastery between humans and both GPT models does not mean that the humans have low scores for language mastery (M=3.90), but rather that the NLG models have exceptionally high scores (M=5.03 for ChatGPT-3, M=5.25 for ChatGPT-4).

When we consider the differences between the two GPT models, we observe that while ChatGPT-4 has consistently higher mean values for all criteria, only the differences for logic and composition, vocabulary and text linking, and complexity are significant. The effect sizes are between 0.45 and 0.5, i.e. small ( $d \in [0.2, 0.5)$ ) and medium. Thus, while GPT-4 seems to be an improvement over GPT-3.5 in general, the only clear indicator of this is a better and clearer logical composition and more complex writing with a more diverse vocabulary.

We also observe significant differences in the distribution of linguistic characteristics between all three groups (see Table 3 ). Sentence complexity (depth) is the only category without a significant difference between humans and ChatGPT-3, as well as ChatGPT-3 and ChatGPT-4. There is also no significant difference in the category of discourse markers between humans and ChatGPT-3. The magnitude of the effects varies a lot and is between 0.39 and 1.93, i.e., between small ( $d \in [0.2, 0.5)$ ) and very large. However, in comparison to the ratings, there is no clear tendency regarding the direction of the differences. For instance, while the ChatGPT models write more complex sentences and use more nominalizations, humans tend to use more modals and epistemic markers instead. The lexical diversity of humans is higher than that of ChatGPT-3 but lower than that of ChatGPT-4. While there is no difference in the use of discourse markers between humans and ChatGPT-3, ChatGPT-4 uses significantly fewer discourse markers.

We detect the expected positive correlations between the complexity ratings and the linguistic markers for sentence complexity ( $r=0.16$ for depth, $r=0.19$ for clauses) and nominalizations ( $r=0.22$ ). However, we observe a negative correlation between the logic ratings and the discourse markers ( $r=-0.14$ ), which counters our intuition that more frequent use of discourse indicators makes a text more logically coherent. However, this is in line with previous work: McNamara et al. 45 also find no indication that the use of cohesion indices such as discourse connectives correlates with high- and low-proficiency essays. Finally, we observe the expected positive correlation between the ratings for the vocabulary and the lexical diversity ( $r=0.12$ ). All observed correlations are significant. However, we note that the strength of all these correlations is weak and that the significance itself should not be over-interpreted due to the large sample size.

Our results provide clear answers to the first two research questions that consider the quality of the generated essays: ChatGPT performs well at writing argumentative student essays and outperforms the quality of the human-written essays significantly. The ChatGPT-4 model has (at least) a large effect and is on average about one point better than humans on a seven-point Likert scale.

Regarding the third research question, we find that there are significant linguistic differences between humans and AI-generated content. The AI-generated essays are highly structured, which for instance is reflected by the identical beginnings of the concluding sections of all ChatGPT essays (‘In conclusion, [...]’). The initial sentences of each essay are also very similar starting with a general statement using the main concepts of the essay topics. Although this corresponds to the general structure that is sought after for argumentative essays, it is striking to see that the ChatGPT models are so rigid in realizing this, whereas the human-written essays are looser in representing the guideline on the linguistic surface. Moreover, the linguistic fingerprint has the counter-intuitive property that the use of discourse markers is negatively correlated with logical coherence. We believe that this might be due to the rigid structure of the generated essays: instead of using discourse markers, the AI models provide a clear logical structure by separating the different arguments into paragraphs, thereby reducing the need for discourse markers.

Our data also shows that hallucinations are not a problem in the setting of argumentative essay writing: the essay topics are not really about factual correctness, but rather about argumentation and critical reflection on general concepts which seem to be contained within the knowledge of the AI model. The stochastic nature of the language generation is well-suited for this kind of task, as different plausible arguments can be seen as a sampling from all available arguments for a topic. Nevertheless, we need to perform a more systematic study of the argumentative structures in order to better understand the difference in argumentation between human-written and ChatGPT-generated essay content. Moreover, we also cannot rule out that subtle hallucinations may have been overlooked during the ratings. There are also essays with a low rating for the criteria related to factual correctness, indicating that there might be cases where the AI models still have problems, even if they are, on average, better than the students.

One of the issues with evaluations of the recent large-language models is not accounting for the impact of tainted data when benchmarking such models. While it is certainly possible that the essays that were sourced by Stab and Gurevych 41 from the internet were part of the training data of the GPT models, the proprietary nature of the model training means that we cannot confirm this. However, we note that the generated essays did not resemble the corpus of human essays at all. Moreover, the topics of the essays are general in the sense that any human should be able to reason and write about these topics, just by understanding concepts like ‘cooperation’. Consequently, a taint on these general topics, i.e. the fact that they might be present in the data, is not only possible but is actually expected and unproblematic, as it relates to the capability of the models to learn about concepts, rather than the memorization of specific task solutions.

While we did everything to ensure a sound construct and a high validity of our study, there are still certain issues that may affect our conclusions. Most importantly, neither the writers of the essays, nor their raters, were English native speakers. However, the students purposefully used a forum for English writing frequented by native speakers to ensure the language and content quality of their essays. This indicates that the resulting essays are likely above average for non-native speakers, as they went through at least one round of revisions with the help of native speakers. The teachers were informed that part of the training would be in English to prevent registrations from people without English language skills. Moreover, the self-assessment of the language skills was only weakly correlated with the ratings, indicating that the threat to the soundness of our results is low. While we cannot definitively rule out that our results would not be reproducible with other human raters, the high inter-rater agreement indicates that this is unlikely.

However, our reliance on essays written by non-native speakers affects the external validity and the generalizability of our results. It is certainly possible that native speaking students would perform better in the criteria related to language skills, though it is unclear by how much. However, the language skills were particular strengths of the AI models, meaning that while the difference might be smaller, it is still reasonable to conclude that the AI models would have at least comparable performance to humans, but possibly still better performance, just with a smaller gap. While we cannot rule out a difference for the content-related criteria, we also see no strong argument why native speakers should have better arguments than non-native speakers. Thus, while our results might not fully translate to native speakers, we see no reason why aspects regarding the content should not be similar. Further, our results were obtained based on high-school-level essays. Native and non-native speakers with higher education degrees or experts in fields would likely also achieve a better performance, such that the difference in performance between the AI models and humans would likely also be smaller in such a setting.

We further note that the essay topics may not be an unbiased sample. While Stab and Gurevych 41 randomly sampled the essays from the writing feedback section of an essay forum, it is unclear whether the essays posted there are representative of the general population of essay topics. Nevertheless, we believe that the threat is fairly low because our results are consistent and do not seem to be influenced by certain topics. Further, we cannot with certainty conclude how our results generalize beyond ChatGPT-3 and ChatGPT-4 to similar models like Bard ( https://bard.google.com/?hl=en ) Alpaca, and Dolly. Especially the results for linguistic characteristics are hard to predict. However, since—to the best of our knowledge and given the proprietary nature of some of these models—the general approach to how these models work is similar and the trends for essay quality should hold for models with comparable size and training procedures.

Finally, we want to note that the current speed of progress with generative AI is extremely fast and we are studying moving targets: ChatGPT 3.5 and 4 today are already not the same as the models we studied. Due to a lack of transparency regarding the specific incremental changes, we cannot know or predict how this might affect our results.

Our results provide a strong indication that the fear many teaching professionals have is warranted: the way students do homework and teachers assess it needs to change in a world of generative AI models. For non-native speakers, our results show that when students want to maximize their essay grades, they could easily do so by relying on results from AI models like ChatGPT. The very strong performance of the AI models indicates that this might also be the case for native speakers, though the difference in language skills is probably smaller. However, this is not and cannot be the goal of education. Consequently, educators need to change how they approach homework. Instead of just assigning and grading essays, we need to reflect more on the output of AI tools regarding their reasoning and correctness. AI models need to be seen as an integral part of education, but one which requires careful reflection and training of critical thinking skills.

Furthermore, teachers need to adapt strategies for teaching writing skills: as with the use of calculators, it is necessary to critically reflect with the students on when and how to use those tools. For instance, constructivists 62 argue that learning is enhanced by the active design and creation of unique artifacts by students themselves. In the present case this means that, in the long term, educational objectives may need to be adjusted. This is analogous to teaching good arithmetic skills to younger students and then allowing and encouraging students to use calculators freely in later stages of education. Similarly, once a sound level of literacy has been achieved, strongly integrating AI models in lesson plans may no longer run counter to reasonable learning goals.

In terms of shedding light on the quality and structure of AI-generated essays, this paper makes an important contribution by offering an independent, large-scale and statistically sound account of essay quality, comparing human-written and AI-generated texts. By comparing different versions of ChatGPT, we also offer a glance into the development of these models over time in terms of their linguistic properties and the quality they exhibit. Our results show that while the language generated by ChatGPT is considered very good by humans, there are also notable structural differences, e.g. in the use of discourse markers. This demonstrates that an in-depth consideration not only of the capabilities of generative AI models is required (i.e. which tasks can they be used for), but also of the language they generate. For example, if we read many AI-generated texts that use fewer discourse markers, it raises the question if and how this would affect our human use of discourse markers. Understanding how AI-generated texts differ from human-written enables us to look for these differences, to reason about their potential impact, and to study and possibly mitigate this impact.

Data availability

The datasets generated during and/or analysed during the current study are available in the Zenodo repository, https://doi.org/10.5281/zenodo.8343644

Code availability

All materials are available online in form of a replication package that contains the data and the analysis code, https://doi.org/10.5281/zenodo.8343644 .

Ouyang, L. et al. Training language models to follow instructions with human feedback (2022). arXiv:2203.02155 .

Ruby, D. 30+ detailed chatgpt statistics–users & facts (sep 2023). https://www.demandsage.com/chatgpt-statistics/ (2023). Accessed 09 June 2023.

Leahy, S. & Mishra, P. TPACK and the Cambrian explosion of AI. In Society for Information Technology & Teacher Education International Conference , (ed. Langran, E.) 2465–2469 (Association for the Advancement of Computing in Education (AACE), 2023).

Ortiz, S. Need an ai essay writer? here’s how chatgpt (and other chatbots) can help. https://www.zdnet.com/article/how-to-use-chatgpt-to-write-an-essay/ (2023). Accessed 09 June 2023.

Openai chat interface. https://chat.openai.com/ . Accessed 09 June 2023.

OpenAI. Gpt-4 technical report (2023). arXiv:2303.08774 .

Brown, T. B. et al. Language models are few-shot learners (2020). arXiv:2005.14165 .

Wang, B. Mesh-Transformer-JAX: Model-Parallel Implementation of Transformer Language Model with JAX. https://github.com/kingoflolz/mesh-transformer-jax (2021).

Wei, J. et al. Finetuned language models are zero-shot learners. In International Conference on Learning Representations (2022).

Taori, R. et al. Stanford alpaca: An instruction-following llama model. https://github.com/tatsu-lab/stanford_alpaca (2023).

Cai, Z. G., Haslett, D. A., Duan, X., Wang, S. & Pickering, M. J. Does chatgpt resemble humans in language use? (2023). arXiv:2303.08014 .

Mahowald, K. A discerning several thousand judgments: Gpt-3 rates the article + adjective + numeral + noun construction (2023). arXiv:2301.12564 .

Dentella, V., Murphy, E., Marcus, G. & Leivada, E. Testing ai performance on less frequent aspects of language reveals insensitivity to underlying meaning (2023). arXiv:2302.12313 .

Guo, B. et al. How close is chatgpt to human experts? comparison corpus, evaluation, and detection (2023). arXiv:2301.07597 .

Zhao, W. et al. Is chatgpt equipped with emotional dialogue capabilities? (2023). arXiv:2304.09582 .

Keim, D. A. & Oelke, D. Literature fingerprinting : A new method for visual literary analysis. In 2007 IEEE Symposium on Visual Analytics Science and Technology , 115–122, https://doi.org/10.1109/VAST.2007.4389004 (IEEE, 2007).

El-Assady, M. et al. Interactive visual analysis of transcribed multi-party discourse. In Proceedings of ACL 2017, System Demonstrations , 49–54 (Association for Computational Linguistics, Vancouver, Canada, 2017).

Mennatallah El-Assady, A. H.-J. & Butt, M. Discourse maps - feature encoding for the analysis of verbatim conversation transcripts. In Visual Analytics for Linguistics , vol. CSLI Lecture Notes, Number 220, 115–147 (Stanford: CSLI Publications, 2020).

Matt Foulis, J. V. & Reed, C. Dialogical fingerprinting of debaters. In Proceedings of COMMA 2020 , 465–466, https://doi.org/10.3233/FAIA200536 (Amsterdam: IOS Press, 2020).

Matt Foulis, J. V. & Reed, C. Interactive visualisation of debater identification and characteristics. In Proceedings of the COMMA workshop on Argument Visualisation, COMMA , 1–7 (2020).

Chatzipanagiotidis, S., Giagkou, M. & Meurers, D. Broad linguistic complexity analysis for Greek readability classification. In Proceedings of the 16th Workshop on Innovative Use of NLP for Building Educational Applications , 48–58 (Association for Computational Linguistics, Online, 2021).

Ajili, M., Bonastre, J.-F., Kahn, J., Rossato, S. & Bernard, G. FABIOLE, a speech database for forensic speaker comparison. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16) , 726–733 (European Language Resources Association (ELRA), Portorož, Slovenia, 2016).

Deutsch, T., Jasbi, M. & Shieber, S. Linguistic features for readability assessment. In Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications , 1–17, https://doi.org/10.18653/v1/2020.bea-1.1 (Association for Computational Linguistics, Seattle, WA, USA $\rightarrow$ Online, 2020).

Fiacco, J., Jiang, S., Adamson, D. & Rosé, C. Toward automatic discourse parsing of student writing motivated by neural interpretation. In Proceedings of the 17th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2022) , 204–215, https://doi.org/10.18653/v1/2022.bea-1.25 (Association for Computational Linguistics, Seattle, Washington, 2022).

Weiss, Z., Riemenschneider, A., Schröter, P. & Meurers, D. Computationally modeling the impact of task-appropriate language complexity and accuracy on human grading of German essays. In Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications , 30–45, https://doi.org/10.18653/v1/W19-4404 (Association for Computational Linguistics, Florence, Italy, 2019).

Yang, F., Dragut, E. & Mukherjee, A. Predicting personal opinion on future events with fingerprints. In Proceedings of the 28th International Conference on Computational Linguistics , 1802–1807, https://doi.org/10.18653/v1/2020.coling-main.162 (International Committee on Computational Linguistics, Barcelona, Spain (Online), 2020).

Tumarada, K. et al. Opinion prediction with user fingerprinting. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021) , 1423–1431 (INCOMA Ltd., Held Online, 2021).

Rocca, R. & Yarkoni, T. Language as a fingerprint: Self-supervised learning of user encodings using transformers. In Findings of the Association for Computational Linguistics: EMNLP . 1701–1714 (Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 2022).

Aiyappa, R., An, J., Kwak, H. & Ahn, Y.-Y. Can we trust the evaluation on chatgpt? (2023). arXiv:2303.12767 .

Yeadon, W., Inyang, O.-O., Mizouri, A., Peach, A. & Testrow, C. The death of the short-form physics essay in the coming ai revolution (2022). arXiv:2212.11661 .

TURING, A. M. I.-COMPUTING MACHINERY AND INTELLIGENCE. Mind LIX , 433–460, https://doi.org/10.1093/mind/LIX.236.433 (1950). https://academic.oup.com/mind/article-pdf/LIX/236/433/30123314/lix-236-433.pdf .

Kortemeyer, G. Could an artificial-intelligence agent pass an introductory physics course? (2023). arXiv:2301.12127 .

Kung, T. H. et al. Performance of chatgpt on usmle: Potential for ai-assisted medical education using large language models. PLOS Digital Health 2 , 1–12. https://doi.org/10.1371/journal.pdig.0000198 (2023).

Article Google Scholar

Frieder, S. et al. Mathematical capabilities of chatgpt (2023). arXiv:2301.13867 .

Yuan, Z., Yuan, H., Tan, C., Wang, W. & Huang, S. How well do large language models perform in arithmetic tasks? (2023). arXiv:2304.02015 .

Touvron, H. et al. Llama: Open and efficient foundation language models (2023). arXiv:2302.13971 .

Chung, H. W. et al. Scaling instruction-finetuned language models (2022). arXiv:2210.11416 .

Workshop, B. et al. Bloom: A 176b-parameter open-access multilingual language model (2023). arXiv:2211.05100 .

Spencer, S. T., Joshi, V. & Mitchell, A. M. W. Can ai put gamma-ray astrophysicists out of a job? (2023). arXiv:2303.17853 .

Cherian, A., Peng, K.-C., Lohit, S., Smith, K. & Tenenbaum, J. B. Are deep neural networks smarter than second graders? (2023). arXiv:2212.09993 .

Stab, C. & Gurevych, I. Annotating argument components and relations in persuasive essays. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers , 1501–1510 (Dublin City University and Association for Computational Linguistics, Dublin, Ireland, 2014).

Essay forum. https://essayforum.com/ . Last-accessed: 2023-09-07.

Common european framework of reference for languages (cefr). https://www.coe.int/en/web/common-european-framework-reference-languages . Accessed 09 July 2023.

Kmk guidelines for essay assessment. http://www.kmk-format.de/material/Fremdsprachen/5-3-2_Bewertungsskalen_Schreiben.pdf . Accessed 09 July 2023.

McNamara, D. S., Crossley, S. A. & McCarthy, P. M. Linguistic features of writing quality. Writ. Commun. 27 , 57–86 (2010).

McCarthy, P. M. & Jarvis, S. Mtld, vocd-d, and hd-d: A validation study of sophisticated approaches to lexical diversity assessment. Behav. Res. Methods 42 , 381–392 (2010).

Article PubMed Google Scholar

Dasgupta, T., Naskar, A., Dey, L. & Saha, R. Augmenting textual qualitative features in deep convolution recurrent neural network for automatic essay scoring. In Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications , 93–102 (2018).

Koizumi, R. & In’nami, Y. Effects of text length on lexical diversity measures: Using short texts with less than 200 tokens. System 40 , 554–564 (2012).

spacy industrial-strength natural language processing in python. https://spacy.io/ .

Siskou, W., Friedrich, L., Eckhard, S., Espinoza, I. & Hautli-Janisz, A. Measuring plain language in public service encounters. In Proceedings of the 2nd Workshop on Computational Linguistics for Political Text Analysis (CPSS-2022) (Potsdam, Germany, 2022).

El-Assady, M. & Hautli-Janisz, A. Discourse Maps - Feature Encoding for the Analysis of Verbatim Conversation Transcripts (CSLI lecture notes (CSLI Publications, Center for the Study of Language and Information, 2019).

Hautli-Janisz, A. et al. QT30: A corpus of argument and conflict in broadcast debate. In Proceedings of the Thirteenth Language Resources and Evaluation Conference , 3291–3300 (European Language Resources Association, Marseille, France, 2022).

Somasundaran, S. et al. Towards evaluating narrative quality in student writing. Trans. Assoc. Comput. Linguist. 6 , 91–106 (2018).

Nadeem, F., Nguyen, H., Liu, Y. & Ostendorf, M. Automated essay scoring with discourse-aware neural models. In Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications , 484–493, https://doi.org/10.18653/v1/W19-4450 (Association for Computational Linguistics, Florence, Italy, 2019).

Prasad, R. et al. The Penn Discourse TreeBank 2.0. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08) (European Language Resources Association (ELRA), Marrakech, Morocco, 2008).

Cronbach, L. J. Coefficient alpha and the internal structure of tests. Psychometrika 16 , 297–334. https://doi.org/10.1007/bf02310555 (1951).

Article MATH Google Scholar

Wilcoxon, F. Individual comparisons by ranking methods. Biom. Bull. 1 , 80–83 (1945).

Holm, S. A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6 , 65–70 (1979).

MathSciNet MATH Google Scholar

Cohen, J. Statistical power analysis for the behavioral sciences (Academic press, 2013).

Freedman, D., Pisani, R. & Purves, R. Statistics (international student edition). Pisani, R. Purves, 4th edn. WW Norton & Company, New York (2007).

Scipy documentation. https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.pearsonr.html . Accessed 09 June 2023.

Windschitl, M. Framing constructivism in practice as the negotiation of dilemmas: An analysis of the conceptual, pedagogical, cultural, and political challenges facing teachers. Rev. Educ. Res. 72 , 131–175 (2002).

Download references

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and affiliations.

Faculty of Computer Science and Mathematics, University of Passau, Passau, Germany

Steffen Herbold, Annette Hautli-Janisz, Ute Heuer, Zlata Kikteva & Alexander Trautsch

You can also search for this author in PubMed Google Scholar

Contributions

S.H., A.HJ., and U.H. conceived the experiment; S.H., A.HJ, and Z.K. collected the essays from ChatGPT; U.H. recruited the study participants; S.H., A.HJ., U.H. and A.T. conducted the training session and questionnaire; all authors contributed to the analysis of the results, the writing of the manuscript, and review of the manuscript.

Corresponding author

Correspondence to Steffen Herbold .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary information 1., supplementary information 2., supplementary information 3., supplementary tables., supplementary figures., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Herbold, S., Hautli-Janisz, A., Heuer, U. et al. A large-scale comparison of human-written versus ChatGPT-generated essays. Sci Rep 13 , 18617 (2023). https://doi.org/10.1038/s41598-023-45644-9

Download citation

Received : 01 June 2023

Accepted : 22 October 2023

Published : 30 October 2023

DOI : https://doi.org/10.1038/s41598-023-45644-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

Explore articles by subject
Guide to authors
Editorial policies

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

We've detected unusual activity from your computer network

To continue, please click the box below to let us know you're not a robot.

Why did this happen?

Please make sure your browser supports JavaScript and cookies and that you are not blocking them from loading. For more information you can review our Terms of Service and Cookie Policy .

For inquiries related to this message please contact our support team and provide the reference ID below.

What is ChatGPT?

How to Use Google Gemini

Digital Trends may earn a commission when you buy through links on our site. Why trust us?

ChatGPT: the latest news and updates on the AI chatbot that changed everything

In the ever-evolving landscape of artificial intelligence , ChatGPT stands out as a groundbreaking development that has captured global attention. From its impressive capabilities and recent advancements to the heated debates surrounding its ethical implications, ChatGPT continues to make headlines.

When was ChatGPT released?

How to use chatgpt, how to use the chatgpt iphone, android, and mac apps, is chatgpt free to use, who created chatgpt, chatgpt’s continuous confounding controversies, can chatgpt’s outputs be detected by anti-plagiarism systems, what are chatgpt plugins, is there a chatgpt api, what’s the future of chatgpt, chatgpt alternatives worth trying, other things to know about chatgpt.

Whether you’re a tech enthusiast or just curious about the future of AI, dive into this comprehensive guide to uncover everything you need to know about this revolutionary AI tool.

ChatGPT is a natural language AI chatbot . At its most basic level, that means you can ask it a question and it will generate an answer. As opposed to a simple voice assistant like Siri or Google Assistant , ChatGPT is built on what is called an LLM (Large Language Model). These neural networks are trained on huge quantities of information from the internet for deep learning — meaning they generate altogether new responses, rather than just regurgitating specific canned responses. They’re not built for a specific purpose like chatbots of the past — and they’re a whole lot smarter.

This is implied in the name of ChatGPT, which stands for Chat Generative Pre-trained Transformer. In the case of the current version of ChatGPT, it’s based on the GPT-4 LLM. The model behind ChatGPT was trained on all sorts of web content including websites, books, social media, news articles, and more — all fine-tuned in the language model by both supervised learning and RLHF (Reinforcement Learning From Human Feedback). OpenAI says this use of human AI trainers is really what makes ChatGPT stand out.

ChatGPT was originally launched to the public in November of 2022 by OpenAI. That initial version was based on the GPT-3.5 model, though the system has undergone a number of iterative advancements since then with the current version of ChatGPT running the GPT-4 model family, with GPT-5 reportedly just around the corner . GPT-3 was first launched in 2020, GPT-2 released the year prior to that.

First, go to chatgpt.com . If you’d like to maintain a history of your previous chats, sign up for a free account. You can use the system anonymously without a login if you prefer. Users can opt to connect their ChatGPT login with that of their Google-, Microsoft- or Apple-backed accounts as well. At the sign up screen, you’ll see some basic rules about ChatGPT, including potential errors in data, how OpenAI collects data, and how users can submit feedback. If you want to get started, we have a roundup of the best ChatGPT tips .

Using ChatGPT itself is simple and straightforward, just type in your text prompt and wait for the system to respond. You can be as creative as you like, and see how your ChatGPT responds to different prompts. If you don’t get the intended result, try tweaking your prompt or giving ChatGPT further instructions The system understands context based on previous responses from the current chat session, so you can refine your requests rather than starting over fresh every time.

For example, starting with “Explain how the solar system was made” will give a more detailed result with more paragraphs than “How was the solar system made,” even though both inquiries will give fairly detailed results. Take it a step further by giving ChatGPT more guidance about style or tone, saying “Explain how the solar system was made as a middle school teacher.”

You also have the option for more specific input requests, for example, an essay with a set number of paragraphs or a link to a specific Wikipedia page. We got an extremely detailed result with the request “write a four-paragraph essay explaining Mary Shelley’s Frankenstein.”

ChatGPT is capable of automating any number of daily work or personal tasks from writing emails and crafting business proposals, to offering suggestions for fun date night ideas or even drafting a best man’s speech for your buddy’s wedding. So long as the request doesn’t violate the site’s rules on explicit or illegal content, the model will do its best to fulfill the commands.

Since its launch, people have been experimenting to discover everything the chatbot can and can’t do — and the results have been impressive, to say the least . Learning the kinds of prompts and follow-up prompts that ChatGPT responds well to requires some experimentation though. Much like we’ve learned to get the information we want from traditional search engines, it can take some time to get the best results from ChatGPT. It really all depends on what you want out of it. To start out, try using it to write a template blog post, for example, or even blocks of code if you’re a programmer.

Our writers experimented with ChatGPT too, attempting to see if it could handle holiday shopping or even properly interpret astrological makeup . In both cases, we found limitations to what it could do while still being thoroughly impressed by the results.

Following an update on August 10, you can now use custom instructions with ChatGPT . This allows you to customize how the AI chatbot responds to your inputs so you can tailor it for your needs. You can’t ask anything, though. OpenAI has safeguards in place in order to “build a safe and beneficial artificial general intelligence.” That means any questions that are hateful, sexist, racist, or discriminatory in any way are generally off-limits.

You shouldn’t take everything that ChatGPT (or any chatbot, for that matter) tells you at face value. When ChatGPT first launched it was highly prone to “ hallucinations .” The system would repeat erroneous data as fact. The issue has become less prevalent as the model is continually fine tuned, though mistakes do still happen . Trust but verify!

What’s more, due to the way that OpenAI trains its underlying large language models — whether that’s GPT-3.5, GPT-4 and GPT-4o , or the upcoming GPT-5 — ChatGPT may not be able to answer your question without help from an internet search if the subject is something that occurred recently. For example, GPT-3.5 and 3.5 Turbo cannot answer questions about events after September 2021 without conducting an internet search to find the information because the data that the model was initially trained on was produced before that “knowledge cutoff date.” Similarly, GPT-4 or GPT-4 Turbo have cutoff dates of December 2023, though GPT-40 (despite being released more recently) has a cutoff of October 2023 .

While ChatGPT might not remember all of recorded history, it will remember what you were discussing with it in previous chat sessions. Logged in users can access their chat history from the navigation sidebar on the left of the screen, and manage these chats, renaming, hiding or deleting them as needed. You can also ask ChatGPT follow up questions based on those previous conversations directly through the chat window. Users also have the option to use ChatGPT in dark mode or light mode.

ChatGPT isn’t just a wordsmith. Those users paying $20/month subscription for ChatGPT Plus or $30/month/user for ChatGPT Business, gain access to the Dall-E image generator, which converts text prompts into lifelike generated images. Unfortunately, this feature is not currently available to the free tier. Regardless of subscription status, all users can use image or voice inputs for their prompt.

ChatGPT is available through the OpenAI web, as well as a mobile app for both iOS and Android devices. The iOS version was an immediate hit when it arrived at the App Store, topping half a million downloads in less than a week.

If you can use chatGPT on the web, you can use it on your phone. Logging on or signing up through the app is nearly identical to the web version and nearly all of the features found on the desktop have been ported to the mobile versions. The app lets you toggle between GPT-3.5, GPT-4, and GPT-4o as well. The clean interface shows your conversation with GPT in a straightforward manner, hiding the chat history and settings behind the menu in the top right.

Some devices go beyond just the app, too. For instance, the Infinix Folax is an Android phone that integrated ChatGPT throughout the device. Instead of just an app, the phone replaces the typical smart assistant (Google Assistant) with ChatGPT.

There’s even an official ChatGPT app released for the Mac that can be used for free . The app is capable of all sorts of new things that bring Mac AI capabilities to new levels — and you don’t even have to wait for macOS Sequoia later this year.

Yes, ChatGPT is completely free to use, though with some restrictions. Even with a free tier account, users will have access to the GPT-3.5 and GPT-40 models, though the number of queries that users can make of the more advanced model are limited. Upgrading to a paid subscription drastically increases that query limit, grants access to other generative AI tools like Dall-E image generation, and the GPT store.

It’s not free for OpenAI to continue running it, of course. Initial estimates are currently that OpenAI spends around $3 million per month to continue running ChatGPT, which is around $100,000 per day. A report from April 2023 indicated that the price of operation is closer to $700,000 per day .

Beyond the cost of the servers themselves, some troubling information and accusations have come to light regarding what else has been done to safeguard the model from producing offensive content.

OpenAI, a San Francisco-based AI research lab, created ChatGPT and released the very first version of the LLM in 2018. The organization started as a non-profit meant for collaboration with other institutions and researchers, funded by high-profile figures like Peter Thiel and Elon Musk, the latter of whom left the company after an internal power struggle to found rival firm, xAI.

OpenAI later transitioned to a for-profit structure in 2019 and is now led by CEO, Sam Altman. It runs on Microsoft’s Azure system infrastructure and is powered by Nvidia’s GPUs, including the new supercomputers just announced this year . Microsoft has invested heavily in OpenAI since 2019 as well, expanding its partnership with the AI startup in 2021 and again in 2023, when Microsoft announced a multi-billion dollar round of investments that included naming its Azure cloud as OpenAI’s exclusive cloud provider.

Although ChatGPT is an extremely capable digital tool, it isn’t foolproof. The AI is known for making mistakes or “hallucinations,” where it makes up an answer to something it doesn’t know. Early on, a simple example of how unreliable it can sometimes be involved misidentifying the prime minister of Japan .

Beyond just making mistakes, many people are concerned about what this human-like generative AI could mean for the future of the internet, so much so that thousands of tech leaders and prominent public figures have signed a petition to slow down the development. It was even banned in Italy due to privacy concerns, alongside complaints from the FTC — although that’s now been reversed. Since then, the FTC has reopened investigations against OpenAI on questions of personal consumer data is being handled.

In addition, JPMorgan Chase has threatened to restrict the use of the AI chatbot for workers, especially for generating emails, which companies like Apple have also prohibited internally. Following Apple’s announcement at WWDC 2024 that it would be integrating OpenAI’s technology into its mobile and desktop products, Tesla CEO and sore loser Elon Musk similarly threatened to ban any device running the software from his businesses — everything from iPhones to Mac Studios. Other high-profile companies have been disallowing the use of ChatGPT internally, including Samsung, Amazon, Verizon, and even the United States Congress .

There’s also the concern that generative AI like ChatGPT could result in the loss of many jobs — as many as 300 million worldwide, according to Goldman Sachs. In particular, it’s taken the spotlight in Hollywood’s writer’s strike , which wants to ensure that AI-written scripts don’t take the jobs of working screenwriters.

In 2023, many people attempting to use ChatGPT received an “at capacity” notice when trying to access the site . It’s likely behind the move to try and use unofficial paid apps, which had already flooded app stores and scammed thousands into paying for a free service.

Because of how much ChatGPT costs to run, it seems as if OpenAI has been limiting access when its servers are “at capacity.” It can take as long as a few hours to wait out, but if you’re patient, you’ll get through eventually. Of the numerous growing pains ChatGPT has faced , “at capacity” errors had been the biggest hurdle keeping people from using the service more. In some cases, demand had been so high that the entire ChatGPT website has gone down for several hours for maintenance multiple times over the course of months.

Multiple controversies have also emerged from people using ChatGPT to handle tasks that should probably be handled by an actual person. One of the worst cases of this is generating malware, which the FBI recently warned ChatGPT is being used for. More startling, Vanderbilt University’s Peabody School came under fire for generating an email about a mass shooting and the importance of community.

There are also privacy concerns. A recent GDPR complaint says that ChatGPT violates user’s privacy by stealing data from users without their knowledge, and using that data to train the AI model. ChatGPT was even made able to generate Windows 11 keys for free , according to one user. Of course, this is not how ChatGPT was meant to be used, but it’s significant that it was even able to be “tricked” into generating the keys in the first place.

Teachers, school administrators, and developers are already finding different ways around this and banning the use of ChatGPT in schools . Others are more optimistic about how ChatGPT might be used for teaching, but plagiarism is undoubtedly going to continue being an issue in terms of education in the future. There are some ideas about how ChatGPT could “watermark” its text and fix this plagiarism problem, but as of now, detecting ChatGPT is still incredibly difficult to do.

ChatGPT launched an updated version of its own plagiarism detection tool in January 2023, with hopes that it would squelch some of the criticism around how people are using the text generation system. It uses a feature called “AI text classifier,” which operates in a way familiar to other plagiarism software. According to OpenAI, however, the tool is a work in progress and remains “imperfect.” Since the advent of GPTs in April 2024, third party developers have also stepped in with their own offerings, such as Plagiarism Checker.

They’re a feature that doesn’t exist anymore. The announcement of ChatGPT plugins caused a great stir in the developer community, with some calling it “the most powerful developer platform ever created.” AI enthusiasts have compared it to the surge of interest in the iOS App Store when it first launched, greatly expanding the capabilities of the iPhone.

Essentially, developers would be able to build plugins directly for ChatGPT, to open it up to have access to the whole of the internet and connect directly to the APIs of specific applications. Some of the examples provided by OpenAI include applications being able to perform actions on behalf of the user, retrieve real-time information, and access knowledge-based information.

However, in 2024, OpenAI reversed course on its plugin plans , sunsetting the feature and replacing them with GPT applets. OpenAI’s GPT applets were released in conjunction with the unveiling of GPT-4o , They’re small, interactive JavaScript applications generated by GPT-4 and available on the ChatGPT website. These applets are various tools designed to perform specific, often singular, tasks such as acting as calculators, planners, widgets, image apps, and text transformation utilities.

Yes. APIs are a way for developers to access ChatGPT and plug its natural language capabilities directly into apps and websites. We’ve seen it used in all sorts of different cases, ranging from suggesting parts in Newegg’s PC builder to building out a travel itinerary with just a few words. Many apps had been announced as partners with OpenAI using the ChatGPT API. Of the initial batch, the most prominent example is Snapchat’s MyAI .

Recently, OpenAI made the ChatGPT API available to everyone, and we’ve seen a surge in tools leveraging the technology, such as Discord’s Clyde chatbot or Wix’s website builder . Most recently, GPT-4 has been made available as an API “for developers to build applications and services.” Some of the companies that have already integrated GPT-4 include Duolingo, Be My Eyes, Stripe, and Khan Academy.

There’s no doubt that the tech world has become obsessed with ChatGPT right now, and it’s not slowing down anytime soon. But the bigger development will be how ChatGPT continues to be integrated into other applications.

GPT-5 is the rumored next significant step up in, which has been teased and talked about ad nauseam over the past year. Some say that it will finish training as early as in December of 2024, paving the way toward AGI (artificial general intelligence) . OpenAI CTO Mira Murati has compared it to having Ph.D.-level intelligence , while others have said it will lead to AI with better memory and reasoning . The timing seems very uncertain though, but it seems like it could launch sometime in 2025.

Beyond GPT-5, plenty of AI enthusiasts and forecasters have predicted where this technology is headed. Last year, Shane Legg, Google DeepMind’s co-founder and chief AGI scientist, told Time Magazine that he estimates there to be a 50% chance that AGI will be developed by 2028. Dario Amodei, co-founder and CEO of Anthropic, is even more bullish, claiming last August that “human-level” AI could arrive in the next two to three years. For his part, OpenAI CEO Sam Altman argues that AGI could be achieved within the next half-decade .

All that to say, if you think AI is a big deal now, we’re clearly still in the early days.

ChatGPT remains the most popular AI chatbot, but it’s not without competition. Microsoft’s Copilot is a significant rival, even though Microsoft has invested heavily with the AI startup and Copilot itself leverages the GPT-4 model for its answers.

Google’s Gemini AI (formerly Google Bard ) is another such competitor. Built on Google’s own transformer architecture, this family of multimodal AI models can both understand and generate text, images, audio, videos, and code. First released in March, 2o23, Gemini is available in 46 languages and in 239 countries and territories. One of its big advantages is that Gemini can generate images for free, while you’ll have to upgrade to ChatGPT Plus in OpenAI’s ecosystem.

Anthropic’s Claude family of AI have also emerged as serious challengers to ChatGPT’s dominance. In June 2024, the AI startup announced that its recently released Claude 3.5 Sonnet model outperformed both GPT-4o and Gemini Pro 1.5 at a host of industry benchmarks and significantly outperformed the older Claude 3.0 Opus by double digits while consuming 50 percent less energy.

Meta, the parent company to Facebook, has also spent the last few years developing its own AI chatbot based on its family of Llama large language models. The company finally revealed its chatbot in April 2024, dubbed Meta AI, and revealed that it leverages the company’s latest to date model, Llama 3 . The assistant is available in more than a dozen countries and operates across Meta’s app suite, including Facebook, Instagram, WhatsApp, and Messenger.

Lastly, Apple had long been rumored to be working on an artificial intelligence system of its own, and proved the world right at WWDC 2024 in June, where the company revealed Apple Intelligence . The AI is “comprised of highly capable large language and diffusion models specialized for your everyday tasks” and designed to help iPhone, iPad and Mac users streamline many of their most common everyday tasks across apps.

For example, the system will autonomously prioritize specific system notifications so as to minimize distractions while you focus on a task while writing aides can proofread your work, revise them at your command, and even summarize text for you. Apple’s AI is expected to begin rolling out to users alongside the iOS 18, iPadOS 18 and Mac Sierra software releases in Fall 2024.

Are ChatGPT chats private?

It depends on what you mean by private. All chats with ChatGPT are used by OpenAI to further tune the models, which can actually involve the use of human trainers. No, that doesn’t mean a human is looking through every question you ask ChatGPT, but there’s a reason OpenAI warns against providing any personal information to ChatGPT.

It should be noted that if you don’t delete your chats, the conversations will appear in the left sidebar. Unlike with other chatbots, individual chats within a conversation cannot be deleted, though they can be edited using the pencil icon that appears when you hover over a chat. When you delete the conversations, however, it’s not that ChatGPT forgets they ever happened — it’s just that they disappear from the sidebar chat history.

Fortunately, OpenAI has recently announced a way to make your chats hidden from the sidebar . These “hidden” chats won’t be used to train AI models either. You can also opt out of allowing OpenAI to train its models in the settings.

Will ChatGPT replace Google Search?

Rather than replace it, generative AI features are being integrated directly into search. Microsoft started things off by integrating Copilot right into its own search engine, which puts a “chat” tab right into the menu of Bing search. Google, of course, made its big move with AI Overviews , which uses AI-generated answers in place of traditional search results. It launched first through its Search Generative Experience , but rolled out widely in May 2024.

To be clear, this kind of AI is different than just Gemini or ChatGPT. And yet, it’s also undeniable that AI will play an important role in the future of search in the near future. Despite all the problems with AI Overviews, Google seems committed to making it work.

Is Copilot the same as ChatGPT?

Although Copilot and ChatGPT are capable of similar things, they’re not exactly the same. Copilot, even though it runs the same GPT-4 model as ChatGPT, is an entirely separate product that has been fine-tuned by Microsoft.

Microsoft, as part of its multi-billion dollar investment into OpenAI, originally brought ChatGPT to Bing in the form of Bing Chat . But unlike ChatGPT , Bing Chat required downloading the latest version of Edge at the time.

Bing Chat has since been completely retooled into Copilot, which has seemingly become Microsoft’s most important product. It’s integrated into Microsoft 365 apps through Copilot Pro , while the Copilot+ expands the AI capabilities deep into Windows and laptop hardware.

Can you write essays with ChatGPT?

The use of ChatGPT has been full of controversy, with many onlookers considering how the power of AI will change everything from search engines to novel writing. It’s even demonstrated the ability to earn students surprisingly good grades in essay writing.

Essay writing for students is one of the most obvious examples of where ChatGPT could become a problem. ChatGPT might not write this article all that well, but it feels particularly easy to use for essay writing. Some generative AI tools, such as Caktus AI , are built specifically for this purpose.

Can ChatGPT write and debug code?

Absolutely. It’s one of the most powerful features of ChatGPT. As with everything with AI, you’ll want to double-check everything it produces, because it won’t always get your code right. But it’s certainly powerful at both writing code from scratch and debugging code. Developers have used it to create websites, applications, and games from scratch — all of which are made more powerful with GPT-4, of course.

What is the ChatGPT character limit?

ChatGPT doesn’t have a hard character limit. However, the size of the context window (essentially, how long you can make your prompt), depends on the tier of ChatGPT you’re using. Free tier users receive just 8,000 characters, while Plus and Teams subscribers receive 32k-charcter context windows, and Enterprise users get a whopping 128k characters to play with.

What is Auto-GPT?

Built on GPT-4, Auto-GPT is the latest evolution of AI technology to cause a stir in the industry. It’s not directly related to ChatGPT or OpenAI — instead, it’s an open-source Python application that got into the hands of developers all over the internet when it was published on GitHub .

With ChatGPT or ChatGPT Plus, the capabilities of the AI are limited to a single chat window. Auto-GPT, at its simplest, is making AI autonomous. It can be given a set of goals, and then take the necessary steps towards accomplishing that goal across the internet, including connecting up with applications and software.

According to the official description on GitHub, Auto-GPT is an “experimental open-source application showcasing the capabilities of the GPT-4 language model. This program, driven by GPT-4, chains together LLM ‘thoughts’, to autonomously achieve whatever goal you set. As one of the first examples of GPT-4 running fully autonomously, Auto-GPT pushes the boundaries of what is possible with AI.”

The demo used on the GitHub page is simple — just create a recipe appropriate for Easter and save it to a file. What’s neat is how Auto-GPT breaks down the steps the AI is taking to accomplish the goal, including the “thoughts” and “reasoning” behind its actions. Auto-GPT is already being used in a variety of different applications, with some touting it as the beginning of AGI (Artificial General Intelligence) due to its autonomous nature.

Who owns the copyright to content created by ChatGPT?

This is a question open to debate. Much of the conversation around copyright and AI is ongoing, with some saying generative AI is “stealing” the work of the content it was trained on. This has become increasingly contentious in the world of AI art. Companies like Adobe are finding ways around this by only training models on stock image libraries that already have proper artist credit and legal boundaries.

According to OpenAI, however, you have the right to reprint, sell, and merchandise anything that was created with ChatGPT or ChatGPT Plus. So, you’re not going to get sued by OpenAI.

The larger topic of copyright law regarding generative AI is still to be determined by various lawmakers and interpreters of the law, especially since copyright law as it currently stands technically only protects content created by human beings.

Editors’ Recommendations

Character.ai: how to use this insanely popular AI chatbot
The Microsoft AI CEO just dropped a huge hint about GPT-5
DuckDuckGo’s new AI service keeps your chatbot conversations private
Artificial Intelligence

Meta's latest foray into AI image generation is a quick one. The company introduced its new "3D Gen" model on Tuesday, a "state-of-the-art, fast pipeline" for transforming input text into high-fidelity 3D images that can output them in under a minute.

What's more, the system is reportedly able to apply new textures and skins to both generated and artist-produced images using text prompts.

If you're thinking of grabbing yourself a new laptop for work or school, then a great option to consider is the Lenovo ThinkPad P15v. Not only is Lenovo one of the best laptop brands on the market, but ThinkPads are well-known for being great work and school laptops, so they're a perfect option if you won't be gaming or anything like that. In fact, the ThinkPad P15v is currently a whopping $2,050 off as part of the 4th of July sales, and you can get it from Lenovo for just $899 rather than the usual $2,949 it goes for.

Why you should buy the Lenovo ThinkPad P15v One of the best things about the ThinkPad P15V is the Intel Core i7-12700H, which is great to see at this price point, especially as a mid-to-upper range processor. It will work great for most productivity tasks that include things like Excel sheets, and it should let you handle a bit of graphic design as well. In fact, the ThinkPad P15V has an NVIDIA T1200 graphics card, which is an entry-level professional graphics card akin to something like a GTX 1650 but with more CUDA cores for graphical performance. That means you'll be able to do graphical rendering and simulation tasks that rely on GPU processing power, and while it's not a very powerful GPU, the inclusion of one makes a big difference, especially at this price.

A college student got a top grade for an essay written with the help of ChatGPT, report says

A college student said he was given the highest grade on an essay written with the help of ChatGPT.
The student told BBC News he adapted and tweaked the content generated by the chatbot.
He said the first-class mark was the highest grade he'd received at Cardiff University in Wales.

A student at Cardiff University in Wales said he received a top grade on an essay written with the help of OpenAI's ChatGPT.

The student, who was called the pseudonym "Tom" to protect his identity, told BBC News he conducted an experiment with the AI chatbot during January's assessment period.

The student, who averages a 2.1 grade, said he submitted one 2,500-word essay written with the help of ChatGPT, and another without any help from the bot.

Watch: What is ChatGPT, and should we be afraid of AI chatbots?

Main content

To revisit this article, visit My Profile, then View saved stories .

Backchannel
Newsletters
WIRED Insider
WIRED Consulting

OpenAI Wants AI to Help Humans Train AI

3D rendering of multicolored glowing speech bubbles in front of a blue background

One of the key ingredients that made ChatGPT a ripsnorting success was an army of human trainers who gave the artificial intelligence model behind the bot guidance on what constitutes good and bad outputs. OpenAI now says that adding even more AI into the mix—to help assist human trainers—could help make AI helpers smarter and more reliable.

In developing ChatGPT, OpenAI pioneered the use of reinforcement learning with human feedback, or RLHF. This technique uses input from human testers to fine-tune an AI model so that its output is judged to be more coherent, less objectionable, and more accurate. The ratings the trainers give feed into an algorithm that drives the model’s behavior. The technique has proven crucial both to making chatbots more reliable and useful and preventing them from misbehaving.

“RLHF does work very well, but it has some key limitations,” says Nat McAleese, a researcher at OpenAI involved with the new work. For one thing, human feedback can be inconsistent. For another it can be difficult for even skilled humans to rate extremely complex outputs, such as sophisticated software code. The process can also optimize a model to produce output that seems convincing rather than actually being accurate.

OpenAI developed a new model by fine-tuning its most powerful offering, GPT-4, to assist human trainers tasked with assessing code. The company found that the new model, dubbed CriticGPT, could catch bugs that humans missed, and that human judges found its critiques of code to be better 63 percent of the time. OpenAI will look at extending the approach to areas beyond code in the future.

“We’re starting work to integrate this technique into our RLHF chat stack,” McAleese says. He notes that the approach is imperfect, since CriticGPT can also make mistakes by hallucinating, but he adds that the technique could help make OpenAI’s models as well as tools like ChatGPT more accurate by reducing errors in human training. He adds that it might also prove crucial in helping AI models become much smarter, because it may allow humans to help train an AI that exceeds their own abilities. “And as models continue to get better and better, we suspect that people will need more help,” McAleese says.

The new technique is one of many now being developed to improve large language models and squeeze more abilities out of them. It is also part of an effort to ensure that AI behaves in acceptable ways even as it becomes more capable.

Earlier this month, Anthropic, a rival to OpenAI founded by ex-OpenAI employees, announced a more capable version of its own chatbot, called Claude, thanks to improvements in the model’s training regimen and the data it is fed. Anthropic and OpenAI have both also recently touted new ways of inspecting AI models to understand how they arrive at their output in order to better prevent unwanted behavior such as deception.

The new technique might help OpenAI train increasingly powerful AI models while ensuring their output is more trustworthy and aligned with human values, especially if the company successfully deploys it in more areas than code. OpenAI has said that it is training its next major AI model, and the company is evidently keen to show that it is serious about ensuring that it behaves. This follows the dissolvement of a prominent team dedicated to assessing the long-term risks posed by AI. The team was co-led by Ilya Sutskever, a cofounder of the company and former board member who briefly pushed CEO Sam Altman out of the company before recanting and helping him regain control. Several members of that team have since criticized the company for moving riskily as it rushes to develop and commercialize powerful AI algorithms.

Dylan Hadfield-Menell , a professor at MIT who researches ways to align AI, says the idea of having AI models help train more powerful ones has been kicking around for a while. “This is a pretty natural development,” he says.

Hadfield-Menell notes that the researchers who originally developed techniques used for RLHF discussed related ideas several years ago. He says it remains to be seen how generally applicable and powerful it is. “It might lead to big jumps in individual capabilities, and it might be a stepping stone towards sort of more effective feedback in the long run,” he says.

The Hunt for the Most Efficient Heat Pump in the World

${intro} ${title}

Ex-openai employee writes ai essay: war with china, resources and robots.

The race for AGI has begun, but according to Leopold Aschenbrenner, the race for the necessary resources and power is even more important.

(Image: Shutterstock/Alexander Supertramp)

Eva-Maria Weiß

Leopold Aschenbrenner looks a bit like the villain of a Bravo photo love story from the 90s, blonde quiff, broad smile. But the scientist, who originally comes from Germany, is above all a high-flyer: according to his LinkedIn profile, Aschenbrenner graduated from Columbia University in the USA at 19, top of his class. After a few jobs in research and investment, Aschenbrenner joined OpenAI in 2023 in the Superalignment department. This is the team that deals with the security of superintelligence. One and a half years and a few squabbles at the AI company later, Aschenbrenner was made redundant.

Shortly afterward, the economist published a 165-page essay that is currently attracting a lot of attention. Aschenbrenner is apparently one of those people who can be described as AI ultras. He assumes, for example, that in just one or two years, AI will be smarter than most college graduates. By the end of the decade, he says, AI will be smarter than all of us, including Aschenbrenner himself. "We will have a superintelligence, in the true meaning of that word," he writes, for example. It also sounds a little dubious when he writes about "The Project", which he capitalizes. This project could either lead to a race with the Chinese Communist Party or to "total war".

Reasons for Aschenbrenner's dismissal from OpenAI

This concern is also said to have been one of the reasons why Aschenbrenner was kicked out of OpenAI. In a podcast, he said that there had been a security incident at OpenAI, after which he compiled a memo that he claims to have shared with some members of the board. The memo was about the possible theft of "important algorithmic secrets from foreign actors", writes Business Insider . The HR department described worrying about espionage by the Chinese Communist Party as "racist and unconstructive" and warned him. In addition, according to Aschenbrenner, he was accused of passing on confidential information. This allegedly involved a brainstorming document that he shared with scientists to get their opinion.

Sharing is probably also the exciting thing about the essay. Aschenbrenner is in contact with various tech entrepreneurs, employees and investors in San Francisco and Silicon Valley. In his essay, you can read about how some of these people see the future, what they talk about - and what mindset prevails among them.

Künstliche Intelligenz: Was kommt nach ChatGPT?

While some scientists assume that it will be a long time before AI is truly intelligent and reliable, Aschenbrenner believes that the next breakthrough is just around the corner. This is although he himself writes that "the magic of deep learning is that it simply works". Other experts call this magic a "black box" because no one is quite sure how and why it works so well. And that's why it's also clear to many that progress will not necessarily be linear. Meta's head of AI, Naila Murray, for example, has said that she sees clear limits to large language models and that scaling does not reflect the intelligence that we humans have. And even if AI continues to make great progress, there are limiting factors: energy and data.

Where does the energy for AI come from?

Aschenbrenner is also clear when it comes to the resources that AI needs. He believes that by the end of the decade, individual training clusters will consume more than 20 percent of all the electricity generated in the USA. Nvidia's soaring stock market and sales are just the beginning, he says. "Trillions of dollars of investment will produce hundreds of millions of GPUs per year", the essay says. "Where can I find 10GW? (Power for the $100B+, trending 2028 cluster) is a popular topic of conversation in SF," says Aschenbrenner. He calculates the demand and investments in detail. Aschenbrenner says, "Surprisingly, even 100GW clusters are easy to implement. Aschenbrenner is relying on natural gas, saying that only around 1200 new boreholes would be needed. For comparison: According to EnBW, the total net output of all generation plants in Germany is around 252.8 gigawatts (GW) .

Phil Schiller's observer role on OpenAI's board may strengthen Apple-OpenAI ties

Allegations of copyright infringement: figma disables ai design function, perplexity violates aws guidelines – amazon investigates, apple intelligence: these devices are getting the new ai functions, bosch smart e-bike system: using ai to combat range anxiety.

Aschenbrenner's calculations also include the fact that OpenAI is continuously increasing in value. While the company was worth one billion US dollars in August 2023, it was already worth two billion US dollars in February 2024. Aschenbrenner therefore expects a valuation of ten billion US dollars by the end of 2024 or early 2025. And he writes: without there being a new "next generation model".

Bill Gates also recently commented on the subject of energy. His calculations are also typical of Silicon Valley . Gates believes that what AI will consume, it will recoup. Namely, AI algorithms will lead to savings through better calculations and progress. OpenAI's CEO Sam Altman is already investing in a start-up that relies on fusion power plants. There are said to be talks with OpenAI about contracts to purchase the energy that does not yet exist. Helion has even already signed contracts with Microsoft. The power plant in the state of Washington is expected to be up and running by 2028.

According to his own website, Aschenbrenner himself has now founded an investment company that, of course, specializes in AI. He may therefore also benefit if the hype continues and expectations remain high.

Incidentally, in Aschenbrenner's vision, the data that AI needs comes from robots. Built by robots, in factories created by robots.

1 Monat gratis lesen. Jetzt 1 Monat gratis lesen.

Das digitale abo für it und technik..

Exklusive Tests, Ratgeber & Hintergründe. Unbegrenzter Zugriff auf alle heise+ Beiträge inkl. allen Digital-Magazinen.

Today's news
Reviews and deals
Climate change
2024 election
Fall allergies
Health news
Mental health
Sexual health
Family health
So mini ways
Unapologetically
Buying guides

Entertainment

How to Watch
My watchlist
Stock market
Biden economy
Personal finance
Stocks: most active
Stocks: gainers
Stocks: losers
Trending tickers
World indices
US Treasury bonds
Top mutual funds
Highest open interest
Highest implied volatility
Currency converter
Basic materials
Communication services
Consumer cyclical
Consumer defensive
Financial services
Industrials
Real estate
Mutual funds
Credit cards
Balance transfer cards
Cash back cards
Rewards cards
Travel cards
Online checking
High-yield savings
Money market
Home equity loan
Personal loans
Student loans
Options pit
Fantasy football
Pro Pick 'Em
College Pick 'Em
Fantasy baseball
Fantasy hockey
Fantasy basketball
Download the app
Daily fantasy
Scores and schedules
GameChannel
World Baseball Classic
Premier League
CONCACAF League
Champions League
Motorsports
Horse racing
Newsletters

New on Yahoo

Privacy Dashboard
Buying Guides

Meet CriticGPT, OpenAI's new fix for ChatGPT's biggest coding blunders

Oops! Something went wrong. Please try again later. More content below

ChatGPT might be able to write lines of code, but it's often full of errors.

OpenAI thinks it's found a solution to the problem: CriticGPT.

The new model is designed to help AI trainers get sharper in identifying ChatGPT's coding mistakes.

OpenAI is starting to get serious about the hunt for bugs lurking in code generated by ChatGPT.

Since its release, the AI chatbot has impressed the developer community with its ability to produce code in programming languages such as Python and Ruby. Yet it's also given developers reason to be dubious: code produced by ChatGPT is often full of mistakes.

A study published in August 2023 by researchers at Purdue University found ChatGPT's responses to questions about code in the developer forum Stack Overflow were wrong 52% of the time, after assessing it for "correctness, consistency, comprehensiveness and conciseness."

What's worse, the researchers found, was that the mistakes were often tough to identify: ChatGPT's seemingly "articulate" responses made it difficult to spot the errors.

OpenAI seems to have recognized the problem, and is responding with a new solution: CriticGPT.

The new model, revealed by the startup last week, has been built to "catch errors in ChatGPT's code output." In OpenAI's telling, the tool based on its GPT-4 model sounds like it could be a huge help to developers.

"We found that when people get help from CriticGPT to review ChatGPT code they outperform those without help 60% of the time," the company said in a blog post.

'Fewer hallucinated bugs'

To begin, OpenAI is limiting CriticGPT's access to AI trainers. In practice, that means humans whose job involves reviewing answers produced by ChatGPT — through reinforcement learning from human feedback (RLHF) — will get assistance from CriticGPT to assess for accuracy.

OpenAI says CriticGPT will offer humans AI that "augments their skills, resulting in more comprehensive critiques than when people work alone" in reviewing ChatGPT answers and "fewer hallucinated bugs than when the model works alone."

Their hope is that this will boost a review process becoming increasingly tricky. OpenAI has acknowledged that the more advanced its AI models become , the harder it can be for AI trainers "to spot inaccuracies when they do occur."

The bigger problem this can lead to, the company said, is increased difficulty in aligning models with their intended objectives "as they gradually become more knowledgeable than any person that could provide feedback."

A future where models produced by AI companies actually become more knowledgeable than a person using them is not yet in sight, but AI researchers focused on safety have been busy thinking about how to keep such models in check to ensure they pose no threats.

Such researchers include Jan Leike, who quit OpenAI in May over safety concerns and happened to be one of several coauthors on the paper explaining how CriticGPT works.

Very exciting that this is out now (from my time at OpenAI): We trained an LLM critic to find bugs in code, and this helps humans find flaws on real-world production tasks that they would have missed otherwise. A promising sign for scalable oversight! https://t.co/e6CiXXoCeG pic.twitter.com/EJ6OSfUN9p — Jan Leike (@janleike) June 27, 2024

OpenAI admits there are some limitations to CriticGPT.

For now, it's only handling answers from ChatGPT "that are quite short." AI models are also still susceptible to hallucinations that can be missed by AI trainers, OpenAI said.

Still, Sam Altman's company seems keen to boost its chatbot's coding chops by trying to catch its errors. CriticGPT clearly has a long way to go, but shows that OpenAI is at least trying to tackle the problem.

Read the original article on Business Insider

LLMs now write lots of science. Good

Easier and more lucid writing will make science faster and better

Chat bubble revealing abstract computer stuff

Your browser does not support the <audio> element.

M ANY PEople are busily experimenting with chatbots in the hope that generative artificial intelligence ( AI ) can improve their daily lives. Scientists, brainy as they are, are several steps ahead. As we report , 10% or more of abstracts for papers in scientific journals now appear to be written at least in part by large language models. In fields such as computer science that figure rises to 20%. Among Chinese computer scientists, it is a third.

Some see this enthusiastic adoption as a mistake. They fear that vast quantities of poor-quality papers will introduce biases, boost plagiarism and jam the machinery of scientific publication. Some journals, including the Science family, are imposing onerous disclosure requirements on the use of llm s. Such attempts are futile and misguided. llm s cannot easily be policed. Even if they could be, many scientists find that their use brings real benefits.

Research scientists are not just devoted to laboratory work or thinking big thoughts. They face great demands on their time, from writing papers and teaching to filling out endless grant applications. llm s help by speeding up the writing of papers, thereby freeing up time for scientists to develop new ideas, collaborate or check for mistakes in their work.

The technology can also help level a playing-field that is tilted towards native English speakers, because many of the prestigious journals are in their tongue. llm s can help those who do not speak the language well to translate and edit their text. Thanks to LLM s, scientists everywhere should be able to disseminate their findings more easily, and be judged by the brilliance of their ideas and ingeniousness of their research, rather than their skill in avoiding dangling modifiers.

As with any technology, there are worries. Because llm s make it easier to produce professional-sounding text, they will make it easier to generate bogus scientific papers. Science received 10,444 submissions last year, of which 83% were rejected before peer review. Some of these are bound to have been ai -generated fantasies.

llm s could also export, through their words, the cultural environment in which they were trained. Their lack of imagination may spur inadvertent plagiarism, in which they directly copy past work by humans. “Hallucinations” that are obviously wrong to experts, but very believable to everyone else, could also make their way into the text. And most worrying of all, writing can be an integral part of the research process, by helping researchers clarify and formulate their own ideas. An excessive reliance on llm s could therefore make science poorer.

Trying to restrict the use of LLM s is not the way to deal with these problems. In the future they are rapidly going to become more prevalent and more powerful. They are already embedded in word processors and other software, and will soon be as common as spell-checkers. Researchers tell surveys that they see the benefits of generative ai not just for writing papers but for coding and doing administrative tasks. And crucially, their use cannot easily be detected. Although journals can impose all the burdensome disclosure requirements they like, it would not help, because they cannot tell when their rules have been broken. Journals such as Science should abandon detailed disclosures for the use of llm s as a writing tool, beyond a simple acknowledgment.

Science already has many defences against fabrication and plagiarism. In a world where the cost of producing words falls to nothing, these must become stronger still. Peer review, for instance, will become even more important in a gen- ai world. It must be beefed up accordingly, perhaps by paying reviewers for the time they sacrifice to scrutinise papers. There should also be more incentives for researchers to replicate experiments. Hiring and promotion committees at universities should ensure that scientists are rewarded based on the quality of their work and the quantity of new insights they generate. Curb the potential for misuse, and scientists have plenty to gain from their llm amanuenses. ■

Explore more

This article appeared in the Leaders section of the print edition under the headline “Can you make this clearer?”

Leaders June 29th 2024

Keir starmer should be britain’s next prime minister, what to make of joe biden’s plans for a second term, simple steps to stop people dying from heatwaves, a pivotal moment for china’s communist party, macron has done well by france. but he risks throwing it all away.

From the June 29th 2024 edition

Discover stories from this section and more in the list of contents

More from Leaders

Joe Biden should now give way to an alternative candidate

His last and greatest political act would help rescue America from an emergency

His domestic agenda is underwhelming, unrealistic and better than the alternative

Will Xi Jinping keep ignoring good advice at the party’s third plenum?

After the election, populists of the right and left could hobble a centrist president

Why Labour must form the next government

As much of the world roasts, don’t despair

More From Forbes

Has microsoft’s ai chief just made windows free.

Share to Facebook
Share to Twitter
Share to Linkedin

Microsoft's AI chief suggested anything found online is freeware

Content that has been posted on the open web should be treated as “freeware”, according to Microsoft’s AI chief. That being the case, he appears to have just ripped up the licensing agreement for software such as Microsoft Windows and Office.

Mustafa Suleyman, the CEO of Microsoft AI since March this year, made his eyebrow-raising comments during an interview with CNBC . Asked if the training of AI models on internet content was tantamount to intellectual property theft, Suleyman made the argument that anything posted on the web was fair game.

"I think that with respect to content that's already on the open web, the social contract of that content since the nineties has been that it is fair use,” said Suleyman. “Anyone can copy it, recreate with it, reproduce with it. That has been freeware, if you like, that's been the understanding."

Windows Licensing Terms

If that is “the understanding,” Microsoft’s licensing department seems to have a very different one when it comes to many of the products it posts on the open web.

For example, you can download the Windows 11 operating system on the open web from the Microsoft website . However, Microsoft is very protective of its intellectual property, as it makes clear in its terms of use , which are linked to at the bottom of the download site.

Best High-Yield Savings Accounts Of 2024

Best 5% interest savings accounts of 2024.

In fact, those terms include an FAQ on copyright, which directly contradict the statement made by Suleyman in his CNBC interview. “If a work is in the public domain, the work may be freely used without permission from the creator of the work,” the FAQs state. “However, just because a work is available online does not mean it's in the public domain or free to use.”

As for the notion that you could “copy it, recreate with it, reproduce with it,” that is at odds with the Windows licensing agreement , which expressly states you must not “publish, copy (other than the permitted backup copy), rent, lease, or lend the software,” nor “work around any technical restrictions or limitations in the software.”

Microsoft seemingly thinks you’re free to do what you like with content you find on the web, unless it’s Microsoft’s content.

Microsoft has been approached for comment.

Copyright Law

You might argue this point is somewhat facile, that there’s a clear distinction between the type of written or image content you might use to train an AI model and software that is being sold commercially.

However, U.S. copyright law makes no such distinction. As the FAQ page for the U.S. Copyright Office states: “Copyright, a form of intellectual property law, protects original works of authorship including literary, dramatic, musical, and artistic works, such as poetry, novels, movies, songs, computer software, and architecture.”

Nor does publishing it online automatically invalidate copyright law. “Your work is under copyright protection the moment it is created and fixed in a tangible form that it is perceptible either directly or with the aid of a machine or device,” the FAQ further states.

This is, of course, the reason why many AI companies are facing lawsuits for scraping data from the open web to train their large language models. In December, The New York Times announced it was suing ChatGPT creator OpenAI and Microsoft (which uses OpenAI’s products to power its own AI offerings) for “billions of dollars” in damages for unlawful use of its content. Other lawsuits have been issued.

So, it seems we will get to find out whether the “social contract” cited by Microsoft’s AI chief actually exists. In the meantime, it’s probably best to avoid doing what you like with Windows, or else you might find yourself subject to a lawsuit of your own.

Editorial Standards
Reprints & Permissions

Join The Conversation

One Community. Many Voices. Create a free account to share your thoughts.

Forbes Community Guidelines

Our community is about connecting people through open and thoughtful conversations. We want our readers to share their views and exchange ideas and facts in a safe space.

In order to do so, please follow the posting rules in our site's Terms of Service. We've summarized some of those key rules below. Simply put, keep it civil.

Your post will be rejected if we notice that it seems to contain:

False or intentionally out-of-context or misleading information
Insults, profanity, incoherent, obscene or inflammatory language or threats of any kind
Attacks on the identity of other commenters or the article's author
Content that otherwise violates our site's terms.

User accounts will be blocked if we notice or believe that users are engaged in:

Continuous attempts to re-post comments that have been previously moderated/rejected
Racist, sexist, homophobic or other discriminatory comments
Attempts or tactics that put the site security at risk
Actions that otherwise violate our site's terms.

So, how can you be a power user?

Stay on topic and share your insights
Feel free to be clear and thoughtful to get your point across
‘Like’ or ‘Dislike’ to show your point of view.
Protect your community.
Use the report tool to alert us when someone breaks the rules.

Thanks for reading our community guidelines. Please read the full list of posting rules found in our site's Terms of Service.

Chinese AI firms woo OpenAI users as US company plans API restrictions

Medium Text

Reporting by Eduardo Baptista and Yelin Mo; Editing by Edwina Gibbs, Louise Heavens and David Gregorio

Our Standards: The Thomson Reuters Trust Principles. New Tab , opens new tab

Illustration shows AI (Artificial Intelligence) letters and computer motherboard

Technology Chevron

Ericsson to book $1.1 billion impairment on weaker outlook for Vonage

Mobile telecoms equipment maker Ericsson said on Wednesday it will record a noncash impairment charge of 11.4 billion Swedish crowns ($1.09 billion) in the second quarter of 2024, the second writedown on its acquisition of Vonage.

OpenAI's new tool says it can spot text written by AI

But ChatGPT maker warns the tool isn't that reliable yet

OpenAI has announced a new tool that it says can tell the difference between text written by a human and that of an AI writer - some of the time.

The Microsoft -backed company says the new classifier, as it is called, has been developed to combat the malicious use of AI content generators, such as its very own and very popular ChatGPT , in "running automated misinformation campaigns , … academic dishonesty, and positioning an AI chatbot as a human."

So far, it claims that the classifier has a success rate of 26% in identifying AI-generated content, correctly labelling it as being 'likely AI-written', and a 9% false positive rate in mislabeling the work of humans as being artificially created.

TechRadar Pro needs you! We want to build a better website for our readers, and we need your help! You can do your bit by filling out our survey and telling us your opinions and views about the tech industry in 2023. It will only take a few minutes and all your answers will be anonymous and confidential. Thank you again for helping us make TechRadar Pro even better.

D. Athow, Managing Editor

Spot the difference

OpenAI notes that the classifier performs better the longer the text, and that compared to previous versions, the newer version is "significantly more reliable" at detecting autogenerated text from more recent AI tools.

The classifier is now publicly available, and OpenAI will use the feedback it gets to determine the usefulness of it and to help improve further developments of AI detection tools going forward.

OpenAI is keen to point out that it has its limitations and should not be relied upon as a "primary decision-making tool", a sentiment shared by most involved in all fields of AI.

As mentioned, the length of the text is important for the classifier's success, with OpenAI stating that it is "very unreliable" on pieces with less than a thousand characters.

Are you a pro? Subscribe to our newsletter

Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!

Even longer texts can be incorrectly identified, and human written content can be "incorrectly but confidently labeled as AI-written". Also, it performs worse on text in written in non-English languages as well as computer code.

Predictable text where the content can only realistically be written one way is also unable to be labelled reliably, such as a list of the first one thousand prime numbers, to give OpenAI's example.

What's more, OpenAI points out that AI text can be edited to fool the classifier, and although the classifier can be updated and learn from being tricked like this, interestingly, the company says it is "unclear whether detection has an advantage in the long-term."

Text that is also very different from that which it has been trained on can cause the classifier issues too, with it "sometimes [being] extremely confident in a wrong prediction."

> What is an AI chip? Everything you need to know > Top stock photo site gets AI image generator > Is an AI Bill of Rights enough?

On this training data, OpenAI says that it used pairs of written text on the same topic, one AI-produced and the other it believed to be written by a human - some gathered from human responses to prompts used to train InstructGPT, the AI model from the company that is primarily used by researchers and developers.

The development of the classifier comes amid numerous concerns and debates surrounding the use of AI chatbots, such as OpenAI's own ChatGPT, in academic institutions such as high schools and universities.

Accusations of cheating are mounting, as students are using the chatbot to write their assignments for them. Essay submission platform Turnitin has even developed its own AI-writing detection system in response.

OpenAI acknowledges this fact, and has even produced its own set of guidelines for educators to understand the uses and limitations of ChatGPT. It hopes its new classifier will not only be of benefit to this institution, but also "journalists, mis/dis-information researchers, and other groups."

The company wants to engage with educators to hear about their experiences with ChatGPT in the classroom, and they can use this form to submit feedback to OpenAI.

AI writing tools have been causing a stir elsewhere too. Tech site CNET recently came under fire for using an AI tool to write articles as part of an experiment, but was accused of failing to distinguish theses articles from those written by actual people. Such articles were also found to contain some basic factual errors.

If you're worried about copied work, then here are the best plagiarism detectors

Lewis Maddison is a Reviews Writer for TechRadar. He previously worked as a Staff Writer for our business section, TechRadar Pro, where he had experience with productivity-enhancing hardware, ranging from keyboards to standing desks. His area of expertise lies in computer peripherals and audio hardware, having spent over a decade exploring the murky depths of both PC building and music production. He also revels in picking up on the finest details and niggles that ultimately make a big difference to the user experience.

Best apps to transfer Android phone data of 2024

Smart Transfer review: File sharing revolutionized

Don’t miss this GEEKOM A7 mini PC Amazon Prime Day deal

What Is ChatGPT and How Does It Work?

How does chatgpt make money, potential benefits of chatgpt, disadvantages and dangers of chatgpt, inner turmoil, the bottom line.

Company Profiles
Tech Companies

What Is ChatGPT, and How Does It Make Money?

The popular chatbot has become a symbol of the promises, perils, and potential profits of artificial intelligence

Getty Images / Laurence Dutton

ChatGPT, the free chatbot released in November 2022 by the artificial intelligence (AI) research company OpenAI, has taken the internet by the proverbial storm. In its first years of existence, ChatGPT has been used to write essays and articles, summarize long texts, explain complicated arguments, write code, translate text, make workout plans, and even create bedtime stories for children.

Some artificial intelligence experts believe that ChatGPT could revolutionize both the way that humans interact with chatbots and AI more broadly. Others have expressed serious concerns about the technology's potential for harm.

Key Takeaways

ChatGPT is a chatbot released in November 2022 by OpenAI, an artificial intelligence research company.
Chatbots are AI systems that engage in spoken or written dialogue and are commonly used in the customer service sections of company websites as well as virtual assistants like Siri or Alexa.
The "GPT" in ChatGPT stands for "Generative Pre-training Transformer," referring to the way it processes language.
A free version of ChatGPT is available on OpenAI's website, and the company also sells a more advanced version on a subscription basis.
ChatGPT generates billions in annualized revenues but continues to be loss-making.

ChatGPT is an AI model that engages in conversational dialogue. It is an example of a chatbot , akin to the automated chat services found on some companies' customer service websites.

The technology was developed by OpenAI, a tech research company that has said it is dedicated to ensuring "that artificial general intelligence benefits all of humanity." The "GPT" in ChatGPT stands for "generative pre-training transformer," referring to the way that ChatGPT processes language.

What sets ChatGPT apart from chatbots of the past, is that ChatGPT was trained using reinforcement learning from human feedback (RLHF). RLHF involves the use of human AI trainers and reward models to develop ChatGPT into a bot capable of challenging incorrect assumptions, answering follow-up questions, and admitting mistakes.

GPT stands for "generative pre-training transformer."

To put ChatGPT to the test, Investopedia asked it to "write a journalistic-style article explaining what ChatGPT is." The bot responded that it was "designed to generate human-like text based on a given prompt or conversation." It added that, because it is trained on a data set of human conversations, it can understand context and intent and is able to have more natural, intuitive conversations.

In its response to our prompt, ChatGPT said that its applications could include customer service bots, the creation of content for social media or blogs, and the translation of text from one language to another.

OpenAI launched ChatGPT as a nonprofit in 2015, relying on donations for funding because its founders didn't like the idea of artificial intelligence being used for commercial purposes. Over time, however, the company said "it became increasingly clear that donations alone would not scale with the cost of computational power and talent required to push core research forward, jeopardizing our mission." So, in 2019, it created a for-profit subsidiary overseen by the nonprofit's board of directors.

A lot has changed since then. OpenAI's sales have been rocketing and were reported to have surpassed $2 billion on an annualized basis in December 2023 off the back of the soaring popularity of ChatGPT. This figure is expected to continue soaring in the future as businesses flock to use OpenAI’s technology to adopt generative AI tools in the workplace.

A free version of ChatGPT remains available on OpenAI's website. In addition, the company offers a premium version, ChatGPT Plus, with a subscription price of $20 per month. The free version comes with limited access to data analysis, file uploads, vision, web browsing, custom GPTs, and the latest system: ChatGPT-4.0. And management currently has no plans to make money off non-paying users through selling ads.

In addition, the company sells its application programming interface on a subscription basis to organizations looking to use the model for their own purposes, according to Investopedia's query to ChatGPT itself.

The percentage of Fortune 500 companies rumored to be using OpenAI products.

Despite rising substantially, revenues haven't been enough to cover expenses. ChatGPT continues to be loss-making because of the huge costs associated with building and running its models. And with continued investment needed to develop more sophisticated models in the future, this isn't expected to change anytime soon.

That means the company will continue to need to raise more money. OpenAI's biggest backer Microsoft has pledged up to $13 billion of funding. There have also been stock sales, the most recent of which valued OpenAI at $86 billion.

ChatGPT may be the leader in its field, but it is not without competitors, some with vast resources behind them, including Amazon, Google, and Meta.

While chatbots have existed for some years, ChatGPT is viewed as a significant improvement on the intelligibility, fluidity, and thoroughness of prior models.

As mentioned, ChatGPT has numerous potential uses. They range from relatively direct, chatbot-type functions to much more obscure applications, and it is likely that users will explore a host of other possible ways to utilize this technology in the future. Recent additions include the ability to upload photos for the bot to analyze and real-time language translations.

One demonstration of the sophistication of ChatGPT provided by OpenAI includes a prompt that was designed to trick the bot: asking about when Christopher Columbus (supposedly) came to the United States in 2015. ChatGPT's response easily avoided the trap, clarifying that while Columbus did not come to the U.S. in 2015, it can posit some of the ways he may have reacted to his visit if he had.

OpenAI itself lists some of the limitations of ChatGPT as it currently exists. They include ChatGPT sometimes writing coherent but incorrect statements, making assumptions about ambiguous queries, and tending to be excessively verbose.

In the first weeks of its public release, ChatGPT made headlines for its alleged use among students in creating AI-written papers and other assignments. Concerns about the misuse of ChatGPT for academic cheating grew large enough that a computer science student at Princeton University created an app designed to identify and expose writing created by the bot.

In September 2023, the Authors Guild and more than a dozen prominent fiction writers—including Jonathan Franzen, John Grisham, George R.R. Martin, Jodi Picoult, George Saunders, and Scott Turow—filed a lawsuit charging OpenAI with copyright infringement. It alleged that in order to train ChatGPT, the company had copied the authors' works "wholesale, without permission or consideration." Perhaps more ominously, the suit pointed to the potential for AI to "spit out derivative works," mimicking and competing with the human authors and potentially depriving them of their livelihood.

Several newspapers have also sued OpenAI and Microsoft for copyright infringement.

To some observers, ChatGPT poses additional and potentially even more serious risks. For instance, analysts have predicted that the bot could be used to make malware and phishing attacks more sophisticated, or that hackers may utilize the technology to develop their own AI models that may be less well-controlled. As concerns about misinformation have proliferated, some are especially sensitive to the possibility that ChatGPT could be used to create and share convincing but misleading material of a political nature.

Worse still are concerns that AI might ultimately take on a life of its own, leading even to human extinction—a worry shared by many legitimate scientists and industry experts, not simply a crackpot fringe.

OpenAI's reputation also hasn't been helped by inner turmoil, fueled in part by a clash between non-profit and for profit values.

In a fast-moving chain of events characterized by the New York Times as, "a fight between dueling visions of artificial intelligence," OpenAI's board fired Sam Altman, the company's CEO and its most public face, on Nov. 17, 2023, only to restore him to the position less than a week later.

"Mr. Altman's departure follows a deliberative review process by the board, which concluded that he was not consistently candid in his communications with the board, hindering its ability to exercise its responsibilities," the company said in a statement on Nov. 17. "The board no longer has confidence in his ability to continue leading OpenAI."

Within a matter of hours, Altman appeared to have been hired by Microsoft, a major investor in OpenAI, and virtually the entire staff of OpenAI threatened to resign unless he was rehired. The board relented on Nov. 22, and all but one of its members left the nonprofit. Altman was officially reinstated as OpenAI's CEO a week later.

The upshot of the turmoil, according to many industry commentators, was that the subsidiary's profit motive had won out over the nonprofit board's high-minded mission of serving "all of humanity."

What Makes ChatGPT Stand Out?

ChatGPT's functionality, including its ability to understand context and detail and to recognize when it has made mistakes, is seen as standing out among rivals.

What Is a Large Language Model (LLM)?

A large language model (LLM) is a way of training artificial intelligence tools like ChatGPT. Google, which is deeply involved in the development of AI, defines it as "a statistical language model, trained on a massive amount of data, that can be used to generate and translate text and other content, and perform other natural language processing (NLP) tasks."

What Is ChatGPT Used For?

ChatGPT is primarily used for natural language understanding and generation, making it valuable for tasks like content creation, chatbot development, language translation, and more. It can be used for a variety of tasks, and largely depends on how each user chooses to use it.

ChatGPT is the leading chatbot technology available today. But, not surprisingly, given all of the money likely to be made from artificial intelligence, numerous competitors have developed alternatives to challenge its dominance.

University of Central Arkansas. " Chat GPT: What Is It? "

CNET. " You'll Be Seeing ChatGPT's Influence Everywhere This Year ."

OpenAI. " Introducing ChatGPT ."

OpenAI. " About OpenAI Page ."

Lance B. Eliot, via SSRN. " Generative Pre-Trained Transformers (GPT-3) Pertain to AI in the Law ."

OpenAI. " Our Structure ."

The Financial Times. " OpenAI on Track to Hit $2bn Revenue Milestone as Growth Rockets ."

OpenAI. " ChatGPT ."

Reuters. " OpenAI Unveils New AI Model as Competition Heats Up ."

Bloomberg. " OpenAI Deal Lets Employees Sell Shares at $86 Billion Valuation ."

Northwest.Education. " Top 6 Competitors of ChatGPT ."

The New York Times. " The New ChatGPT Offers a Lesson in A.I. Hype ."

NPR. " A New AI Chatbot Might Do Your Homework for You. But It's Still Not an A+ Student ."

NPR. " A College Student Created an App That Can Tell Whether AI Wrote an Essay ."

The Authors Guild. " Case 1:23-cv-08292 Document 1 ," Page 2.

AP. " Eight US Newspapers Sue ChatGPT-Maker OpenAI and Microsoft for Copyright Infringement ."

Bloomberg. " Viral ChatGPT Spurs Concerns About Propaganda and Hacking Risks ."

Time. " An AI Pause Is Humanity's Best Bet for Preventing Extinction ."

The New York Times. " A.I. Belongs to the Capitalists Now ."

OpenAI. " OpenAI Announces Leadership Transition ."

Wired. " Sam Altman Officially Returns to OpenAI—With a New Board Seat for Microsoft ."

The New York Times. " What Happened in the World of Artificial Intelligence? "

Google Cloud. " Large Language Models Powered by World-Class Google AI ."

Terms of Service
Editorial Policy
Privacy Policy

Special Features

Vendor voice.

OpenAI develops AI model to critique its AI models

When your chatbots outshine their human trainers, you could pay for expertise ... or just augment your crowdsourced workforce.

To help catch code errors made by ChatGPT, OpenAI uses human AI trainers in the hope of improving the model. To help the human trainers, OpenAI has developed another AI model called CriticGPT – in case the humans don't spot the mistakes.

The Microsoft-championed super lab on Thursday issued a paper [PDF] titled, "LLM Critics Help Catch LLM Bugs," that explains the approach.

Generative AI models like GPT-4o get trained on massive amounts of data and then go through a refinement process called Reinforcement Learning from Human Feedback (RLHF).

This commonly involves human workers, often hired through crowdsourcing platforms, interacting with models and annotating their responses to various questions. When Time Magazine looked into this last year, it found OpenAI using Kenyan workers paid less than $2 per hour to improve its models.

The goal is to teach the model which answer is preferred, so it performs better. But RLHF becomes less effective as models become more capable. Human AI trainers find it harder to identify flawed answers, particularly when the chatbot reaches the point that it knows more than its teachers.

So as an aid to the people tasked with providing feedback to make its models more capable of generating programming code, OpenAI created another model – to critique those generative responses.

"We've trained a model, based on GPT-4, called CriticGPT, to catch errors in ChatGPT's code output," the AI startup explained in a blog post . "We found that when people get help from CriticGPT to review ChatGPT code they outperform those without help 60 percent of the time."

Screenshot of diagram from OpenAI paper, "LLM Critics Help Catch LLM Bugs" – Click to enlarge

In other words, this isn't an autonomous feedback loop from one chatbot to another – it's a way to augment the knowledge of those administering reinforcement learning.

This approach apparently leads to better results than just relying on crowdsourced workers – who at $2 per hour probably aren't computer science professors or trenchant technical writers, or whatever the prevailing annotation rate happens to be.

Anthropic tries 'to enable beneficial uses' of AI by government agencies

Reddit hopes robots.txt tweak will do the trick in scaring off AI training data scrapers
AI's appetite for power could double datacenter electricity bills by 2030

OpenAI to pull plug on 'unsupported' nations – cough, China – from July 9

According to the paper, the results show "that LLMs catch substantially more inserted bugs than qualified humans paid for code review, and further that model critiques are preferred over human critiques more than 80 percent of the time."

The finding that CriticGPT enables AI trainers to write better model response critiques isn't entirely surprising. Mediocre office temps presumably would write better crafted email messages with the help of generative AI too.

But AI help comes with a cost. When human contractors work in conjunction with CriticGPT, the resulting critiques of ChatGPT responses have a lower rate of hallucinations (invented bugs) than CriticGPT responses alone – but that error rate is still higher than if a human AI trainer had been left to respond without AI assistance.

"Unfortunately, it's not obvious what the right tradeoff between hallucinations and bug detection is for an overall RLHF system that uses critiques to enhance model performance," the paper concedes. ®

And speaking of Microsoft-backed things, a study has demonstrated that the Windows giant's Bing translation and web search engine in China censors more aggressively than its Chinese competitors. 谢谢, Redmond!

Development

Narrower topics

Accessibility
AdBlock Plus
Application Delivery Controller
Graphics Interchange Format
Large Language Model
Legacy Technology
LibreOffice
Machine Learning
Microsoft 365
Microsoft Office
Microsoft Teams
Mobile Device Management
Neural Networks
Programming Language
Retro computing
Search Engine
Software bug
Software License
Tensor Processing Unit
Text Editor
User interface
Visual Studio
Visual Studio Code
WebAssembly
Web Browser

Broader topics

Self-driving Car

Send us news

Microsoft CEO of AI: Your online content is 'freeware' fodder for training models

So much for green google ... emissions up 48% since 2019, cloudflare debuts one-click nuke of web-scraping ai, on-prem ai has arrived – the solution to cloudy problems no one really has, 'skeleton key' attack unlocks the worst of ai, says microsoft, payoff from ai projects is 'dismal', biz leaders complain, figma pulls ai design tool for seemingly plagiarizing apple's weather app.

Advertise with us

Our Websites

The Next Platform
Blocks and Files

Your Privacy

Cookies Policy
Privacy Policy
Ts & Cs

Critic’s Notebook

The Voices of A.I. Are Telling Us a Lot

Even as the technology advances, stubborn stereotypes about women are re-encoded again and again.

Credit... Illustration by Petra Péterffy

Supported by

Share full article

By Amanda Hess

Amanda Hess is a critic at large who writes about internet culture.

June 28, 2024

What does artificial intelligence sound like? Hollywood has been imagining it for decades. Now A.I. developers are cribbing from the movies, crafting voices for real machines based on dated cinematic fantasies of how machines should talk.

Last month, OpenAI revealed upgrades to its artificially intelligent chatbot. ChatGPT, the company said, was learning how to hear, see and converse in a naturalistic voice — one that sounded much like the disembodied operating system voiced by Scarlett Johansson in the 2013 Spike Jonze movie “Her.”

ChatGPT’s voice, called Sky, also had a husky timbre, a soothing affect and a sexy edge. She was agreeable and self-effacing; she sounded like she was game for anything. After Sky’s debut, Johansson expressed displeasure at the “eerily similar” sound, and said that she had previously declined OpenAI’s request that she voice the bot. The company protested that Sky was voiced by a “different professional actress,” but agreed to pause her voice in deference to Johansson. Bereft OpenAI users have started a petition to bring her back.

The A.I. operating system in the film “Her,” voiced by the actress Scarlett Johansson.

A voice of OpenAI’s chatbot, ChatGPT, paused by the company in May.

Enterprise Computer

The onboard starship computer in the original “Star Trek” series, voiced by the actress Majel Barrett-Roddenberry.

The A.I. virtual assistant on Apple devices (iOS 9).

The A.I. computer in “2001: A Space Odyssey,” voiced by the actor Douglas Rain.

A voice generated by TikTok’s text-to-speech feature.

We are having trouble retrieving the article content.

Please enable JavaScript in your browser settings.

Thank you for your patience while we verify access. If you are in Reader mode please exit and log into your Times account, or subscribe for all of The Times.

Thank you for your patience while we verify access.

Already a subscriber? Log in .

Want all of The Times? Subscribe .

Today's news
Reviews and deals
Climate change
2024 election
Fall allergies
Health news
Mental health
Sexual health
Family health
So mini ways
Unapologetically
Buying guides

Entertainment

How to Watch
My watchlist
Stock market
Biden economy
Personal finance
Stocks: most active
Stocks: gainers
Stocks: losers
Trending tickers
World indices
US Treasury bonds
Top mutual funds
Highest open interest
Highest implied volatility
Currency converter
Basic materials
Communication services
Consumer cyclical
Consumer defensive
Financial services
Industrials
Real estate
Mutual funds
Credit cards
Balance transfer cards
Cash back cards
Rewards cards
Travel cards
Online checking
High-yield savings
Money market
Home equity loan
Personal loans
Student loans
Options pit
Fantasy football
Pro Pick 'Em
College Pick 'Em
Fantasy baseball
Fantasy hockey
Fantasy basketball
Download the app
Daily fantasy
Scores and schedules
GameChannel
World Baseball Classic
Premier League
CONCACAF League
Champions League
Motorsports
Horse racing
Newsletters

New on Yahoo

Privacy Dashboard

News nonprofit alleges copyright infringement in lawsuit against OpenAI, Microsoft

The Center for Investigative Reporting has filed a lawsuit against OpenAI and Microsoft alleging copyright infringement in a new fight against unauthorized use of news content in building artificial intelligence (AI).

The news nonprofit, which produces Mother Jones and Reveal, alleges in the lawsuit that OpenAI, the creator of the popular ChatGPT tool, used its content without permission or offering compensation.

OpenAI’s platforms are trained using human works and in particular, human-made journalism, to “attempt to mimic how humans write and speak,” the lawsuit said.

Filed in the Southern District of New York, the suit argues that the company attempts to compete for the attention of consumers to earn profit. It said Open AI has used “hundreds of thousands, if not millions” of journalistic articles, undermining the Center for Investigative Reporting’s standing with the public.

Monika Bauerlein, the nonprofit’s CEO, told The Associated Press that “it’s immensely dangerous.”

“Our existence relies on users finding our work valuable and deciding to support it,” she said, voicing concerns that users will now create a relationship with the AI tool instead.

The AI companies are battling other copyright lawsuits from various news organizations that also argue their work has been used without permission to “fuel the commercialization” of AI.

The AP noted that instead of battling the new AI wave, some news organizations are partnering with them. For example, Time magazine announced Thursday that OpenAI will be receiving access to its archives over the last 100 years.

The Hill has reached out to OpenAI and Microsoft for comment.

For the latest news, weather, sports, and streaming video, head to The Hill.

COMMENTS

Introducing ChatGPT
Today's research release of ChatGPT is the latest step in OpenAI's iterative deployment of increasingly safe and useful AI systems. Many lessons from deployment of earlier models like GPT-3 and Codex have informed the safety mitigations in place for this release, including substantial reductions in harmful and untruthful outputs achieved by ...
How to Use OpenAI to Write Essays: ChatGPT Tips for Students
3. Ask ChatGPT to write the essay. To get the best essay from ChatGPT, create a prompt that contains the topic, type of essay, and the other details you've gathered. In these examples, we'll show you prompts to get ChatGPT to write an essay based on your topic, length requirements, and a few specific requests:
ChatGPT: Everything you need to know about the AI chatbot
ChatGPT, OpenAI's text-generating AI chatbot, has taken the world by storm since its launch in November 2022. What started as a tool to hyper-charge productivity through writing essays and code ...
Anatomy of an AI Essay
Since OpenAI launched ChatGPT in 2022, educators have been grappling with the problem of how to recognize and address AI-generated writing. The host of AI-detection tools that have emerged over the past year vary greatly in their capabilities and reliability. For example, mere months after OpenAI launched its own AI detector, the company shut it down due to its low accuracy rate.
AI bot ChatGPT writes smart essays
ChatGPT is the brainchild of AI firm OpenAI, based in San Francisco, California. ... noting that students have long been able to outsource essay writing to human third parties through ...
Will ChatGPT Kill the Student Essay?
Because Mike Sharples, a professor in the U.K., used GPT-3, a large language model from OpenAI that automatically generates text from a prompt, to write it. (The whole essay, which Sharples ...
A large-scale comparison of human-written versus ChatGPT-generated essays
The corpus features essays for 90 topics from Essay Forum 42, an active community for providing writing feedback on different kinds of text and is frequented by high-school students to get ...
We Used A.I. to Write Essays for Harvard, Yale and Princeton. Here's
Write me a 100-word essay in the voice of a high school student explaining why I would love to attend Dartmouth to pursue a double major in biology and computer science. HuggingChat
OpenAI
Instant answers. Greater productivity. Endless inspiration. We believe our research will eventually lead to artificial general intelligence, a system that can solve human-level problems. Building safe and beneficial AGI is our mission.
Understanding OpenAI's ChatGPT Essay Writer
Understanding OpenAI's ChatGPT Essay Writer. Since the release of ChatGPT, numerous students have taken to the AI chatbot and writing generator for help with their assignments. However, despite ChatGPT's many capabilities, it is still prone to plagiarism, mechanical writing, inaccurate information, and a certain degree of bias.
AI ChatGPT: OpenAI, DALL-E Maker's New Essay-Writing Bot Blowing People
A new chatbot created by artificial intelligence non-profit OpenAI Inc. has taken the internet by storm, as users speculated on its ability to replace everything from playwrights to college essays.
All about OpenAI's ChatGPT Essay Writer
AI essay writing is a growing problem in academia today. While OpenAI's ChatGPT answers any given prompt with seemingly original text, it does so with drawbacks. Students who use it for their writing assignments are likely to be relying on information that is biased, not up-to-date, potentially wordy, lacking detail, and/or possibly incorrect.
ChatGPT: the latest news, updates, and helpful tips
OpenAI, a San Francisco-based AI research lab, created ChatGPT and released the very first version of the LLM in 2018. ... Essay writing for students is one of the most obvious examples of where ...
New AI classifier for indicating AI-written text
In our evaluations on a "challenge set" of English texts, our classifier correctly identifies 26% of AI-written text (true positives) as "likely AI-written," while incorrectly labeling human-written text as AI-written 9% of the time (false positives). Our classifier's reliability typically improves as the length of the input text ...
Student Says He Got a Top Grade for Essay Written With ...
The student, who averages a 2.1 grade, said he submitted one 2,500-word essay written with the help of ChatGPT, and another without any help from the bot. For the essay he wrote himself, "Tom ...
Students Are Using AI to Write Their Papers, Because Of Course ...
Essays written by AI language tools like OpenAI's Playground are often hard to tell apart from text written by humans. by Claire Woodcock October 14, 2022, 1:00pm
OpenAI Wants AI to Help Humans Train AI
OpenAI developed a new model by fine-tuning its most powerful offering, GPT-4, to assist human trainers tasked with assessing code. The company found that the new model, dubbed CriticGPT, could ...
Ex-OpenAI employee writes AI essay: War with China, resources and
Ex-OpenAI employee writes AI essay: War with China, resources and robots The race for AGI has begun, but according to Leopold Aschenbrenner, the race for the necessary resources and power is even ...
How to Outline ANY Essay with OpenAI
#OpenAI #writing #outline When you start a writing assignment, it's easy to get overwhelmed about what you should and shouldn't include. Or you could use Ope...
Leopold Aschenbrenner's "Situational Awareness": AI from now to ...
Leopold Aschenbrenner — formerly of OpenAI's Superalignment team, now founder of an investment firm focused on artificial general intelligence (AGI) — has posted a massive, provocative essay putting a long lens on AI's future.. Why it matters: Aschenbrenner, based in San Francisco, relies on lots of speculation and projection.So none of this is set in stone.
Meet CriticGPT, OpenAI's new fix for ChatGPT's biggest coding ...
ChatGPT, OpenAI's text-generating AI chatbot, has taken the world by storm since its launch in November 2022. What started as a tool to hyper-charge productivity through writing essays and code ...
LLMs now write lots of science. Good
M ANY PEople are busily experimenting with chatbots in the hope that generative artificial intelligence (AI) can improve their daily lives.Scientists, brainy as they are, are several steps ahead ...
Has Microsoft's AI Chief Just Made Windows Free?
In December, The New York Times announced it was suing ChatGPT creator OpenAI and Microsoft (which uses OpenAI's products to power its own AI offerings) for "billions of dollars" in damages ...
Chinese AI firms woo OpenAI users as US company plans API restrictions
Chinese artificial intelligence (AI) companies are moving swiftly to attract users of OpenAI's technology, following reports the U.S. firm plans to restrict access in China and other countries to ...
OpenAI's new tool says it can spot text written by AI
OpenAI has announced a new tool that it says can tell the difference between text written by a human and that of an AI writer - some of the time. The Microsoft -backed company says the new ...
AI-written critiques help humans notice flaws
We trained "critique-writing" models to describe flaws in summaries. Human evaluators find flaws in summaries much more often when shown our model's critiques. Larger models are better at self-critiquing, with scale improving critique-writing more than summary-writing. This shows promise for using AI systems to assist human supervision of AI systems on difficult tasks.
What Is ChatGPT, and How Does It Make Money?
ChatGPT is a chatbot released in November 2022 by OpenAI, an artificial intelligence research company. Chatbots are AI systems that engage in spoken or written dialogue and are commonly used in ...
OpenAI develops AI model to critique its AI models
Human AI trainers find it harder to identify flawed answers, particularly when the chatbot reaches the point that it knows more than its teachers. So as an aid to the people tasked with providing feedback to make its models more capable of generating programming code, OpenAI created another model - to critique those generative responses.
The Voices of A.I. Are Telling Us a Lot
Artificial intelligence stands accused of devastating the creative industries, guzzling energy and even threatening human life. Understandably, OpenAI wants a voice that makes people feel at ease ...
News nonprofit alleges copyright infringement in lawsuit against OpenAI
ChatGPT, OpenAI's text-generating AI chatbot, has taken the world by storm since its launch in November 2022. What started as a tool to hyper-charge productivity through writing essays and code with short text prompts has evolved into a behemoth used by more than 92% of Fortune 500 companies.

How to Get ChatGPT to Write an Essay: Prompts, Outlines, & More

Getting ChatGPT to Write the Essay

Things You Should Know

Expert Q&A

You Might Also Like

About This Article

Is this article up to date?

Featured Articles

Trending Articles

Watch Articles

ChatGPT: Everything you need to know about the AI-powered chatbot

Timeline of the most recent ChatGPT updates

OpenAI delays ChatGPT’s new Voice Mode

ChatGPT releases app for Mac

Apple brings ChatGPT to its apps, including Siri

House Oversight subcommittee invites Scarlett Johansson to testify about ‘Sky’ controversy

ChatGPT experiences two outages in a single day

The Atlantic and Vox Media ink content deals with OpenAI

OpenAI signs 100K PwC workers to ChatGPT’s enterprise tier

OpenAI says it is training its GPT-4 successor

Former OpenAI director claims the board found out about ChatGPT on Twitter

ChatGPT’s mobile app revenue saw biggest spike yet following GPT-4o launch

OpenAI to remove ChatGPT’s Scarlett Johansson-like voice

ChatGPT lets you add files from Google Drive and Microsoft OneDrive

OpenAI inks deal to train AI on Reddit data

OpenAI debuts GPT-4o “omni” model now powering ChatGPT

OpenAI to build a tool that lets content creators opt out of AI training

OpenAI explores allowing AI porn

OpenAI and Stack Overflow announce partnership

U.S. newspapers file copyright lawsuit against OpenAI and Microsoft

OpenAI inks content licensing deal with Financial Times

OpenAI opens Tokyo hub, adds GPT-4 model optimized for Japanese

Sam Altman pitches ChatGPT Enterprise to Fortune 500 companies

OpenAI releases “more direct, less verbose” version of GPT-4 Turbo

ChatGPT no longer requires an account — but there’s a catch

OpenAI’s chatbot store is filling up with spam

The New York Times responds to OpenAI’s claims that it “hacked” ChatGPT for its copyright lawsuit

OpenAI VP doesn’t say whether artists should be paid for training data

A new report estimates that ChatGPT uses more than half a million kilowatt-hours of electricity per day

ChatGPT can now read its answers aloud

OpenAI partners with Dublin City Council to use GPT-4 for tourism

A law firm used ChatGPT to justify a six-figure bill for legal services

ChatGPT experienced a bizarre bug for several hours

Match Group announced deal with OpenAI with a press release co-written by ChatGPT

ChatGPT will now remember — and forget — things you tell it to

OpenAI begins rolling out “Temporary Chat” feature

ChatGPT users can now invoke GPTs directly in chats

ChatGPT is reportedly leaking usernames and passwords from users’ private conversations

ChatGPT is violating Europe’s privacy laws, Italian DPA tells OpenAI

OpenAI partners with Common Sense Media to collaborate on AI guidelines

OpenAI responds to Congressional Black Caucus about lack of diversity on its board

OpenAI drops prices and fixes ‘lazy’ GPT-4 that refused to work

OpenAI bans developer of a bot impersonating a presidential candidate

OpenAI announces partnership with Arizona State University

Winner of a literary prize reveals around 5% her novel was written by ChatGPT

Sam Altman teases video capabilities for ChatGPT and the release of GPT-5

OpenAI announces team to build ‘crowdsourced’ governance ideas into its models

OpenAI unveils plan to combat election misinformation

OpenAI changes policy to allow military applications

ChatGPT subscription aimed at small teams debuts

OpenAI’s GPT store officially launches

Developing AI models would be “impossible” without copyrighted materials, OpenAI claims

OpenAI claims The New York Times’ copyright lawsuit is without merit

OpenAI’s app store for GPTs planned to launch next week

OpenAI moves to shrink regulatory risk in EU around data privacy

What is ChatGPT? How does it work?

When did ChatGPT get released?

What is the latest version of ChatGPT?

Can I use ChatGPT for free?

Who uses ChatGPT?

What companies use ChatGPT?

What does GPT mean in ChatGPT?

What is the difference between ChatGPT and a chatbot?

Can ChatGPT write essays?

Does ChatGPT have an app?

What is the ChatGPT character limit?

Does ChatGPT have an API?

What are some sample everyday uses for ChatGPT?

What are some advanced uses for ChatGPT?

How good is ChatGPT at writing code?