The ability of the new version of ChatGPT to interpret images and “tutor” students poses new challenges and opportunities for higher education, according to experts.
With universities still scrambling to understand the impact artificial intelligence (AI) chatbots might have on assessments and research, OpenAI has released GPT-4, just 15 weeks after its previous version caused such a stir around the world.
Although full access to the technology was still restricted, it appeared to represent another “step change” in what generative AI was capable of, according to Thomas Lancaster, a senior teaching fellow in computing at Imperial College London.
In its publicity around the launch, OpenAI highlighted GPT-4’s increased ability to pass popular examinations compared with ChatGPT, claiming it scored 298 out of 400 in the bar exam and 163 in the Law School Admissions Test, compared with 213 and 149 with the previous version. The risk of students using AI to write essays or complete exams has already led to many universities banning ChatGPT on campuses.
The reported results were “incredible”, said Dr Lancaster, but it was the ability to process and respond to images that set GPT-4 apart from what came before, he said.
“That means you can take a photo of an exam question with pictures and diagrams and expect GPT-4 to be able to formulate a response.
“That might not be useful in an invigilated exam where students don’t have easy access to GPT-4, but there are all kinds of real-world uses of this technology that we need to be preparing students for.”
Mike Sharples, emeritus professor at the Open University’s Institute of Educational Technology, agreed that the image-interpretation facility presented several potential new uses of generative AI.
“Students will be able to use it to answer complex exam questions containing figures and illustrations. They will also be able to use GPT to summarise entire research papers,” he said.
“It could interpret concept maps, flowcharts, even cuneiform text. It can turn a sketch of a computer interface into a working computer programme.”
Professor Sharples said GPT-4 could also act as a “knowledgeable guide” to students because it could respond to queries in an intellectual way, guiding and shaping their learning.
Education company Khan Academy has already announced that it will use the new technology to power a virtual tutor, and Professor Sharples said this “transformation shouldn’t be underestimated”.
Academics could also use this new feature in their work, he added, and he himself was using GPT-4 to “augment my thinking in preparing talks and papers”.
The fact that GPT-4 can now generate up to 25,000 words in one go was also a “huge opportunity and a huge challenge”, according to Professor Sharples.
“It means that academics, and students, can ask GPT-4 to generate works up to 50 pages in length,” he said, adding that this was a “big challenge to the entire academic publishing process”.
He said he had used GPT-4 to generate an entire “conference paper” on comparisons between college student essays and AI-written versions, giving it only a title and a list of main research questions to address as prompts.
It successfully “concocted a research study, with methods and results, including a table of findings and a statistical analysis of the invented data”, Professor Sharples said, adding that this should force journal editors and conference chairs to rethink their acceptance criteria for papers.
For example, they might now need to require access to the full data gathered by the researcher, he said, which would bring potential privacy issues, and develop new guidelines to clearly state what is and isn’t acceptable with regard to how AI is used by the author.
Dr Lancaster said the writing ability of GPT-4 also appeared to have advanced, addressing one of the issues identified with ChatGPT, which tended to produce formulaic sentences that could then be picked up by AI detectors developed to identify AI-generated text.
David Cyrus, one of the creators of an AI detector that has emerged recently, AICheatCheck, said the tool had been trained to prepare for GPT-4 and it appeared to work just as well, based on a test of 400 articles.
But Dr Lancaster said GPT-4 could “write much more in the style of an individual author, as well as being able to write longform”, meaning it would be harder for detection systems trained on ChatGPT to function.
“It also means that academics comparing student writing styles between assessments might no longer use that approach to detect academic misconduct,” he added.
“I’d expect to see continual advances in this technology – so, more than ever, we need to work with this in the future, rather than against it. This is an incredibly exciting time to be working in education.”
后记
Print headline: ChatGPT-4 brings step change, but also a challenge for universities