The Creative Musical Achievement of AI Systems Compared to Music Students: A Replication of the Study by Schreiber et al. (2024)
Die kreativen musikalischen Leistungen von KI-Systemen im Vergleich zu Musikstudierenden: Eine Replikation der Studie von Schreiber et al. (2024)

Main Article Content

Nicholas Meier Orcid
Kilian Sander Orcid
Anton Schreiber Orcid
Reinhard Kopiez Orcid

Abstract

Although the last two years have seen AI systems progress significantly when it comes to
generating cultural products like literature, poems, or music, the jury is still out when it comes to
determining whether the aesthetic quality of these products increases in tandem with the
performance enhancements of underlying large language models (LLMs). We replicated the study
by Schreiber et al. (2024) to test whether the creative performance of selected LLMs had improved
over the past two years in the musical domain. In an online rating experiment based on a melody
continuation paradigm, 75 melodic continuations generated by the AI systems Qwen 2 (Version 72B
Instruct), Llama 3 (Version 70B Instruct), and ChatGPT (Version 4) were compared to 23 solutions
composed by humans. The aesthetic quality of the sound examples was then evaluated by N = 54
listeners (music students) using four criteria (convincing, logical and meaningful, interesting, and
liking). As the first main finding, human-based creative solutions outperformed all three AI
systems on all four dependent variables (large effect sizes 1.11 ≤ dz ≤ 2.51), thus confirming the
finding by Schreiber et al. (2024). The second main finding revealed a mean (and meaningful)
discrimination sensitivity of d’ = 1.09 for AI- and human-based solutions. We conclude that merely
boosting the volume of training of the AI systems does not guarantee correlating improvement in
the creative musical output produced under controlled conditions.

Artificial Intelligence; AI; generative AI; composition; empirical aesthetics; melody rating; musical creativity; large language models

Article Details

Section
Research Reports