Paper Title | Authors |
A Contrastive Self-Supervised Learning scheme for beat tracking amenable to few-shot learning | Antonin Gagneré (LTCI - Télécom Paris, IP Paris)*; Slim Essid ( LTCI - Télécom Paris, IP Paris); Geoffroy Peeters (LTCI - Télécom Paris, IP Paris) |
A Critical Survey of Research in Music Genre Recognition | Owen Green (Max Planck Institute for Empirical Aesthetics)*; Bob L. T. Sturm (KTH Royal Institute of Technology); Georgina Born (University College London); Melanie Wald-Fuhrmann (Max Planck Institute for Empirical Aesthetics) |
A Kalman Filter model for synchronization in musical ensembles | Hugo T Carvalho (Federal University of Rio de Janeiro)*; Min Susan Li (University of Birmingham); Massimiliano Di Luca (University of Birmingham); Alan M. Wing (University of Birmingham) |
A Method for MIDI Velocity Estimation for Piano Performance by a U-Net with Attention and FiLM | Hyon Kim (Universitat Pompeu Fabra)*; Xavier Serra (Universitat Pompeu Fabra ) |
A New Dataset, Notation Software, and Representation for Computational Schenkerian Analysis | Stephen Hahn (Duke)*; Weihan Xu (duke); Zirui Yin (Duke University); Rico Zhu (Duke University); Simon Mak (Duke University); Yue Jiang (Duke University); Cynthia Rudin (Duke) |
A Stem-Agnostic Single-Decoder System for Music Source Separation Beyond Four Stems | Karn N Watcharasupat (Georgia Institute of Technology)*; Alexander Lerch (Georgia Institute of Technology) |
Audio Conditioning for Music Generation via Discrete Bottleneck Features | Simon Rouard (Meta AI Research)*; Alexandre Defossez (Kyutai); Yossi Adi (Facebook AI Research ); Jade Copet (Meta AI Research); Axel Roebel (IRCAM) |
Audio Prompt Adapter: Unleashing Music Editing Abilities for Text-to-Music with Lightweight Finetuning | Fang Duo Tsai (National Taiwan University)*; Shih-Lun Wu (Carnegie Mellon University); Haven Kim (University of California San Diego); Bo-Yu Chen (National Taiwan University, Rhythm Culture Corporation); Hao-Chung Cheng (National Taiwan University); Yi-Hsuan Yang (National Taiwan University) |
Augment, Drop & Swap: Improving Diversity in LLM Captions for Efficient Music-Text Representation Learning | Ilaria Manco (Queen Mary University of London)*; Justin Salamon (Adobe); Oriol Nieto (Adobe) |
Automatic Detection of Moral Values in Music Lyrics | Vjosa Preniqi (Queen Mary University of London)*; Iacopo Ghinassi (Queen Mary University of London); Julia Ive (Queen Mary University of London); Kyriaki Kalimeri (ISI Foundation); Charalampos Saitis (Queen Mary University of London) |
Automatic Estimation of Singing Voice Musical Dynamics | Jyoti Narang (Student)*; Nazif Can Tamer (Universitat Pompeu Fabra); Viviana De La Vega (Escola Superior de Música de Catalunya); Xavier Serra (Universitat Pompeu Fabra ) |
Beat this! Accurate beat tracking without DBN postprocessing | Francesco Foscarin (Johannes Kepler University Linz)*; Jan Schlüter (JKU Linz); Gerhard Widmer (Johannes Kepler University) |
Between the AI and Me: Analysing Listeners' Perspectives on AI- and Human-Composed Progressive Metal Music | Pedro Pereira Sarmento (Centre for Digital Music); Jackson J Loth (Queen Mary University of London)*; Mathieu Barthet (Queen Mary University of London) |
CADENZA: A Generative Framework for Expressive Musical Ideas and Variations | Julian Lenz (Lemonaide ); Anirudh Mani (Lemonaide)* |
Can LLMs "Reason" in Music? An Evaluation of LLMs' Capability of Music Understanding and Generation | Ziya Zhou (HKUST)*; Yuhang Wu (Multimodal Art Projection); Zhiyue Wu (Shenzhen University); Xinyue Zhang (Multimodal Art Projection); Ruibin Yuan (CMU); Yinghao MA (Queen Mary University of London); Lu Wang (Shenzhen University); Emmanouil Benetos (Queen Mary University of London); Wei Xue (The Hong Kong University of Science and Technology); Yike Guo (Hong Kong University of Science and Technology) |
Classical Guitar Duet Separation using GuitarDuets - a Dataset of Real and Synthesized Guitar Recordings | Marios Glytsos (National Technical University of Athens)*; Christos Garoufis (Athena Research Center); Athanasia Zlatintsi (Athena Research Center); Petros Maragos (National Technical University of Athens) |
Cluster and Separate: a GNN Approach to Voice and Staff Prediction for Score Engraving | Francesco Foscarin (Johannes Kepler University Linz)*; Emmanouil Karystinaios (Johannes Kepler University); Eita Nakamura (Kyoto University); Gerhard Widmer (Johannes Kepler University) |
Combining audio control and style transfer using latent diffusion | Nils Demerlé (IRCAM)*; Philippe Esling (IRCAM); Guillaume Doras (Ircam); David Genova (Ircam) |
Composer's Assistant 2: Interactive Multi-Track MIDI Infilling with Fine-Grained User Control | Martin E Malandro (Sam Houston State University)* |
ComposerX: Multi-Agent Music Generation with LLMs | Qixin Deng (University of Rochester); Qikai Yang (University of Illinois at Urbana-Champaign); Ruibin Yuan (CMU)*; Yipeng Huang (Multimodal Art Projection Research Community); Yi Wang (CMU); Xubo Liu (University of Surrey); Zeyue Tian (Hong Kong University of Science and Technology); Jiahao Pan (The Hong Kong University of Science and Technology); Ge Zhang (University of Michigan); Hanfeng Lin (Multimodal Art Projection Research Community); Yizhi Li (The University of Sheffield); Yinghao MA (Queen Mary University of London); Jie Fu (HKUST); Chenghua Lin (University of Manchester); Emmanouil Benetos (Queen Mary University of London); Wenwu Wang (University of Surrey); Guangyu Xia (NYU Shanghai); Wei Xue (The Hong Kong University of Science and Technology); Yike Guo (Hong Kong University of Science and Technology) |
Computational Analysis of Yaredawi YeZema Silt in Ethiopian Orthodox Tewahedo Church Chants | Mequanent Argaw Muluneh (Academia Sinica; National Chengchi University; Debre Markos University)*; Yan-Tsung Peng (National Chengchi University); Li Su (Academia Sinica) |
Content-based Controls for Music Large-scale Language Modeling | Liwei Lin (New York University Shanghai)*; Gus Xia (New York University Shanghai); Junyan Jiang (New York University Shanghai); Yixiao Zhang (Queen Mary University of London) |
Continual Learning for Music Classification | Pedro González-Barrachina (University of Alicante); María Alfaro-Contreras (University of Alicante); Jorge Calvo-Zaragoza (University of Alicante)* |
Controlling Surprisal in Music Generation via Information Content Curve Matching | Mathias Rose Bjare (Johannes Kepler University Linz)*; Stefan Lattner (Sony Computer Science Laboratories, Paris); Gerhard Widmer (Johannes Kepler University) |
Cue Point Estimation using Object Detection | Giulia Arguello (ETH Zurich); Luca A Lanzendoerfer (ETH Zurich)*; Roger Wattenhofer (ETH Zurich) |
DEEP RECOMBINANT TRANSFORMER: ENHANCING LOOP COMPATIBILITY IN DIGITAL MUSIC PRODUCTION | Muhammad Taimoor Haseeb (Mohamed bin Zayed University of Artificial Intelligence)*; Ahmad Hammoudeh (Mohamed bin Zayed University of Artificial Intelligence); Gus Xia (Mohamed bin Zayed University of Artificial Intelligence) |
DIFF-A-RIFF: MUSICAL ACCOMPANIMENT CO-CREATION VIA LATENT DIFFUSION MODELS | Javier Nistal (Sony CSL)*; Marco Pasini (Queen Mary University of London); Cyran Aouameur (Sony CSL); Stefan Lattner (Sony Computer Science Laboratories, Paris); Maarten Grachten (Machine Learning Consultant) |
Diff-MST: Differentiable Mixing Style Transfer | Soumya Sai Vanka (QMUL)*; Christian J. Steinmetz (Queen Mary University of London); Jean-Baptiste Rolland (Steinberg Media Technologies GmbH); Joshua D. Reiss (Queen Mary University of London); George Fazekas (QMUL) |
Discogs-VI: A Musical Version Identification Dataset Based on Public Editorial Metadata | Recep Oguz Araz (Universitat Pompeu Fabra)*; Xavier Serra (Universitat Pompeu Fabra ); Dmitry Bogdanov (Universitat Pompeu Fabra) |
DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music Generation | Zachary Novack (UC San Diego)*; Julian McAuley (UCSD); Taylor Berg-Kirkpatrick (UCSD); Nicholas J. Bryan (Adobe Research) |
Do Music Generation Models Encode Music Theory? | Megan Wei (Brown University)*; Michael Freeman (Brown University); Chris Donahue (Carnegie Mellon University); Chen Sun (Brown University) |
EFFICIENT ADAPTER TUNING FOR JOINT SINGING VOICE BEAT AND DOWNBEAT TRACKING WITH SELF-SUPERVISED LEARNING FEATURES | Jiajun Deng (The Chinese University of HongKong)*; Yaolong Ju (Huawei); Jing Yang (Huawei 2012 Labs); Simon Lui (Huawei); Xunying Liu (The Chinese University of Hong Kong) |
El Bongosero: A Crowd-sourced Symbolic Dataset of Improvised Hand Percussion Rhythms Paired with Drum Patterns | Behzad Haki (Universitat Pompeu Fabra); Nicholas Evans (Universitat Pompeu Fabra)*; Daniel Gómez (MTG); Sergi Jordà (Universitat Pompeu Fabra) |
Emotion-driven Piano Music Generation via Two-stage Disentanglement and Functional Representation | Jingyue Huang (New York University)*; Ke Chen (University of California San Diego); Yi-Hsuan Yang (National Taiwan University) |
End-to-end automatic singing skill evaluation using cross-attention and data augmentation for solo singing and singing with accompaniment | Yaolong Ju (Huawei)*; Chun Yat Wu (Huawei); Betty Cortinas Lorenzo (Huawei); Jing Yang (Huawei 2012 Labs); Jiajun Deng (Huawei); FAN FAN (Huawei); Simon Lui (Huawei) |
End-to-end Piano Performance-MIDI to Score Conversion with Transformers | Tim Beyer (Technical University of Munich)*; Angela Dai (Technical University of Munich) |
Enhancing predictive models of music familiarity with EEG: Insights from fans and non-fans of K-pop group NCT127 | Seokbeom Park (KAIST); Hyunjae Kim (KAIST); Kyung Myun Lee (KAIST)* |
Exploring GPT's Ability as a Judge in Music Understanding | Kun Fang (McGill University)*; Ziyu Wang (NYU Shanghai); Gus Xia (New York University Shanghai); Ichiro Fujinaga (McGill University) |
Exploring Internet Radio Across the Globe with the MIRAGE Online Dashboard | Ngan V.T. Nguyen (University of Science, Vietnam Nation University Ho Chi Minh city); Elizabeth Acosta (Texas Tech University); Tommy Dang (Texas Tech University); David Sears (Texas Tech University)* |
Exploring Musical Roots: Applying Audio Embeddings to Empower Influence Attribution for a Generative Music Model | Julia Barnett (Northwestern University)*; Bryan Pardo (Northwestern University); Hugo Flores García (Northwestern University) |
Exploring the inner mechanisms of large generative music models | Marcel A Vélez Vásquez (University of Amsterdam)*; Charlotte Pouw (University of Amsterdam); John Ashley Burgoyne (University of Amsterdam); Willem Zuidema (ILLC, UvA) |
Field Study on Children's Home Piano Practice: Developing a Comprehensive System for Enhanced Student-Teacher Engagement | Seikoh Fukuda (PTNA Research Institute of Music)*; Yuko Fukuda (Kyoritsu Women’s University, To-on Kikaku Company); Ami Motomura (To-on Kikaku Company); Eri Sasao (To-on Kikaku Company); Masamichi Hosoda (NTT East Corporation); Masaki Matsubara (University of Tsukuba); Masahiro Niitsuma (Keio University) |
Formal Modeling of Structural Repetition using Tree Compression | Zeng Ren (École Polytechnique Fédérale de Lausanne)*; Yannis Rammos (EPFL); Martin A Rohrmeier (Ecole Polytechnique Fédérale de Lausanne) |
From Audio Encoders to Piano Judges: Benchmarking Performance Understanding for Solo Piano | Huan Zhang (Queen Mary University of London)*; Jinhua Liang (Queen Mary University of London); Simon Dixon (Queen Mary University of London) |
From Real to Cloned Singer Identification | Dorian Desblancs (Deezer Research)*; Gabriel Meseguer Brocal (Deezer); Romain Hennequin (Deezer Research); Manuel Moussallam (Deezer) |
FruitsMusic: A Real-World Corpus of Japanese Idol-Group Songs | Hitoshi Suda (National Institute of Advanced Industrial Science and Technology (AIST))*; Shunsuke Yoshida (The University of Tokyo); Tomohiko Nakamura (National Institute of Advanced Industrial Science and Technology (AIST)); Satoru Fukayama (National Institute of Advanced Industrial Science and Technology (AIST)); Jun Ogata (AIST) |
GAPS: A Large and Diverse Classical Guitar Dataset and Benchmark Transcription Model | Xavier Riley (C4DM)*; Zixun Guo (Singapore University of Technology and Design); Andrew C Edwards (QMUL); Simon Dixon (Queen Mary University of London) |
Generating Sample-Based Musical Instruments Using Neural Audio Codec Language Models | Shahan Nercessian (Native Instruments)*; Johannes Imort (Native Instruments); Ninon Devis (Native Instruments); Frederik Blang (Native Instruments) |
GraphMuse: A Library for Symbolic Music Graph Processing | Emmanouil Karystinaios (Johannes Kepler University)*; Gerhard Widmer (Johannes Kepler University) |
Green MIR? Investigating computational cost of recent music-Ai research in ISMIR | Andre Holzapfel (KTH Royal Institute of Technology in Stockholm)*; Anna-Kaisa Kaila (KTH Royal Institute of Technology, Stockholm); Petra Jääskeläinen (KTH) |
Harmonic and Transposition Constraints Arising from the Use of the Roland TR-808 Bass Drum | Emmanuel Deruty (Sony Computer Science Laboratories)* |
Harnessing the Power of Distributions: Probabilistic Representation Learning on Hypersphere for Multimodal Music Information Retrieval | Takayuki Nakatsuka (National Institute of Advanced Industrial Science and Technology (AIST))*; Masahiro Hamasaki (National Institute of Advanced Industrial Science and Technology (AIST)); Masataka Goto (National Institute of Advanced Industrial Science and Technology (AIST)) |
HIERARCHICAL GENERATIVE MODELING OF THE MELODIC VOICE IN HINDUSTANI CLASSICAL MUSIC | Nithya Nadig Shikarpur (Mila; University of Montreal)*; Krishna Maneesha Dendukuri (Mila); Yusong Wu (Mila, University of Montreal); Antoine CAILLON (IRCAM); Cheng-Zhi Anna Huang (Google Brain) |
Human Pose Estimation for Expressive Movement Descriptors in Vocal Musical Performance | Sujoy Roychowdhury (Indian Institute of Technology Bombay)*; Preeti Rao (Indian Institute of Technology Bombay); Sharat Chandran (IIT Bombay) |
Human-AI Music Process: A Dataset of AI-Supported Songwriting Processes from the AI Song Contest | Lidia J Morris (University of Washington)*; Rebecca Leger (Fraunhofer IIS); Michele Newman (University of Washington); John Ashley Burgoyne (University of Amsterdam); Ryan Groves (Self-employed); Natasha Mangal (CISAC); Jin Ha Lee (University of Washington) |
I can listen but cannot read: An evaluation of two-tower multimodal systems for instrument recognition | Yannis Vasilakis (Queen Mary University of London)*; Rachel Bittner (Spotify); Johan Pauwels (Queen Mary University of London) |
Improved symbolic drum style classification with grammar-based hierarchical representations | Léo Géré (Cnam)*; Nicolas Audebert (IGN); Philippe Rigaux (Cnam) |
In-depth performance analysis of the ADTOF-based algorithm for automatic drum transcription | Mickaël Zehren (Umeå University)*; Marco Alunno (Universidad EAFIT Medellín); Paolo Bientinesi (Umeå Universitet) |
Inner Metric Analysis as a Measure of Rhythmic Syncopation | Brian Bemman (Durham University)*; Justin Christensen (The University of Sheffield) |
Investigating Time-Line-Based Music Traditions with Field Recordings: A Case Study of Candomblé Bell Patterns | Lucas S Maia (Universidade Federal do Rio de Janeiro)*; Richa Namballa (New York University); Martín Rocamora (Universidad de la República); Magdalena Fuentes (New York University); Carlos Guedes (NYU Abu Dhabi) |
Joint Audio and Symbolic Audio Conditioning For Temporally Controlled Text-to-Music Generation | Or Tal (The Hebrew University of Jerusalem)*; Alon Ziv (The Hebrew University of Jerusalem); Felix Kreuk (Bar-Ilan University); Itai Gat (Meta); Yossi Adi (The Hebrew University of Jerusalem) |
Just Label the Repeats for In-The-Wild Audio-to-Score Alignment | Irmak Bukey (Carnegie Mellon University)*; Michael Feffer (Carnegie Mellon University); Chris Donahue (CMU) |
Learning Multifaceted Self-Similarity over Time and Frequency for Music Structure Analysis | Tsung-Ping Chen (Kyoto University)*; Kazuyoshi Yoshii (Kyoto University) |
Lessons learned from a project to encode Mensural music on a large scale with Optical Music Recognition | David Rizo (University of Alicante. Instituto Superior de Enseñanzas Artísrticas de la Comunidad Valenciana)*; Jorge Calvo-Zaragoza (University of Alicante); Teresa Delgado-Sánchez (Biblioteca Nacional de España); Patricia García-Iasci (University of Alicante) |
Leveraging Unlabeled Data to Improve Automatic Guitar Tablature Transcription | Andrew F Wiggins (Drexel University)*; Youngmoo Kim (Drexel University) |
Long-form music generation with latent diffusion | Zach Evans (Stability AI); Julian D Parker (Stability AI)*; CJ Carr (Stability AI); Zachary Zuckowski (Stability AI); Josiah Taylor (Stability AI); Jordi Pons (Stability AI) |
Looking for Tactus in All the Wrong Places: Statistical Inference of Metric Alignment in Rap Flow | Nathaniel Condit-Schultz (Georgia Institute of Technology)* |
Lyrically Speaking: Exploring the Link Between Lyrical Emotions, Themes and Depression Risk | Pavani B Chowdary (International Institute of Information Technology, Hyderabad)*; Bhavyajeet Singh (International Institute of Information Technology, Hyderabad ); Rajat Agarwal (International Institute of Information Technology); Vinoo Alluri (IIIT - Hyderabad) |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | Ondřej Cífka (AudioShake)*; Hendrik Schreiber (AudioShake); Luke Miner (AudioShake); Fabian-Robert Stöter (AudioShake) |
MelodyT5: A Unified Score-to-Score Transformer for Symbolic Music Processing | Shangda Wu (Central Conservatory of Music); Yashan Wang (Central Conservatory of Music); Xiaobing Li (Central Conservatory of Music); Feng Yu (Central Conservatory of Music); Maosong Sun (Tsinghua University)* |
Mel-RoFormer for Vocal Separation and Vocal Melody Transcription | Ju-Chiang Wang (ByteDance)*; Wei-Tsung Lu (New Your University); Jitong Chen (ByteDance) |
MidiCaps: A Large-scale MIDI Dataset with Text Captions | Jan Melechovsky (Singapore University of Technology and Design); Abhinaba Roy (SUTD)*; Dorien Herremans (Singapore University of Technology and Design) |
MIDI-to-Tab: Guitar Tablature Inference via Masked Language Modeling | Andrew C Edwards (QMUL)*; Xavier Riley (C4DM); Pedro Pereira Sarmento (Centre for Digital Music); Simon Dixon (Queen Mary University of London) |
MMT-BERT: Chord-aware Symbolic Music Generation Based on Multitrack Music Transformer and MusicBERT | Jinlong ZHU (Hokkaido University)*; Keigo Sakurai (Hokkaido University); Ren Togo (Hokkaido University); Takahiro Ogawa (Hokkaido University); Miki Haseyama (Hokkaido University) |
Mosaikbox: Improving Fully Automatic DJ Mixing Through Rule-based Stem Modification And Precise Beat-Grid Estimation | Robert Sowula (TU Wien)*; Peter Knees (TU Wien) |
MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models | Benno Weck (Music Technology Group, Universitat Pompeu Fabra (UPF))*; Ilaria Manco (Queen Mary University of London); Emmanouil Benetos (Queen Mary University of London); Elio Quinton (Universal Music Group); George Fazekas (QMUL); Dmitry Bogdanov (Universitat Pompeu Fabra) |
Music Discovery Dialogue Generation Using Human Intent Analysis and Large Language Model | Seungheon Doh (KAIST)*; Keunwoo Choi (Genentech); Daeyong Kwon (KAIST); Taesoo Kim (KAIST); Juhan Nam (KAIST) |
Music Proofreading with RefinPaint: Where and How to Modify Compositions given Context | Pedro Ramoneda (Universitat Pompeu Fabra)*; Martín Rocamora (Universidad de la República); Taketo Akama (Sony CSL) |
Music2Latent: Consistency Autoencoders for Latent Audio Compression | Marco Pasini (Queen Mary University of London)*; Stefan Lattner (Sony Computer Science Laboratories, Paris); George Fazekas (QMUL) |
MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music Generation | Yun-Han Lan (Taiwan AI Labs)*; Wen-Yi Hsiao (Taiwan AI Labs); Hao-Chung Cheng (National Taiwan University); Yi-Hsuan Yang (National Taiwan University) |
Nested Music Transformer: Sequentially Decoding Compound Tokens in Symbolic Music and Audio Generation | Jiwoo Ryu (Sogang University); Hao-Wen Dong (University of Michigan); Jongmin Jung (Sogang University); Dasaem Jeong (Sogang University)* |
Note-Level Transcription of Choral Music | Huiran Yu (University of Rochester)*; Zhiyao Duan (Unversity of Rochester) |
Notewise Evaluation of Source Separation: A Case Study For Separated Piano Tracks | Yigitcan Özer (International Audio Laboratories Erlangen)*; Hans-Ulrich Berendes (International Audio Laboratories Erlangen); Vlora Arifi-Müller (International Audio Laboratories Erlangen ); Fabian-Robert Stöter (AudioShake, Inc.); Meinard Müller (International Audio Laboratories Erlangen) |
On the validity of employing ChatGPT for distant reading of music similarity | Arthur Flexer (Johannes Kepler University Linz)* |
PiCoGen2: Piano cover generation with transfer learning approach and weakly aligned data | Chih-Pin Tan (National Taiwan University)*; Hsin Ai (National Taiwan University); Yi-Hsin Chang (National Taiwan University); Shuen-Huei Guan (KKCompany Techonologies); Yi-Hsuan Yang (National Taiwan University) |
PolySinger: Singing-Voice to Singing-Voice Translation from English to Japanese | Silas Antonisen (University of Granada)*; Iván López-Espejo (University of Granada) |
Purposeful Play: Evaluation and Co-Design of Casual Music Creation Applications with Children | Michele Newman (University of Washington)*; Lidia J Morris (University of Washington); Jun Kato (National Institute of Advanced Industrial Science and Technology (AIST)); Masataka Goto (National Institute of Advanced Industrial Science and Technology (AIST)); Jason Yip (University of Washington); Jin Ha Lee (University of Washington) |
Quantitative Analysis of Melodic Similarity in Music Copyright Infringement Cases | Saebyul Park (KAIST)*; Halla Kim (KAIST); Jiye Jung (Heinrich Heine University Düsseldorf); Juyong Park (KAIST); Jeounghoon Kim (KAIST); Juhan Nam (KAIST) |
RNBert: Fine-Tuning a Masked Language Model for Roman Numeral Analysis | Malcolm Sailor (Yale University)* |
Robust and Accurate Audio Synchronization Using Raw Features From Transcription Models | Johannes Zeitler (International Audio Laboratories Erlangen)*; Ben Maman (Tel Aviv University); Meinard Müller (International Audio Laboratories Erlangen) |
Robust lossy audio compression identification | Hendrik Vincent Koops (Universal Music Group)*; Gianluca Micchi (Universal Music Group); Elio Quinton (Universal Music Group) |
Sanidha: A Studio Quality Multi-Modal Dataset for Carnatic Music | Venkatakrishnan Vaidyanathapuram Krishnan (Georgia Institute of Technology)*; Noel Alben (Georgia Institute Of Technology); Anish A Nair (Georgia Institute of Technology ); Nathaniel Condit-Schultz (Georgia Institute of Technology) |
Saraga Audiovisual: a large multimodal open data collection for the analysis of Carnatic Music | Adithi Shankar Sivasankar (Music Technology Group- Universitat Pompeu Fabra)*; Genís Plaja-Roglans (Music Technology Group); Thomas Nuttall (Universitat Pompeu Fabra, Barcelona); Martín Rocamora (Universitat Pompeu Fabra); Xavier Serra (Universitat Pompeu Fabra ) |
Scoring Time Intervals Using Non-Hierarchical Transformer for Automatic Piano Transcription | Yujia Yan (University of Rochester)*; Zhiyao Duan (Unversity of Rochester) |
Semi-Supervised Contrastive Learning of Musical Representations | Julien PM Guinot (Queen Mary University of London)*; Elio Quinton (Universal Music Group); George Fazekas (QMUL) |
Semi-Supervised Piano Transcription Using Pseudo-Labeling Techniques | Sebastian Strahl (International Audio Laboratories Erlangen)*; Meinard Müller (International Audio Laboratories Erlangen) |
Six Dragons Fly Again: Reviving 15th-Century Korean Court Music with Transformers and Novel Encoding | Danbinaerin Han (KAIST); Mark R H Gotham (Durham); DongMin Kim (Sogang University); Hannah Park (Sogang University); Sihun Lee (Sogang University); Dasaem Jeong (Sogang University)* |
SpecMaskGIT: Masked Generative Modelling of Audio Spectrogram for Efficient Audio Synthesis and Beyond | Marco Comunita (Queen Mary University of London); Zhi Zhong (Sony Group Corporation)*; Akira Takahashi (Sony Group Corporation); Shiqi Yang (Sony); Mengjie Zhao (Sony Group Corporation); Koichi Saito (Sony Gruop Corporation); Yukara Ikemiya (Sony Research); Takashi Shibuya (Sony AI); Shusuke Takahashi (Sony Group Corporation); Yuki Mitsufuji (Sony AI) |
Stem-JEPA: A Joint-Embedding Predictive Architecture for Musical Stem Compatibility Estimation | Alain Riou (Sony CSL Paris); Stefan Lattner (Sony Computer Science Laboratories, Paris); Gaëtan Hadjeres (Sony CSL)*; Michael Anslow (Sony Computer Science Laboratories, Paris); Geoffroy Peeters (LTCI - Télécom Paris, IP Paris) |
ST-ITO: Controlling audio effects for style transfer with inference-time optimization | Christian J. Steinmetz (Queen Mary University of London)*; Shubhr singh (Queen Mary University of London); Marco Comunita (Queen Mary University of London); Ilias Ibnyahya (Queen Mary University of London); Shanxin Yuan (Queen Mary University of London); Emmanouil Benetos (Queen Mary University of London); Joshua D. Reiss (Queen Mary University of London) |
STONE: Self-supervised tonality estimator | Yuexuan KONG (Deezer)*; Vincent Lostanlen (LS2N, CNRS); Gabriel Meseguer Brocal (Deezer); Stella Wong (Columbia University); Mathieu Lagrange (LS2N); Romain Hennequin (Deezer Research) |
Streaming Piano Transcription Based on Consistent Onset and Offset Decoding with Sustain Pedal Detection | Weixing Wei (Kyoto University)*; Jiahao Zhao (Kyoto University); Yulun Wu (Fudan University); Kazuyoshi Yoshii (Kyoto University) |
SymPAC: Scalable Symbolic Music Generation With Prompts And Constraints | Haonan Chen (Bytedance Inc.)*; Jordan B. L. Smith (TikTok); Janne Spijkervet (University of Amsterdam); Ju-Chiang Wang (ByteDance); Pei Zou (Bytedance Inc.); Bochen Li (University of Rochester); Qiuqiang Kong (Byte Dance); Xingjian Du (University of Rochester) |
The Changing Sound of Music: An Exploratory Corpus Study of Vocal Trends Over Time | Elena Georgieva (NYU)*; Pablo Ripollés (New York University); Brian McFee (New York University) |
The Concatenator: A Bayesian Approach To Real Time Concatenative Musaicing | Christopher J Tralie (Ursinus College)*; Ben Cantil (DataMind Audio) |
The ListenBrainz Listens Dataset | Kartik Ohri (MetaBrainz Foundation Inc.)*; Robert Kaye (MetaBrainz Foundation Inc.) |
TheGlueNote: Learned Representations for Robust and Flexible Note Alignment | Silvan Peter (JKU)*; Gerhard Widmer (Johannes Kepler University) |
Toward a More Complete OMR Solution | Guang Yang (University of Washington)*; Muru Zhang (University of Washington); Lin Qiu (University of Washington); Yanming Wan (University of Washington); Noah A Smith (University of Washington and Allen Institute for AI) |
Towards Assessing Data Replication in Music Generation with Music Similarity Metrics on Raw Audio | Roser Batlle-Roca (Universitat Pompeu Fabra)*; Wei-Hsiang Liao (Sony Group Corporation); Xavier Serra (Universitat Pompeu Fabra ); Yuki Mitsufuji (Sony AI); Emilia Gomez (Joint Research Centre, European Commission & Universitat Pompeu Fabra) |
Towards Automated Personal Value Estimation in Song Lyrics | Andrew M. Demetriou (Delft University of Technology)*; Jaehun Kim (Pandora / SiriusXM); Cynthia Liem (Delft University of Technology), Sandy Manolios |
Towards Explainable and Interpretable Musical Difficulty Estimation: A Parameter-efficient Approach | Pedro Ramoneda (Universitat Pompeu Fabra)*; Vsevolod E Eremenko (Music Technology Group at Universitat Pompeu Fabra); Alexandre D'Hooge (Université de Lille); Emilia Parada-Cabaleiro (Nuremberg University of Music); Xavier Serra (Universitat Pompeu Fabra ) |
Towards Musically Informed Evaluation of Piano Transcription Models | Patricia Hu (Johannes Kepler University)*; Lukáš Samuel Marták (Johannes Kepler University Linz); Carlos Eduardo Cancino-Chacón (Johannes Kepler University Linz); Gerhard Widmer (Johannes Kepler University) |
Towards Universal Optical Music Recognition: A Case Study on Notation Types | Juan Carlos Martinez-Sevilla (University of Alicante)*; David Rizo (University of Alicante. Instituto Superior de Enseñanzas Artísrticas de la Comunidad Valenciana); Jorge Calvo-Zaragoza (University of Alicante) |
Towards Zero-Shot Amplifier Modeling: One-to-Many Amplifier Modeling via Tone Embedding Control | Yu-Hua Chen (NTU)*; Yen-Tung Yeh (National Taiwan University); Yuan-Chiao Cheng (Positive Grid); Jui-Te Wu (Positive Grid); Yu-Hsiang Ho (Positive Grid ); Jyh-Shing Roger Jang (National Taiwan University); Yi-Hsuan Yang (National Taiwan University) |
Transcription-based lyrics embeddings: simple extraction of effective lyrics embeddings from audio | Jaehun Kim (Pandora / SiriusXM)*; Florian Henkel (SiriusXM + Pandora); Camilo Landau (Pandora / SiriusXM); Samuel E. Sandberg (SiriusXM + Pandora); Andreas F. Ehmann (SiriusXM + Pandora) |
Unsupervised Composable Representations for Audio | Giovanni Bindi (IRCAM)*; Philippe Esling |
Unsupervised Synthetic-to-Real Adaptation for Optical Music Recognition | Noelia N Luna-Barahona (Universidad de Alicante); Adrián Roselló (Universidad de Alicante); María Alfaro-Contreras (University of Alicante); David Rizo (University of Alicante. Instituto Superior de Enseñanzas Artísrticas de la Comunidad Valenciana); Jorge Calvo-Zaragoza (University of Alicante)* |
Using Item Response Theory to Aggregate Music Annotation Results of Multiple Annotators | Tomoyasu Nakano (National Institute of Advanced Industrial Science and Technology (AIST))*; Masataka Goto (National Institute of Advanced Industrial Science and Technology (AIST)) |
Using Pairwise Link Prediction and Graph Attention Networks for Music Structure Analysis | Morgan Buisson (Telecom-Paris)*; Brian McFee (New York University); Slim Essid (Telecom Paris - Institut Polytechnique de Paris) |
Utilizing Listener-Provided Tags for Music Emotion Recognition: A Data-Driven Approach | Joanne Affolter (Ecole Polytechnique Fédérale de Lausanne (EPFL))*; Yannis Rammos (EPFL); Martin A Rohrmeier (Ecole Polytechnique Fédérale de Lausanne) |
Variation Transformer: New datasets, models, and comparative evaluation for symbolic music variation generation | Chenyu Gao (University of York)*; Federico Reuben (University of York); Tom Collins (University of York; MAIA, Inc.) |
Which audio features can predict the dynamic musical emotions of both composers and listeners? | Eun Ji Oh (KAIST); Hyunjae Kim (KAIST); Kyung Myun Lee (KAIST)* |
WHO'S AFRAID OF THE `ARTYFYSHALL BYRD'? HISTORICAL NOTIONS AND CURRENT CHALLENGES OF MUSICAL ARTIFICIALITY | Nicholas Cornia (Orpheus Instituut)*; Bruno Forment (Orpheus Instituut) |
X-Cover: Better music version identification system by integrating pretrained ASR model | Xingjian Du (University of Rochester)*; Zou Pei (ByteDance); Mingyu Liu (ByteDance); Xia Liang (Bytedance); Huidong Liang (University of Oxford); Minghang Chu (Bytedance); Zijie Wang (ByteDance); Bilei Zhu (ByteDance AI Lab) |