“Music is the one incorporeal entrance into the higher world of knowledge,
which comprehends mankind but which mankind cannot comprehend.”
— Ludwig van Beethoven
Our Vision
Music manifests the most complex and subtle forms of creation of human minds. The composition process of music is almost free from the limitations of the physical world, fully leveraging imagination and creative intelligence. For this reason, Beethoven refers to music as “incorporeal” and refers to creative intelligence, the magnificent faculty accessible to humans, as the “higher world of knowledge.” From an AI perspective, the best way to uncover the mystery of creative intelligence is to realize it via computational efforts — to conceive being from void, to develop many from one, to construct whole from parts, to make analogy among seemingly distant scenarios, and music is a perfect subject of this endeavor.
On the other hand, the appreciation of music involves profound subjective experiences, especially aesthetic perception, which goes beyond utilities and cost functions that can be easily measured by static equations. The inner feelings, the dynamic notion of beauty, taste, good, and the self “I”, are what make ourselves “mankind” and what machines are yet to encompass. Hence, teaching machines to perceive structures, expressions and representations of music and to appreciate music with a taste is essentially to incorporate humanity into intelligent agents.
Our Teams and Projects
On the one hand, we are musicians, and we are curious about how indeed gifted musicians understand and create music. On the other hand, we are computer scientists and we believe that the best way to uncover the mystery of musicianship is to re-create it via computational efforts in a human-centered way. That is why we have been developing various intelligent systems that can help people better create, perform, and learn music.
Three of our most representative projects are: 1) deep music representation learning and style transfer, 2) human-computer interactive performance, and 3) computer-aided multimodal music tutoring. The first one is a new field (as well as a hot topic since 2018) that lies in the core of deep learning, relating to many other domains such as NLP and CV, and we were lucky to be one of the teams who laid the groundwork. The other two projects both have great practical value, and at the same time call for truly interdisciplinary efforts (music practice, educational psychology, hardware & interface design, real-time systems, etc.). We were proud to help promote them as the host of NIME2021 via the conference theme “learning to play, playing to learn.”
In a big picture, these three projects aim to seamlessly “close the loop” for the next-generation AI-extended music experience, in which I envision a workflow as follows: i) a user first sketches a short melody segment or a motif, ii) a music-generation agent extends it to a full song with accompaniment, while the user is free to transfer the style of any part of the piece, iii) a tutoring agent helps the user to learn to play the piece via interactive visual and haptic feedback, and finally, iv) the user and an accompaniment agent perform the performance on (a virtual) stage.
Suggested follow-up reading
Analogy-making vs. prediction: a debate on the philosophy of automated music generation