The French National Research Agency Projects for science

Voir cette page en français

ANR funded project

(DS0705) 2016

Web Audio and Semantic Aggregated in the Browser for Indexation

Deezer, Spotify, Pandora or Apple Music enrich the listening of music with data such as biography or albums by the same artist, and offer suggestions to listen to other artists or songs "similar" (without similarity explicitly defined). A journalist or a radio presenter often uses the Web and media data to prepare its programs. A professor of Master in Engineering uses analytical tools to explain production techniques to his students. All use basic knowledge from the most empirical sources (the press, Google) to the more formalized ones that are also accessible by machines (Spotify uses LastFM, MusicBrainz, DBPedia and audio extractors from the startup The Echo Nest acquired by Spotify in 2014). The need for richer musical knowledge bases and for operating tools is important.

WASABI's originality consists in mixing several approaches and in offering methods for enriching results, and it is this joint implementation which aims to produce a richer and better equipped Knowledge Base :

1) By leveraging Semantic Web databases (eg DBPedia, MusicBrainz, LastFM), you can extract structured data, link the song with elements such as the producer, the studio where it was recorded, the composer, the year, the lyrics, a description from the WikiPedia page of the song, etc.

2) By analyzing the data in free text (words of the song, pages of text related to this song), you can extract non-explicit data (themes of the song, places, people, events, dates, emotions conveyed). The data obtained by these four methods may be linked, faced, confirmed or refuted based on assumptions. For example, the description of a rock band and a producer can be used to configure the initial settings of audio analysis and facilitate the unmixing.

3) By using jointly this information from the Semantic Web and the analysis of the words altogether with the information contained in the audio signal, you can improved automatic the extraction of music information (the time structure, the presence and characterization of the voice, the musical emotion or the presence of plagiarism).

4) When a song is available with separate tracks, we can perform a more accurate analysis and extract audio richer data (notes, instruments, reverberation type, etc.). We will study in this project how unmixing can be achieved and how results can be used in the context of the browser, even when it is imperfect.

5) We can also encourage serendipity and find non-trivial data with a tool such as Discovery Hub (and answer questions like: what connects Radiohead to Pink Floyd)

From use cases specified by the project and co-designed with collaborators users of our research, WASABI wants to offer a suite of open source software components and open data online services for:

1) audio metadata visualization of results of Music Information Retrieval and listening to separate track songs, with tools that run in a Web context,

2) the automatic processing of song lyrics, recognition of linked named entities, annotation and collaborative correction,

3) access to a Web service with an API offering a musical similarities study environment made possible from audio analysis on one hand and from the semantic, textual analysis on the other hand.

These software modules will allow us to develop demonstrators formalized with the help of external collaborators: composers, musicologists, journalists (Radio France), engineers from a leading nonline streaming service (Deezer).


CNRS DR20 Centre National de la Recherche Scientifique De´le´gation Co^te d'Azur / I3S


IRCAM Institut de recherche et coordination acoustique/ musique


ANR grant: 734 706 euros
Beginning and duration: octobre 2016 - 42 mois


ANR Programme: (DS0705) 2016

Project ID: ANR-16-CE23-0017

Project coordinator:
Monsieur Michel Buffa (Centre National de la Recherche Scientifique De´le´gation Co^te d'Azur / I3S)


Back to the previous page


The project coordinator is the author of this abstract and is therefore responsible for the content of the summary. The ANR disclaims all responsibility in connection with its content.