ISMIR 2003


Recent Submissions

Now showing 1 - 20 of 48
  • Item
    Music identification by Leadsheets
    (Johns Hopkins University, 2003-10-26) Frank Seifert; Wolfgang Benn; Holger H. Hoos; David Bainbridge
    Most experimental research on content-based automatic recognition and identification of musical documents is founded on statistical distribution of timbre or simple retrieval mechanisms like comparison of melodic segments. Therefore often a vast number of relevant and irrelevant hits including multiple appearances of the same documents are returned or the actual document can’t be revealed at all. To improve this situation we propose a model for recognition of music that enables identification and comparison of musical documents without dependence on their actual instantiation. The resulting structures enclose musical meaning and can be used for estimation of identity and semantic relationship between musical documents.
  • Item
    Quantitative Comparisons into Content-Based Music Recognition with the Self Organising Map
    (Johns Hopkins University, 2003-10-26) Gavin Wood; Simon O'Keefe; Holger H. Hoos; David Bainbridge
    With so much modern music being so widely available both in electronic form and in more traditional physical formats, a great opportunity exists for the development of a general-purpose recognition and music classification system. We describe an ongoing investigation into the subject of musical recognition purely by the sonic content from a standard recording.
  • Item
    Automatic Music Transcription from Multiphonic MIDI Signals
    (Johns Hopkins University, 2003-10-26) Haruto Takeda; Takuya Nishimoto; Shigeki Sagayama; Holger H. Hoos; David Bainbridge
    For automatically transcribing human-performed polyphonic music recorded in the MIDI format, rhythm and tempo are decomposed through probabilistic modeling using Viterbi search in HMM for recognizing the rhythm and EM algorithm for estimating the tempo. Experimental evaluation are also presented.
  • Item
    Design Patterns in XML Music Representation
    (Johns Hopkins University, 2003-10-26) Perry Roland; Holger H. Hoos; David Bainbridge
    Design patterns attempt to formalize the discussion of recurring problems and their solutions. This paper introduces several XML design patterns and demonstrates their usefulness in the development of XML music representations. The patterns have been grouped into several categories of desirable outcome of the design process – modularity, separation of data and meta-data, reduction of learning requirements, assistance to tool development, and increase in legibility and understandability. The Music Encoding Initiative (MEI) DTD, from which the examples are drawn, the examples, and other materials related to MEI are available at ~pdr4h/.
  • Item
    Using morphological description for generic sound retrieval
    (Johns Hopkins University, 2003-10-26) Julien Ricard; Perfecto Herrera; Holger H. Hoos; David Bainbridge
    Systems for sound retrieval are usually “source-centred”. This means that retrieval is based on using the proper keywords that define or specify a sound source. Although this type of description is of great interest, it is very difficult to implement it into realistic automatic labelling systems because of the necessity of dealing with thousands of categories, hence with thousands of different sound models. Moreover, digitally synthesised or transformed sounds, which are frequently used in most of the contemporary popular music, have no identifiable sources. We propose a description framework, based on Schaeffer’s research on a generalised solfeggio which could be applied to any type of sounds. He defined some morphological description criteria, based on intrinsic perceptual qualities of sound, which doesn’t refer to the cause or the meaning of a sound. We describe more specifically experiments on automatic extraction of morphological descriptors.
  • Item
    Music Notation as a MEI Feasibility Test
    (Johns Hopkins University, 2003-10-26) Baron Schwartz; Holger H. Hoos; David Bainbridge
    This project demonstrated that enough information can be retrieved from MEI, an XML format for musical information representation, to transform it into music notation with good fidelity. The process involved writing an XSLT script to transform files into Mup, an intermediate format, then processing the Mup into PostScript, the de facto page description language for high-quality printing. The results show that the MEI format represents musical information such that it may be retrieved simply, with good recall and precision.
  • Item
    Key-specific Shrinkage Techniques for Harmonic Models
    (Johns Hopkins University, 2003-10-26) Jeremy Pickens; Holger H. Hoos; David Bainbridge
    Statistical modeling of music is rapidly gaining acceptance as viable approach to a host of Music Information Retrieval related tasks, from transcription to ad hoc retrieval. As music may be viewed as an evolving pattern of notes over time, models which capture some statistical notion of sequence are preferred. The focus of this paper is on Markov models for ad hoc retrieval. In particular, we use Harmonic Models as our baseline retrieval system and explain how they may be improved by better shrinkage procedures to improve parameter estimation.
  • Item
    Rhythmic Similarity through Elaboration
    (Johns Hopkins University, 2003-10-26) Mitchell Parry; Irfan Essa; Holger H. Hoos; David Bainbridge
    Rhythmic similarity techniques for audio tend to evaluate how close to identical two rhythms are. This paper proposes a similarity metric based on rhythmic elaboration that matches rhythms that share the same beats regardless of tempo or identicalness. Elaborations can help an application decide where to transition between songs. Potential applications include automatically generating a non-stop music mix or sonically browsing a music library.
  • Item
    Chopin Early Editions: Construction and Usage of Online Digital Scores
    (Johns Hopkins University, 2003-10-26) Tod Olson; J. Stephen Downie; Holger H. Hoos; David Bainbridge
    The University of Chicago Library has digitized a collection of 19th century music scores. The online collection is generated programmatically from the scanned images and human-created descriptive and structural metadata, encoded as METS objects, and delivered using the Greenstone Digital Library software. Use statistics are analyzed and possible future directions for the collection are discussed.
  • Item
    A HMM-Based Pitch Tracker for Audio Queries
    (Johns Hopkins University, 2003-10-26) Nicola Orio; Matteo Sisti Sette; Holger H. Hoos; David Bainbridge
    In this paper we present an approach to the transcription of musical queries based on a HMM. The HMM is used to model the audio features related to the singing voice, and the transcription is obtained through Viterbi decoding. We report our preliminary work on evaluation of the system.
  • Item
    An Auditory Model Based Transcriber of Vocal Queries
    (Johns Hopkins University, 2003-10-26) Tom De Mulder; Jean-Pierre Martens; Micheline Lesaffre; Marc Leman; Bernard De Baets; Hans De Meyer; Holger H. Hoos; David Bainbridge
    In this paper a new auditory model-based transcriber of melodic queries produced by a human voice is presented. The newly presented system is tested systematically, together with some other state-of-the-art systems, on three types of vocal queries: singing with syllables, singing with words and whistling. The experimental results show that the new system can transcribe these queries with an accuracy between 76% (whistling) to 85% (singing with syllables), and that it clearly outperforms the other systems included in the test on all three query modes.
  • Item
    The Importance of Cross Database Evaluation in Sound Classification
    (Johns Hopkins University, 2003-10-26) Arie Livshin; Xavier Rodet; Holger H. Hoos; David Bainbridge
    In numerous articles (Martin and Kim, 1998; Fraser and Fujinaga, 1999; and many others) sound classification algorithms are evaluated using "self classification" - the learning and test groups are randomly selected out of the same sound database. We will show that "self classification" is not necessarily a good statistic for the ability of a classification algorithm to learn, generalize or classify well. We introduce the alternative "Minus-1 DB" evaluation method and demonstrate that it does not have the shortcomings of "self classification".
  • Item
    A SVM ¨C Based Classification Approach to Musical Audio
    (Johns Hopkins University, 2003-10-26) Namunu Chinthaka Maddage; Changsheng Xu; Ye Wang; Holger H. Hoos; David Bainbridge
    This paper describes an automatic heirarchical music classification approach based on support vector machines (SVM). Based on the proposed method, the music is classified into coursed classes such as vocal, instrumental or vocal mixed with instrumental music. These main classes are further sub-classed according to gender and instrument type. A novel method, Correction Algorithm for Music Sequence (CAMS) has been developed to imporve the classification efficiency.
  • Item
    The C-BRAHMS project
    (Johns Hopkins University, 2003-10-26) Kjell Lemström; Veli Mäkinen; Anna Pienimäki; Mika Turkia; Esko Ukkonen; Holger H. Hoos; David Bainbridge
    The C-BRAHMS project develops computational methods for content-based retrieval and analysis of music data. A summary of the recent algorithmic and experimental developments of the project is given. A search engine developed by the project is available at
  • Item
    Detecting Emotion in Music
    (Johns Hopkins University, 2003-10-26) Tao Li; Mitsunori Ogihara; Holger H. Hoos; David Bainbridge
    Detection of emotion in music sounds is an important problem in music indexing. This paper studies the problem of identifying emotion in music by sound signal processing. The problem is cast as a multiclass classification problem, decomposed as a multiple binary classification problem, and is resolved with the use of Support Vector Machines trained on the timbral textures, rhythmic contents, and pitch contents extracted from the sound data. Experiments were carried out on a data set consisting of 499 30-second long music sounds over ambient, classical, fusion, and jazz. Classification into the ten adjective groups of Farnsworth (plus three additional groups) as well as classification into six supergroups that are formed by combining these basic groups was attempted. For some groups and supergroups reasonably accurate performance was achieved.
  • Item
    Music Scene Description Project: Toward Audio-based Real-time Music Understanding
    (Johns Hopkins University, 2003-10-26) Masataka Goto; Holger H. Hoos; David Bainbridge
    This paper reports a research project intended to build a real-time music-understanding system producing intuitively meaningful descriptions of real-world musical audio signals, such as the melody lines and chorus sections. This paper also introduces our efforts to add correct descriptions (metadata) to the pieces in a music database.
  • Item
    Automatic Segmentation, Learning and Retrieval of Melodies Using A Self-Organizing Neural Network
    (Johns Hopkins University, 2003-10-26) Steven Harford; Holger H. Hoos; David Bainbridge
    We introduce a neural network, known as SONNET-MAP, capable of automatic segmentation, learning and retrieval of melodies. SONNET-MAP is a synthesis of Nigrin’s SONNET (Self-Organizing Neural NETwork) architecture and an associative map derived from Carpenter, Grossberg and Reynolds’ ARTMAP. SONNET-MAP automatically segments a melody based on pitch and rhythmic grouping cues. Separate SONNET modules represent the pitch and rhythm dimensions of each segmented phrase independently, with two associative maps fusing these representations at the phrase level. Further SONNET modules aggregate these phrases forming a hierarchical memory structure that encompasses the entire melody. In addition, melodic queries may be used to retrieve any encoded melody. As far as we are aware, SONNET-MAP is the first self-organizing neural network architecture capable of automatically segmenting and retrieving melodies based on both pitch and rhythm.
  • Item
    Three Dimensional Continuous DP Algorithm for Multiple Pitch Candidates in Music Information Retrieval System
    (Johns Hopkins University, 2003-10-26) Sungphil Heo; Motoyuki Suzuki; Akinori Ito; Shozo Makino; Holger H. Hoos; David Bainbridge
    This paper threats theoretical and practical issues that implement a music information retrieval system based on query by humming. In order to extract accuracy features from the user's humming, we propose a new retrieval method based on multiple pitch candidates. Extracted multiple pitches have shown to be very important parameters in determining melodic similarity, but it is also clear that the confidence measures feature which are obtained from the power are important as well. Furthermore, we propose extending the traditional DP algorithm to three dimensions so that multiple pitch candidates can be treated. Simultaneously, at the melody representation technique, we propose the DP paths are changed dynamically to be able to take relative values so that they can respond to insert or omit notes.
  • Item
    Position Indexing of Adjacent and Concurrent N-Grams for Polyphonic Music Retrieval
    (Johns Hopkins University, 2003-10-26) Shyamala Doraisamy; Stefan Rüger; Holger H. Hoos; David Bainbridge
    In this paper we examine the retrieval performance of adjacent and concurrent n-grams generated from polyphonic music data. We deploy a method to index polyphonic music using a word position indexer with the n-gram approach. Using all possible combinations of monophonic sequences from polyphonic music data, “overlaying” word locations within a document are obtained, such as needed with polyphony (i.e. where more than one word can assume the same word position). The feasibility in utilising the position information of polyphonic ‘musical words’ is investigated using various proximity-based and structured query operators available with text retrieval system. Our experiments show that nested phrase operators improve the retrieval performance and we present the results of our comparative study on a collection of 5456 polyphonic pieces encoded in the MIDI format.
  • Item
    RWC Music Database: Music Genre Database and Musical Instrument Sound Database
    (Johns Hopkins University, 2003-10-26) Masataka Goto; Hiroki Hashiguchi; Takuichi Nishimura; Ryuichi Oka; Holger H. Hoos; David Bainbridge
    This paper describes the design policy and specifications of the RWC Music Database, a copyright-cleared music database (DB) compiled specifically for research purposes. Shared DBs are common in other research fields and have made significant contributions to progress in those fields. The field of music information processing, however, has lacked a common DB of musical pieces or a large-scale DB of musical instrument sounds. We therefore recently constructed the RWC Music Database comprising four original component DBs: the Popular Music Database (100 pieces), Royalty-Free Music Database (15 pieces), Classical Music Database (50 pieces), and Jazz Music Database (50 pieces). In this paper we report the construction of two additional component DBs: the Music Genre Database (100 pieces) and Musical Instrument Sound Database (50 instruments). For all 315 musical pieces, we prepared original audio signals, corresponding standard MIDI files, and text files of lyrics (for songs). For all 50 instruments, we recorded individual sounds at half-tone intervals with several variations of playing styles, dynamics, instrument manufacturers, and musicians. It is our hope that our DB will make a significant contribution to future advances in the field of music information processing.