The identification and modelling of a percussion ‘language’

James Kippen & Bernard Bel

Computers and the Humanities (1989), 23, 3: 119-214

Abstract

In experimental research into percussion ‘languages', an interactive computer system, the Bol Processor, has been developed by the authors to analyse the performances of expert musicians and generate its own musical items that were assessed for quality and accuracy by the informants. The problem of transferring knowledge from a human expert to a machine in this context is the focus of this paper. A prototypical grammatical inferencer named QAVAID (Question Answer Validated Analytical Inference Device, an acronym also meaning ‘grammar' in Arabic/Urdu) is described and its operation in a real experimental situation is demonstrated. The paper concludes on the nature of the knowledge acquired and the scope and limitations of a cognitive-computational approach to music.

Excerpts of an AI review of this paper (Academia, June 2025)

Summary

This paper explores a novel approach to modeling North Indian tabla drumming as a “percussion language” by applying formal language theory, machine learning, and interactive generative/analytic computer methods. The authors discuss two systems— Bol Processor and QAVAID — that each plays a distinct role in analyzing and generating rhythmic patterns (termed “sentences”) under the guidance of expert informants. They examine how knowledge is incrementally acquired and formalized as a grammar, how alternative segmentations can be evaluated, and how probabilistic modeling may be employed to generate original musical sentences for expert evaluation. The work’s ethnomusicological perspective unites computational formalization with the real-world practice of tabla improvisation and teaching, raising broader questions about the nature of knowledge transfer between human expert, machine learner, and cultural context.

Contribution and Strengths

Interdisciplinary Framework

The paper positions itself at the intersection of musicology, cognitive science, computational linguistics, and ethnography. This breadth underscores the complexity of “music as language” and effectively highlights the idea that music may be formally scrutinized with methods akin to those in computer science.

Formal Language Techniques

By grounding the analysis in the Chomskian hierarchy (regular and context-free grammars) and referencing Gold’s concept of “identification in the limit,” the authors tie their ethnomusicological observations to well-established theoretical underpinnings. These connections help clarify why a systematic, incremental approach to grammar inference is suitable for modeling the improvisational components of North Indian tabla drumming.

Attention to Vocabulary and Segmentation

The discussion on how the system learns segmentation and defines “words” in the drumming lexicon is illuminating. Though segmenting tabla phrases is not analogous to segmenting words in spoken languages, the authors show how incremental analysis can propose, refine, or discard potential lexical boundaries in a principled manner.

Interactive and Incremental Learning

A significant feature is the interactive model: the system generates output strings that are validated or rejected by the human informant, thereby triggering incremental adjustments to the grammar. This mimics student-teacher interactions and demonstrates a strong attempt to reflect authentic learning and teaching processes.

Probabilistic Aspect

Introducing stochasticity in synthesis breaks from purely deterministic methods. It points to a more realistic reflection of the ways in which live performance might involve creative, non-deterministic choices, while maintaining constraints guided by the learned grammar.

Methodological Observations

Data Representation

The authors clearly define the symbol inventory (bols like dha, ge, ti, etc.) and acknowledge the complexity of how these symbols relate to sonic events. By limiting the approach to frequency-based segmentation and grammar inference, the system operating within a “text presentation protocol” remains suitably rigorous.

User–System Dialogue

Illustrations of the QAVAID question–answer mechanism highlight practical aspects of grammar construction. This is valuable for explaining how the system backs up, modifies rules, or infers new chunks based on partial disagreements from the expert and how it tests repeated merges or segmentations for consistency.

Scalability Considerations

The experiments presented involve a limited number of examples. The authors note computational constraints and carefully frame how repeated merges, lexical expansions, and negative examples (machine outputs the user rejects) unfold in realistic time on a microcomputer. This transparency about performance considerations is commendable.

Comparison to Existing Tools

While the authors reference formal language theory, it could be helpful to situate the QAVAID approach more explicitly alongside other grammar-inference systems (or music cognition models) in terms of efficiency and success rates. This might provide additional context about how QAVAID’s tight-fit methodology differs from existing machine-learning strategies in music.

Suggestions for Future Work

Integration of Connectionist Approaches

A deeper investigation into how sub-symbolic learning algorithms (e.g., neural networks) might coexist or complement a symbolic grammar-inference approach could shed light on whether deeper hierarchical or pattern-based musical structures can be discovered automatically.

Temporal and Metric Awareness

Incorporating real-time constraints, including an explicit model of cycle boundaries and tempo variations, might enable QAVAID or a successor system to handle performances that deviate subtly from rigorously measured durations.

Generative Evaluation

Extending the system to produce longer performance sequences and evaluating how coherent or context-appropriate they sound in extended improvisation might reveal new facets of pattern synergy that short examples do not expose.

Cross-Cultural Applicability

The strategies deployed here for tabla might prove adaptable to other deeply mnemonic or improvisatory musical traditions (e.g., West African drumming, Middle Eastern percussion). Investigating how the model generalizes across cultures could underscore the method’s versatility and reveal new limitations.

Conclusion

By merging formal language theory with ethnomusicological fieldwork and machine learning, the authors propose a powerful model for capturing core aspects of tabla improvisation. The framework encourages close human–computer collaboration through dynamic questioning and incremental grammar building. This approach not only advances a cognitive-computational perspective on music but also opens a pathway for further inquiries into cross-cultural applications, time-sensitive performance modeling, and creative composition within implicit musical grammars.

Download this paper

View Fullscreen