Audio Recognition of Percussion Sounds via Machine Learning with an Automated Sound Generation System (3D Auto-Drum Machine)

We developed an automated system (3D Auto-Drum Machine) that generates and records percussion sounds from metallic surfaces (e.g., cymbals, aluminum sheets) under fully controlled striking conditions. The impact force and striking position are recorded alongside the sound, creating a database used to train machine learning models capable of recognizing the material and geometry of the object from its sound.
3D Auto-Drum Machineaudio recognitionmachine learningpercussive instruments
Overview
This research aims to bridge the gap between the physical characteristics of musical instruments and the sound they produce. Traditional percussion instruments, such as cymbals, produce complex sounds that depend on material, shape, and striking technique. To study this relationship, we constructed a precision robotic system capable of striking a surface at predetermined points with controlled force. The system simultaneously records the emitted sound, the impact force, and the exact coordinates of the striking point.
In this way, we created a database of audio samples from different materials (aluminum, MS63 alloy, B8 alloy) and different geometries (flat sheet, cymbal). We then used a pre-trained deep learning model (DistilHuBERT) to analyze the audio samples. The results showed that samples from different materials/geometries are separable in the feature space, while measurement repetitions were consistent, confirming the reliability of the method. This capability paves the way for creating large databases that will enable training models to predict the sound of new materials and designs, reducing the need for time-consuming and costly manufacturing.
Examples
Comparative sound spectra: Graph comparing sound spectra from three different materials (aluminium, MS63, B8) for the same striking position, highlighting differences due to material and geometry.

Machine learning visualization: The t-SNE plot showing how the 96 audio samples cluster by material/geometry, demonstrating they are separable by the model.

Publications
Brezas, S., Skoulakis, A., Kaliakatsos-Papakostas, M., Sarantis-Karamesinis, A., Orphanos, Y., Tatarakis, M., Papadogiannis, N.A., Bakarezos, M., Kaselouris, E., Dimitriou, V. (2024). Audio Recognition of the Percussion Sounds Generated by a 3D Auto-Drum Machine System via Machine Learning. Electronics, 13(9), 1787. https://doi.org/10.3390/electronics13091787
Ερευνητική Ομάδα
Vasileios Dimitriou, Professor
Maximos Kaliakatsos-Papakostas, Associate Professor
Evaggelos Kaselouris, Assistant Professor
Chrisoula Alexandraki, Associate Professor
Nektarios Papadogiannis, Professor
Makis Bakarezos, Professor
Giannis Orphanos, Laboratory Teaching Staff
Spyros Brezas, Postdoctoral Researcher, Acoustics Consultant
Despina Grigoriou, PhD Candidate
Michalis Starakis, PhD Candidate
Nikos Charalampidis, MSc Student
Lampros Kariotoglou, Undergraduate Student
