Hi everyone,
I'm working on a project involving the automatic recognition of bats and birds from their audio recordings in Italy, with a possibile evolution also in other european countries. So far, I've experimented with well-known open source tools like BatDetect2, Bat2Web, and BirdNET, but I’m wondering if there are any more recent solutions, maybe leveraging newer architectures like Vision Transformers applied to spectrograms.
Does anyone have experience with new open source models or frameworks that could be useful for this task? Any suggestions, recent approaches or alternatives to the ones I’ve mentioned would be greatly appreciated!
Thanks in advance!
7 February 2025 5:06pm
Hi Lorenzo,
I highly recommend the OpenSoundscapes package (developed by the Kitzes Lab at U Pittsburgh) - there are workflows to build your own CNNs there, the documentation is really thorough, and the team are very responsive to inquiries. They also have a bioacoustics 'model zoo' that lists relevant models. The Perch model from Google would be good to look into as well.
Some recent papers I've seen that might also be worth checking out -
- Advanced montane bird monitoring using self-supervised learning and transformer on passive acoustic data
- A good horizon scan paper - The potential for AI to revolutionize conservation: a horizon scan
- Challenges and solutions for ecologists adopting AI
- And perhaps not directly related, but a new framework for deploying edge models onto recorders was just released - acoupi: An Open-Source Python Framework for Deploying Bioacoustic AI Models on Edge Devices
Hope that helps a bit!
Carly Batist