Hi all!
Background: I did bioacoustic sampling at 12 sites, roughly 18 days of recording at each site, over the course of a season (both birds and bats). In reviewing our species accumulation curves for avian species, and comparing our beaver pond sites vs the control sites, I can see a lot of instances of just one vocalization of a species captured over the course of the study. (We used BirdNET software for avian data).
I'm struggling to find literature talking about specific numbers of calls necessary to differentiate between birds using the habitat vs passing through (ex: we have a loon call at a tiny stream with no pond or lake nearby, it was most likely moving through the area, not an actual resident of the study site).
Do any of you know of literature or venues where this has been discussed or thresholds proposed? Is anything >1 a reasonable threshold?
Thanks in advance for your time and replies!
-Cortney
7 March 2025 9:53pm
In my experience, it's better to look at times as well as number of calls. If you have recordings over a number of days then it increases the likelihood that it is using it as a habitat. If it's super active for a short period, but then never again then its likely passing through. I had a similar issue at a site recently, random marsh birds visiting the woodland for short periods. It would be a interesting piece of research to put together.
8 March 2025 8:25pm
Or the isolated calls are false alarms / misclassifications.
Tessa Rhinehart
University of Pittsburgh
11 March 2025 2:44pm
Hey Cortney, this is such a good question!
First off - I'd be very careful about interpreting the number of BirdNET detections of a species as a vocalization index for that species. For example, let's say a recorder captures a lot of Species A, which sounds like Species B, but Species B isn't present at that site. You will often find that BirdNET (or any other sound ID model) reports many detections for Species B and Species A, making it seem like both species are present. So, the number of detections of a species at a site isn't necessarily a more reliable indicator of species presence at that site.
One alternative to consider would be to verify species presence at each site and each day, then use the number of days that species was present as a threshold for habitat use. What threshold to use depends on the species - we've used three consecutive days of presence for some songbirds, for instance.
You could also remove dates when the species is expected solely to be migratory. However, this can get tricky... For example, I've observed that Cerulean Warblers that breed at a particular site will arrive there a week or two before migrants pass through at nearby non-breeding migration hotspots.
Doing a day-by-day analysis of presence enables you to more easily check machine learning detections than if you were checking for the number of detections for each species. You can verify the presence of the species at each site and day by listening to the highest-scoring clips for each day for each species.
If you have 18 days of recording * 12 sites and you listen to the top 5 highest-scoring clips per species and day, that's a maximum of 1,000 five-second clips per species. In my experience, reviewing that amount of clips takes me only a few hours using a Jupyter notebook like this one:
bioacoustics-cookbook/tdl-notebook at main 路 kitzeslab/bioacoustics-cookbook 路 GitHub
Jupyter Notebooks for interactive review and annotation of audio files - bioacoustics-cookbook/tdl-notebook at main 路 kitzeslab/bioacoustics-cookbook
Ryan Smith