I'm looking for more information about the specific needs in conservation biology and other fields that use acoustic monitoring for research. I've reviewed various discussions, surveys, such as the WILDLABS State of Conservation Technology white paper, and others. However, distinguishing between existing technologies and genuine needs from an engineering perspective remains challenging. Primarily because I am not intimately familiar with certain elements of the research itself.
The real question is about "data richness" requirements with audio. I am trying to determine if long term audio recordings are actually important for research in most cases, or if it is just the current status quo for detecting a small number of animals because of available hardware? For instance, when conducting surveys with devices like audiomoth or other continual monitoring devices, is the focus typically on analyzing a small number of animal calls (for a single animal / few animals), or is it crucial to actually record the entire range of audio (and all potential animals)?
In other words, would a system capable of identifying between 10 - 30 distinct animal sounds using AI, which then records a timestamp, geodata, and the animal type/call detected into a CSV file provide a useful depth of information, or would it be too incomplete to provide value in many/some/most use cases?
Another way of framing this might be; what is the comparitive volume of research conducted that simply seeks to determine space, time, and detection data on a small number of animals vs detecting a large range of animals or needing to understand context from sounds which must come from recordings?
I could imagine scenarios like detecting periodic endangered species where you would not want to miss the detection. Monitoring very bio-diverse areas for a plethora of wildlife to require full audio recordings to really pick up the depth and variety. Situations where you want to hear vocal tone and expressions. etc. However being an engineer, I am not quite sure how often one vs another is actually implemented in the field.
I ask these questions because I believe a different paradigm of how detection is done may allow engineers to create significantly cheaper devices with much longer timescales ultimately providing significantly more total detection's at less time and actual cost to researchers using emerging AI edge tech. But I have not seen it done, yet.
Any knowledge would be greatly appreciated!
6 July 2024 9:17am
I also wanted to piggy back this conversation. What is the best way to start if you want to make a library of acoustic communication from semi-aquatic species? any advice are welcome, thanks!
9 July 2024 5:58pm
Hi @TenX_Lab !
There are a number of different use cases for passive acoustic monitoring, each of which can/may require particular datasets. We always strongly recommend that a specific research question is developed first and then base sampling design and analysis plan off of that. In a very general sense, PAM analyses can be broken into two primary pipelines: species detection and soundscape analysis (i.e., via acoustic indices). The latter does require recordings the entire range of audio that you're interested in and then calculating acoustic indices over particular timespans and per site to compare.
For species detection, yes there are projects that just focus on a subset of species (endangered/endemics/keystone, etc.). But there is also the consideration that PAM data is an audio archive of sorts, so that even if someone is only focused on a particular species or set of taxa, that same dataset may then be useful in future for other taxa or use cases. For example, my dissertation project was just focused on a lemur species in Madagascar but I did continuous recording and now I have collaborators using the same dataset to study birds. It's common for people to record a schedule like 1 minute every 5 minutes as a trade-off between extending battery life and still getting comprehensive coverage throughout the day/sampling period.
I'm not entirely sure what you mean by 'vocal tone and expressions' - typically PAM data isn't used for acoustic analysis in the sense of vocal communication, as PAM studies focus usually just on the loud/long calls of a species (those most likely to be picked up on ARUs) and not the entire repertoire. The majority of calls you get in PAM recordings are from animals far away so there's not much you can do to analyze acoustic structure and the like. You also don't know the distance the animal is calling from the recorder so you can't say if 2 calls are actually different or just appear different as an artifact of varied distances to the recorder.
Happy to chat more or hop on a call if you'd like! You can DM me here or feel free to email: cbatist@gradcenter.cuny.edu.
-Carly

Vanesa Reyes
WILDLABS
Wildlife Conservation Society (WCS)
10 July 2024 2:28pm
Hi @TenX_Lab !
Great to see this discussion. From my point of view, the depth of data collection varies a lot based on specific goals. There are advantages to both approaches discussed:
Systems driven by AI that focus on detecting specific animal sounds and efficiently logging this data may be particularly important for real-time monitoring and mitigation actions in places where such immediate action is needed. Conversely, as Carly highlighted, data collected more broadly can be reused for future studies, which is a great advantage. Furthermore, long-term audio recordings are crucial to understanding wider ecological contexts, essential for effective conservation efforts. Recording a full range of audio helps reveal interactions between species, changes due to environmental factors, and even impacts from human activities like shipping or construction. While detecting specific species is helpful, extensive recordings can provide crucial insights into an ecosystem's health or uncover unexpected behavioral changes or presences that focused approaches might overlook.
Perhaps, a mixed approach could be useful, like smart ARUs that are capable of both targeted detection and broad recording, with the flexibility to adjust based on specific project needs or unexpected findings. However, as Walter mentioned, analyzing the cost-effectiveness of these devices is important.

Jamie Macaulay
Sea Mammal Research Unit Univ' St Andrews
19 July 2024 10:43am
There lots of really excellent answers here - we have often thought about this for species monitoring and so a few points perhaps more from the marine bioacoustics perspective.
- The deployment of underwater recording devices at sea is often (not always) quite expensive and so the aim is for a device is often to record for as long as possible and, as @carlybatist said, continuous recordings provide context and an archive you can always go back to. We only consider using on board detectors when it means a device can run for longer periods, therefore saving deployment cost (e.g. see FPODs from Chelonia).
- On that note we would be VERY wary of running a complex AI algorithm on a device. This is because any complex algorithm has the potential to fall over when it encounters situations out with which has been trained and the ocean is a highly dynamic and unpredictable environment for which we have comparatively very little training data. We therefore prefer lower performance but predictable algorithms. For example, say we are interested in echolocation clicks from porpoises. We could run a porpoise detection algorithm onboard using a sophisticated algorithm, but we could also just run a very simple energy detector which saves little waveform snippets. The energy detector might have a huge false positive rate but it's low power and still reduces data compared to acoustic recordings by 99% or so compared to say 99.9% using an AI approach. 99% data reduction is more than sufficient to save storage space and power, and we can then always run the more sophisticated algorithms post processing.
- There are a small subset of use cases for edge AI computing I can think of, primarily when you have a limited data connection and are working in a real time context. This could, for example, be a buoy with a satellite connection in a remote area. In these cases, it makes sense to have real time systems, but these are a comparatively small subset of bioacoustics work. In situations where you have 5G or similar data rates, we would still advocate using simple high false positive rate detectors which are then processed by an algorithm on shore that allows a human-in-the-loop approach.
In summary, there is a use for edge AI but, in the marine context, I would suggest it is quite niche. What we do need is cheaper more open-source hardware - low power, low noise, multi-channel recording devices with the availability and support like that of an AudioMoth are sorely needed. The current price for a long-term marine recording device is around 5000-6000 USD.
Herdhanu Jayanto
KONKLUSI (Kolaborasi Inklusi Konservasi - Yayasan)