Questions for Biologists relating to system requirements in acoustic research

Morgan

@TenX_Lab

Wildlife enthusiast who happens to run an embedded systems lab

Groups

Acoustics

I'm looking for more information about the specific needs in conservation biology and other fields that use acoustic monitoring for research. I've reviewed various discussions, surveys, such as the WILDLABS State of Conservation Technology white paper, and others. However, distinguishing between existing technologies and genuine needs from an engineering perspective remains challenging. Primarily because I am not intimately familiar with certain elements of the research itself.

The real question is about "data richness" requirements with audio. I am trying to determine if long term audio recordings are actually important for research in most cases, or if it is just the current status quo for detecting a small number of animals because of available hardware? For instance, when conducting surveys with devices like audiomoth or other continual monitoring devices, is the focus typically on analyzing a small number of animal calls (for a single animal / few animals), or is it crucial to actually record the entire range of audio (and all potential animals)?

In other words, would a system capable of identifying between 10 - 30 distinct animal sounds using AI, which then records a timestamp, geodata, and the animal type/call detected into a CSV file provide a useful depth of information, or would it be too incomplete to provide value in many/some/most use cases?

Another way of framing this might be; what is the comparitive volume of research conducted that simply seeks to determine space, time, and detection data on a small number of animals vs detecting a large range of animals or needing to understand context from sounds which must come from recordings?

I could imagine scenarios like detecting periodic endangered species where you would not want to miss the detection. Monitoring very bio-diverse areas for a plethora of wildlife to require full audio recordings to really pick up the depth and variety. Situations where you want to hear vocal tone and expressions. etc. However being an engineer, I am not quite sure how often one vs another is actually implemented in the field.

I ask these questions because I believe a different paradigm of how detection is done may allow engineers to create significantly cheaper devices with much longer timescales ultimately providing significantly more total detection's at less time and actual cost to researchers using emerging AI edge tech. But I have not seen it done, yet.

Any knowledge would be greatly appreciated!

Herdhanu Jayanto

@hjayanto | He/Him

KONKLUSI (Kolaborasi Inklusi Konservasi - Yayasan)

Your friendly Indo-Crocky-Croc

6 July 2024 9:17am

I also wanted to piggy back this conversation. What is the best way to start if you want to make a library of acoustic communication from semi-aquatic species? any advice are welcome, thanks!

Carly Batist

@carlybatist | she/her

ecoacoustics, biodiversity monitoring, nature tech

9 July 2024 5:58pm

Hi @TenX_Lab !

There are a number of different use cases for passive acoustic monitoring, each of which can/may require particular datasets. We always strongly recommend that a specific research question is developed first and then base sampling design and analysis plan off of that. In a very general sense, PAM analyses can be broken into two primary pipelines: species detection and soundscape analysis (i.e., via acoustic indices). The latter does require recordings the entire range of audio that you're interested in and then calculating acoustic indices over particular timespans and per site to compare.

For species detection, yes there are projects that just focus on a subset of species (endangered/endemics/keystone, etc.). But there is also the consideration that PAM data is an audio archive of sorts, so that even if someone is only focused on a particular species or set of taxa, that same dataset may then be useful in future for other taxa or use cases. For example, my dissertation project was just focused on a lemur species in Madagascar but I did continuous recording and now I have collaborators using the same dataset to study birds. It's common for people to record a schedule like 1 minute every 5 minutes as a trade-off between extending battery life and still getting comprehensive coverage throughout the day/sampling period.

I'm not entirely sure what you mean by 'vocal tone and expressions' - typically PAM data isn't used for acoustic analysis in the sense of vocal communication, as PAM studies focus usually just on the loud/long calls of a species (those most likely to be picked up on ARUs) and not the entire repertoire. The majority of calls you get in PAM recordings are from animals far away so there's not much you can do to analyze acoustic structure and the like. You also don't know the distance the animal is calling from the recorder so you can't say if 2 calls are actually different or just appear different as an artifact of varied distances to the recorder.

Happy to chat more or hop on a call if you'd like! You can DM me here or feel free to email: cbatist@gradcenter.cuny.edu.

-Carly

Walter Zimmer

9 July 2024 5:58pm

To me the key expression is "may allow engineers to create significantly cheaper devices ... using emerging AI edge tech", which boils down to: can smart ARUs that pre-process the data be useful? I agree with your suggestion that first comes the aim of the use case, which translates to the questions, can the different use cases be well defined that smart ARU are of interest? From your dissertation example I conclude that, in research settings, raw recordings are preferred as they allow the same dataset be re-used for different research questions, but what about soundscape analysis, or legal mitigation requirements (in underwater)? Can AI play an enough strong role so that its additional power requirement is lower than the power needed in an ARU to write to raw data to disk? Which are the use cases, where the time for post-acquisition data analysis is a limiting factor? I guess, for most legal related use cases time is a limited factor, but then price of acoustic sensing units are not an issue.

Carly Batist

9 July 2024 5:58pm

There are some examples of such real-time species-detection devices, both commercial and DIY. Commerical, such as BirdWeather's BirdPUC (and associated Data Explorer) - this is a device running BirdNET on the edge and sending alerts to your phone. And there are open-source options such as BirdNET-Pi, which allows you to run BirdNET-lite on a Raspberry Pi.

Carly Batist

9 July 2024 5:58pm

For soundscape analyses, there could definitely be edge computing of acoustic indices, which wouldn't take up much computational cost. There are many existing python (scikit-maad) and R packages that run a bunch of the indices on a recording at the same time. You could for sure load that up to run on a device.

The OpenAcousticDevices folks would probably be keen to chat about this too! @alex_rogers @Andrew_Hill

Alex Rogers

9 July 2024 5:58pm

We've done experiments in the past with calculating acoustic indices on AudioMoth. It was possible to sample at 48kHz, calculate and store 1-minute averages over the spectrogram, and three acoustic indices, and still show a significant x4 energy saving over continuous recording to SD cards.

The opposite is the case for our experiments with small neural network models. We can monitor continuously at 16kHz, and run a model over 1-second audio clips to decide whether or not to write it to the SD card, but this uses about x2 as much energy as recording continuously to SD card.

This will vary with different hardware, and the trade-offs will likely shift, but energy saving doesn't seem to be a compelling case for on-board processing. Reducing storage requirements and real-time alerting are still potential advantages though.

Walter Zimmer

9 July 2024 5:58pm

I'm interested in underwater applications, where it is sometimes difficult to acccess ARUs, and where pressure- housing is a major cost factor and therefore energy saving is very important. Yes, bio-acoustic data collections have very different aspects.

Morgan

9 July 2024 5:58pm

Hey @carlybatist ! Thank you so much for the well written response, this is exactly the information I have been looking for!

You bring up a great point about creating sharable audio archives. I could see this strategy being immensely useful for tracking acoustic/species changes in a specific area over time. Particularly in the case of your dissertation, where you are working in highly bio-diverse and complex acoustic environments where (presumably) the richness of information per second of data is very high.

Based on your answer, I am essentially concluding that something with the capabilities I have vaguely laid out may have an application for species detection, but ultimately (at least in particular use cases) may be limiting in terms of total gross usable data available for additional research questions down the line.

However this opens up another potential strength for this approach - If we look at the scenario where a researcher may be running a survey in an environment where there are significantly less detectable sounds per unit time, it appears that a system optimized specifically for very long battery life with a limited number of sounds could be greatly advantageous. It would seem that running a device continually for a long duration would be the only way to assure you catch enough detections for your survey, and you may have to swap SD cards very frequently in order to record for a long enough time-span to gather many actual detections with standard audio devices. Further, in a space such as this, species detection for only a limited number of calls seems like it would be a fair tradeoff considering duration could be extended (to a very high degree..)

So assuming this assertion is accurate, I guess the question would be if this is too limited of a use case. In theory could this be something that could provide a high degree of value, or potentially create research opportunities that are not currently available? I could see utility in things like gunshot detection to combat poaching. I would really appreciate any ideas of which areas of conservation generally this may be useful in!

Also, yes I will reach out! Thank you!

@WMXZ You have got the idea. I am attempting to apply business principles to the researchers perspective. When looking at this from a cost / value (data) generated perspective. Then determine what the use cases may be that would benefit most from a new data collection strategy. It appears that the costs in time and dollars per data point is be extremely high generally, which would then limit the total amount of useful data (with regards to timescale, area coverage, or total projects a researcher can do per year assuming budget limitations). Obviously though, loss of data richness from continual audio recordings is a downside. But I believe there may be applications where a different strategy may be very valuable.

May I ask - What are you specifically meaning when you talk about "legal" in audio recording? I am not very well versed in underwater audio but would love to look into it.

@alex_rogers I love the idea of using the 1 minute averages. I could imagine that would greatly reduce storage requirements as well, or are you saving the audio as well as the spectrogram? I could see using a NN on the front end being useful for the audiomoth in environments with rich sunlight so that you can easily keep it powered via solar. Ive had a similar idea, but am trying to move toward a different approach entirely.

Walter Zimmer

9 July 2024 5:58pm

@TenX_Lab , with "legal" I mean the following situation: To protect the underwater environment from acoustic noise pollution, some states make it obligatory not only to minimize the sound impact but also to monitor the environment before during and after human activities (wind farm construction, port construction work, etc). These monitoring programs are required by law (therefore the term "legal") and as such the expected output can easily be defined a-priory and therefore implemented as on-board processing in the acoustic monitoring device. The objective is obviously fundamental different from acoustic based bio-acoustic research.

Edit: yes, gun-shot detection, as indicated in other answers, especially in remote areas and any other applications that ask for real-time alerts over low-bandwidth data links, are such 'legal' use cases that could be improved with 'smart' ARUs.

Morgan

9 July 2024 5:58pm

@WMXZ It appears from looking at all fo the answers to my prompt in the thread that the real value here is in the "real time" element, more-so than the use of edge AI. Especially, it seems, for use in actual animal detection. Though as mentioned there certainly are use cases for that as well.

Working in the regulatory side does seem to be a great use case for RT detection electronics however. I am sure this market is saturated and highly competitive though, and likely to be difficut to enter as a smaller business considering that special ratings and equipment standards likely must be met (though, these may be handled sensor-side).

Are you aware of any areas in the market that are un-met? Or groups/organizations that would be good to contact?

Vanesa Reyes

@vanereyes

WILDLABS

Wildlife Conservation Society (WCS)

I'm the Bioacoustics Research Analyst at WILDLABS. I'm a marine biologist with particular interest in the acoustics behavior of cetaceans. I'm also a backend web developer, hoping to use technology to improve wildlife conservation efforts.

10 July 2024 2:28pm

Hi @TenX_Lab !

Great to see this discussion. From my point of view, the depth of data collection varies a lot based on specific goals. There are advantages to both approaches discussed:

Systems driven by AI that focus on detecting specific animal sounds and efficiently logging this data may be particularly important for real-time monitoring and mitigation actions in places where such immediate action is needed. Conversely, as Carly highlighted, data collected more broadly can be reused for future studies, which is a great advantage. Furthermore, long-term audio recordings are crucial to understanding wider ecological contexts, essential for effective conservation efforts. Recording a full range of audio helps reveal interactions between species, changes due to environmental factors, and even impacts from human activities like shipping or construction. While detecting specific species is helpful, extensive recordings can provide crucial insights into an ecosystem's health or uncover unexpected behavioral changes or presences that focused approaches might overlook.

Perhaps, a mixed approach could be useful, like smart ARUs that are capable of both targeted detection and broad recording, with the flexibility to adjust based on specific project needs or unexpected findings. However, as Walter mentioned, analyzing the cost-effectiveness of these devices is important.

Morgan

10 July 2024 2:28pm

Hello @vanereyes !

An excellent answer, I really appreciate the thorough input. I think conversations like this can really help the tech side find more diverse implementation strategies and analyze design tradeoffs to be more effective in providing researches with better tools as a whole!

I can understand how the actual audio logs can be very necessary in many cases. I guess the question would then become - What would use cases for such a system such as what I describe be most suited for? Real time monitoring for example is an excellent point - And would be easy to implement.

A mixed approach would be simple to use from the research side, and may be a good note for other tech people on the matter! However the interest I have in edge detection is that it can be potentially done at a significantly lower cost per device than most ARUs, and/or could work remotely for very long durations with very little to no human input. Either or both could have significant advantages potentially, I would think. As manufacturers begin to build chips and devices that are lower power, cheaper, and more powerful, it will open up new opportunities for the landscape to change. It is not out of the question to assume that fairly soon we will have the ability to create devices that cost under $50 - 100 that could run remotely for years assuming that a small number of detectable sounds are viable for research. Or as you mentioned, potentially use cases where for some reason the researchers would benefit from real time analysis. Though you would lose out on data richness of audio recordings, I would imagine this could be very valuable in its ability to cover more ground for the cost. As the WILDLABS State of Conservation Technology white paper does indicate cost is a major constraint on researchers, I would love to see how new tech could be used to mitigate that as a limiting factor potentially.

I also do wonder what other applications there might be tangential to the use cases of ARUs that this type of technology could be valuable in. Perhaps human/animal conflict areas like real time chainsaw sound detection, poacher detection, etc may be a more apt use case? I would love to hear thoughts!

Vanesa Reyes

10 July 2024 2:28pm

Hey @TenX_Lab

There are several scenarios where real-time monitoring could significantly enhance conservation efforts. As you've noted, real-time threat detection in areas prone to illegal activities, such as poaching or illegal logging could enable law enforcement or park rangers to respond swiftly to threats, potentially reducing response times and preventing illegal activities.

Conservation efforts focused on specific endangered species by allowing for continuous monitoring without the need for extensive manpower. For instance, such technology could help mitigate specific threats to these populations, like ship strikes on whale populations (some papers for your reference https://www.frontiersin.org/journals/marine-science/articles/10.3389/fmars.2022.873888/full https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/2041-210X.13244 ); monitoring migratory species, detecting when they pass through specific parts of their migratory routes, which could inform and enhance conservation strategies.

Specific calls can be targeted to detect the presence of invasive species and control their spread.

Provide early warnings for natural disasters such as tsunamis by detecting changes in acoustic signals, which may indicate seismic activity.

In the realm of ecotourism, such devices could significantly enhance visitor experiences by identifying and providing real-time information about nearby wildlife, enriching educational outreach and engagement with nature.

I really think that if the technology can be made affordable, real-time acoustic monitoring has tremendous potential to assist conservation efforts.

Carly Batist

10 July 2024 2:28pm

@LydiaKatsis would be good to talk about to about edge processing of gunshots!

This is a great paper from her & co where they developed a gunshot detection model and ran it onboard recorders.

Morgan

10 July 2024 2:28pm

@vanereyes These are all excellent ideas! This is exactly what I have been looking for. Tracking migratory species is a particularly good point and seems like it would be an excellent use case, as monitoring this over time could help shed light on how climate change effects migratory species. From the AI side, the easiest (and therefore, cheapest) applications would be things like gunshots, chainsaws, or other sounds that are vastly different than anything in the natural environment. The more distinct the sounds of interest compared to the background the cheaper and more effective the system can be made. As the technology progresses, I believe systems for species detection will become vastly cheaper. As everyone appears to agree, it does seem the best applications for this relate to real time detection. I have seen (in the white paper, and elsewhere) that so called "networked sensors" have a high degree of utility. Though in searching the forums I haven't found many mention of exactly what hardware currently exists or what these happen to be used for typically. However, based on your answer related to seismic detection. It appears that edge may have many or potentially more applications outside of sound. Is this an area being explored much? I could imagine sensor fusion (multi-sensor ai) to be a potentially valuable option for development as well.

Jamie Macaulay

@jamie_mac

Sea Mammal Research Unit Univ' St Andrews

I work in marine bioacoustics with a focus on the conservation of marine mammals. Most of the time I'm developing and using passive acoustic technology to study the underwater behaviour of dolphins and porpoises. I'm also a keen developer on the PAMGuard project.

19 July 2024 10:43am

There lots of really excellent answers here - we have often thought about this for species monitoring and so a few points perhaps more from the marine bioacoustics perspective.

The deployment of underwater recording devices at sea is often (not always) quite expensive and so the aim is for a device is often to record for as long as possible and, as @carlybatist said, continuous recordings provide context and an archive you can always go back to. We only consider using on board detectors when it means a device can run for longer periods, therefore saving deployment cost (e.g. see FPODs from Chelonia).
On that note we would be VERY wary of running a complex AI algorithm on a device. This is because any complex algorithm has the potential to fall over when it encounters situations out with which has been trained and the ocean is a highly dynamic and unpredictable environment for which we have comparatively very little training data. We therefore prefer lower performance but predictable algorithms. For example, say we are interested in echolocation clicks from porpoises. We could run a porpoise detection algorithm onboard using a sophisticated algorithm, but we could also just run a very simple energy detector which saves little waveform snippets. The energy detector might have a huge false positive rate but it's low power and still reduces data compared to acoustic recordings by 99% or so compared to say 99.9% using an AI approach. 99% data reduction is more than sufficient to save storage space and power, and we can then always run the more sophisticated algorithms post processing.
There are a small subset of use cases for edge AI computing I can think of, primarily when you have a limited data connection and are working in a real time context. This could, for example, be a buoy with a satellite connection in a remote area. In these cases, it makes sense to have real time systems, but these are a comparatively small subset of bioacoustics work. In situations where you have 5G or similar data rates, we would still advocate using simple high false positive rate detectors which are then processed by an algorithm on shore that allows a human-in-the-loop approach.

In summary, there is a use for edge AI but, in the marine context, I would suggest it is quite niche. What we do need is cheaper more open-source hardware - low power, low noise, multi-channel recording devices with the availability and support like that of an AudioMoth are sorely needed. The current price for a long-term marine recording device is around 5000-6000 USD.

KONKLUSI (Kolaborasi Inklusi Konservasi - Yayasan)

WILDLABS

Wildlife Conservation Society (WCS)

Sea Mammal Research Unit Univ' St Andrews

Wildlabs.net : The conservation technology network

Wildlabs.net : The conservation technology network

Questions for Biologists relating to system requirements in acoustic research

Morgan

Groups

Herdhanu Jayanto

Carly Batist

Vanesa Reyes

Jamie Macaulay