Jorge Lizarazo

@JorgeLizarazo

A colombian biologist | Master of Engineering (MEng) | Master's in Data Science. Researcher Associate and Adjunct Professor at Universidad Icesi

Groups

Hi everyone,

I recently watched a talk by Aza Raskin where he discusses the idea of using generative models to explore communication with whales. While the conversation wasn’t deeply technical, it sparked a line of thought I’ve been quietly exploring: can we use generative AI to better understand, replicate, and eventually "translate" the communicative structures in animal vocalisations?
I work primarily in conservation and ecology, and I’ve applied several predictive models (CNNs, LSTMs, and some transformer-based architectures with transfer learning). However, I’ve never trained a generative model myself. Still, I’ve been sketching a conceptual pipeline that could link bioacoustics and behavioural ecology.
The idea is now something like start with a large, unlabeled dataset of an animal vocalisations (e.g. bird songs, whale calls, primate vocalisations). Use unsupervised learning (e.g. embeddings + UMAP + HDBSCAN or even kmens, don’t know) to group vocalisations by structure or spectral similarity.
Then, based on field knowledge or ethological observations, manually label some clusters with possible communicative functions (e.g. alarm, contact, courtship). Only if they make sense. Use these labelled clusters to fine-tune a generative model (like SpecGAN, AudioLDM, or even an autoregressive model like WaveNet or VALL-E) to create synthetic sounds conditioned on function.
But… then explore whether this can help us simulate or even engage in meaningful communicative loops, perhaps as a tool for playback experiments or for probing animal perception in a controlled field experiment or exsitu one.
This is still a very very very early-stage idea, more like a sketch hahaha. but I’m curious whether anyone in this community has tried something similar. Not necessarily for whales—any species or sound system would be relevant.

Maristela Martins de Camargo

@mmcamar | she/her

Veterinarian, Immunologist, professor at the University of São Paulo, working on solutions for landscape-scale remote sampling of wildlife for health and pathogen monitoring. @swab_biosampler

25 July 2025 1:32pm

Hi, Jorge,

There was a milestone study a while ago that used AI to show that elephants have specific sounds that translates as individual names that they use to call each other. I think that could give you a starting point and some names to reach out to. Animal translation is a great idea worth pursuing.

I think you will also appreciate what is going on at the Interspecies internet.

Good luck!

Interspecies Internet

" /> <link rel=

African elephants address one another with individually specific name-like calls | Nature Ecology & Evolution

Personal names are a universal feature of human language, yet few analogues exist in other species. While dolphins and parrots address conspecifics by imitating the calls of the addressee, human names are not imitations of the sounds typically made by the named individual. Labelling objects or individuals without relying on imitation of the sounds made by the referent radically expands the expressive power of language. Thus, if non-imitative name analogues were found in other species, this could have important implications for our understanding of language evolution. Here we present evidence that wild African elephants address one another with individually specific calls, probably without relying on imitation of the receiver. We used machine learning to demonstrate that the receiver of a call could be predicted from the call’s acoustic structure, regardless of how similar the call was to the receiver’s vocalizations. Moreover, elephants differentially responded to playbacks of calls originally addressed to them relative to calls addressed to a different individual. Our findings offer evidence for individual addressing of conspecifics in elephants. They further suggest that, unlike other non-human animals, elephants probably do not rely on imitation of the receiver’s calls to address one another. Machine learning analyses and playback experiments in wild African elephants suggest that individuals address conspecifics with name-like calls that do not rely on imitation of the receiver.

Nature

Michael Moshe Michelashvili

@mosheman5

Deep learning researcher, CTO at Deep Voice, and enthusiast of marine mammals and audio.

25 July 2025 2:30pm

Hi Jorge,

I think you'll find this research interesting: https://blog.google/technology/ai/dolphingemma/

Google's researchers did exactly that. They trained an LLM on dolphin vocalizations to produce continuation output, exactly as in the autoregressive papers you've mentioned, VALL-E or WaveNet.

I think they plan to test it in the field this summer and see if it will produce any interesting interaction.

Looking forward to see what they'll find :)

Besides, two more cool organizations working in the field of language understanding of animals using AI:

https://www.projectceti.org/

https://www.earthspecies.org/

Ariho Jonathan

@jonah | He

I am a Ugandan technologist passionate about using open-source tools and accessible data to support grassroots wildlife conservation. I am currently developing community-friendly bioacoustic and training resources to help local rangers monitor biodiversity more effectively

28 July 2025 3:51pm

This is a really fascinating concept. I’ve been thinking about similar overlaps between AI and animal communication, especially for conservation applications. Definitely interested in seeing where this kind of work goes.

Ariho Jonathan

@jonah | He

30 July 2025 9:46am

This is such a compelling direction, especially the idea of linking unsupervised vocalisation clustering to generative models for controlled playback. I haven’t seen much done with SpecGAN or AudioLDM in this space yet, but the potential is huge. Definitely curious how the field might adopt this for species beyond whales. Following this thread closely!

Wildlabs.net : The conservation technology network

Wildlabs.net : The conservation technology network

A technical and conceptual curiosity... Could generative AI help us simulate or decode animal communication?

Jorge Lizarazo

Groups

Maristela Martins de Camargo

Interspecies Internet

African elephants address one another with individually specific name-like calls | Nature Ecology & Evolution

Michael Moshe Michelashvili

Ariho Jonathan

Ariho Jonathan