Group

Data management and processing tools / Feed

Conservation tech work doesn't stop after data is collected in the field. Equally as important to success is navigating data management and processing tools. For the many community members who deal with enormous datasets, this group will be an invaluable resource to trade advice, discuss workflows and tools, and share what works for you.

discussion

The 100KB Challenge!

If you could send/receive 100KB of data from anywhere on the planet via satellite; what would you send?I work for a company called Ground Control, we design, build and manufacture...

11 3
Dan

Nice one - what kind of thing would you use this for? 

~500mA peak current, it has a similar power profile as the current RockBLOCK product, in that it needs lots of juice for a for a small period of time (to undertake the transmission) we include onboard circuitry to help smooth this over. I'll be able to share more details on this once the product is officially launched!

 

Dan

~500mA peak current, it has a similar power profile as the current RockBLOCK product, in that it needs lots of juice for a for a small period of time (to undertake the transmission) we include onboard circuitry to help smooth this over. I'll be able to share more details on this once the product is officially launched!

 

Hi Dan, 

Not right now but I can envision many uses. A key problem in RS is data streams for validation and training of ML models, its really not yet a solved problem. Any kind of system that is about deploying and "forgetting" as it collects data and streams it is a good opportunity. 

 

If you want we can have a talk so you tell me about what you developed and I'll see if it fits future projects.

 

All the best

 

See full post
discussion

Camera Trap Data Visualization Open Question

Hi there,I would like to get some feedback, insight into how practitioners manage and visualize their camera trap data.We realized that there exists already many web based...

6 0

Hey Ed! 

Great to see you here and thanks a lot for your thorough answer.
We will be checking out Trapper for sure - cc @Jeremy_ ! A standardized data exchange format like Camtrap DP makes a lot of sense and we have it in mind to build the first prototypes.
 

Our main requirements are the following:

  • Integrate with the camtrap ecosystem (via standardized data formats)
  • Make it easy to run for non technical users (most likely an Electron application that can work cross OSes).
  • Make it useful to explore camtrap data and generate reports

 

In the first prototyping stage, it is useful for us to keep it lean while keeping in mind the interface (data exchange format) so that we can move fast.


Regards,
Arthur

Quick question on this topic to take advantage of those that know a lot about it already. So once you have extracted all your camera data and are going through the AI object detection phase which identifies the animal types. What file formation that contains all of the time + location + labels in the photos data do the most people consider the most useful ? I'm imagining that it's some format that is used by the most expressive visualization software around I suppose. Is this correct ?

A quick look at the trapper format suggested to me that it's meta data from the camera traps and thus perform the AI matching phase. But it was a quick look, maybe it's something else ? Is the trapper format also for holding the labelled results ? (I might actually the asking the same question as the person that started this thread but in different words).

Another question. Right now pretty  much all camera traps trigger on either PIR sensors or small AI models. Small AI models would tend to have a limitation that they would only accurately detect animal types and recognise them at close distances where the animal is very large and I have question marks as to whether small models even in these circumstances are not going to make a lot of classification errors (I expect that they do and they are simply sorted out back at the office so to speak). PIR sensors would typically only see animals within say 6m - 10m distance. Maybe an elephant could be detected a bit further. Small animals only even closer.

But what about when camera traps can reliably see and recognise objects across a whole field, perhaps hundreds of meters?

Then in principle you don't have to deploy as many traps for a start. But I would expect you would need a different approach to how you want to report this and then visualize it as the co-ordinates of the trap itself is not going to give you much  information. We would be in a situation to potentially have much more accurate and rich biodiversity information.

Maybe it's even possible to determine to a greater degree of accuracy where several different animals from the same camera trap image are spatially located, by knowing the 3D layout of what the camera can see and the location and size of the animal.

I expect that current camera trap data formats may fall short of being able to express that information in a sufficiently useful way, considering the in principle more information available and it could be multiple co-ordinates per species for each image that needs to be registered.

I'm likely going to be confronted with this soon as the systems I build use state of the art large number of parameter models that can see species types over much greater distances. I showed in a recent discussion here, detection of a polar bear at a distance between 130-150m.

Right now I would say it's an unknown as to how much more information about species we will be able to gather with this approach as the images were not being triggered in this manner till now. Maybe it's far greater than we would expect ? We have no idea right now.

See full post
discussion

Machine learning for bird pollination syndromes

I am a PhD student working on bird pollination syndromes in South Africa and looking specifically at urbanizations effect on sunbirds and sugarbirds. By drawing from a large...

2 2

Hi @craigg, my background is machine learning and deep neural networks, and I'm also actively involved with developing global geospatial ecological models, which I believe could be very useful for your PhD studies.  

First of all to your direct challenges, I think there will be many different approaches, which could serve more or less of your interests.

As one idea that came up, I think it will be possible in the coming months, through a collaboration, to "fine-tune" a general purpose "foundation model" for ecology that I'm developing with University of Florida and Stanford University researchers.  More here.

You may also find the 1+ million plant trait inferences searchable by native plant habitats at Ecodash.ai to be useful.  A collaborator at Stanford actually is from South Africa, and I was just about to send him this e.g. https://ecodash.ai/geo/za/06/johannesburg

I'm happy to chat about this, just reach out!  I think there could also be a big publication in Nature (or something nice) by mid-2025, with dozens of researchers demonstrating a large number of applications of the general AI techniques I linked to above.

See full post
discussion

Free/open-source app for field data collection

Hi all, I know something similar was asked a year ago but I'd like some advice on free applications or software for collecting data in the field on an Android device (for eventual...

11 2

Thanks! Essentially field technicians, students, researchers etc. go out into the field and find one of our study groups and from early in the morning until evening the researchers record the behaviour of individual animals at short intervals (e.g., their individual traits like age-sex class, ID, what the animal is doing, how many conspecifics it has within a certain radius, what kind of food the animal is eating if it happens to be foraging). Right now in our system things work well but we are using an app that is somewhat expensive so we want to move towards open-source

See full post
Link

Accessing the Global Register of Introduced and Invasive Species (GRIIS)

Signposts to different ways to access the new/updated GRIIS (through GBIF or open-access country lists) - as part of the IAS Toolkit for GBF target 6

1
discussion

Which LLMs are most valuable for coding/debugging?

Hello! I'm curious to get folks' impressions on the most useful LLMs for helping with data analysis/debugging code? I've been using chatgpt, which is at times helpful and at times...

4 2

When it comes to coding and debugging, several large language models (LLMs) stand out for their value. Here are a few of the most valuable LLMs for these tasks:

1. OpenAI's Codex: This model is specifically trained for programming tasks, making it excellent for generating code snippets, suggesting improvements, and even debugging existing code. It powers tools like GitHub Copilot, which developers find immensely helpful.

2. Google's PaLM: Known for its versatility, PaLM excels in understanding complex queries, making it suitable for coding-related tasks as well. Its ability to generate and refine code snippets is particularly useful for developers.

3. Meta's LLaMA: This model is designed to be adaptable and can be fine-tuned for specific coding tasks. Its open-source nature allows developers to customize it according to their needs, making it a flexible option for coding and debugging.

4. Mistral: Another emerging model that shows promise in various tasks, including programming. It’s being recognized for its capabilities in generating and understanding code.

These LLMs are gaining traction not just for their coding capabilities but also for their potential to streamline the debugging process, saving developers time and effort. If you want to dive deeper into the features and strengths of these models, you can check out the full article here: Best Open Source Large Language Models  LLMs

See full post
discussion

Video evidence for the evaluation of behavioral state predictions

Hi all, glad to share two of our contributions to the current e-obs newsletter in the context of the evaluation of behavioral state predictions and the mapping of...

6 0

Currently, the main focus is visual footage as we don't render audio data in the same way as we do for acceleration (also: the highly different frequencies can be hard to show sensibly side by side).


But In this sense, yes, the new module features 'quick adjust knobs' for time shifts: you can roll-over a timestamp and use a combination of shift/control and mouse-wheel to adjust the offset of the video by 1/10/60 seconds or simply enter the target timestamp manually down to the millisecond level. This work can then and also be saved in a custom mapping file to continue synchronisation work later on.

 

No, not yet. The player we attached does support slower/faster replay up to a certain precision, but I'm not sure that this will be sufficiently precise for the kind of offsets we are talking about. Adding an option on the frontend to adjust this is quite easy, but understanding the impact of this on internal timestamp handling will add a level of complexity that we need to experiment with first. 

As you said, for a reliable estimate on this kind of drift we need at least 2 distinct synchronized markers with sufficient distance to each other, e.g. a precise start timestamp and some recognizable point event later on.

I perfectly agree that providing an easy-to-use solution does make perfect sense. We'll definitely see into this.

See full post
discussion

Firetail 13 - now available

Thanks to our wonderful user community and a lot of feedback, shared sample data and fruitful discussions I am glad to announce that Firetail 13 is now available, featuring a...

10 2
See full post
discussion

Detecting Thrips and Larger Insects Together

Hello everyone,I’m reaching out to discuss a challenge we’re tackling here in Brazil related to pest monitoring in agriculture. Thrips (Frankliniella spp., Thrips spp...

6 1
Hi Kim, the yellow sticky paper I have today is around 10cm by 30cm. I took a picture with a really good cellphone of the whole paper and the resolution was not good enough (3072 × 4080 in Ultra HDR). This gave me 10pixels / mm, but I could not get a precise enough model with yolo at this resolution for the Thrips... I will play around with the cellphone for a bit more and see if 2 or 4 pictures are enough. We even made a support for the cellphone to have always the same distance (but if I could avoid this for practical reasons in the future would be fine too). With the digital microscope we used we got over 50 pixels per mm and so got a quiet good model in identifying them (but time consuming), sometime dust also shows up and with the phone camera you can't differentiate Thrips and dust ;)))) Lets see if I can edit my post to include some cellphone images into it.

Yeah, I would expect that you might need to have higher resolution if the critters are very small. Still might be just a lens choice. But not up on this amount of lens difference, so don't know how hard it would be.

So, updated the text a bit with images cropped at 100% zoom :) we are already happy with the time reductions we got, but... would like to get at least 90% time reduction instead of 70% :))) we know that with a very expensive and high power camera we could probably do it, so one approach we are thinking of is just taking a closer macro picture with a cellphone of let's say 1/3 or 1/4 of the sticky paper and use this data instead of everything...  or take 2-3 pictures (but we don't want to waste time in sticking the images together).

See full post
discussion

Mirror images - to annotate or not?

We are pushing hard to start training our custom model for the PolarBearWatchdog! soon.This includes lots of dataset curation and annotation work.A question that has come up is...

18 0

I made a few rotation experiements with MD5b.

Here is the original image (1152x2048) :

When saving this as copy in photoshop, the confidence on the mirror image changes slightly:

and when just cropping to a (1152*1152) square it changes quite a bit: 

The mirror image confidence drops below my chosen threshold of 0.2 but the non-mirrored image now gets a confidence boost.

Something must be going on with overall scaling under the hood in MD as the targets here have the exact same number of pixels. 

I tried resizing to 640x640:

640x640

This bumped the mirror image confidence back over 0.2... but lowered the non-mirrored confidence a bit... huh!?

My original hypothesis was that the confidence could be somewhat swapped just by turning the image upside down (180 degree rotation):

Here is the 1152x1152 crop rotated 180 degrees:

The mirror part now got a higher confidence but it is interpreted as sub-part of a larger organism. The non-mirrored polar bear had a drop in confidence.

So my hypothesis was somewhat confirmed...

This leads me to believe that MD is not trained on many upside down animals .... 

 

Seems like we should include some rotations in our image augmentations as the real world can be seen a bit tilted - as this cropped corner view from our fisheye at the zoo shows.

See full post
discussion

Conservation Data Strategist?

Hello everyone – long time lurker, first time poster...I’m the founder of a recently funded tech startup working on an exciting venture that bridges consumer technology and...

9 1

Great resources being shared! Darwin Core is a commonly used bio-data standard as well.  

For bioacoustic data, there are some metadata standards (GUANO is used by pretty much all the terrestrial ARU manufacturers). Some use Tethys as well.

Recordings are typically recorded as .WAV files but many store them as .flac (a type of lossless compression) to save on space. 

For ethics, usually acoustic data platforms with a public-facing component (e.g., Arbimon, WildTrax, etc.) will mask presence/absence geographical data for species listed on the IUCN RedList, CITES, etc. so that you're not giving away geographical information on where a species is to someone who would use it to go hunt them for example. 

 

Hello, I am experienced in conservation data strategy. If you want to have a conversation you can reach me at SustainNorth@gmail.com.

See full post