Need advice on data for an app that recommends plants.

This is going to show how little I know, but I'm doing my best! 

Basically I could use help understanding what I should do from a data standpoint in order to enable a developer to build out an application. The focus is on native plants. 

The data will come from different sources like USDA hardiness zone / USDA Plants Database/NWF native plant finder. Potentially other sources as well. Many but not all have an API.

The end game is to have an app where users can input information about sun/moisture, add preferences like height, perennial/annual and maybe additional data like which pollinators it supports or color of plant (depending on how realistic this is for an MVP).

Considering I'm on a budget and self-funding this - what is the easiest way to go about this? Do I first find the developer then ask them to advice on what type of pre-work needs to be done on the data in order for it to be integrated in the app?

I do not have a tech background, and I'm trying to make this all as efficient as possible. 

Native plants are a low hanging fruit for supporting biodiversity - lawn culture is the enemy and I'm hoping to create something that gets people more connected with the plants they see around them, and simplifies the process of selecting native plants. 

Any and all advice is much appreciated. Thank you!  




Frank van der Most
@Frank_van_der_Most  | He, him
RubberBootsData
Field data app developer, with an interest in funding and finance
Group Leader
Involvement level 3
Variety Hour Regular
Poster level 3
Commenter level 3

It's a really good question, Colleen!

Ideally, I would work with two developers, or at least one developer and one other party who knows what developing an app like this would mean. This will lower the risk that the developer answers your question too much in their advantage. So this is one reason why it is a good question, and maybe this is why you ask it. However, if you really trust your developer not to take advantage, then go with just her/him/they.

It's also a good question because there is no easiest way to go about this if you're on a budget. You mention that you do not have a tech background, so here comes some explaining. If I misunderstood, then please skip and continue at "Back to 'the easiest way to go about this'"

There are at least two things you should be really aware of when it comes to software development.

The first is that once the basic data structure is defined and the software built around it, it is extremely costly to change the data structure. It's like deciding after the car has been built, that the engine should go to the back of the car instead of the front.

The second thing is that the basic data structure is dependent on ( among other things, but I'd say these are the two most important factors ) the complexity of what needs to be achieved and the speed at which it needs to be done.

The difficulty is that the required complexity and the speed may change over time, which brings one back to the car and engine situation. Changes in complexity and speed requirements may be the result of many things, one of which is success. You get far more clients than anticipated, so the system needs to be scaled up. In addition, with more users come more feature requests ( this can work both ways: new features result in more users, and more users may result in more feature requests ).

There is no real solution to this problem ( well, except not growing beyond the point that the first design can handle ). When it comes to scaling up, one vendor may claim that their database back bone easily scales up. Maybe so - but it may come at a price and they may also underestimate your and their own future needs. When it comes to changes in complexity, additional features can in the beginning probably be added on without changing the basic data structure. Maybe an additional row of seats at the back of the car, a trailer hook, bigger lamps, a roof-rack, a trailer, suitcases on the rack. At some point the car will need a new and bigger engine to carry all those add ons and keep at the same speed.

Here is a prediction : the more you stress cheap and efficient at the beginning, the bigger these problems will be later on. But when the business is successful, there will be more money to invest in scaling up and redesigning. Obviously yes, but in terms of the car metaphor, you may find that you want the car to be kept running with all its added on features, while the engine is replaced and moved to the back. It may be possible, but perhaps out of reach of patience and the increased income.

 

Back to 'the easiest way to go about this':

Invest a little effort to find out not only what are the minimum requirements for the MVP, but also what else you or your clients may want in the future. The developer should then have these future requirements in mind (and future upscaling) when they start developing for the minimum ones. This means, develop a somewhat more generic data structure than what is necessary for the MVP. This will cost some more at the beginning but should save a lot later on. I'm writing 'should' not 'will' on purpose. It's a balancing act because taking too much into account has the risk of over-engineering for a future that may not happen, or develop differently than expected. Like I said, there is no easiest way out.

 

A few more detailed comments

If there is no API for a source, try to go around it if possible at the beginning. API's are made with some long term stability in mind. Websites and web pages not necessarily so or less so. They will require more monitoring and maintenance on the web scraping routines.

Perhaps the developer may not be aware of the necessary pre-work that needs to be done if the pre-work depends on biological knowledge needed to transform the data from your data sources into data that allows easy ( and fast ) calculation of results to the users.