The power of public data in aquaculture prediction
How we built a powerful data set to fuel prediction models for Norwegian salmon farmers—like our new treatment recommendation tool on the app.
“Public data holds no value, because anyone can do it. We need to collect more precise data before we can analyze it and get any value.”
I’ve heard this from countless investors, customers, prospects and industry peers in aquaculture.
While, yes, publicly-available data can be retrieved by anyone, and yes, sometimes it is less accurate or precise, there is much more to the story.
When the proper skills and expertise are applied, raw public data can be turned into an extremely valuable and reliable tool to drive important decisions on the farm. Depending on the problem at hand, it can actually provide better insights than a farmer’s own production data.
Before starting Manolin with Tony Chen, my career spanned activities that ranged from detecting anomalous events on satellites to scraping, mining, and analyzing data via natural language processing and spatial analysis to identify unique trends. Tony worked at the US Food and Drug Administration building systems to monitor and assess medicine for unforeseen infectious diseases. After finding a mutual passion for seafood production, we took this combined experience to an industry that we saw as data-rich but insight-poor: Norwegian aquaculture.
While Norway has an incredible amount of aquaculture data, it holds very little value on its own. The value is in mining the data for unique information that is hidden in a sandstorm of variables, impossible to see with the human eye.
For example, one of the biggest problems facing our Norwegian salmon farming customers is sea lice outbreaks. The government imposes strict regulations on sea lice levels among aquaculture farms. In order to get their levels down, farmers must use costly chemical or mechanical treatments that can cause stress on the fish. A mechanical treatment requires a wellboat, for instance, which can cost up to US$150,000 per day.
What we commonly hear is, “How do I know when to treat my farm for sea lice, and which treatment to use?”
Normally, a farmer would tackle this by checking various sources and relying on some intuition:
- Check which of my farms has the highest sea lice level.
- Check if nearby farms have high levels.
- Estimate the infection pressure by guessing the number and size of these farms’ fish.
- Check if any nearby farms have recently done treatments for lice.
- Assess the risk to my own farm based on this information.
- Check on the fish health for my at-risk sites.
- Find cages with worst lice levels and those with the healthiest fish (that will lead to the least amount of fish stress and mortality due to treatment).
- Once a cage is chosen, look back at how effective previous treatments have been here.
- If the typical options don’t fit the current fish profile (they may be too small or particularly vulnerable), then choose a different treatment.
Here, a farmer’s private data can say a lot about treatments: mortality rate, the effect on sea lice, dosage, equipment used and environmental variables. But coming to the above decision requires a lot of time and energy.
With some analysis, we can see how this farmer’s own treatments perform on average over time. This is valuable and will save time sifting through different sources and making estimations. But unless data is shared via calls to neighbours, the insights are limited to their own production.
This information for nearby farms isn’t publicly available, but lice levels, treatment times and maximum allowable biomass are. With basic prediction models, we fill the gaps to see average lice levels across all farms. Then, we use publicly available weekly values to calculate the production season and length, how many events (lice treatments, environmental anomalies, etc) have happened, and more – all of which feed into our models to predict each farm’s fish size, quantity and mortality.
Further data scraping shows information like wellboat port calls, their unique call sign, and the equipment used. Combining GPS data with some of our algorithms, we determine boat movements from cage to cage to estimate how many cages were actually treated.
By manipulating, formatting and mining more and more of these factors, Manolin has built a robust data set on the disease levels and treatment events of salmon farms across Norway. When combined with a farmer’s private data, this fuels a powerful treatment mortality model that can predict, for example, what would my mortality rate be for treatment at my farm right now?
We also use benchmarks from other farmers’ anonymized data to fine-tune this even more. When a farmer sees risk of sea lice, the model can instantly determine how particular treatments will perform at that time, given the current conditions, so they can make their own educated decision.
This is not only far more precise and accurate, it simply saves time. A farmer typically has between a week and just a few hours to make a treatment decision when something changes on the farm. Conditions can change quickly in water, and often farmers are managing many cages. When the wrong decision could mean millions of dollars lost, there is no time to lose.
The Norwegian salmon industry is already collecting the data necessary for accurate and early disease prediction. With the right digital tools, farmers can make high-stakes decisions more quickly and easily than ever—starting right now.
Interested in exploring the industry insights? Create a free account at www.manolinaqua.com with no data commitment necessary.