Idea Four
Since I have been working in agriculture for the past several years, the examples will come from that domain. However, the approaches described here are applicable to any field and any datasets.
Imagine you have several datasets over a certain period. The sampling frequency is not critical here, because time series can—and should—be normalized using basic mathematics. Newton interpolation polynomials, Gaussian curves, and other useful tools help here. Store the data in the format {id, date, value}. After normalizing date, all series are brought to a common form and become ready for comparison.
Using machine-learning algorithms and mathematical models such as SARIMAX, you can analyze the data and identify dependencies between comparable series in the form of functions. But rather than diving into a long and boring explanation of AI and neural network training, let’s skip that part. I am sure your own "neural network," located in your head, can do just fine.
Suppose we have several time series:
— Milk yield: the average daily milk yield per cow in a region.
— The average air temperature in the region.
— The amount of feed given to the animals.
— An abstract indicator of the subsidy amount available per animal at a given yield level.
Of course, I partly made these indicators up, but they are good enough for an example.
Now imagine that you have visualized these data on a chart. I used Chart.js for that, but you can choose any library you like. Looking at the chart, you might assume that as air temperature rises, milk yield also rises. Apparently cows like warmth, and sunshine, it seems, also supports lactation. You can also see that increasing feed has a positive effect on yield.
Using methods of mathematical analysis, I identified linear dependencies between the indicators that can be described by the following formulas:
1) Yield = Temperature - Feed/10
2) Yield = Yield * Feed / 10
3) Subsidy = Yield / 4
I chose the coefficients completely at random, purely by intuition. However, the tool I built lets you create more complex dependencies and experiment with them directly on the chart.
As an example, open the link and click the "Load predefined data" button to see a set of numbers and indicators. Then add the formulas above one by one and watch the chart. By moving the indicator values up and down, you will see how the dependent values change. If your target metric is milk yield, you can search for optimal values of the other variables. Increase feed and yield jumps up; decrease temperature and you adjust for autumn; and so on.
Another interesting problem that can be solved with this approach is matching fuzzy and crisp data. Imagine that instead of exact values you have ranges or qualitative descriptions. For example, in one farm air temperature is measured in ranges ("warm," "cold"), while in another it is recorded to tenths of a degree. How do you combine them? Very simply. This is where the concept of a membership function helps. We define rules describing how precise temperature values correspond to fuzzy categories. For example, "warm temperature" may be represented as a range from 20 to 30°C with a membership function that gradually decreases toward the edges of the range.
Now suppose your feeding indicators are precise—for example, 5 kg per cow per day. But yield data contains some uncertainty because different farms keep records differently. We can match these data by normalizing them onto a common scale: we transform crisp data (for example, "5 kg") into a membership function (for example, "heavy feeding"). This is done through predefined intervals or statistical estimates. Then we apply fuzzy logic rules to evaluate how "warm weather" and "heavy feeding" jointly affect yield.
When matching fuzzy and crisp data, it is important to remember that even the strictest numbers sometimes fail to capture context. For example, "5 kg of feed" for one cow in winter may be insufficient, while in summer it may be excessive. Using the flexibility of fuzzy logic, dependencies can be modeled in a way that reflects real scenarios: temperature ranges, seasonal factors, and data variation. The resulting analysis gives not only average values, but also ranges that reflect the degree of uncertainty and enable better forecasting.
In truth, the tool is primitive. More than that—it is not even a tool, just a prototype. But I think this approach has potential. What do you think?
P.S. I execute the formulas with eval. Yes, it is unsafe. Yes, it is rough. But it works. For a prototype, that is good enough.
Example here: https://stukalin.com/df/