By Robby Edwards, director of communications
Dale Bumpers College of Agricultural, Food and Life Sciences
FAYETTEVILLE, Ark. — Igor Fernandes, a master's degree student in crop, soil and environmental sciences in the University of Arkansas’ Dale Bumpers College of Agricultural, Food and Life Sciences, recently placed second in an international prediction contest conducted by the Genomes to Fields Initiative.
The initiative, a public-private partnership also called G2F, collected data on more than 180,000 corn field plots, including 2,500 hybrids and 162 unique environments. Competitors developed prediction models to predict maize yield based on genetic and environmental data from trials, datasets and other publicly available information. From Nov. 15-Dec. 15, contestants had access to training data, and they had to submit their predictions by Jan. 15.
The Genotype by Environment contest was open to teams and individuals, and Fernandes developed his model individually. He is now working with his adviser, Sam Fernandes, assistant professor of agricultural statistics and quantitative genetics, to improve his prediction model.
Fernandes is a researcher with the Agricultural Statistics Laboratory, a program of the Arkansas Agricultural Experiment Station, the research arm of the U of A System Division of Agriculture. His research ties into work with the departments of crop, soil and environmental sciences and horticulture.
A team from Corteva Agriscience won the contest and $4,000 prize with a Mean Root Mean Square Error score of 2.328863. Fernandes was second among 33 entries with a score of 2.345147. For this contest and this RMSE metric, lower scores are better. Models with a lower RSME mean the predicted maize yield is more similar to the actual yield when compared to another model with a larger RMSE.
"We used trial data from 2014 to 2021 to build the prediction models and had to evaluate the predictions on unseen trials from 2022," Fernandes says. "We had to make predictions for different environments and different maize hybrids. My solution consisted in creating meaningful predictor variables, the so-called feature engineering process, and building a gradient boosting machine learning model with those variables."
He said his solution included using aggregations, such as calculating the mean, standard deviation and others, from time series climate variables to summarize climate patterns for each season in each environment.
"Another useful technique used was the adoption of lagged variables, which means that we take a variable and look at its pattern in a previous window, which could be from the previous year or the previous two years, and use it as a predictor," Fernandes says.
More about the Agricultural Statistics Laboratory here.
Genomes to Fields focuses on efficiently and sustainably producing a safe, dependable food supply for a growing world population, which requires the development and management of crop varieties that will perform well in spite of increased weather variability. A widescale plant phenotyping initiative is proposed, which will expand understanding of the interacting roles of crop genomes and crop environments (including weather and management practices) on crop performance. By improving the ability to predict crop performance in diverse environments, the initiative will enhance capabilities to develop new varieties and manage the effects of weather variability on crop productivity.
To learn more about Division of Agriculture research, visit the Arkansas Agricultural Experiment Station website: https://aaes.uada.edu/. Follow us on Twitter at @ArkAgResearch and on Instagram at @ArkAgResearch. To learn more about the Division of Agriculture, visit https://uada.edu/. Follow us on Twitter at @AgInArk.