ALY

February 2023

Python
Pandas
FastAPI
Next.js
PostgreSQL
SQLAlchemy

ALY is a tool designed to connect farm producers with consumers, aiming to reduce food waste and improve the efficiency of the food supply chain. The project was developed during the Carlson Analytics for Good Hackathon , where our team was honored with the Most Engaged and Inquisitive Award. My primary responsibilities included data disambiguation, feature engineering of two datasets, database schema design, developing a pipeline for training a machine learning model, and integrating the backend with the frontend.

ALY Project Planning
Our team planning the project on a whiteboard at the hackathon.

Background

Our team was provided with operational data from The Good Acre , consisting of orders and contracts between local farmers and consumers, along with information on fulfillment rates. The objective was to use this data to better predict the completion rate of orders and contracts, thereby improving producer-consumer matching and reducing food waste. The datasets provided were unstructured and dispersed across multiple sources. My task, using pandas, was to standardize, clean, and merge the data into a unified dataset that could be stored in a database accessible to the frontend.

ALY Data Display
Data after being cleaned and combined as shown on our web-application.

Development

Our team defined three key metrics to evaluate the performance of farmer and contract relationships: On Time & In Full, On Time, and Fulfillment Average.

  • On Time & In Full refers to the farmer delivering the exact quantity of produce on the specified date.
  • On Time indicates timely delivery, though the quantity might be insufficient.
  • Fulfillment Average measures the percentage of requested produce delivered, even if it was not on the requested date.

While calculating these metrics was relatively straightforward, pairing the contracts with the corresponding produce deliveries was more challenging due to the lack of guaranteed date alignment. We implemented a greedy algorithm to match the nearest contract to a farmer’s produce, which, although not ideal, was the most feasible solution given the available data.

ALY Metrics
Metrics used to measure success on our dashboard.

Machine Learning Model

Once the data was cleaned and stored in our database, I developed three machine learning models to predict the success of farmer-contract relationships:

  • Stochastic Gradient Descent Classifier
  • Multivariate Regression Model
  • Random Forest Classifier

Due to the limited data spanning only two years, the Stochastic Gradient Descent Classifier and Multivariate Regression Model produced suboptimal results. However, the Ensemble Random Forest Classifier achieved over 80% accuracy. This model was serialized using pickle, deployed on a FastAPI server, and integrated into our frontend.

Outcome

This project offered valuable insights into how analytics can address real-world challenges. I gained significant experience working with my team, and I am proud that our efforts were recognized. Below is the final product we presented at the hackathon.

ALY Final Product
Our final product at the hackathon.