Predicting road traffic flow

In this project applied the entire data science pipeline, with the objective of developing various machine learning models that could predict the road traffic flow in the city of Porto.

PythonMachine LearningData Science
View on GitHub

This project aims to develop and optimize Machine Learning models capable of predicting road traffic flow at a specific hour in the city of Porto. While the main objective was to analyze urban data to support better transport management and planning , we also participated in a Kaggle competition as a supplementary means to evaluate our models' performance on unknown data adn compare the results with other participants.

The project was implemented using a CRISP-DM methodology base like the following image:

In this app there existed various models that would execute prediction tasks, trained on historical records. These tasks relied on data regarding internal traffic patterns and external events, such as school holidays, weather conditions, and football games involving local teams like FC Porto and Boavista. We processed these inputs and sent the results to various "agents"—specifically Ensemble Learning algorithms like XGBoost, CatBoost, and HistGradientBoosting—to calculate the traffic severity.

A big focus of this project was designing the data preparation pipeline and the ensemble architecture. We tested complex Time Series preparations, but the protocol we designed around "Merge Days"—which integrated calendar events directly—proved more effective. We also implemented a Stacking Regressor, inspired by the idea that combining multiple predictive models (like Random Forest and LightGBM) overcomes individual weaknesses.

We were very happy with the result of this project having as a goal, if we had more data to capture long-term patterns, we could improve our LSTM (Deep Learning) implementation, which showed great potential but was limited by the dataset size.

For a full report and project files check out this projects repository at https://github.com/luis25franca/daa_grupo_20.