14April2026
09:00 Master's Defense IC room 85
Topic on
Selection and Optimization of Pre-Trained Deep Learning Models for Camera-Based Smart Parking Systems on Edge Devices
Student
Gustavo Pessoa Caixeta Pinto da Luz
Advisor / Teacher
Juliana Freitag Borin - Co-advisor: Luis Fernando Gomez Gonzalez
Brief summary
Urban growth and the increasing number of vehicles circulating in cities, along with the time spent searching for parking spaces, contribute to increased fuel consumption and carbon emissions. Smart parking systems are emerging as a solution in the context of the Internet of Things (IoT), traditionally based on sensors in each parking space. Another possible approach is to use cameras combined with deep learning models to analyze job availability. Smart parking systems based on cameras present challenges, such as having good vehicle detection capabilities in different scenarios and the need to perform local processing without sending images to external servers. One possible approach to circumvent this problem is to use pre-trained models running on edge devices. Despite the extensive literature in the field, there is still no clear methodology to guide the selection, evaluation, and optimization of these models in intelligent parking systems with edge inference.
This thesis proposes a methodology inspired by CRISP-DM, which consists of data preparation, selection of system components, execution of a benchmark, evaluation of each architecture to answer research questions, and a search for an architecture compatible with devices with computational constraints. The method was validated in two case studies in parking lots at the State University of Campinas (Unicamp), achieving a mean absolute error (MAE) of less than three vehicles across five cameras with distinct characteristics, and an inference time at the edge of less than one minute.
In the first case study on the low-complexity parking lot at the Institute of Computing, the use of region of interest (ROI) selection as a post-processing step reduced MAE by up to 96,6%. Models with a larger number of parameters were not compatible with all edge devices, while lighter models achieved inference times of less than 2 seconds. The YOLOv11m model presented the best balance between accuracy and inference time, achieving a MAE of 0,08 vehicles with an inference time of 8,5 seconds on an edge device, when combined with enhancement and optimization techniques.
In the second case study, conducted in the University Rectorate parking lot, which has over 100 spaces and high complexity, Patching and Slicing Aided Hyper Inference (SAHI) techniques allowed lighter deep learning models to achieve performance comparable to larger models, reducing MAE by up to 73,1%. However, this improvement in accuracy is accompanied by an increase in inference time, particularly evident in Transformers models. The YOLOv11m model once again demonstrated the best balance in edge devices, achieving a MAE ranging from 1 to 2,95 vehicles, with inference times between 3,47 and 39,15 seconds. Finally, an inference strategy was proposed that reduced the system's computational load, saving up to 10 hours of inference time per day, while maintaining an average MAE (Maximum Apparent Volume) of 1,61 vehicles across the four cameras.
Examination Board
Headlines:
| Juliana Freitag Borin | IC / UNICAMP |
| Luis Henrique Maciel Kosmalski Costa | POLI/UFRJ |
| Hélio Pedrini | IC / UNICAMP |
Substitutes:
| Allan Mariano de Souza | IC / UNICAMP |
| Rodolfo Ipolito Meneguette | ICMC / USP |