Final Graduation Projects published in 2024

  • IC-PFG-24-44 pdf bib
    Framework for visualization and teaching of deep learning models.
    Matheus Esteves Zanoto.
    December 2024. In Portuguese, 14 pages.

    Summary: This Final Graduation Project was mainly inspired by the development of a framework/platform for visualization and teaching of deep learning models. We know that the didactics for teaching Machine Learning (more specifically in areas related to Deep Learning) can be very compromised due to the complexity of the models, the exponential number of layers and parameters involved in the architecture of a Neural Network and algorithms that require very large amounts of data for the training process.

    The development of this final undergraduate project was extremely important for the practical consolidation and review of all Deep Learning concepts absorbed in the Machine Learning discipline, such as Transformers and Self-Attention Mechanism. The initial proposed architecture needed to be revised due to time and scope limitations for project development, becoming a more specialized approach for training Transformers with only the BERT model. However, given the complexity of the models and the communication process between the execution of the cells in the Notebook and the final result on the Web page, a good result was obtained in the final objective of the Project, which was precisely the visual representation for teaching deep learning models, regardless of which model or algorithm was used for the training process.

    As a point of evolution of this project towards a possible Scientific Initiation, we would like a more structured visual approach for the end user in the Web application, with a greater and more detailed possibility of parameters, intermediate and real-time visual representations throughout the training iterations, and expand/generalize the mechanism to other models such as CNN, RNN and MLP.

  • IC-PFG-24-43 pdf bib
    Semi-Complete Sorts by Genome Rearrangement Operations.
    Gabriel Siqueira, and Zanoni Dias.
    December 2024. In Portuguese, 22 pages.

    Summary: This work presents a variation of the classic Rearrangement Distance problem, whose objective is to identify a sequence of rearrangements that transforms one genome into another, respecting specific proximity criteria based on thresholds. Variations based on inversions, $\lambda$-permutations and entropy. For each of the variations, a study was carried out on the impact of the threshold value on the problem, an approximation approach and a heuristic that can assist in the approximation approach.

    Abstract: This work presents a variation of the classic Rearrangement Distance problem called the Semi-Complete Sorting by Rearrangement Events problem. This variation aims to identify a sequence of rearrangements that transforms one genome into another, adhering to specific proximity criteria based on thresholds. Variations based on inversions, $\lambda$-permutations, and entropy are explored. For each variation, a study was conducted on the impact of the threshold value on the problem, an approximation approach was proposed, and a heuristic that can support the approximation approach was developed.

  • IC-PFG-24-42 pdf bib
    Study of deep learning models for cell segmentation in microscopic images.
    Aureo Henrique E Silva Marques and Esther Luna Colombini.
    December 2024. In Portuguese, 19 pages.

    Summary: Microscopic image segmentation is an essential task in the biomedical field, being widely used in cell analysis. This Final Graduation Project presents a study on the use of deep learning models for cell segmentation, focusing on the Stardist and Cellpose models, both based on the U-Net architecture. Initially, a review of microscopy concepts and cell segmentation techniques was carried out, covering traditional and modern methods. Then, an experiment was conducted using the Cellpose and Stardist models on two public Kaggle datasets: one more diverse and the other with a more specific context. The evaluation metric chosen was the "mean average precision" (mAP). The results showed that, for the first dataset, the Cellpose model presented the best performance, while the Stardist model stood out for being more efficient in runtime. For the second dataset, the performance of all models was low due to excessive segmentation, indicating the need for specific adaptations for this type of data.

  • IC-PFG-24-40 pdf bib
    Machine Learning-Based Energy Consumption Model for Cellular Network Interfaces.
    Leonardo Novaes Do Nascimento and Gabriel Sudo Enoki.
    December 2024. In Portuguese, 18 pages.

    Summary: The report presents a machine learning-based energy consumption model for cellular network devices. Two approaches were implemented in this study. The first focuses on predicting peak power consumption, simplifying the problem by focusing on moments of highest energy demand, while the second considers power consumption as a function of time, allowing for more detailed predictions and capturing dynamic patterns. The process included data preprocessing steps, such as applying low-pass filters to smooth signals, removing anomalies with wavelet transform, and organizing the data for training in neural networks. The models use parameters such as bandwidth, transmission and reception power to predict energy consumption. The results demonstrated high performance of the models, with R-squared coefficient of determination above 0,94 and MAE (mean absolute error) below 0,0135 for both models. The second model stood out for its ability to predict the shape of the curve and maximum consumption with greater accuracy. It is concluded that these approaches can contribute significantly to the evaluation of performance and energy efficiency in cellular devices. These models can promote greater sustainability and efficiency in cellular networks, where most devices operate with batteries with limited capacity.

  • IC-PFG-24-39 pdf bib
    Machine Learning-Based Energy Consumption Model for Cellular Network Interfaces.
    Leonardo Novaes Do Nascimento and Gabriel Sudo Enoki.
    December 2024. In Portuguese, 18 pages.

    Summary: The report presents a machine learning-based energy consumption model for cellular network devices. Two approaches were implemented in this study. The first focuses on predicting peak power consumption, simplifying the problem by focusing on moments of highest energy demand, while the second considers power consumption as a function of time, allowing for more detailed predictions and capturing dynamic patterns. The process included data preprocessing steps, such as applying low-pass filters to smooth signals, removing anomalies with wavelet transform, and organizing the data for training in neural networks. The models use parameters such as bandwidth, transmission and reception power to predict energy consumption. The results demonstrated high performance of the models, with R-squared coefficient of determination above 0,94 and MAE (mean absolute error) below 0,0135 for both models. The second model stood out for its ability to predict the shape of the curve and maximum consumption with greater accuracy. It is concluded that these approaches can contribute significantly to the evaluation of performance and energy efficiency in cellular devices. These models can promote greater sustainability and efficiency in cellular networks, where most devices operate with batteries with limited capacity.

  • IC-PFG-24-38 pdf bib
    Enhancing Linux Kernel Test Result Analysis: Automated Log Clustering in the KernelCI Database.
    Gabriela Bittencourt and Islene Calciolari Garcia.
    December 2024. In English, 20 pages.

    Abstract: The Linux kernel is one of the largest collaborative efforts of software development in the world, powering a large majority of the infrastructure that runs computing workloads of all scales - from embedded systems to HPC clusters. As such, improving the testing ecosystem for the Linux kernel is critical to ensuring the longevity of the project. The KernelCI project is a recent initiative that looks to provide a unified testing infrastructure for all the kernel subsystems; this work aims to improve the automatic evaluation and labeling of test results for the kernel in the context of the KernelCI project, through the use of modern clustering and data aggregation techniques; In particular, we propose a frequency-based algorithm for filtering and labeling logs from kernel tests as a way to facilitate their analysis by kernel maintainers, greatly improving the efficiency of the review process.

  • IC-PFG-24-37 pdf bib
    Low-cost computers for teaching programming in public schools using TV Boxes.
    Bruno Gonçalves Rodrigues, Giovanna Magario Adamo, Maria Vitória Rodrigues Oliveira, and Islene Calciolari Garcia.
    December 2024. In Portuguese, 23 pages.

    Summary: Access to education in Brazil is still not democratic, as is access to technology. Many students do not have the infrastructure that meets current educational needs, especially among students in public schools. Contact with modern technologies and access to content such as programming lessons is still extremely limited. In this same scenario of scarcity, the Federal Revenue Service has confiscated a large number of pirated TV Boxes in recent years, equipment used to access content illegally, and which would have been destroyed, generating electronic waste and equipment waste, if they had not been donated to educational institutions, such as Unicamp, in order to enable projects that sought to reuse them in a way that would benefit society. Thus, this project aims to use these TV Boxes as low-cost computers that can be used in public schools to teach programming, through its own initiatives or those of partners. By installing a new Operating System and customizing it, aimed at educational use, it is possible to offer students an environment that allows them to learn about computing and programming.

  • IC-PFG-24-36 pdf bib
    Modeling and generation of tests for critical embedded systems.
    Ariadne Bigheti and Eliane Martins.
    December 2024. In Portuguese, 20 pages.

    Summary: This work, developed in collaboration with the Technological Institute of Aeronautics (ITA) and the Budapest University of Technology and Economics (BME), investigates systematic approaches for modeling and generating test cases of a critical embedded system, called Train Door Controller.

    The combination of STPA and CoFI techniques with the GraphWalker tool was employed to create automated test cases based on state models that describe the system.

    The test cases, implemented in JUnit, validated the system behavior without identifying failures. However, some requirements were not completely covered due to the lack of clear specifications, highlighting both the importance of well-defined requirements and the ability of the approach to identify such gaps.

    The results reinforce the effectiveness of integrating modeling techniques with automated tools to ensure the quality of critical systems, promoting greater reliability and security in the development of these systems.

  • IC-PFG-24-35 pdf bib
    Evaluating and improving state model-based tests for REST APIs.
    Joed Vicente da Silva and Eliane Martins.
    December 2024. In Portuguese, 33 pages.

    Summary: One way to determine the quality of test suites is by assessing their coverage. In the automatic generation of test cases based on state models, several model coverage criteria are implemented by the tools, such as state coverage or transition coverage. Do test suites that satisfy the state model coverage criteria guarantee good coverage of the REST API under test? An API specification contains requests, responses, resources or endpoints, among others, that need to be exercised during testing. This work seeks to evaluate how well the test suites generated according to the different criteria based on state models are able to adequately cover the elements of an API. By using a set of coverage metrics for APIs, the goal is to have a way to help testing teams determine where to introduce improvements in the generated test suites.

  • IC-PFG-24-34 pdf bib
    Fairness testing strategies for recommender systems.
    Isabella Garcia Fagioli and Eliane Martins.
    December 2024. In Portuguese, 13 pages.

    Summary: This paper studies impartiality in recommendation systems, which have become increasingly common on digital platforms for recommending movies, series, music, books, products, among others. Such systems, although efficient in personalizing the user experience, can incur biases that affect the partiality of recommendations, often harming specific groups or unintentionally favoring others based on sensitive attributes.

    To assess the impartiality of these systems, specific tests and metrics are required, which is not as simple as accuracy tests, as it is difficult to establish what is expected in terms of impartiality and even more challenging to measure it accurately.

    However, there is a wide range of tools available to perform such tests, and choosing the most appropriate one can be a challenge. This work aims to understand the main factors that should be considered when selecting a tool to test fairness in recommender systems.

  • IC-PFG-24-32 pdf bib
    Blockchain data analysis with generative language models.
    Otavio Anovazzi, Lucas Eduardo Ramos de Oliveira, and Allan M. de Souza.
    December 2024. In Portuguese, 17 pages.

    Summary: This work aims to democratize access to public and technical information about blockchains, using data analysis techniques with generative language models. The proposal includes the use of computer vision and multimodal language models to explore data extracted from the blockchain network through APIs, allowing a more accessible and detailed interpretation of this information. The study was based on the open-source project mempool [], available on GitHub, which provides essential data for implementing analyses and developing innovative solutions for blockchain queries. Based on several tests, a comparison was made between several configurations for blockchain data analysis and text generation, in order to identify the most efficient approach. It was observed that the configuration with the best performance was the GPT-4-Vision-Preview model integrated with web search and temperature 0.8, standing out for its accuracy and speed in data analysis. The difference in performance between this configuration and the least efficient one reached approximately 24%, according to evaluation parameters such as length score, similarity to human language (human_likeness), relevance (relevance) and factual accuracy (factual_accuracy). These results demonstrate the importance of choosing appropriate parameters and models to optimize data interpretation in blockchain networks.

  • IC-PFG-24-29 pdf bib
    Content Steering Simulations: Exploring Continuous Computing and Reinforcement Learning to Support Adaptive Video Streaming.
    The authors of this article are:
    December 2024. In Portuguese, 46 pages.

    Summary: With the growing demand for video in recent years and the growth of cloud computing, we are seeing increased investment in adaptive video streaming. One popular protocol that has emerged in this context is Dynamic Adaptive Streaming over HTTP (DASH). It proposes that a video player can dynamically adjust the bitrate based on available bandwidth, maximizing the efficient use of resources and enabling a personalized experience - depending on the user's device and network conditions. The architecture proposed in this project is based on the edge-cloud continuum, highlighting how video streaming services can be optimized to handle mobile users. To this end, the Multi-Armed Bandits problem was explored and reinforcement learning algorithms, such as Epsilon-Greedy and UCB1, were used, combined with latency metrics to better redirect client requests to cache servers. The scenarios explored were: stationary client with stress on a server and mobile client with stress and no server stress scenarios. The results show that the quality of user experience does not depend solely on the proximity to the cache server that stores the video content. It is necessary to capture additional metrics. In our case, we captured server congestion and concluded that request redirection also acts as a load balancer in adaptive video streaming systems. In addition, we saw how the Content Steering architecture can allow the end user to have lower latency during their streaming experience, exploiting Continuous Computing.

    Abstract: With the growing demand for video in recent years and the growth of cloud computing, we are seeing increased investment in adaptive video streaming. One popular protocol that has emerged in this context is Dynamic Adaptive Streaming over HTTP (DASH). It proposes that a video player can dynamically adjust the bitrate based on available bandwidth, maximizing the efficient use of resources and enabling a personalized experience - depending on the user's device and network conditions. The architecture proposed in this project is based on the edge-cloud continuum, highlighting how video streaming services can be optimized to handle mobile users. To this end, the Multi-Armed Bandits problem was explored and reinforcement learning algorithms, such as Epsilon-Greedy and UCB1, were used, combined with latency metrics to better redirect client requests to cache servers. The scenarios explored were: stationary client with stress on a server and mobile client with stress and no server stress scenarios. The results show that the quality of user experience does not depend solely on the proximity to the cache server that stores the video content. It is necessary to capture additional metrics. In our case, we captured server congestion and concluded that request redirection also acts as a load balancer in adaptive video streaming systems. In addition, we saw how the Content Steering architecture can allow the end user to have lower latency during their streaming experience, exploring Computing Continuum.

  • IC-PFG-24-27 pdf bib
    MobFogSim Simulator with Federated Training and Machine Learning Support.
    Leonardo Livrare Martins, Luiz Fernando Bittencourt, and Diogo Machado Gonçasves.
    December 2024. In Portuguese, 24 pages.

    Summary: Federated Learning (FL) is an emerging approach in Machine Learning (ML) that enables collaborative training of models without the need to centralize data. This technique has gained prominence in the literature due to its ability to preserve privacy and enable training in distributed environments, such as mobile devices connected to the edge of the network. The increased interest in FL reflects the relevance of exploring decentralized solutions to problems involving large volumes of data and privacy constraints. Federated Learning (FL) is a good solution because it allows preprocessing and training of data to be performed directly on local devices, preserving privacy by avoiding the transfer of raw data to central servers. Instead, only updated model parameters (such as weights and gradients) are shared across the network, significantly reducing data traffic and the risks associated with exposing sensitive information. The MobFogSim simulator is recognized for its ability to replicate distributed environments, simulating the mobility of mobile devices running applications connected to servers in edge computing architecture. These characteristics make the simulator a suitable environment for federated training studies, given its ability to represent complex scenarios of distributed systems and dynamic interactions between devices and servers. Therefore, in this project, we aimed to enable the integration of Machine Learning (ML) models into the MobFogSim simulator, expanding its functionalities to support federated training studies in scenarios with mobile devices and decentralized data. To achieve these goals, the project was structured in three main modules: 1) the simulator was adapted to generate detailed logs per device, enabling data collection for federated training using the Flower framework; 2) a Machine Learning model was trained in a collaborative and decentralized manner; and 3) an API, developed with FastAPI and PyTorch, was implemented to integrate the trained model with the simulator, enabling dynamic decision-making during simulations.

  • IC-PFG-24-26 pdf bib
    Hive monitoring with the Internet of Things.
    C. V. M. Gomes, J. F. Theodoro, L. Q. Borges, J. V. Dos S. Oliveira, and L. F. Bittencourt.
    December 2024. In Portuguese, 28 pages.

    Summary: This is the report of the Final Course Project carried out in partnership with Prof. Roberto Greco from the Institute of Geosciences under the guidance of Prof. Luiz F. Bittencourt from the Institute of Computing, whose objective is to develop a system for collecting data from beehives that will be installed in schools for student learning. The system collects data from the hive environment via Wi-Fi with an embedded board, maintaining the safety of the bees and the integrity of the board. The project had four boards provided, in addition to the sensors, by Prof. Fabiano Fruett from the School of Electrical and Computer Engineering. The sensors measure temperature, humidity, sound, pressure and proximity. Data on weather conditions from the OpenWeatherMap API is also collected.

  • IC-PFG-24-25 pdf bib
    Exploring Content Steering Strategies for Adaptive Video Broadcasts.
    Lucas Jacinto Goncalves.
    December 2024. In Portuguese, 24 pages.

    Summary: This is the report of the project carried out as a Final Course Work at the Institute of Computing, in partnership with Prof. Luiz Fernando Bittencourt and Co-Advisor Eduardo de Souza Gama, whose objective is to develop a content steering system in the edge-cloud. The work describes the development of an adaptive content steering system to optimize video streaming in edge-cloud environments, addressing challenges in content distribution by integrating dynamic management of cache servers with adaptive network control. The platform offers an interactive web dashboard to monitor and control critical parameters such as latency, packet loss and bandwidth in real time, using Docker containers to simulate a distributed infrastructure. With dual server selection methods, one traditional and one with Artificial Intelligence, and six predefined presets (2G to Fiber) to simulate different network conditions, the system demonstrated to maintain the Quality of Experience (QoE) at adequate levels even in adverse networks. Its modular architecture and robust logging and graph generation system facilitate detailed performance analysis and provide a solid foundation for future expansions, contributing to the advancement of adaptive streaming technologies in heterogeneous networks.

  • IC-PFG-24-24 pdf bib
    Analysis of Energy Performance in Autonomous Drone Fleets.
    Fernando Bittencourt, and Fabiola M. C. de Oliveira.
    December 2024. In Portuguese, 16 pages.

    Summary: This work analyzes the energy performance of autonomous drone fleets in urban delivery scenarios, evaluating collision detection and avoidance strategies applied to different energy consumption models. Thus, this work highlights the importance of studying and researching empirical energy consumption models, in addition to the care in choosing the models and calibrating their parameters to ensure more realistic and accurate autonomous drone simulations.

  • IC-PFG-24-23 pdf bib
    Simulating the environmental cost of scenarios in federated learning.
    Matheus Casanova, Tobias Zorzetto, and Victor Dominguite.
    December 2024. In Portuguese, 25 pages.

    Summary: The exponential growth of mobile devices and the increasing complexity of machine learning models have driven interest in decentralized approaches such as federated learning. This methodology allows collaborative training of models without the need to transfer sensitive data, promoting greater privacy and efficiency. However, the impact of federated learning on the energy consumption of mobile devices is still an underexplored challenge, especially in real-world usage scenarios. Furthermore, there are still no widespread tools to measure the impact of this distributed training across different devices.

    In view of this, this project presents the development of an Android application designed to enable on-device training of machine learning models using custom datasets. The application collects detailed energy consumption metrics during the training process, providing support for the analysis of the impact of federated learning on mobile devices. In addition, a federated learning simulation based on the obtained measurements was implemented, with the aim of evaluating the environmental and performance impact of this approach in different scenarios.

    Thus, this work provides means for measuring the energy impact of federated learning in customized scenarios. The results demonstrate the feasibility of the application to assist researchers and developers in analyzing trade-offs between performance, energy consumption and environmental impact in federated learning architectures.

  • IC-PFG-24-22 pdf bib
    Internet of Things and computer vision for hive monitoring and bee flow counting.
    Carlos Alberto Astudillo Trujillo, and Luiz Fernando Bittencourt.
    December 2024. In Portuguese, 37 pages.

    Summary: This work is a report carried out as a Final Graduation Project at the Institute of Computing, in partnership with Prof. Roberto Greco from the Institute of Geosciences. The objective of the project is to develop a system for collecting and processing data from beehives, for study in schools. The system is composed of embedded boards and a Web application. The boards collect data on temperature, humidity, pressure and sound, communicating with the application via Wi-Fi through HTTP requests. The application allows the observation of data from different beehives and sensors, as well as the upload of a video showing the flow of bees around the entrance of the hive. This video is then processed to count the bees that enter and leave, displaying the results in graphs.

  • IC-PFG-24-20 pdf bib
    Exploring technical debt in code samples: A detailed analysis of the implications for code maintainability.
    Rafael Santa Rosa Alves and Thaina Milene de Oliveira.
    December 2024. In Portuguese, 32 pages.

    Summary: Context: This work investigates the detection and analysis of code smells and refactorings in code sample projects, addressing how these practices impact code quality over time. The presence of code smells – code characteristics that may indicate design problems and compromise maintainability – is directly related to the quality of a software and its ability to evolve. Refactorings, in turn, are modifications to the code that do not change its external behavior, but aim to improve its internal structure, facilitating maintenance and reducing the possibility of future bugs. Objective: The objective of this research is to understand the presence, evolution and effectiveness of code smells and refactorings in code sample repositories. The Goal Question Metric (GQM) approach is used to structure the study, defining specific goals and creating metrics to quantify the impact of refactorings on code quality, as well as to analyze how code smells change over time in response to these refactorings. Method: Six representative and widely used software ecosystems were selected, such as spring-guides, spring-cloud-samples, googlesamples, among others, applying criteria such as minimum number of contributors, LOC between 500 and 100.000, and recent activity in the repositories. Data collection uses the GitHub API to identify the repositories and SonarQube (version 10.7) together with SonarScanner (version 6.2.1) to analyze code smells. RefactoringMiner was used to identify refactorings. Results: The results indicate that code smells are frequent in code samples, but their resolution is not always prioritized. Refactorings, in turn, present cyclical patterns and their frequency is related to factors such as code complexity and collaboration. Conclusion: The study demonstrates the importance of tools and processes for managing code smells and refactorings in code samples. It is crucial to promote code quality and make developers aware of the importance of these practices to ensure the effectiveness and longevity of projects.

  • IC-PFG-24-18 pdf bib
    Using Sparse Voxel Octrees for Real-Time Ray Tracing.
    Luiz Henrique Aguiar de Lima Alves - Helio Pedrini - Jose Mario De Martino.
    December 2024. In Portuguese, 11 pages.

    Summary: This work explores the application of the rendering paradigm by Ray Tracing for real-time visualization of voxel-described models. We first describe an implementation using Compute Shaders with the data organized in a three-dimensional (3D) array. We then introduce an optimization by changing the data structure to a Sparse Voxel Octree (SVO), explaining its construction, how it can be traversed, as well as the efficiency gains in terms of memory and graphics processing achieved with it. Finally, the importance of this acceleration structure to achieve satisfactory performance metrics in interactive applications is demonstrated.

  • IC-PFG-24-17 pdf bib
    Coloring Drawings.
    Lucca Ferreira Paiva and Helio Pedrini.
    November 2024. In Portuguese, 20 pages.

    Summary: Drawings and animations are composed of well-defined lines that separate objects from the background. Therefore, colorization methods for natural images are not efficient in this scenario. Thus, several approaches have emerged in recent years with different strategies to solve this problem. This work investigates the main techniques developed for colorization of animations, highlighting their similarities and differences. In addition, it includes a discussion of the datasets used and the evaluation methodologies, as well as the main challenges in the area.

  • IC-PFG-24-16 pdf bib
    Comparative Analysis of Spotted Text Classification Techniques.
    Mateus Batista and Jacques Wainer.
    July 2024. In English, 14 pages.

    Summary: This work seeks to implement and compare different spotted text classification techniques. Spotteds are pages that publish messages and texts anonymously on social networks. For this study, a set of texts was separated, labeled as 'postable' and 'non-postable'. From this data, three different classification approaches were tested: using OpenAI's ready-made moderation service (Moderation Service), using a ready-made language model with a specific prompt, and using embeddings in traditional classification models, such as SVM, Naive Bayes and Random Forest. The recall metric was used to evaluate the initial performance of each approach, followed by hyperparameter adjustments to optimize the results.

    Abstract: This work seeks to implement and compare different spotted text classification techniques. Spotted are pages that publish messages and texts anonymously on social networks. For this study, a set of texts was separated, labeled as 'postable' and 'non-postable'. From this data, three different classification approaches were tested: using OpenAI's ready-made moderation service (Moderation Service), using a ready-made language model with a specific prompt, and using embeddings in traditional classification models, such as SVM, Naive Bayes and RandomForest. The recall metric was used to evaluate the initial performance of each approach, followed by hyperparameter adjustments to optimize the results.

    Summary This work seeks to implement and compare different classification techniques for 'stained' texts. Pages that publish messages and texts anonymously on social networks are detected. For this study, a set of texts was separated, labeled as 'postable' and 'non-postable'. Based on this data, we will test three different classification approaches: using the OpenAI moderation service (Moderation Service), using a ready-made language model with a specific message, and using inlays in traditional classification models, such as SVM, Naive Bayes and Random Forest.. A recovery metric was used to evaluate the initial performance of each approach, followed by hyperparameter adjustments to optimize results.

  • IC-PFG-24-14 pdf bib
    Integer Linear Programming Formulations for Variants of the String Partition Problem.
    Felipe Romeiro, Gabriel Siqueira, and Zanoni Dias.
    July 2024. In English, 18 pages.

    Summary: This report analyzes adaptations of two models from the literature for variants of the Least Common Partition of Balanced Strings problem. As this problem originates from Computational Biology, where strings represent genomes, we propose evolutions for models that meet variants of the problem. In these variants we consider the case in which the strings are not balanced, representing genomes with different sets of genes, and the case in which the characters have positive or negative signs, representing the orientation of the genes. We also take into account variations considering the number of nucleotides between genes. In the end, the different models were tested and compared in terms of execution time and solution quality.

  • IC-PFG-24-10 pdf bib
    Study and Implementation of Full Waveform Inversion (FWI).
    Fábio de Andrade Barboza and Hervé Cédric Yviquel.
    July 2024. In English, 14 pages.

    Summary: Detailed knowledge of the Earth's subsurface is extremely important for areas such as the exploration of natural resources, carrying out environmental studies and infrastructure planning. This work explores Full Waveform Inversion (FWI), a geophysical imaging technique capable of generating detailed models of the Earth's subsurface from seismic wave propagation data collected at the surface. The simulation of wave propagation, fundamental for geophysical imaging, and the experimental implementation of FWI using the computing infrastructure of the Computing Systems Laboratory (LSC) of the Computing Institute (IC) at Unicamp are also covered. Seismic modeling and seismic data simulation complement the analysis. It concludes with a discussion of future improvements, including parallel and distributed execution of the application and the use of machine learning techniques to improve the quality of the models generated by the application.

    Abstract: Detailed knowledge of the Earth's subsurface is of utmost importance for areas such as natural resource exploration, environmental studies, and infrastructure planning. This work explores Full Waveform Inversion (FWI), a geophysical imaging technique capable of generating detailed models of the Earth's subsurface from seismic wave propagation data collected on the surface. It also addresses the simulation of wave propagation, which is fundamental for geophysical imaging, and the experimental implementation of FWI using the computing infrastructure of the Laboratory of Computer Systems (LSC) at the Institute of Computing (IC) of Unicamp. Seismic modeling and seismic data simulation complement the analysis. The study concludes with a discussion on future improvements, including the parallel and distributed execution of the application and the use of machine learning techniques to enhance the quality of the models generated by the application.

  • IC-PFG-24-09 pdf bib
    Sqisign: A post-quantum signature scheme.
    David Afonso Borges dos Santos and Julio López.
    July 2024. In English, 19 pages.

    Abstract: This work is a study about the post-quantum signature scheme SQISign, one of the candidates in the NIST Post-Quantum Cryptography Standardization contest. The SQISign algorithm assumes the hardness of finding a path in supersingular isogeny graphs and uses the Deuring correspondence to operate in the quaternion algebra world during signature and in the elliptic curves world during verification. Among the other candidates in the same category, SQISign has relatively small public key and signature sizes, which is an important advantage. The recent SIDH attacks showed new ways of efficiently representing isogenies. This fact, resulted in some new variants of SQISign, now using 2, 4, and 8-dimension isogenies. Among the available variants, we are going to discuss SQISign2D-West and SQISignHD.

  • IC-PFG-24-08 pdf bib
    Implementation of a Distributed System for the Smart Parking Solution.
    André Luis R. Gouvêa, Lucas B. A. Farias, Tiago P. Dall'Oca, and Juliana Freitag Borin.
    July 2024. In English, 20 pages.

    Summary: In this work, we present an extension of the Smart Parking project, a parking space monitoring system implemented on the Unicamp campus. The objective was to develop an efficient technological solution for parking management, increasing user convenience and optimizing the use of available spaces. We use technologies such as MongoDB for data management, MQTT for communication between IoT devices and React for the front-end, creating a modular, scalable and easy-to-maintain system. The architecture follows software design practices, ensuring low coupling and high cohesion between components. Unit tests with Jest ensured the reliability of the system, validated through controlled simulations containing multiple parking lots.

  • IC-PFG-24-07 pdf bib
    Invention: Equipment Tracking with RFID.
    Cristiano Sampaio Pinheiro, Lucas Ribeiro Rodrigues, and Juliana Freitag Borin.
    June 2024. In English, 26 pages.

    Summary: InventIo is a platform, for exclusive use by the Unicamp Computing Institute, designed for tracking objects in order to manage the movement of critical items. Essentially, the application enables the registration of objects with Tags radio frequency identification (RFID) and the management of sensors for their detection. Whenever a sensor identifies a tag (object) a history is generated containing its location and time at which the item was identified.

    This stage of the project seeks to advance the existing system by adding new functionalities, understand and validate the use of RFID technology and, finally, install the solution on the premises of the Computing Institute. Furthermore, it is also of interest to carry out evaluations to expand the scope of use of this technology on campus.

    After making the planned improvements and conducting a series of experiments, the results obtained demonstrate the potential of the InventIo platform but also expose considerable limitations. In particular, RFID technology proved to be too sensitive to interference, which restricts the use of the system. Regarding the application of this technology in other contexts, the possibility of automating the Institute's inventory process was evaluated. The results are promising, and, despite limitations regarding the use of Tags on metallic surfaces, means to circumvent this restriction have been explored.

    Abstract: InventIo is a platform exclusively used by the Institute of Computing at Unicamp, designed for tracking objects to manage the movement of critical items. Essentially, the application enables the registration of objects with radio frequency identification (RFID) tags and the management of sensors for their detection. Whenever a sensor identifies a tag (object), a history is generated containing its location and the time the item was identified.

    The current phase of the project aims to advance the existing system by adding new functionalities, understanding and validating the use of RFID technology, and finally, installing the solution within the Institute of Computing's premises. Additionally, there is an interest in conducting evaluations to expand the scope of this technology's use on campus.

    After implementing the planned improvements and conducting a series of experiments, the results demonstrate the potential of the InventIo platform but also reveal significant limitations. In particular, RFID technology proved to be highly sensitive to interference, which restricts the system's use. Regarding the application of this technology in other contexts, the possibility of automating the Institute's inventory process was evaluated. The results are promising, and despite limitations concerning the use of tags on metallic surfaces, methods to overcome this restriction were explored.

  • IC-PFG-24-06 pdf bib
    Performance and feasibility analysis of federated learning algorithms using Flower.
    Thiago dos Santos Solera, Pedro Strambeck Nogueira, Allan Mariano de Souza, Joahannes Bruno Dias da Costa, and Luiz Fernando Bittencourt.
    July 2024. In English, 14 pages.

    Summary: Federated learning is a common solution to the problem of training artificial intelligence models in environments where data cannot be easily centralized, while also ensuring customer privacy. However, implementing large-scale federated learning solutions for real-world applications can be particularly challenging. The Flower framework proposes to deal with problems such as hardware homogeneity and implementation language, while facilitating the performative execution of algorithms in higher scalability tests. This work aims to evaluate the performance and viability of the FedProx and FedAvgM algorithms, two algorithms used in data cases non-IID, in the context of federated learning when implemented using the Flower framework, evaluating their accuracy and scalability, as well as the time needed to achieve such results, with the aim of classifying the feasibility of using each solution.

  • IC-PFG-24-04 pdf bib
    Impact of Differential Privacy on distance measurement techniques.
    Filipe Maciel Roberto Bruno Henrique Emidio Leite and Luiz Fernando Bittencourt.
    July 2024. In English, 18 pages.

    Summary: In recent years, several algorithms and metrics have been developed to efficiently define the distance between different groups of data, techniques generally applied in user clustering algorithms. At the same time, there is also growing concern about ways to guarantee the security of data received from a user, so that their identity is not compromised and their information is not reproduced by third parties. In this context, this work seeks to analyze how different security techniques impact distance metrics commonly used today, applying different degrees of privacy to them, in order to verify the possibility of implementing privacy in parallel with the application of these metrics. Therefore, using simulation libraries, such as Flower, and simpler data sets, it was possible to observe that the metrics have behaviors that are very sensitive to the application of privacy, but still allow the use of less severe degrees of privacy.

  • IC-PFG-24-03 pdf bib
    Hive monitoring with the Internet of Things.
    J. C. Gonçalves, D. M. Dos Santos, L. JS Dos S. P. Monroe, V. A. Scholze, F. Fruett, and L. F. Bittencourt.
    July 2024. In English, 40 pages.

    Summary: This is the project report carried out as a Course Completion Work at the Institute of Computing, in partnership with Prof. Roberto Greco from the Geosciences Institute and Prof. Fabiano Fruett from the Faculty of Electrical and Computer Engineering, whose objective is to develop a system for collecting data from hives that will be installed in schools for students to learn. The system collects data from the hive's environment via a wireless network with an onboard board, all at a distance from the bees themselves, thus maintaining their safety and the integrity of the board. The project had four boards supplied, in addition to the sensors, by Prof. Fabiano, the sensors deal with temperature, humidity, sound, pressure and proximity, as well as weather conditions coming from the OpenWeatherMap API.

  • IC-PFG-24-02 pdf bib
    Distribution of on-demand replicas for emerging software.
    Igor Fernando Mandello, Luiz Fernando Bittencourt, and Roberto Rodrigues Filho.
    July 2024. In English, 12 pages.

    Summary: This project aims to develop a methodology for streamlining the remote distribution of components in the context of self-distributed systems. The chosen approach seeks to optimize the scalability and adaptability of systems, taking advantage of the infrastructure of data management systems. containers to keep executing only what is strictly necessary. With this, it is possible to observe how an application can make use of adjustments in runtime to improve your turnaround time, while keeping costs under control.

  • IC-PFG-24-01 pdf bib
    Comparative analysis of load balancing algorithms in heterogeneous systems.
    Júlia Alves de Arruda and Lucas Hideki Carvalho Dinnouti.
    July 2024. In English, 17 pages.

    Summary: This work is a comparative analysis between different load balancing techniques: Round Robin, Weighted Round Robin, based on Metadata and Machine Learning. The architecture was based on a message processing platform, which transports content of different types, using real data. The objective was to find the best strategy to process a large volume of messages with different types and sizes of instances, seeking to understand whether algorithms customized for the application domain present better performance. For the proposed problem, it was concluded that such algorithms can be more efficient, such as those based on Metadata. On the other hand, algorithms based on Machine Learning did not perform well when compared to simpler techniques due to their computational cost.


  • Instituto de Computação :: State University of Campinas
    Av. Albert Einstein, 1251 - Cidade Universitária Zeferino Vaz • 13083-852 Campinas, SP - Brazil • Phone: [19] 3521-5838