Marcelo S. Reis

Publications
Up-to-date, comprehensive lists of publications from my research group, including papers, books and book chapters, are provided in the following resources:

Moreover, below I provide an overview of my thesis, which links several of its results to more recent works and publications.

Doctoral thesis
My doctoral thesis, advised by Prof. Dr. Junior Barrera, was in the field of Machine Learning. More specifically, I worked on dealing with the feature selection problem through an approximation called the U-curve problem, which enables us to deal with the search space of features subsets in a more structured, hence more efficient, way. In the following, I briefly present the chapters of my thesis and also publications associated to some of them. My thesis can be downloaded from here (in Portuguese only).

"Minimization of decomposable in U-shaped curves functions defined on poset chains -- algorithms and applications"

Chapter 1: Introduction
Motivation of the problem and thesis outline.

Chapter 2: Literature review and fundamental concepts
Review of the feature selection problem and proposed methods to tackle it; presentation of concepts and definitions used throughout the thesis.

Chapter 3: the U-curve problem
Formalization of the U-curve problem and a proof that this problem is NP-hard. An improved version of that proof was published in Estrela et al. (2020).

Chapter 4: The UCS algorithm
Demonstration that an algorithm previously introduced to tackle the U-curve problem is suboptimal; presentation of the U-Curve-Search (UCS), which is actually optimal to solve the U-curve problem. Some principles of the UCS algorithm were presented in Reis and Barrera (2013), and that algorithm was fully published in Reis et al. (2019).

Chapter 5: The path initiation problem
In UCS and other similar algorithms, finding an element of the search space that was not pruned is known as the path initiation problem. In this chapter we analyzed such problem, including some computational complexity aspects. After this thesis, we tackled this problem using reduced, ordered binary decision diagrams. This latter work was developed during Gustavo Estrela's first scientific initiation (FAPESP scholarship 14/23564-0) and also reported it in Reis et al. (2019).

Chapter 6: Branch-and-bound algorithms for the U-curve problem
We introduced the Poset-Forest Search (PFS), a branch-and-bound algorithm to tackle the U-curve problem that manages the search space through two forests of partially-ordered sets (posets). Further improvements in PFS were accomplished later, through Gustavo Estrela's second scientific initiation (FAPESP scholarship 16/25959-7).

Chapter 7: Experiments
Presentation of benchmarking experiments with the algorithms developed during the doctoral period, including computational assays using hard instances (polynomial reduction from the subset sum problem) and also from real-world data (filtering of binary images using W-operators).

Chapter 8: Generalization of the U-curve problem
Here, we provided some theoretical results in Machine Learning. Consider, for a fixed number N, the Boolean lattice composed of all possible Boolean functions with N variables (inputs). Consider the joint distribution Pr[Y = y, X = x], where X and Y are random variables that draw from the input and Boolean target domains, respectively. We demonstrated that using the mean absolute error (MAE) to estimate the error of choosing each of those Boolean functions to classify data produced by the aforementioned joint distribution results in a search space in which chains that pass through the global minimum are U-shaped, therefore prone to be solved by algorithms designed for the U-curve problem.

Chapter 9: Conclusion
Typical conclusion chapter: we recalled the main contributions of this thesis and provided some perspectives of future works.

Appendix A: The featsel framework
Description of featsel, a framework for benchmarking of feature selection algorithms and cost functions. This framework was coded in C++ and was developed especially for the execution of the computational experiments depicted throughout the thesis. featsel was published in Reis et al. (2017).