@techreport{TR-IC-PFG-16-17, number = {IC-PFG-16-17}, author = {Felipe Lemes Galvão and Alexandre Xavier Falcão}, title = {{Evaluating Active Learning Strategies for Image Annotation of Intestinal Parasites}}, month = {December}, year = {2016}, institution = {Institute of Computing, University of Campinas}, note = {In English, 14 pages. \par\selectlanguage{english}\textbf{Abstract} Manually annotating large datasets is unfeasible and, to do it automatically with a pattern classifier, it depends on the quality of a much smaller training set. Active learning techniques have been proposed to select those relevant samples from large datasets by prompting an user with label suggestions to be confirmed or corrected. \par In this work we explore variations of an active learning methodology that, given an organization of the data computed beforehand and only once, allows interactive response times during the active learning process involving an expert. \par We use the optimum-path forest (OPF) data clustering algorithm for the \emph{a priori} organization and some combinations of active learning algorithms and classifiers to test the methodology. The active learning algorithms considered are the root distance-based sampling (RDS), a new variation of it that we call root path-weight-based sampling (RWS) and two additional random selection baselines. The included classifiers are the OPF-based supervised and semi-supervised learning methods, and an ensemble of logistic regression classifiers. \par We tested each combination of active learning and classification algorithm against a dataset extracted from images of intestinal parasites, in which the presence of a large diverse class, namely impurities, mixed with the actual parasites poses a challenge for learning methods. } }