@techreport{TR-IC-PFG-18-25, number = {IC-PFG-18-25}, author = {Guilherme {Pereira Gribeler}}, title = {{Semantic Metadata Extraction from Subtitles of Video Lectures}}, month = {December}, year = {2018}, institution = {Institute of Computing, University of Campinas}, note = {In English, 25 pages. \par\selectlanguage{english}\textbf{Abstract} Video lectures can stimulate learning experiences considering individual needs and learning styles. Extracting relevant information from video lectures can be useful to recommendation purposes and to interpret a concept in a exact moment of a lecture that a student can be interested in watching. The extraction of semantic metadata from a video natural language subtitle involves challenges in dealing with informal aspects of language and the detection of semantic classes from the free text. In this work, we propose a technique for extracting semantic metadata, which consists in developing a tool to extract the subtitle of a YouTube video in a text file and then use semantic annotation tools to identify semantic classes from the subtitle text file. We conduct an evaluation to compare the effectiveness of distinct semantic annotation tools on this task. Obtained results indicate that both AutoMĂȘta tool and the our proposed SubAnnotator tool can perform the task of semantic annotating relevant terms well, but Ontotext and NCBO are not very effective for accomplishing this task. The difference between SubAnnotator and AutoMĂȘta is the ability of annotating multiple occurrence of the terms throughout the input text. The SubAnnotator was able to annotate a higher number of occurrences than AutoMĂȘta. The results also indicate that the biggest challenge on the video lectures semantic metada extraction process is the definition of the ontology used by the tools. } }