Relatórios Técnicos - Technical Reports

Technical Reports Published in 2015

IC-15-08 pdf bib

Adaptive multiscale function approximation - I: General discrete bases.
Gilcélia Regiâne de Souza and Jorge Stolfi.
December 2015. In English, 22 pages.

Resumo: Descrevemos algoritmos eficientes para aproximação multi-escala adaptativa de funções que estão amostradas com densidade não uniforme e/ou tem detalhes importantes de pequena escala limitados a regiões pequenas de seu domínio. Nossos algoritmos são bem gerais, independentes da forma do domínio, da malha, e da base de aproximação. Uma base multi-escala é definida genericamente como uma hierarquia de bases uni-escala, cujos elementos tem suportes progressivamente menores e estão arranjados mais densamente no domínio. Uma base multiescala adaptativa é um subconjunto de uma base completa, que exclui elementos que contribuem muito pouco para a aproximação. Testamos nossos algoritmos com uma base herárquica de elementos finitos de Bézier em malhas regulares triangulares. Os testes mostram que o tamanho da base adaptativa pode ser 1/100 do tamanho de uma base completa uni-escala com a mesma resolução espacial, resultando em enorme economia de tempo de computação. (Versões prévias destes algoritmos foram descritas na tese de Doutorado da primeira autora.)

Abstract: We describe efficient algorithms for adaptive multiscale approximation of functions that are sampled with uneven density and/or have important small-scale detail limited to small portions of their domain. Our algorithms are very general, independent of domain shape, mesh, and approximation basis. A full multiscale basis is defined generically as a hierarchy of single-scale bases, whose elements have progressively smaller supports and are arranged more densely over the domain. An adaptive multiscale basis is a subset of a full one, that excludes elements that contribute very little to the approximation. We tested our algorithms with a hierarchical basis of finite Bézier elements on regular triangular meshes. The tests show that the size of the adaptive basis can be about 1/100 of the size of a full single-scale basis of the same spatial resolution, leading to enormos savings in computation time. (Earlier versions of these algorithms were described in the Ph. D. Thesis of the first author.).

IC-15-07 pdf bib

New Dissimilarity Measures for Image Phylogeny Reconstruction.
Filipe de O. Costa, Alberto Oliveira, Pasquale Ferrara, Zanoni Dias, Siome Goldenstein, and Anderson Rocha.
November 2015. In English, 20 pages.

Resumo: Filogenia de Imagens é o problema de reconstruir a estrutura que representa o histórico de geração de imagens semanticamente similares, por exemplo, imagens duplicatas-próximas. Algoritmos típicos de Filogenia de Imagens dividem o problema em dois passos: estimativa da dissimilaridade entre cada par de imagens, e reconstrução da estrutura filogenética. Dado que a cálculo da dissimilaridade impacta diretamente a reconstrução da filogenia, propomos novas abordagens para a formulação padrão do cálculo da dissimilaridade em Filogenia de Imagens, buscando melhorar a eficácia em estimar o histórico de relacionamentos entre imagens duplicatas-próximas. Essa nova formulação explora famílias diferentes de funções de ajuste de cor, gradientes locais e informação mútua como dissimilaridade. Os resultados obtidos com essa nova formulação superam os métodos existentes na literatura, permitindo uma melhor análise das relações de parentesco em um conjunto de imagens, o que possibilita melhores soluções tomando proveito de filogenia, em problemas como controle de direitos autorais e problemas em análise forense de documentos digitais.

Abstract: Image Phylogeny is the problem of reconstructing the structure that represents the history of generation of semantically similar images (e.g., near-duplicate images). Typical Image Phylogeny approaches break the problem into two steps: the estimation of the dissimilarity between each pair of images, and the actual reconstruction of the phylogeny structure. Given that the dissimilarity calculation directly impacts the phylogeny reconstruction, we propose new approaches to the standard dissimilarity calculation formulation in image phylogeny, aiming at improving the accuracy when estimating the historical relationships between near-duplicates. The new formulations explore a different family of color adjustment, local gradient, and mutual information processes. The obtained results with the new formulations remarkably outperform the existing counterparts in the literature allowing a much better analysis of the parenthood relationships in a set of images better enabling the deployment of phylogeny solutions to tackle traitor tracing, copyright enforcement, and digital forensics problems.

IC-15-06 pdf bib

Robust least squares by iterative Bayesian data adjustment.
Gilcélia Regiâne de Souza and Jorge Stolfi.
September 2015. In English, 9 pages.

Resumo: Apresentamos um algritmo eficiente para aproximação multivariada por mínimos quadrados ponderados na presença de outliers (dados anômalos). O algoritmo constrói iterativamente uma interpretação probabilística auto-consistente dos dados; especificamente, a média e variância das duas populações (inliers e outliers), a aproximação ajustada por mínimos quadrados, e a probabilidade de que cada ponto seja um inlier. Uma característica original deste algoritmo é que, a cada iteração, ele ajusta os valores amostrados dados conforme as probabilidades estimadas, antes de recalcular a aproximação por mínimos quadrados. Esta abordagem é mais eficiente que a alternativa mais óbvia de incluir as probabilidades na matriz do sistema de mínimos quadrados. A convergência do método é demonstrada por teses empíricos.

Abstract: We describe an efficient algorithm for robust multivariate weighted least squares approximation in the presence of outliers. The algorithm iteratively constructs a self-consistent probabilistic interpretation of the data; specifically, the mean and variance of the two populations (inliers and outliers), the least squares fitted approximation, and the probability that each data point is an inlier. An original feature of this algorithm is that, at each iteration, it adjusts the given sampled values according to the estimated probabilities, before re-computing the least squares approximation. This approach is more efficient than the more obvious alternative of including the probabilities in the matrix of the least squares system. The convergence of the method is demonstrated with empirical tests.

IC-15-05 pdf bib

Anais do X Workshop de Teses, Dissertações e Trabalhos de Iniciação Científica em Andamento do IC-Unicamp.
Allan Pinto, Ariadne Carvalho, Carlos A. Petry, Edelson H. Constantino, Eliana Moreira, Elisa Rodrigues, Ewerton Silva, Ewerton Martins, Fabrício M. Gonçalves, Fagner L. Pantoja, Greice Mariano, Gustavo Alkmim, Jacqueline Santo, Julian Posada, Juliana Borin, Luis A. Marins, Roberto Pereira, Samuel Buchdid, and Vanessa Maike.
August 2015. Partly in English, partly in Portuguese, 114 pages.

Resumo: Este relatório técnico contém os resumos de 21 trabalhos apresentados no X Workshop de Teses, Dissertações e Trabalhos de Iniciação Científica em Andamento (WTD), do Instituto de Computação (IC) da Universidade Estadual de Campinas (Unicamp) do ano de 2015. O Workshop ocorreu nos dias 3 e 4 de Agosto de 2015 e contou com mais de 250 participantes, entre ouvintes, apresentadores de trabalhos e organizadores. Na ocasião houve 27 apresentações orais e 8 pôsteres. Aos alunos foi dada a possibilidade de escolher a forma de apresentação (oral, pôster, ou ambas), bem como escolher se desejasse publicar o seu trabalho nos anais do evento. A publicação dos resumos, sob forma de relatório técnico, tem por objetivo divulgar os trabalhos em andamento e registrar, de forma sucinta, o estado da arte da pesquisa do Instituto de Computação no ano de 2015.

IC-15-04 pdf bib

Dependable dynamic software product line – a systematic mapping study.
Jane D.A.S. Eleuterio, Felipe N. Gaia, Andrea Bondavalli, Paolo Lolline, Genaína N. Rodrigues, and Cecília M.F. Rubira.
July 2015. In English, 60 pages.

Abstract: Software Product Lines (SPLs) are emerging techniques where several artifacts are reused (domain), and some are customized (variation points). An SPL can bind variation points statically (compilation time) or dynamically (runtime). Dynamic Software Product Lines (DSPLs) use dynamic bind to adapt to the environment or requirements changes. DSPLs are commonly used to build dependable systems, defined as systems with the ability to avoid service failures more frequent or severe than is acceptable. Main dependability attributes are availability, confidentiality, integrity, reliability, maintainability, and safety. To better understand this context, a Systematic Mapping Study (SMS) was applied searching proposals that include dependability attributes in DSPLs. Our results suggest that few studies handle dependability in DSPL context. We selected only nine primary studies in this context. Also, four of them provide explicit support for SOA and were analyzed as a secondary result.

IC-15-03 pdf bib

Chasing the tail of atomic broadcast protocols.
Daniel Cason, Parisa J. Marandi, Luiz E. Buzato, and Fernando Pedone.
May 2015. In English, 20 pages.

Abstract: Many applications today rely on multiple services, whose results are combined to form the application's response. In such contexts, the most unreliable service and the slowest service determine the application's reliability and response time, respectively. State-machine replication and atomic broadcast are fundamental abstractions to build highly available services. In this paper, we consider the latency variability of atomic broadcast protocols. This is important because atomic broadcast has a direct impact on the response time of services. We study four high performance atomic broadcast protocols representative of different classes of protocol design and characterize their latency tail distribution under different workloads. Next, we assess how key design features of each protocol can possibly be related to the observed latency tail distributions. Our observations hint at request batching as a simple yet effective way to shorten the latency tails of some of the studied protocols; an improvement within the reach of application implementers. Indeed, our observation is not only verified experimentally, it allows us to assess which of the protocol's key design principles favor the construction of latency predictable protocols.

IC-15-02 pdf bib

Invariants-based architecture for semantic malware resistance.
Rachid Rebiha and Arnaldo V. Moura.
January 2015. In English, 36 pages.

Abstract: In this work we provide theoretical basis for the design of static and dynamic platforms that can exhibit a suitable architecture for automatic in-depth malware analysis. We show how formal methods involving static and dynamic program analysis, decision procedures and automated verification techniques can be used to build such architectures. We present automatic formal specification, classification and detection methods for malware analysis. We provide several methods to generate formal malware signatures capable of capturing and modelizing the core of the malicious behavior in a sound and concise way. We propose to generate malware invariants and abstract models directly from the specified malware code in order to use them as semantic aware signatures. Importantly, these invariants and abstracted structures would remain unchanged in many obfuscated versions of the code. Then, we present formal model extraction methods in order to modelize the program being inspected. These computational models are basically abstract call graphs and pushdown systems annoted with invariants at each system call locations. Thus, for each specification methods and formal models introduced, we are able to reduce the problem of malware detection to classical problem very standard for the formal method research community. We also present a host-based intrusion detection systems, where system calls are preceded by the check of pre-computed invariants. We prove that any malware analysis or intrusion detection system will be strongly re-enforced by the presence of pre-computed invariants, and will also be weakened by their absence. The research security community will benefit from mathematical formalisms that could serve as a scientific basis for their classification. The other main purpose of this paper is to enlight the formal method research community about writing static malware analyzer using several formal techniques and combined logics.

IC-15-01 pdf bib

FDPE (Fast and Accurate Dynamic Power Estimation): Implementation and User Guidelines.
Daniel Vidal and Mário Lúcio Côrtes.
January 2015. In English, 20 pages.

Abstract: This Technical Report presents the implementation details and usage guidelines of FDPE framework. FDPE provides fast and accurate dynamic power estimation for gate-level digital circuits using Liberty characterization data and Verilog simulator. This framework enables accurate power estimation and achieves a speed-up of three orders of magnitude over Spice. FDPE also enables dynamic power estimation for large scale circuits that are unfeasible to run in Spice simulations. In addition, FDPE has the following advantages over similar tools: ease of use, flexibility, low-cost, and public availability.