Leonardo Villalobos Arias

Leonardo Villalobos Arias

Es estudiante: 
No
Programa en que estudia: 

Proyectos

Publicaciones

Evaluation of a model-based testing platform for Java applications

Descripción:

Model-based testing (MBT) automates the design and generation of test cases from a model. This process includes model building, test selection criteria, test case generation, and test case execution stages. Current tools support this process at various levels of automation, most of them supporting three out of four stages. Among them is MBT4J, a platform that extends ModelJUnit with several techniques, offering a high level of automation for testing Java applications. In this study, the authors evaluate the efficacy of the MBT4J platform, in terms of the number of test cases generated, errors detected, and coverage metrics. A case study is conducted using two open-source Java systems from public repositories, and 15 different configurations. MBT4J was able to automatically generate five models from the source code. It was also able to generate up to 2025 unique test cases for one system and up to 1044 for the other, resulting in 167 and 349 failed tests, respectively. Transition and transition pair coverage reached 100% for all models. Code coverage ranged between 72 and 84% for the one system and between 59 and 76% for the other. The study found that Greedy and Random were the most effective testers for finding errors.

Tipo de publicación: Journal Article

Publicado en: IET Software

Model-based testing areas, tools and challenges: A tertiary study

Descripción:

Context: Model-based testing is one of the most studied approaches by secondary studies in the area of software testing. Aggregating knowledge from secondary studies on model- based testing can be useful for both academia and industry. 

Objective: The goal of this study is to characterize secondary studies in model-based testing, in terms of the areas, tools and challenges they have investigated. 

Method: We conducted a tertiary study following the guidelines for systematic mapping studies. Our mapping included 22 secondary studies, of which 12 were literature surveys and 10 systematic reviews, over the period 1996–2016. 

Results: A hierarchy of model-based testing areas and subareas was built based on existing taxonomies as well as data that emerged from the secondary studies themselves. This hierarchy was then used to classify studies, tools, challenges and their tendencies in a unified classification scheme. We found that the two most studied areas are UML models and transition-based notations, both being modeling paradigms. Regarding tendencies of areas in time, we found two areas with constant activity through time, namely, test objectives and model specification. With respect to tools, we only found five studies that compared and classified model-based testing tools. These tools have been classified into common dimensions that mainly refer to the model type and phases of the model-based testing process they support. We reclassified all the tools into the hierarchy of model-based testing areas we proposed, and found that most tools were reported within the modeling paradigm area. With regard to tendencies of tools, we found that tools for testing the functional behavior of software have prevailed over time. Another finding was the shift from tools that support the generation of abstract tests to those that support the generation of executable tests. For analyzing challenges, we used six categories that emerged from the data (based on a grounded analysis): efficacy, availability, complexity, professional skills, investment, cost & effort, and evaluation & empirical evidence. We found that most challenges were related to availability. Besides, we too classified challenges according to our hierarchy of model-based testing areas, and found that most challenges fell in the model specification area. With respect to tendencies in challenges, we found they have moved from complexity of the approaches to the lack of approaches for specific software domains. 

Conclusions: Only a few systematic reviews on model-based testing could be found, therefore some areas still lack secondary studies, particularly, test execution aspects, language types, model dynamics, as well as some modeling paradigms and generation methods. We thus encourage the community to perform further systematic reviews and mapping studies, following known protocols and reporting procedures, in order to increase the quality and quantity of empirical studies in model-based testing.

Tipo de publicación: Journal Article

Publicado en: CLEI Electronic Journal

MBT4J: Automating the Model-Based Testing Process for Java Applications

Descripción:

Model-based testing is a process that can reduce the cost of software testing by automating the design and generation of test cases but it usually involves some time-consuming manual steps. Current model-based testing tools automate the generation of test cases, but offer limited support for the model creation and test execution stages. In this paper we present MBT4J, a platform that automates most of the model-based testing process for Java applications, by integrating several existing tools and techniques. It automates the model building, test case generation, and test execution stages of the process. First, a model is extracted from the source code, then an adapter—between this model and the software under test—is generated and finally, test cases are generated and executed. We performed an evaluation of our platform with 12 configurations using an existing Java application from a public repository. Empirical results show that MBT4J is able to generate up to 2,438 test cases, detect up to 289 defects, and achieve a code coverage ranging between 72% and 84%. In the future, we plan to expand our evaluation to include more software applications and perform error seeding in order to be able to analyze the false positive and negative rates of our platform. Improving the automation of oracles is another vein for future research.

Tipo de publicación: Conference Paper

Publicado en: Advances in Intelligent Systems and Computing

Evaluating hyper-parameter tuning using random search in support vector machines for software effort estimation

Descripción:

Studies in software effort estimation (SEE) have explored the use of hyper-parameter tuning for machine learning algorithms (MLA) to improve the accuracy of effort estimates. In other contexts random search (RS) has shown similar results to grid search, while being less computationally-expensive. In this paper, we investigate to what extent the random search hyper-parameter tuning approach affects the accuracy and stability of support vector regression (SVR) in SEE. Results were compared to those obtained from ridge regression models and grid search-tuned models. A case study with four data sets extracted from the ISBSG 2018 repository shows that random search exhibits similar performance to grid search, rendering it an attractive alternative technique for hyper-parameter tuning. RS-tuned SVR achieved an increase of 0.227 standardized accuracy (SA) with respect to default hyper-parameters. In addition, random search improved prediction stability of SVR models to a minimum ratio of 0.840. The analysis showed that RS-tuned SVR attained performance equivalent to GS-tuned SVR. Future work includes extending this research to cover other hyper-parameter tuning approaches and machine learning algorithms, as well as using additional data sets.

Tipo de publicación: Conference Paper

Publicado en: Proceedings of the 16th ACM International Conference on Predictive Models and Data Analytics in Software Engineering

Técnicas de ajuste de hiperparámetros de algoritmos de aprendizaje automático para la estimación de esfuerzo: un mapeo de literatura

Descripción:

Distintos algoritmos de aprendizaje automático (ML) han sido utilizados para apoyar los procesos de estimación de esfuerzo de desarrollo del software (EES). Sin embargo, el desempeño de estos algoritmos puede verse impactado por varios factores, uno de los cuales es la escogencia de los hiperparámetros. En los últimos años, el ajuste de hiperparámetros ha surgido como un área de investigación de interés para la EES que busca optimizar el desempeño de los modelos de ML. En este trabajo, realizamos un mapeo sistemático de literatura para caracterizar las técnicas de ajuste automático de hiperparámetros de algoritmos de ML utilizados en el contexto de la EES. Presentamos los resultados de 67 estudios identificados entre el 2010 y el 2019 y clasificamos las técnicas de ajuste de hiperparámetros, los algoritmos de ML y los conjuntos de datos dónde se han aplicado. Asimismo, reportamos los retos reportados como mapa de ruta para futuras investigaciones en el área.

Tipo de publicación: Journal Article

Publicado en: Revista Ibérica de Sistemas e Tecnologias de Informação

Hyper-Parameter Tuning of Classification and Regression Trees for Software Effort Estimation

Descripción:

Classification and regression trees (CART) have been reported to be competitive machine learning algorithms for software effort estimation. In this work, we analyze the impact of hyper-parameter tuning on the accuracy and stability of CART using the grid search, random search, and DODGE approaches. We compared the results of CART with support vector regression (SVR) and ridge regression (RR) models. Results show that tuning improves the performance of CART models up to a maximum of 0.153 standardized accuracy and reduce its stability radio to a minimum of 0.819. Also, CART proved to be as competitive as SVR and outperformed RR.

Tipo de publicación: Book Chapter

Publicado en: Advances in Intelligent Systems and Computing

Comparative study of Random Search Hyper-Parameter Tuning for Software Effort Estimation

Descripción:

Empirical studies on software effort estimation have employed hyper-parameter tuning algorithms to improve model accuracy and stability. While these tuners can improve model performance, some might be overly complex or costly for the low dimensionality datasets used in SEE. In such cases a method like random search can potentially provide similar benefits as some of the existing tuners, with the advantage of using low amounts of resources and being simple to implement. In this study we evaluate the impact on model accuracy and stability of 12 state-of-the-art hyper-parameter tuning algorithms against random search, on 9 datasets of the PROMISE repository and 4 sub-datasets from the ISBSG R18 dataset. This study covers 2 traditional exhaustive tuners (grid and random searches), 6 bio-inspired algorithms, 2 heuristic tuners, and 3 model-based algorithms. The tuners are used to configure support vector regression, classification and regression trees, and ridge regression models. We aim to determine the techniques and datasets for which certain tuners were 1) more effective than default hyper-parameters, 2) more effective than random search, 3) which models(s) can be considered "the best" for which datasets. The results of this study show that hyper-parameter tuning was effective (increased accuracy and stability) in 862 (51%) of the 1,690 studied scenarios. The 12 state-of-the-art tuners were more effective than random search in 95 (6%) of the 1,560 studied (non-random search) scenarios. Although not effective in every dataset, the combination of flash tuning, logarithm transformation and support vector regression obtained top ranking in accuracy on the highest amount (8 out of 13) of datasets. Hyperband tuned ridge regression with logarithm transformation obtained top ranking in accuracy on the highest amount (10 out of 13) of datasets. We endorse the use of random search as a baseline for comparison for future studies that consider hyper-parameter tuning.

Tipo de publicación: Conference Paper

Publicado en: International Conference on Predictable Models and Data Analytics in Software Engineering (PROMISE 21).

Desarrollo y evaluación de un prototipo de aplicación móvil para la administración de traslados de pacientes COVID-19

Descripción:

En este artículo presentamos un prototipo de una aplicación móvil para la administración de los traslados de pacientes COVID-19 realizados por el equipo PRIME del centro médico CEACO en Costa Rica. Describimos el diseño de la aplicación, los aspectos técnicos relacionados con su implementación, y los resultados de la evaluación de la experiencia de usuario realizada por los miembros del equipo PRIME. La evaluación del prototipo muestra la utilidad de la aplicación móvil para apoyar los procesos del equipo PRIME y los resultados del estudio de experiencia de usuario indican una percepción muy positiva para las categorías de atracción, trasparencia, eficiencia, controlabilidad y estimulación.

Tipo de publicación: Journal Article

Publicado en: Revista Ibérica de Sistemas e Tecnologias de Informação