Hybrid Algorithms for the Search of Test Data in SPL – Javier Ferrer

In Software Product Lines (SPLs) it is not possible, in general, to test all products of the family. The number of products denoted by a SPL is very high due to the combinatorial explosion of features. For this reason, some coverage criteria have been proposed which try to test at least all feature interactions without the necessity to test all products, e.g., all pairs of features (pairwise coverage). In addition, it is desirable to first test products composed by a set of priority features. This problem is known as the Prioritized Pairwise Test Data Generation Problem.

In our work we propose hybrid algorithms to generate prioritized test suites. The first one is based on an integer linear formulation and the second one is based on a integer quadratic (nonlinear) formulation. We compare these techniques with two state-of-the-art algorithms, the Parallel Prioritized Genetic Solver (PPGS) and a greedy algorithm called prioritized-ICPL. Our study reveals that our hybrid nonlinear approach is clearly the best in both, solution quality and computation time.

Presentation: ferrer-SS-SBSE2017

Deep Parameter Optimisation on Android Smartphones for Energy Minimisation – Markus Wagner

With power demands of mobile devices rising, it is becoming increasingly important to make mobile software applications more energy efficient. Unfortunately, mobile platforms are diverse and very complex which makes energy behaviours difficult to model. This complexity presents challenges to the effectiveness of off-line optimisation of mobile applications. In this paper, we demonstrate that it is possible to automatically optimise an application for energy on a mobile device by evaluating energy consumption “in-vivo”. In contrast to previous work, we use only the device’s own internal meter. Our approach involves many technical challenges but represents a realistic path toward learning hardware specific energy models for program code features.

Presentation: sbse-deepparameter

Automated Software Development Support through Optimized Code History Models – Francisco Servant

Software developers regularly need to find diverse information to successfully perform their tasks. Some example of software information needs are: “why was this code implemented in this way?”, or “who has expertise in this functionality?” Unfortunately, finding such information requires high effort, and it is often found inaccurately, which not only decreases software development productivity, but it also decreases software quality. In my research, I follow the insight that many of the questions that developers ask can be answered automatically by analyzing the data that they produced in past software development tasks. I will present a series of techniques that automate the multi-revision, fine-grained analysis of source code history. These techniques provide high accuracy by optimizing code similarity over its modeled history. I will also demonstrate how these techniques help software developers to find relevant information about software development tasks efficiently and accurately.

Maintenance of the logical consistency in Cassandra – Pablo Suárez-Ótero

Contrary to the relational databases, in NoSQL databases like Cassandra is very common that duplicity of data in different tables happens. This is because usually tables are query driven (designed based on queries) and there are not any relationships between them, in order to increase the performance of the queries. Therefore, if the data is not updated on a proper way, inconsistencies in the stored information could appear. It is quite easy to introduce defaults that cause inconsistencies of the data in Cassandra, especially during the evolution of a system where new tables are created, being these ones hard to detect using conventional techniques of dynamic testing. The developer is the one responsible for the maintenance of this consistency incorporating and updating the proper procedures. In this session, a preventive approach to these problems is introduced, establishing the procedures required to ensure the quality of data from the point of view of their consistency and thus helping the developer. These procedures include a static analysis using the conceptual model, the queries and the logical model of the application. They also include the determination and execution of the operations that guarantee the consistency of the information.

Discovery of design patterns based on good practices -Rafael Barbudo

The complexity of current software systems obliges software engineers to learn about the good practices employed in previous projects. The use of design patterns is not an exception, as they can provide developers with a tool to improve the reusability and modularisation of their code. In this context, this talk will introduce a three-step prototypical model aimed at supporting software engineers to implement design patterns based on previous examples and successful experiences. This model makes use of machine learning techniques like frequent pattern mining. A suitable representation of this knowledge will allow us to identify potential code chunks which might become a design pattern.

Presentation: 2ss-sbse

Agile Effort Estimation – Natasha Nigar

Software projects that are over-budget, delivered late, and fall short of users’ expectations have been a challenge in software engineering for decades. The success or failure of a software project heavily depends on the accuracy of effort estimation. The software project cost is primarily estimated based on effort which is defined as the time taken by the software development team members for individual tasks completion. Therefore, accurate effort estimation has gained highest importance due to exponential growth of large scale software applications.
This research contributes by presenting a novel approach for effort estimation in ‘Agile Software development’ (ASD). In ASD, changes in customer requirements are proactively incorporated while delivering software projects within budget and time. We shall formulate effort estimation as the search-based problem and use computational intelligence techniques, such as evolutionary algorithms, to address following limitations in the current research for agile effort estimation.

  1. Datasets used for effort estimation contain single company projects data. We will use cross-company data to validate our model.
  2. Other than scrum and XP no other agile method was investigated. We will use KANBAN agile method in our research.
  3. We will be first to use line of code (LOC) as size metric. The benefit of this research is that it will reduce the risk of software project falling behind schedules by providing realistic estimation figures

José Antonio Parejo y Aurora Ramírez

QoS-Aware web service composition with multi-objective  evolutionary algorithms

Service-based applications often invoke web services provided by third parties in its workflow. The Quality of Service provided by service providers is usually expressed in terms of a Service Level Agreement, that specifies the cost, performance, availability, etc. In this scenario, intelligent systems can help the engineers to scrutinize the service market, in order to select those service configurations that best fit their needs.
This search problem, also known as a QoS-aware web services composition, needs to simultaneously take into account multiple quality attributes which may be in conflict. For instance, faster response time entails a higher cost. Therefore, several quality properties must be optimized simultaneously using multi-objective or many–objective approaches, which require computationally efficient algorithms.
This session presents the QoS-aware web services composition problem and its various variants, as well as a comparative experimental study of multi-objective and many-objective algorithms. Specifically, we explore the suitability of various evolutionary algorithms to address the problem on the basis of a set of real web services with 9 quality properties. It is observed that some algorithms can achieve a better balance between the quality properties, or even promote specific properties while maintaining high quality values ​​for the rest. Furthermore, this search process can be performed within a reasonable computational cost, allowing its adoption by intelligent systems and enabling decision support in the field of service-oriented computing.

Daniel Rodríguez

Software Defect Prediction in Software Engineering

In this talk, we present how data mining techniques are used to predict or rank error-prone modules in software engineering. Classifying or ranking software components according to their probability of being defective helps with the testing and maintenance phases of a project to, for example, allocate resources, prioritising modules to be tested or perform regression testing activities. We will review the publicly available datasets, machine learning techniques and some of the machine learning problems that we face during the process. On the one hand, from software engineering point of view, we need to deal with data quality and what software engineering metrics can used. On the other hand, form the data mining point of view, we may need to deal with feature selection and issues such noise and missing values, imbalanced data, and the evaluation measures of the machine learning algorithms and their comparison.

Talks

Participants attending the summer school will have the opportunity to give a host talk on their ongoing research project. They will receive feedback from senior researchers in SBSE during the summer school. Information about talks will be available after the registration deadline.

Talks I

Wednesday, June 28 (12:00 – 13:30)

  • Agile Effort Estimation (Natasha Nigar)
  • Discovery of design patterns based on good practices (Rafael Barbudo)
  • Maintenance of the logical consistency in Cassandra (Pablo Suárez-Ótero)
  • Hybrid Algorithms for the Search of Test Data in SPL (Javier Ferrer)

Talks II

Friday, June 30 (12:00 – 13:00)

  • Automated Software Development Support through Optimized Code History Models (Francisco Servant)
  • Deep Parameter Optimisation on Android Smartphones for Energy Minimisation (Markus Wagner)

Justyna Petke

Genetic Improvement – new direction in SBSE

Software engineering problems can often be reformulated as search problems. One seeks to find an optimal or near optimal solution in a wide range of candidate solutions, guided by a fitness function that distinguishes between good and bad solutions.

The talk will cover a new exciting direction in search-based software engineering, namely genetic improvement. GI uses automated search in order to improve existing software. It has resulted in dramatic performance improvements for a diverse set of properties such as execution time, energy and memory consumption, as well as results for fixing and extending existing system functionality. Work on genetic improvement has led to several awards, ranging from best paper awards to several `Humie’ awards given for human-competitive results produced by genetic and evolutionary computation.

I will give an overview of genetic improvement and present key components of a GI framework. This keynote is based on work conducted at the CREST centre at UCL.

Brief Bio:

Justyna Petke is a Senior Research Associate at the Centre for Research on Evolution, Search and Testing (CREST) at University College London (UCL). She is interested in the connections between constraint satisfaction and search-based software engineering. Her current research focuses on genetic improvement and combinatorial interaction testing. She won several awards for her work on GI: Silver and Gold `Humie` at GECCO 2014 and 2016 and an ACM SIGSOFT Distinguished Paper Award at ISSTA 2015.

 

Presentation: sbsekeynote_v2.compressed