Hybrid Algorithms for the Search of Test Data in SPL a�� Javier Ferrer

In Software Product Lines (SPLs) it is not possible, in general, to test all products of the family. The number of products denoted by a SPL is very high due to the combinatorial explosion of features. For this reason, some coverage criteria have been proposed which try to test at least all feature interactions without the necessity to test all products, e.g., all pairs of features (pairwise coverage). In addition, it is desirable to first test products composed by a set of priority features. This problem is known as the Prioritized Pairwise Test Data Generation Problem.

In our work we propose hybrid algorithms to generate prioritized test suites. The first one is based on an integer linear formulation and the second one is based on a integer quadratic (nonlinear) formulation. We compare these techniques with two state-of-the-art algorithms, the Parallel Prioritized Genetic Solver (PPGS) and a greedy algorithm called prioritized-ICPL. Our study reveals that our hybrid nonlinear approach is clearly the best in both, solution quality and computation time.


Deep Parameter Optimisation on Android Smartphones for Energy Minimisation a�� Markus Wagner

With power demands of mobile devices rising, it is becoming increasinglyA�important to make mobile software applications more energy efficient.A�Unfortunately, mobile platforms are diverse and very complex which makesA�energy behaviours difficult to model. This complexity presents challenges to theA�effectiveness of off-line optimisation of mobile applications. In this paper, weA�demonstrate that it is possible to automatically optimise an application for energyA�on a mobile device by evaluating energy consumption “in-vivo”. In contrast toA�previous work, we use only the device’s own internal meter. Our approachA�involves many technical challenges but represents a realistic path toward learningA�hardware specific energy models for program code features.


Automated Software Development Support through Optimized Code History Models a�� Francisco Servant

Software developers regularly need to find diverse information to successfullyA�perform their tasks. Some example of software information needs are: “why wasA�this code implemented in this way?”, or “who has expertise in this functionality?”A�Unfortunately, finding such information requires high effort, and it is often foundA�inaccurately, which not only decreases software development productivity, but itA�also decreases software quality. In my research, I follow the insight that many ofA�the questions that developers ask can be answered automatically by analyzing theA�data that they produced in past software development tasks. I will present a seriesA�of techniques that automate the multi-revision, fine-grained analysis of source code history. These techniques provide high accuracy by optimizing codeA�similarity over its modeled history. I will also demonstrate how these techniquesA�help software developers to find relevant information about software developmentA�tasks efficiently and accurately.

Maintenance of the logical consistency in Cassandra – Pablo SuA?rez-A�tero

Contrary to the relational databases, in NoSQL databases like Cassandra is veryA�common that duplicity of data in different tables happens. This is because usuallyA�tables are query driven (designed based on queries) and there are not anyA�relationships between them, in order to increase the performance of the queries.A�Therefore, if the data is not updated on a proper way, inconsistencies in the storedA�information could appear. It is quite easy to introduce defaults that causeA�inconsistencies of the data in Cassandra, especially during the evolution of aA�system where new tables are created, being these ones hard to detect usingA�conventional techniques of dynamic testing. The developer is the one responsibleA�for the maintenance of this consistency incorporating and updating the properA�procedures. In this session, a preventive approach to these problems isA�introduced, establishing the procedures required to ensure the quality of data fromA�the point of view of their consistency and thus helping the developer. TheseA�procedures include a static analysis using the conceptual model, the queries andA�the logical model of the application. They also include the determination andA�execution of the operations that guarantee the consistency of the information.

Discovery of design patterns based on good practices -Rafael Barbudo

The complexity of current software systems obliges software engineers to learnA�about the good practices employed in previous projects. The use of design patterns is not an exception, as they can provide developers with a tool to improveA�the reusability and modularisation of their code. In this context, this talk willA�introduce a three-step prototypical model aimed at supporting software engineersA�to implement design patterns based on previous examples and successfulA�experiences. This model makes use of machine learning techniques like frequentA�pattern mining. A suitable representation of this knowledge will allow us to identifyA�potential code chunks which might become a design pattern.


Agile Effort Estimation – Natasha Nigar

Software projects that are over-budget, delivered late, and fall short of usersa��A�expectations have been a challenge in software engineering for decades. TheA�success or failure of a software project heavily depends on the accuracy of effortA�estimation. The software project cost is primarily estimated based on effort whichA�is defined as the time taken by the software development team members forA�individual tasks completion. Therefore, accurate effort estimation has gainedA�highest importance due to exponential growth of large scale software applications.
This research contributes by presenting a novel approach for effort estimation inA�a�?Agile Software developmenta�� (ASD). In ASD, changes in customer requirementsA�are proactively incorporated while delivering software projects within budget andA�time. We shall formulate effort estimation as the search-based problem and useA�computational intelligence techniques, such as evolutionary algorithms, to addressA�following limitations in the current research for agile effort estimation.

  1. Datasets used for effort estimation contain single company projects data. WeA�will use cross-company data to validate our model.
  2. Other than scrum and XP no other agile method was investigated. We will useA�KANBAN agile method in our research.
  3. We will be first to use line of code (LOC) as size metric.A�The benefit of this research is that it will reduce the risk of software project fallingA�behind schedules by providing realistic estimation figures

JosA� Antonio Parejo y Aurora RamA�rez

QoS-Aware web service composition with multi-objective A�evolutionary algorithms

Service-based applications often invoke web services provided by third parties in its workflow. The Quality of Service provided by service providers is usually expressed in terms of a Service Level Agreement, that specifies the cost, performance, availability, etc. In this scenario, intelligent systems can help the engineers to scrutinize the service market, in order to select those service configurations that best fit their needs.
This search problem, also known as a QoS-aware web services composition, needs to simultaneously take into account multiple quality attributes which may be in conflict. For instance, faster response time entails a higher cost. Therefore, several quality properties must be optimized simultaneously using multi-objective or many–objective approaches, which require computationally efficient algorithms.
This session presents the QoS-aware web services composition problem and its various variants, as well as a comparative experimental study of multi-objective and many-objective algorithms. Specifically, we explore the suitability of various evolutionary algorithms to address the problem on the basis of a set of real web services with 9 quality properties. It is observed that some algorithms can achieve a better balance between the quality properties, or even promote specific properties while maintaining high quality values a��a��for the rest. Furthermore, this search process can be performed within a reasonable computational cost, allowing its adoption by intelligent systems and enabling decision support in the field of service-oriented computing.

Daniel RodrA�guez

Software Defect Prediction in Software Engineering

In this talk, we present how data mining techniques are used to predict or rank error-prone modules in software engineering. Classifying or ranking software components according to their probability of being defective helps with the testing and maintenance phases of a project to, for example, allocate resources, prioritising modules to be tested or perform regression testing activities. We will review the publicly available datasets, machine learning techniques and some of the machine learning problems that we face during the process. On the one hand, from software engineering point of view, we need to deal with data quality and what software engineering metrics can used. On the other hand, form the data mining point of view, we may need to deal with feature selection and issues such noise and missing values, imbalanced data, and the evaluation measures of the machine learning algorithms and their comparison.


Participants attending the summer school will have the opportunity to give a host talk on their ongoing research project. They will receive feedback from senior researchers in SBSE during the summer school. Information about talks will be available after the registration deadline.

Talks I

Wednesday, June 28 (12:00 – 13:30)

  • Agile Effort Estimation (Natasha Nigar)
  • Discovery of design patterns based on good practices (Rafael Barbudo)
  • Maintenance of the logical consistency in Cassandra (Pablo SuA?rez-A�tero)
  • Hybrid Algorithms for the Search of Test Data in SPL (Javier Ferrer)

Talks II

Friday, June 30 (12:00 – 13:00)

  • Automated Software Development Support through Optimized Code History Models (Francisco Servant)
  • Deep Parameter Optimisation on Android Smartphones for Energy Minimisation (Markus Wagner)

Justyna Petke

Genetic Improvement – new direction in SBSE

Software engineering problems can often be reformulated as search problems. One seeks to find an optimal or near optimal solution in a wide range of candidate solutions, guided by a fitness function that distinguishes between good and bad solutions.

The talk will cover a new exciting direction in search-based software engineering, namely genetic improvement. GI uses automated search in orderA�to improve existing software. It has resulted in dramatic performance improvements for a diverse set of properties such as execution time, energy and memory consumption, as well as results for fixing and extending existing system functionality. Work on genetic improvement has led to several awards, ranging from best paper awards to several `Humie’ awards given for human-competitive results produced by genetic and evolutionary computation.

I will give an overview of genetic improvement and present key components of a GI framework. This keynote is based on work conducted at the CREST centre at UCL.

Brief Bio:

Justyna Petke is a Senior Research Associate at the Centre for Research on Evolution, Search and Testing (CREST) at University College London (UCL). She is interested in the connections between constraint satisfaction and search-based software engineering. Her current research focuses on genetic improvement and combinatorial interaction testing. She won several awards for her work on GI: Silver and Gold `Humie` at GECCO 2014 and 2016 and an ACM SIGSOFT Distinguished Paper Award at ISSTA 2015.