Associate professor in the Computer Science department at the University of Verona, Italy
Abstract: Web APIs are a cornerstone of modern web architectures, they are essential for integrating systems and building microservices architectures. Web APIs are more and more adopted to enable different services to communicate and share data seamlessly over the web. While in other programming domains (e.g., smartphone apps or web-sites) a GUI is typically available to suggest what next interactions can be taken (e.g., as available widgets or links), web APIs lack a graphical user interface and all the operations are equally available to a fuzzer even if not logically meaningful at each moment in time. Automatically fuzzing web APIs requires to address peculiar challenges, including not only picking the most appropriate input data, but also fuzzing operations in an appropriate order even if no GUI is available to suggest a logical sequence of interactions. In this keynote I will cover the main research challenges to address to automatically fuzz web APIs. Moreover, I will touch some recent research achievements, including the use of deep reinforcement learning to train a fuzzing agent for functional testing, performing security testing based on test patterns, and the reusable research tools available to the research community to build on top.
Faculty member at the Max Planck Institute for Security and Privacy (MPI-SP), Germany
Abstract: How do we know how well our tool solves a problem, like bug finding, compared to other state-of-the-art tools? We run a benchmark. We choose a few representative instances of the problem, define a reasonable measure of success, and identify and mitigate various threats to validity. Finally, we implement (or reuse) a benchmarking framework, and compare the results for our tool with those for the state-of-the-art. For many important software engineering problems, we have seen new sparks of interest and serious progress made whenever a (substantially better) benchmark became available. Benchmarks are our measure of progress. Without them, we have no empirical support to our claims of effectiveness. Yet, time and again, we see practitioners disregard entire technologies as "paper-ware"---far from solving the problem they set out to solve. In this keynote, I will discuss our recent efforts to systematically study the degree to which our evaluation methodologies allow us to measure those capabilities that we aim to measure. We shed new light on a long-standing dispute about code coverage as a measure of testing effectiveness, explore the impact of the specific benchmark configuration on the evaluation outcome, and call into question the actual versus measured progress of an entire field (ML4VD) just as it gains substantial momentum and interest.
Associate professor at the IMDEA Software Institute in Madrid, Spain
Abstract: Many techniques in formal verification and software testing rely on repOk routines to verify the consistency and validity of software components with complex data representations. A repOk function encodes the state properties necessary for an instance to be a valid object of the class under analysis, enabling early error detection and simplifying debugging. However, writing a correct and complete repOk can be challenging, even for advanced Large Language Models (LLMs). In this talk I will introduce Express, the first search-based algorithm designed to automatically generate a correct repOk for a given class. Express leverages simulated annealing, using the source code and test suite of the class under analysis to iteratively construct a repOk.