Empirical Studies on Automated Software Testing Practices   

Monday, October 31, 2022 - 10:30 am


Author : Alireza Salahirad

Advisor : Dr. Gregory Gay

Date : Oct 31, 2022

Time :  10:30 am

Place : Virtual (teams/zoom link below)

Meeting Link



 Software testing is notoriously difficult and expensive, and improper testing carries economic, legal, and even environmental or medical risks. Research in software testing is critical to enabling the development of the robust software that our society relies upon. This dissertation aims to lower the cost of software testing, with a focus on the use of automation to lower the cost of testing without decreasing the quality. The dissertation consists of three empirical studies on aspects of software testing. Specifically, these three projects focus on (1) mapping the connections between research topics and the evolution of research topics in the field of software testing, (2) an assessment of the metrics used to guide automated test generation and the factors that suggest when automated test generation can detect real faults, and (3) examination of the semantic coupling between synthetic and real faults in service of improving our ability to cost-effectively generate synthetic faults for use in assessing test case quality.

   Project 1 (Mapping): Our main goal for this project is to better understand the emergence of individual research topics and the connection between these topics within the broad field of software testing, enabling the identification of new topics and connections in future research. For achieving this goal, we have applied co-word analysis in order to characterize the topology of software testing research over three decades of research studies, based on the keywords provided by the authors of studies indexed in the Scopus database.

    Project 2 (Automated Input Generation): We have assessed the fault-detection capabilities of unit test suites generated by automated tools with the goal of satisfying eight fitness functions representing common testing goals. Our purpose was to not only identify the particular fitness functions that detect the most faults, but to further explore the factors that influence fault detection. To do this, we gathered observations on the generated test suites and metrics describing the source code of the faulty classes and applied a rule-learning algorithm to identify the factors with the strongest influence on fault detection.

    Project 3 (Mutant-Fault Coupling): Synthetic faults (\textit{mutants}), which can be inserted into code through transformative \textit{mutation operators}, offer an automated means to assess the effectiveness of test suites and create new test cases. However, mutants can be expensive to utilize and may not realistically model real faults. To enable the cost-effective generation of mutants, we investigate this semantic relationship between mutation operators and real faults.