Using NLP-Based Traceability Recovery to Reduce Architectural Complexity from Requirement Volatility in Software Systems
Abstract
This study addresses the problem of requirements–architecture traceability deterioration in software-intensive systems. Frequent changes in requirements weaken architectural documentation and design rationale, reducing visibility into change impact and contributing to architectural drift and increased system complexity. Leveraging advances in NLP-based traceability link recovery, this research adopts a design science methodology to design and evaluate an automated traceability recovery system. The proposed system reconstructs missing links between requirements and architectural artifacts, including Software Architecture Document sections, Architecture Decision Records, and component descriptions. It integrates transformer-based semantic models such as BERT and RoBERTa, similarity search using FAISS, and ontology-supported reasoning through a Neo4j knowledge graph. These components enable cross-artifact traceability and support explainable validation with human involvement. Artifacts obtained from open-source repositories and expert-validated requirements are used to establish a reference dataset. Trace link accuracy is evaluated using precision, recall, and F1-score, along with task-based evaluation of change impact analysis under volatile requirement conditions. The results are expected to demonstrate that automated traceability recovery improves change impact understanding and helps control architectural complexity during system evolution.
Copyright (c) 2026 Edward Obeng Adu, Mary Immaculate Sheela L, Sasikala P

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

