Using NLP-Based Traceability Recovery to Reduce Architectural Complexity from Requirement Volatility in Software Systems

Edward Obeng Adu; Mary Immaculate Sheela L; Sasikala P

doi:10.34293/sijash.v13iS2-Jan.10469

Edward Obeng Adu Master Student, Department of Computing and Engineering Heritage Christian University, Accra, Greater Accra, Ghana
Mary Immaculate Sheela L Faculty, Department of Computing and Engineering, Heritage Christian University, Accra, Greater Accra, Ghana
Sasikala P Faculty, Department of Computer Science, Lal Bahadur Shastri Government First Grade College, Bengaluru, Karnataka, India

DOI: https://doi.org/10.34293/sijash.v13iS2-Jan.10469

Keywords: Natural Language Processing (NLP), Large Language Models (LLM), Traceability Link Recovery (TRL), Artifacts, Software Architecture Document (SAD), Architecture Decision Record (ADR)

Abstract

This study addresses the problem of requirements–architecture traceability deterioration in software-intensive systems. Frequent changes in requirements weaken architectural documentation and design rationale, reducing visibility into change impact and contributing to architectural drift and increased system complexity. Leveraging advances in NLP-based traceability link recovery, this research adopts a design science methodology to design and evaluate an automated traceability recovery system. The proposed system reconstructs missing links between requirements and architectural artifacts, including Software Architecture Document sections, Architecture Decision Records, and component descriptions. It integrates transformer-based semantic models such as BERT and RoBERTa, similarity search using FAISS, and ontology-supported reasoning through a Neo4j knowledge graph. These components enable cross-artifact traceability and support explainable validation with human involvement. Artifacts obtained from open-source repositories and expert-validated requirements are used to establish a reference dataset. Trace link accuracy is evaluated using precision, recall, and F1-score, along with task-based evaluation of change impact analysis under volatile requirement conditions. The results are expected to demonstrate that automated traceability recovery improves change impact understanding and helps control architectural complexity during system evolution.