Retrospective Semi-automated Software Feature Extraction from Natural Language User Manuals

Quirchmayr, Thomas

Preview

PDF, English
Download (3MB) | Terms of use

Citation of documents: Please do not cite the URL that is displayed in your browser location input, instead use the DOI, URN or the persistent URL below, as we can guarantee their long-time accessibility.

DOI: 10.11588/heidok.00025322
URN: urn:nbn:de:bsz:16-heidok-253221
URL: http://www.ub.uni-heidelberg.de/archiv/25322

Abstract

Mature software systems comprise a vast number of heterogeneous system capabilities which are usually requested by different groups of stakeholders and evolve over time. Software features describe and logically bundle low level capabilities on an abstract level and thus provide a structured and comprehensive overview of the entire capabilities of a software system. Software features are often not explicitly managed. Quite the contrary,software feature-relevant information is often spread across several software engineering artifacts (e.g., user manual, issue tracking systems). It requires huge manual effort to (1) identify and extract software feature-relevant information from these artifacts in order to make software feature knowledge explicit and furthermore to (2) determine which software features the disclosed software feature-relevant information belongs to. This thesis presents a three-step-approach to semi-automatically enhance software features by software feature-relevant information from a user manual: first, a domain terminology is semi-automatically extracted from a natural language user manual based on linguistic patterns. Second, the extracted domain terminology, structural sentence information and natural language processing techniques are used to automatically identify and extract atomic software feature-relevant information with an F1-score of at least 92.00%. Finally, the determined atomic software feature-relevant information is semi-automatically assigned to existing and logically related software features. The approach is empirically evaluated by means of a user manual and corresponding gold standards of an industrial partner. This thesis provides tool support to identify and extract atomic software featurerelevant information from user manuals and furthermore recommend logically related software features.

Document type:	Dissertation
Supervisor:	Paech, Prof. Dr. Barbara
Date of thesis defense:	26 July 2018
Date Deposited:	05 Oct 2018 16:30
Date:	2018
Faculties / Institutes:	The Faculty of Mathematics and Computer Science > Department of Computer Science
DDC-classification:	004 Data processing Computer science
Controlled Keywords:	Requirements Engineering, Software Feature, Natural Langue Processing