Finding common parts in software source code versions represented by graph models
Download article in PDF format
Authors: Pogrebnoy A. V.
Annotation: Solving the problem of the objects structural similarity analysis based on graph models (GM) is one of the actual research directions of the applied graph theory. In the paper, program source code versions are used as objects for similarity analysis. Existing version control systems (VCS) work directly with the text of the program source code. Giving VCS an ability to work with GM, adequately representing program versions source code, opens new ways to improve traditional functions of the VCS and expand program development tools. It is proposed to base program versions similarity estimation on allocating biggest common part in the program source code. The aim of the study is to develop the method of the common part allocation if the program versions, represented as a GM. Research is based on the application of the graph vertices differentiation theory, where graph structure model is represented as a network of the state machines. Important method’s advantage is an ability to easily adapt to program source code features. Method includes three stages – forming and encoding list of attributes of GM vertices and edges; transforming GM into standard form as a directed graph; allocating common part in two directed graphs, representing compared program versions. Algorithm of allocating common part is invariant relatively to GM description features and set of used attributes. Example of solving the problem of the common part allocation for two compared GM fragments is provided.
Keywords: software graph model, version control system, common part of two graphs, graph similarity substitution, graph nodes differentiation