Technique for determining the author of software code based on multi-view
DOI: 10.21293/1818-0442-2025-28-3-59-65
DOI: 10.21293/1818-0442-2025-28-3-59-65
Abstract: This paper presents a new method for identifying the author of software code based on a multi-view approach. The aim of the study is to improve the accuracy and robustness of authorship identification by combining different representations of soft-ware code: source code, abstract syntax tree, control flow graph, and disassembled code. Modern machine learning meth-ods were used to build models, allowing for the integration and analysis of complex features from different sources. The exper-iments showed that the developed multi-view architecture pro-vides a significant improvement in the quality of identification compared to traditional approaches using only one representa-tion of the code. Thus, in tasks with a closed set of authors, accuracy and F1-macro values of up to 0.97 were achieved, and on open sets, high resistance to the emergence of new authors and variability of programming styles was noted. In the author verification task, complex features made it possible to achieve accuracy of up to 0.98 and reduce the EER error to 0.04.
Keywords: verification, au-thorship, graph representation, disassembler, source code, software
Authors and copyright holders:
—
For citation:
Kurtukova A. V. Technique for determining the author of software code based on multi-view. Doklady Tomskogo gosudarstvennogo universiteta sistem upravleniya i radioelektroniki, 2025, vol. 28, no. 3, pp. 59–65. DOI: 10.21293/1818-0442-2025-28-3-59-65
Executive Secretary of the Editor’s Office
Editor’s Office: 40 Lenina Prospect, Tomsk, 634050, Russia
Phone / Fax: + 7 (3822) 701-582
Viktor N. Maslennikov
Executive Secretary of the Editor’s Office
Editor’s Office: 40 Lenina Prospect, Tomsk, 634050, Russia
Phone / Fax: + 7 (3822) 51-21-21 / 51-43-02