Computer Science

Cross-Language Code Clone Detection Using Abstract Syntax Tree and Graph Neural Network

Zeina Swilam, The British University in Egypt
Abeer Hamdy, The British University in Egypt
Andreas PesterFollow

Document Type

Conference Proceeding

Publication Date

Winter 1-23-2024

Abstract

Code clones refer to code fragments that have similar functionality but may differ in syntax. When code duplication occurs, it can pose challenges during system maintenance and necessitate fixing errors in multiple locations. Existing methods for detecting code clones typically focus on clones within the same programming language. However, as the use of multiple programming languages becomes more prevalent, clones across different languages are becoming increasingly common. Recent research studies have explored the detection of cross-language code clones using Recurrent Neural Networks (RNN), specifically variants like LSTM and GRU. This paper presents an approach that combines the strengths of Abstract Syntax Trees (AST) and Graph Neural Networks (GNN) to identify cross-language code clones. The AST represents the code as a graph structure, while GNNs are capable of learning the state embeddings of each graph node, capturing information about its surroundings and the overall graph structure. Utilizing GNNs in the context of cross-language clone detection helps capture additional semantic information about the code fragments. Notably, GNNs have not been previously applied to the detection of cross-language code clones. Experimental results demonstrate that the proposed approach outperforms LSTM, GRU, and other state-of-the-art methods in terms of F1 score, precision, and recall.

Recommended Citation

Z. Swilam, A. Hamdy and A. Pester, "Cross-Language Code Clone Detection Using Abstract Syntax Tree and Graph Neural Network," 2023 International Conference on Computer and Applications (ICCA), Cairo, Egypt, 2023, pp. 1-5, doi: 10.1109/ICCA59364.2023.10401783. keywords: {Computer languages;Codes;Recurrent neural networks;Semantics;Cloning;Syntactics;Graph neural networks;Cross-Language code clones;Clone detection;Deep Learning;Abstract syntax tree;Graph neural networks.},

Link to Full Text

COinS

Computer Science

Cross-Language Code Clone Detection Using Abstract Syntax Tree and Graph Neural Network

Document Type

Publication Date

Abstract

Recommended Citation

Browse

Search

Author Corner

Links

Computer Science

Cross-Language Code Clone Detection Using Abstract Syntax Tree and Graph Neural Network

Authors

Document Type

Publication Date

Abstract

Recommended Citation

Share

Browse

Search

Author Corner

Links