Formation Networks of Terms for Identifying Semantic Similarity or Difference Degree of Texts in Cybersecurity


  • Oleh Dmytrenko National Technical University of Ukraine «Igor Sikorsky Kiev Polytechnic Institute», Ukraine



This paper devoted the problem of identifying a semantic similarity degree or difference of text in cybersecurity field. The paper presents a method for comparing text documents based on the formation and comparison of the corresponding semantic networks. The directed weighted network of terms, where the nodes of such networks are key terms of the text, and edges are semantic relationships between these terms in the text are considered as a semantic network. The algorithm for formation semantic networks as one of the types of ontologies is also presented. Formation of network of term includes pre-processing of text data, extraction of key terms, construction of undirected network of terms (using the algorithm of horizontal visibility graph), determining undirected connections between terms, and further determining the directions of connections and their weight values. The Frobenius norm of the difference of matrices corresponding to the semantic networks is considered to compare the semantic networks. An identifying the critically different texts that can have similar keywords but different semantic between them is important to ensure cybersecurity. Also, the proposed approach can be helpful while solving the problem of accumulating text data semantically similar in content. In general, this approach can also be used in systems of automatic information retrieval to determine the degree of similarity or difference in the structure and semantics of texts and identify the sources of information that have a destructive impact on the information space.






Mathematical methods, models and technologies for secure cyberspace functioning research