THE MODEL AND METHOD OF TEXT THEMATIC STRUCTURIZATION ON THE BASIS OF STOCHASTIC AUTOMATICS

Abstract

Purpose.To propose a model for analyzing the text structure, which is characterized by the use of stochastic matrices and automata for text tracing. This allows you to study the change in the significance of certain aspects and story lines along the length of the text, to reveal a program for the representation of verbal images. Results. Stages of the text content analysis method have been given. The method difference is that the thematic aspects are distinguished in the content, the change in their significance is traced, stochastic matrixes of associative connections of entities are formed, and annotations are formulated for each significant aspect of the content. This allows thematically to structure the analyzed text. An experimental verification of the method and model is carried out. The graphs of the change in the significance of the aspects are obtained and annotations are formed for each significant aspect. For experiments to test the working capacity of the proposed method and models, two texts were selected – a technical text with a volume of 2400 words (without a title, conclusions and a list of references), and an artistic text, with a volume of 1500 words. The keywords were selected using the TextAnalyst program. The plot lines were constructed graphs of the significance change of aspects. Significant aspects were highlighted and annotations were formed. The compression ratio was 10%. Practical value. The experiment results confirm the method operability. As a merit of the proposed approach, one can note the relative simplicity of implementation and 100% completeness and even some redundancy of the display of relevant information in the collection of annotations. Redundancy is easily eliminated by threshold processing of the found material for each aspect. The original structure of the document has practically no value. It can be arbitrary texts or sets of values of fields of a heterogeneous database. In future studies, it is planned to analyze trends in the signifi-cance of aspects, to study the correlation of aspects, their consistency, and to construct the phase structure of the text. When using knowledge bases, it is possible to identify contradictions, collisions, and build meaningful interpretations. References 12, figure 1, table 2.

Authors and Affiliations

I. Shevchenko, A. Lebedinets, D. Vasilyev

Keywords

Related Articles

IMPROVED BUNKER RELIABILITY ON WATER-TRANSPORT AS FACTOR OF PROVIDING ECOLOGICAL SAFETY

Purpose. The purpose of the article is the research of reliability of the process of bunkering of ships. The reliability is a property of a technical object to maintain the values of all parameters that characterize the...

STUDYING OF THE SCHOOL SYSTEM DEVELOPMENT PROBLEM OF POLTAVA REGION IN THE EARLY 50-ies OF THE XX CENTURY

Purpose. The purpose of this research is to show the post-war public education in Poltava region and to uncover its problems and shortcomings on the basis of the sources analysis. Methodology. A systematic and complex a...

PRINCIPLES OF PHYTOTOXICOLOGICAL NORMALIZATION OF METALS

Purpose. To create principles of phytotoxic normalization of metals in soil. Methodology. Using the approach of calculation the Maximum Acceptable Toxicant Concentration, it has been proposed the calculation of Phyto Max...

INFLUENCE OF ANISOTROPY, THERMAL SENSITIVITY AND HIGH-TEMPERATURE HEAT EXCHANGE ON THE THERMONALIZED STATE OF A QUILL CYLINDER

Purpose. To develop and approbate new approach to the decision of tasks of quasistatic thermoelasticity of anisotropic bodies, taking into account thermosensitivity of material and high-temperature heat exchange with the...

STUDY OF INFLUENCES OF DANGEROUS PRODUCTS DECOMPOSITION FROM MUNICIPAL SOLID WASTE

Purpose. To test samples of the atmospheric air along with the main chemical composition indicators for the filtration water from the filtrate discharge drain in the impact zone of the solid waste disposal plant. Methodo...

Download PDF file
  • EP ID EP659434
  • DOI 10.30929/1995-0519.2018.1.29-37
  • Views 95
  • Downloads 0

How To Cite

I. Shevchenko, A. Lebedinets, D. Vasilyev (2018). THE MODEL AND METHOD OF TEXT THEMATIC STRUCTURIZATION ON THE BASIS OF STOCHASTIC AUTOMATICS. Вісник Кременчуцького національного університету імені Михайла Остроградського, 1(108), 29-37. https://www.europub.co.uk/articles/-A-659434