Currently, patent documents contain graphic images of device drawings, graphs, chemical and mathematical formulas, and formulas often need to be recognized and brought to a unified standard. In this work, the analysis of graphic images extracted from the descriptions of patents of the FIPS of Rospatent is carried out. Thematic filtering of mathematical and chemical formulas contained in patent documents and their recognition is provided. The theoretical value lies in the developed algorithms for parsing patents in the Yandex system.Patents; recognition of chemical and mathematical formulas among graphic patent images; translation of graphic images of chemical formulas into SMILES format; conversion of graphic images of mathematical formulas into LaTeX format. The practical significance of the work lies in the developed software module for analyzing graphic images from patent documents. The field of application of the developed system is the study of patents and the reduction of graphic images to a unified standard for solving patent search problems.
Keywords: patent, image, mathematical formula, chemical formula, LaTeX, SMILES