8 research outputs found
Π’Π΅Ρ Π½ΠΎΠ»ΠΎΠ³ΠΈΠΈ ΠΊΠΎΠΌΠΏΠ»Π΅ΠΊΡΠ½ΠΎΠ³ΠΎ ΠΈΠ½ΡΠ΅Π»Π»Π΅ΠΊΡΡΠ°Π»ΡΠ½ΠΎΠ³ΠΎ Π°Π½Π°Π»ΠΈΠ·Π° ΠΊΠ»ΠΈΠ½ΠΈΡΠ΅ΡΠΊΠΈΡ Π΄Π°Π½Π½ΡΡ
The paper presents the system for intelligent analysis of clinical information. Authors describe methods implemented in the system for clinical information retrieval, intelligent diagnostics of chronic diseases, patientβs features importance and for detection of hidden dependencies between features. Results of the experimental evaluation of these methods are also presented.Background: Healthcare facilities generate a large flow of both structured and unstructured data which contain important information about patients. Test results are usually retained as structured data but some data is retained in the form of natural language texts (medical history, the results of physical examination, and the results of other examinations, such as ultrasound, ECG or X-ray studies). Many tasks arising in clinical practice can be automated applying methods for intelligent analysis of accumulated structured array and unstructured data that leads to improvement of the healthcare quality.Aims: the creation of the complex system for intelligent data analysis in the multi-disciplinary pediatric center.Materials and methods: Authors propose methods for information extraction from clinical texts in Russian. The methods are carried out on the basis of deep linguistic analysis. They retrieve terms of diseases, symptoms, areas of the body and drugs. The methods can recognize additional attributes such as Β«negationΒ» (indicates that the disease is absent), Β«no patientΒ» (indicates that the disease refers to the patientβs family member, but not to the patient), Β«severity of illnessΒ», Β«disease courseΒ», Β«body region to which the disease refersΒ». Authors use a set of hand-drawn templates and various techniques based on machine learning to retrieve information using a medical thesaurus. The extracted information is used to solve the problem of automatic diagnosis of chronic diseases. A machine learning method for classification of patients with similar nosology and the method for determining the most informative patientsβ features are also proposed.Results: Authors have processed anonymized health records from the pediatric center to estimate the proposed methods. The results show the applicability of the information extracted from the texts for solving practical problems. The records of patients with allergic, glomerular and rheumatic diseases were used for experimental assessment of the method of automatic diagnostic. Authors have also determined the most appropriate machine learning methods for classification of patients for each group of diseases, as well as the most informative disease signs. It has been found that using additional information extracted from clinical texts, together with structured data helps to improve the quality of diagnosis of chronic diseases. Authors have also obtained pattern combinations of signs of diseases.Conclusions: The proposed methods have been implemented in the intelligent data processing system for a multidisciplinary pediatric center. The experimental results show the availability of the system to improve the quality of pediatric healthcare.Β ΠΠ±ΠΎΡΠ½ΠΎΠ²Π°Π½ΠΈΠ΅. ΠΠ΅Π΄ΠΈΡΠΈΠ½ΡΠΊΠΈΠ΅ ΡΡΡΠ΅ΠΆΠ΄Π΅Π½ΠΈΡ Π³Π΅Π½Π΅ΡΠΈΡΡΡΡ Π±ΠΎΠ»ΡΡΠΎΠΉ ΠΏΠΎΡΠΎΠΊ ΠΊΠ°ΠΊ ΡΡΡΡΠΊΡΡΡΠΈΡΠΎΠ²Π°Π½Π½ΡΡ
, ΡΠ°ΠΊ ΠΈ Π½Π΅ΡΡΡΡΠΊΡΡΡΠΈΡΠΎΠ²Π°Π½Π½ΡΡ
Π΄Π°Π½Π½ΡΡ
, ΡΠΎΠ΄Π΅ΡΠΆΠ°ΡΠΈΡ
Π²Π°ΠΆΠ½ΡΡ ΠΈΠ½ΡΠΎΡΠΌΠ°ΡΠΈΡ ΠΎ ΠΏΠ°ΡΠΈΠ΅Π½ΡΠ°Ρ
. Π ΡΡΡΡΠΊΡΡΡΠΈΡΠΎΠ²Π°Π½Π½ΠΎΠΌ Π²ΠΈΠ΄Π΅, ΠΊΠ°ΠΊ ΠΏΡΠ°Π²ΠΈΠ»ΠΎ, Ρ
ΡΠ°Π½ΡΡΡΡ ΡΠ΅Π·ΡΠ»ΡΡΠ°ΡΡ Π°Π½Π°Π»ΠΈΠ·ΠΎΠ², ΠΎΠ΄Π½Π°ΠΊΠΎ ΠΏΠΎΠ΄Π°Π²Π»ΡΡΡΠ΅Π΅ ΠΊΠΎΠ»ΠΈΡΠ΅ΡΡΠ²ΠΎ Π΄Π°Π½Π½ΡΡ
Ρ
ΡΠ°Π½ΠΈΡΡΡ Π² Π½Π΅ΡΡΡΡΠΊΡΡΡΠΈΡΠΎΠ²Π°Π½Π½ΠΎΠΉ ΡΠΎΡΠΌΠ΅ Π² Π²ΠΈΠ΄Π΅ ΡΠ΅ΠΊΡΡΠΎΠ² Π½Π° Π΅ΡΡΠ΅ΡΡΠ²Π΅Π½Π½ΠΎΠΌ ΡΠ·ΡΠΊΠ΅ (Π°Π½Π°ΠΌΠ½Π΅Π·Ρ, ΡΠ΅Π·ΡΠ»ΡΡΠ°ΡΡ ΠΎΡΠΌΠΎΡΡΠΎΠ², ΠΎΠΏΠΈΡΠ°Π½ΠΈΡ ΡΠ΅Π·ΡΠ»ΡΡΠ°ΡΠΎΠ² ΠΎΠ±ΡΠ»Π΅Π΄ΠΎΠ²Π°Π½ΠΈΠΉ, ΡΠ°ΠΊΠΈΡ
ΠΊΠ°ΠΊ Π£ΠΠ, ΠΠΠ, ΡΠ΅Π½ΡΠ³Π΅Π½ΠΎΠ²ΡΠΊΠΈΡ
ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°Π½ΠΈΠΉ ΠΈ Π΄Ρ.). ΠΡΠΏΠΎΠ»ΡΠ·ΡΡ ΠΌΠ΅ΡΠΎΠ΄Ρ ΠΈΠ½ΡΠ΅Π»Π»Π΅ΠΊΡΡΠ°Π»ΡΠ½ΠΎΠΉ ΠΎΠ±ΡΠ°Π±ΠΎΡΠΊΠΈ Π½Π°ΠΊΠΎΠΏΠ»Π΅Π½Π½ΡΡ
ΠΌΠ°ΡΡΠΈΠ²ΠΎΠ² ΡΡΡΡΠΊΡΡΡΠΈΡΠΎΠ²Π°Π½Π½ΡΡ
ΠΈ Π½Π΅ΡΡΡΡΠΊΡΡΡΠΈΡΠΎΠ²Π°Π½Π½ΡΡ
Π΄Π°Π½Π½ΡΡ
, ΠΌΠΎΠΆΠ½ΠΎ Π°Π²ΡΠΎΠΌΠ°ΡΠΈΠ·ΠΈΡΠΎΠ²Π°ΡΡ ΡΠ΅ΡΠ΅Π½ΠΈΠ΅ ΠΌΠ½ΠΎΠ³ΠΈΡ
Π·Π°Π΄Π°Ρ, Π²ΠΎΠ·Π½ΠΈΠΊΠ°ΡΡΠΈΡ
Π² ΠΊΠ»ΠΈΠ½ΠΈΡΠ΅ΡΠΊΠΎΠΉ ΠΏΡΠ°ΠΊΡΠΈΠΊΠ΅ ΠΈ ΠΏΠΎΠ²ΡΡΠΈΡΡ ΠΊΠ°ΡΠ΅ΡΡΠ²ΠΎ ΠΌΠ΅Π΄ΠΈΡΠΈΠ½ΡΠΊΠΎΠΉ ΠΏΠΎΠΌΠΎΡΠΈ.Π¦Π΅Π»Ρ ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°Π½ΠΈΡ:Β ΡΠΎΠ·Π΄Π°Π½ΠΈΠ΅ ΠΊΠΎΠΌΠΏΠ»Π΅ΠΊΡΠ½ΠΎΠΉ ΡΠΈΡΡΠ΅ΠΌΡ ΠΈΠ½ΡΠ΅Π»Π»Π΅ΠΊΡΡΠ°Π»ΡΠ½ΠΎΠΉ ΠΎΠ±ΡΠ°Π±ΠΎΡΠΊΠΈ Π΄Π°Π½Π½ΡΡ
Π² ΠΌΠ½ΠΎΠ³ΠΎΠΏΡΠΎΡΠΈΠ»ΡΠ½ΠΎΠΌ ΠΏΠ΅Π΄ΠΈΠ°ΡΡΠΈΡΠ΅ΡΠΊΠΎΠΌ ΡΠ΅Π½ΡΡΠ΅.ΠΠ΅ΡΠΎΠ΄Ρ. ΠΠ·Π²Π»Π΅ΡΠ΅Π½ΠΈΠ΅ ΠΈΠ½ΡΠΎΡΠΌΠ°ΡΠΈΠΈ ΠΈΠ· ΠΊΠ»ΠΈΠ½ΠΈΡΠ΅ΡΠΊΠΈΡ
ΡΠ΅ΠΊΡΡΠΎΠ² Π½Π° ΡΡΡΡΠΊΠΎΠΌ ΡΠ·ΡΠΊΠ΅ ΠΎΡΡΡΠ΅ΡΡΠ²Π»ΡΠ΅ΡΡΡ Π½Π° ΠΎΡΠ½ΠΎΠ²Π΅ ΠΏΠΎΠ»Π½ΠΎΠ³ΠΎ Π»ΠΈΠ½Π³Π²ΠΈΡΡΠΈΡΠ΅ΡΠΊΠΎΠ³ΠΎ Π°Π½Π°Π»ΠΈΠ·Π°. ΠΠ·Π²Π»Π΅ΠΊΠ°ΡΡΡΡ ΡΠΏΠΎΠΌΠΈΠ½Π°Π½ΠΈΡ Π·Π°Π±ΠΎΠ»Π΅Π²Π°Π½ΠΈΠΉ, ΡΠΈΠΌΠΏΡΠΎΠΌΠΎΠ², ΠΎΠ±Π»Π°ΡΡΠ΅ΠΉ ΡΠ΅Π»Π°, Π»Π΅ΠΊΠ°ΡΡΡΠ²Π΅Π½Π½ΡΡ
ΠΏΡΠ΅ΠΏΠ°ΡΠ°ΡΠΎΠ². Π ΡΠ΅ΠΊΡΡΠ΅ ΡΠ°ΠΊΠΆΠ΅ ΡΠ°ΡΠΏΠΎΠ·Π½Π°ΡΡΡΡ Π°ΡΡΠΈΠ±ΡΡΡ Π·Π°Π±ΠΎΠ»Π΅Π²Π°Π½ΠΈΠΉ: Β«ΠΎΡΡΠΈΡΠ°Π½ΠΈΠ΅Β» (ΡΠΊΠ°Π·ΡΠ²Π°Π΅Ρ Π½Π° ΡΠΎ, ΡΡΠΎ Π·Π°Π±ΠΎΠ»Π΅Π²Π°Π½ΠΈΠ΅ ΠΎΡΡΡΡΡΡΠ²ΡΠ΅Ρ), Β«Π½Π΅ ΠΏΠ°ΡΠΈΠ΅Π½ΡΒ» (ΡΠΊΠ°Π·ΡΠ²Π°Π΅Ρ Π½Π° ΡΠΎ, ΡΡΠΎ Π·Π°Π±ΠΎΠ»Π΅Π²Π°Π½ΠΈΠ΅ ΠΎΡΠ½ΠΎΡΠΈΡΡΡ Π½Π΅ ΠΊ ΠΏΠ°ΡΠΈΠ΅Π½ΡΡ, Π° ΠΊ Π΅Π³ΠΎ ΡΠΎΠ΄ΡΡΠ²Π΅Π½Π½ΠΈΠΊΡ), Β«ΡΡΠΆΠ΅ΡΡΡ Π·Π°Π±ΠΎΠ»Π΅Π²Π°Π½ΠΈΡΒ», Β«ΡΠ΅ΡΠ΅Π½ΠΈΠ΅ Π·Π°Π±ΠΎΠ»Π΅Π²Π°Π½ΠΈΡΒ», Β«ΠΎΠ±Π»Π°ΡΡΡ ΡΠ΅Π»Π°, ΠΊ ΠΊΠΎΡΠΎΡΠΎΠΉ ΠΎΡΠ½ΠΎΡΠΈΡΡΡ Π·Π°Π±ΠΎΠ»Π΅Π²Π°Π½ΠΈΠ΅Β». ΠΠ»Ρ ΠΈΠ·Π²Π»Π΅ΡΠ΅Π½ΠΈΡ ΠΈΠ½ΡΠΎΡΠΌΠ°ΡΠΈΠΈ ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΡΡΡΡ ΠΌΠ΅Π΄ΠΈΡΠΈΠ½ΡΠΊΠΈΠ΅ ΡΠ΅Π·Π°ΡΡΡΡΡ, Π½Π°Π±ΠΎΡ Π²ΡΡΡΠ½ΡΡ ΡΠΎΡΡΠ°Π²Π»Π΅Π½Π½ΡΡ
ΡΠ°Π±Π»ΠΎΠ½ΠΎΠ², Π° ΡΠ°ΠΊΠΆΠ΅ ΡΠ°Π·Π»ΠΈΡΠ½ΡΠ΅ ΠΌΠ΅ΡΠΎΠ΄Ρ Π½Π° ΠΎΡΠ½ΠΎΠ²Π΅ ΠΌΠ°ΡΠΈΠ½Π½ΠΎΠ³ΠΎ ΠΎΠ±ΡΡΠ΅Π½ΠΈΡ. ΠΠΎΠ»ΡΡΠ΅Π½Π½ΡΠ΅ ΠΈΠ· ΡΠ΅ΠΊΡΡΠΎΠ² Π΄Π°Π½Π½ΡΠ΅ ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΡΡΡΡ Π΄Π»Ρ ΡΠ΅ΡΠ΅Π½ΠΈΡ Π·Π°Π΄Π°ΡΠΈ Π°Π²ΡΠΎΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΎΠΉ Π΄ΠΈΠ°Π³Π½ΠΎΡΡΠΈΠΊΠΈ Ρ
ΡΠΎΠ½ΠΈΡΠ΅ΡΠΊΠΈΡ
Π·Π°Π±ΠΎΠ»Π΅Π²Π°Π½ΠΈΠΉ. ΠΡΠ΅Π΄Π»ΠΎΠΆΠ΅Π½ ΠΌΠ΅ΡΠΎΠ΄ Π½Π° ΠΎΡΠ½ΠΎΠ²Π΅ ΠΌΠ°ΡΠΈΠ½Π½ΠΎΠ³ΠΎ ΠΎΠ±ΡΡΠ΅Π½ΠΈΡ Π΄Π»Ρ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ ΠΏΠ°ΡΠΈΠ΅Π½ΡΠΎΠ² ΡΠΎ ΡΡ
ΠΎΠΆΠΈΠΌΠΈ Π½ΠΎΠ·ΠΎΠ»ΠΎΠ³ΠΈΡΠΌΠΈ, Π° ΡΠ°ΠΊΠΆΠ΅ ΠΌΠ΅ΡΠΎΠ΄ Π΄Π»Ρ ΠΎΠΏΡΠ΅Π΄Π΅Π»Π΅Π½ΠΈΡ Π½Π°ΠΈΠ±ΠΎΠ»Π΅Π΅ ΠΈΠ½ΡΠΎΡΠΌΠ°ΡΠΈΠ²Π½ΡΡ
ΠΏΡΠΈΠ·Π½Π°ΠΊΠΎΠ².Π Π΅Π·ΡΠ»ΡΡΠ°ΡΡ. ΠΠΊΡΠΏΠ΅ΡΠΈΠΌΠ΅Π½ΡΠ°Π»ΡΠ½ΠΎΠ΅ ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°Π½ΠΈΠ΅ ΡΠ°Π·ΡΠ°Π±ΠΎΡΠ°Π½Π½ΡΡ
ΠΌΠ΅ΡΠΎΠ΄ΠΎΠ² ΠΏΡΠΎΠ²ΠΎΠ΄ΠΈΠ»ΠΎΡΡ Π½Π° ΠΎΠ±Π΅Π·Π»ΠΈΡΠ΅Π½Π½ΡΡ
ΠΈΡΡΠΎΡΠΈΡΡ
Π±ΠΎΠ»Π΅Π·Π½ΠΈ ΠΏΠ°ΡΠΈΠ΅Π½ΡΠΎΠ² ΠΏΠ΅Π΄ΠΈΠ°ΡΡΠΈΡΠ΅ΡΠΊΠΎΠ³ΠΎ ΡΠ΅Π½ΡΡΠ°. ΠΡΠΎΠ²Π΅Π΄Π΅Π½Π° ΠΎΡΠ΅Π½ΠΊΠ° ΠΊΠ°ΡΠ΅ΡΡΠ²Π° ΡΠ°Π·ΡΠ°Π±ΠΎΡΠ°Π½Π½ΡΡ
ΠΌΠ΅ΡΠΎΠ΄ΠΎΠ² ΠΈΠ·Π²Π»Π΅ΡΠ΅Π½ΠΈΡ ΠΈΠ½ΡΠΎΡΠΌΠ°ΡΠΈΠΈ ΠΈΠ· ΠΊΠ»ΠΈΠ½ΠΈΡΠ΅ΡΠΊΠΈΡ
ΡΠ΅ΠΊΡΡΠΎΠ² Π½Π° ΡΡΡΡΠΊΠΎΠΌ ΡΠ·ΡΠΊΠ΅. ΠΡΠΎΠ²Π΅Π΄Π΅Π½Π° ΡΠΊΡΠΏΠ΅ΡΠΈΠΌΠ΅Π½ΡΠ°Π»ΡΠ½Π°Ρ ΠΎΡΠ΅Π½ΠΊΠ° ΠΌΠ΅ΡΠΎΠ΄Π° Π°Π²ΡΠΎΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΎΠΉ Π΄ΠΈΠ°Π³Π½ΠΎΡΡΠΈΠΊΠΈ Π½Π° Π΄Π°Π½Π½ΡΡ
ΠΏΠ°ΡΠΈΠ΅Π½ΡΠΎΠ² Ρ Π°Π»Π»Π΅ΡΠ³ΠΈΡΠ΅ΡΠΊΠΈΠΌΠΈ Π·Π°Π±ΠΎΠ»Π΅Π²Π°Π½ΠΈΡΠΌΠΈ ΠΈ Π±ΠΎΠ»Π΅Π·Π½ΡΠΌΠΈ ΠΎΡΠ³Π°Π½ΠΎΠ² Π΄ΡΡ
Π°Π½ΠΈΡ, Π½Π΅ΡΡΠΎΠ»ΠΎΠ³ΠΈΡΠ΅ΡΠΊΠΈΠΌΠΈ ΠΈ ΡΠ΅Π²ΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΈΠΌΠΈ Π·Π°Π±ΠΎΠ»Π΅Π²Π°Π½ΠΈΡΠΌΠΈ. ΠΠΏΡΠ΅Π΄Π΅Π»Π΅Π½Ρ Π½Π°ΠΈΠ±ΠΎΠ»Π΅Π΅ ΠΏΠΎΠ΄Ρ
ΠΎΠ΄ΡΡΠΈΠ΅ ΠΌΠ΅ΡΠΎΠ΄Ρ ΠΌΠ°ΡΠΈΠ½Π½ΠΎΠ³ΠΎ ΠΎΠ±ΡΡΠ΅Π½ΠΈΡ Π΄Π»Ρ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ ΠΏΠ°ΡΠΈΠ΅Π½ΡΠΎΠ² Π΄Π»Ρ ΠΊΠ°ΠΆΠ΄ΠΎΠΉ Π³ΡΡΠΏΠΏΡ Π·Π°Π±ΠΎΠ»Π΅Π²Π°Π½ΠΈΠΉ, Π° ΡΠ°ΠΊΠΆΠ΅ Π½Π°ΠΈΠ±ΠΎΠ»Π΅Π΅ ΠΈΠ½ΡΠΎΡΠΌΠ°ΡΠΈΠ²Π½ΡΠ΅ ΠΏΡΠΈΠ·Π½Π°ΠΊΠΈ. ΠΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ Π΄Π°Π½Π½ΡΡ
, ΠΈΠ·Π²Π»Π΅ΡΠ΅Π½Π½ΡΡ
ΠΈΠ· ΠΊΠ»ΠΈΠ½ΠΈΡΠ΅ΡΠΊΠΈΡ
ΡΠ΅ΠΊΡΡΠΎΠ² ΡΠΎΠ²ΠΌΠ΅ΡΡΠ½ΠΎ ΡΠΎ ΡΡΡΡΠΊΡΡΡΠΈΡΠΎΠ²Π°Π½Π½ΡΠΌΠΈ Π΄Π°Π½Π½ΡΠΌΠΈ, ΠΏΠΎΠ·Π²ΠΎΠ»ΠΈΠ»ΠΎ ΠΏΠΎΠ²ΡΡΠΈΡΡ ΠΊΠ°ΡΠ΅ΡΡΠ²ΠΎ Π΄ΠΈΠ°Π³Π½ΠΎΡΡΠΈΠΊΠΈ Ρ
ΡΠΎΠ½ΠΈΡΠ΅ΡΠΊΠΈΡ
Π·Π°Π±ΠΎΠ»Π΅Π²Π°Π½ΠΈΠΉ ΠΏΠΎ ΡΡΠ°Π²Π½Π΅Π½ΠΈΡ Ρ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ΠΌ Π»ΠΈΡΡ Π΄ΠΎΡΡΡΠΏΠ½ΡΡ
ΡΡΡΡΠΊΡΡΡΠΈΡΠΎΠ²Π°Π½Π½ΡΡ
Π΄Π°Π½Π½ΡΡ
. ΠΠΎΠ»ΡΡΠ΅Π½Ρ ΡΠ°ΠΊΠΆΠ΅ ΡΠ°Π±Π»ΠΎΠ½Π½ΡΠ΅ ΠΊΠΎΠΌΠ±ΠΈΠ½Π°ΡΠΈΠΈ ΠΏΡΠΈΠ·Π½Π°ΠΊΠΎΠ² Π·Π°Π±ΠΎΠ»Π΅Π²Π°Π½ΠΈΠΉ.ΠΠ°ΠΊΠ»ΡΡΠ΅Π½ΠΈΠ΅. Π Π°Π·ΡΠ°Π±ΠΎΡΠ°Π½Π½ΡΠ΅ ΠΌΠ΅ΡΠΎΠ΄Ρ Π±ΡΠ»ΠΈ ΡΠ΅Π°Π»ΠΈΠ·ΠΎΠ²Π°Π½Ρ Π² ΡΠΈΡΡΠ΅ΠΌΠ΅ ΠΈΠ½ΡΠ΅Π»Π»Π΅ΠΊΡΡΠ°Π»ΡΠ½ΠΎΠΉ ΠΎΠ±ΡΠ°Π±ΠΎΡΠΊΠΈ Π΄Π°Π½Π½ΡΡ
Π² ΠΌΠ½ΠΎΠ³ΠΎΠΏΡΠΎΡΠΈΠ»ΡΠ½ΠΎΠΌ ΠΏΠ΅Π΄ΠΈΠ°ΡΡΠΈΡΠ΅ΡΠΊΠΎΠΌ ΡΠ΅Π½ΡΡΠ΅. ΠΡΠΎΠ²Π΅Π΄Π΅Π½Π½ΡΠ΅ ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°Π½ΠΈΡ ΡΠ²ΠΈΠ΄Π΅ΡΠ΅Π»ΡΡΡΠ²ΡΡΡ ΠΎ ΠΏΠ΅ΡΡΠΏΠ΅ΠΊΡΠΈΠ²Π½ΠΎΡΡΠΈ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΡ ΡΠΈΡΡΠ΅ΠΌΡ Π΄Π»Ρ ΠΏΠΎΠ²ΡΡΠ΅Π½ΠΈΡ ΠΊΠ°ΡΠ΅ΡΡΠ²Π° ΠΌΠ΅Π΄ΠΈΡΠΈΠ½ΡΠΊΠΎΠΉ ΠΏΠΎΠΌΠΎΡΠΈ ΠΏΠ°ΡΠΈΠ΅Π½ΡΠ°ΠΌ Π΄Π΅ΡΡΠΊΠΎΠΉ Π²ΠΎΠ·ΡΠ°ΡΡΠ½ΠΎΠΉ ΠΊΠ°ΡΠ΅Π³ΠΎΡΠΈΠΈ
ΠΠΎΠ½ΠΈΡΠΎΡΠΈΠ½Π³ ΡΡΠ΅Π·Π²ΡΡΠ°ΠΉΠ½ΡΡ ΠΏΡΠΎΠΈΡΡΠ΅ΡΡΠ²ΠΈΠΉ Ρ ΠΏΠΎΠΌΠΎΡΡΡ Π°Π½Π°Π»ΠΈΠ·Π° Π΄Π°Π½Π½ΡΡ ΠΈΠ· ΡΠΎΡΠΈΠ°Π»ΡΠ½ΡΡ ΡΠ΅ΡΠ΅ΠΉ
The paper presents a prototype of a system for monitoring emergency events in a particular geographic region by analyzing social media data. We consider architecture, the main components of the system, as well as methods for crawling and processing emergency-related messages. The methods provide functionality for collecting emergency reports, information extraction, including the names of geographical locations and names of vessels, text classification, as well as new emergencies detection, and visualizing extracted events on a geographical map. As one of the possible future functions of the system, it is proposed to consider the evaluation of the informative nature of messages published in social networks and other sources. Evaluation of informativeness could be useful both in data collection and in the calculation of the relevance of answers when searching information in the system.ΠΡΡΠ»Π΅Π΄ΠΎΠ²Π°Π½ΠΎ ΡΠΎΠ·Π΄Π°Π½ΠΈΠ΅ ΠΏΡΠΎΡΠΎΡΠΈΠΏΠ° ΡΠΈΡΡΠ΅ΠΌΡ Π΄Π»Ρ ΡΠ΅ΡΠ΅Π½ΠΈΡ Π·Π°Π΄Π°Ρ ΠΌΠΎΠ½ΠΈΡΠΎΡΠΈΠ½Π³Π° ΡΡΠ΅Π·Π²ΡΡΠ°ΠΉΠ½ΡΡ
ΡΠΈΡΡΠ°ΡΠΈΠΉ (Π§Π‘) Π² Π·Π°Π΄Π°Π½Π½ΠΎΠΉ Π³Π΅ΠΎΠ³ΡΠ°ΡΠΈΡΠ΅ΡΠΊΠΎΠΉ ΠΎΠ±Π»Π°ΡΡΠΈ Π½Π° ΠΎΡΠ½ΠΎΠ²Π΅ Π°Π½Π°Π»ΠΈΠ·Π° Π΄Π°Π½Π½ΡΡ
ΠΈΠ· ΡΠΎΡΠΈΠ°Π»ΡΠ½ΡΡ
ΡΠ΅ΡΠ΅ΠΉ. Π Π°ΡΡΠΌΠΎΡΡΠ΅Π½Π° Π°ΡΡ
ΠΈΡΠ΅ΠΊΡΡΡΠ° ΡΠΈΡΡΠ΅ΠΌΡ, Π΅Π΅ ΠΎΡΠ½ΠΎΠ²Π½ΡΠ΅ ΠΊΠΎΠΌΠΏΠΎΠ½Π΅Π½ΡΡ, Π° ΡΠ°ΠΊΠΆΠ΅ Π»Π΅ΠΆΠ°ΡΠΈΠ΅ Π² ΠΈΡ
ΠΎΡΠ½ΠΎΠ²Π΅ ΠΌΠ΅ΡΠΎΠ΄Ρ ΡΠ±ΠΎΡΠ° ΠΈ Π°Π½Π°Π»ΠΈΠ·Π° ΡΠ΅ΠΊΡΡΠΎΠ²ΠΎΠΉ ΠΈΠ½ΡΠΎΡΠΌΠ°ΡΠΈΠΈ. ΠΠΏΠΈΡΠ°Π½Ρ ΠΌΠ΅ΡΠΎΠ΄Ρ ΡΡΠΎΠΊΡΡΠΈΡΠΎΠ²Π°Π½Π½ΠΎΠ³ΠΎ ΡΠ±ΠΎΡΠ° ΡΠΎΠΎΠ±ΡΠ΅Π½ΠΈΠΉ ΠΎ Π§Π‘ ΠΈΠ· ΡΠ°Π·Π½ΠΎΡΠΎΠ΄Π½ΡΡ
ΠΈΡΡΠΎΡΠ½ΠΈΠΊΠΎΠ², ΠΈΠ·Π²Π»Π΅ΡΠ΅Π½ΠΈΡ ΠΈΠ½ΡΠΎΡΠΌΠ°ΡΠΈΠΈ ΠΈΠ· ΡΠ΅ΠΊΡΡΠΎΠ², Π²ΠΊΠ»ΡΡΠ°Ρ Π½Π°Π·Π²Π°Π½ΠΈΡ Π³Π΅ΠΎΠ³ΡΠ°ΡΠΈΡΠ΅ΡΠΊΠΈΡ
ΠΎΠ±ΡΠ΅ΠΊΡΠΎΠ² ΠΈ Π½Π°ΠΈΠΌΠ΅Π½ΠΎΠ²Π°Π½ΠΈΡ ΠΌΠΎΡΡΠΊΠΈΡ
ΠΈ ΡΠ΅ΡΠ½ΡΡ
ΡΡΠ΄ΠΎΠ², ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ ΡΠΎΠΎΠ±ΡΠ΅Π½ΠΈΠΉ, Π° ΡΠ°ΠΊΠΆΠ΅ ΠΌΠ΅ΡΠΎΠ΄Ρ ΠΎΠ±Π½Π°ΡΡΠΆΠ΅Π½ΠΈΡ Π½ΠΎΠ²ΡΡ
Π§Π‘ Π² ΠΏΠΎΡΠΎΠΊΠ΅ ΡΠΎΠΎΠ±ΡΠ΅Π½ΠΈΠΉ ΠΈ ΠΈΡ
Π²ΠΈΠ·ΡΠ°Π»ΠΈΠ·Π°ΡΠΈΠΈ Π½Π° Π³Π΅ΠΎΠ³ΡΠ°ΡΠΈΡΠ΅ΡΠΊΠΎΠΉ ΠΊΠ°ΡΡΠ΅
Towards Automated Identification of Technological Trajectories
The paper presents a text mining approach to identifying technological trajectories. The main problem addressed is the selection of documents related to a particular technology. These documents are needed to identify a trajectory of the technology. Two different methods were compared (based on word2vec and lexical-morphological and syntactic search). The aim of developed approach is to retrieve more information about a given technology and about technologies that could affect its development. We present the results of experiments on a dataset containing over 4.4 million of documents as a part of USPTO patent database. Self-driving car technology was chosen as an example. The result of the research shows that the developed methods are useful for automated information retrieval as the first stage of the analysis and identification of technological trajectories. Β© Springer Nature Switzerland AG 2019
ΠΠ° ΠΏΡΡΠΈ ΠΊ Π°Π²ΡΠΎΠΌΠ°ΡΠΈΠ·ΠΈΡΠΎΠ²Π°Π½Π½ΠΎΠΌΡ ΠΌΠ΅ΡΠ°-Π°Π½Π°Π»ΠΈΠ·Ρ Π±ΠΈΠΎΠΌΠ΅Π΄ΠΈΡΠΈΠ½ΡΠΊΠΈΡ ΡΠ΅ΠΊΡΡΠΎΠ² Π² ΠΎΠ±Π»Π°ΡΡΠΈ ΠΊΠ»Π΅ΡΠΎΡΠ½ΠΎΠΉ ΠΈΠΌΠΌΡΠ½ΠΎΡΠ΅ΡΠ°ΠΏΠΈΠΈ
Cell-based immunotherapy is a promising approach for the treatment of chronic infections, autoimmune disorders, and malignant tumors. There are many strategies of cell-based immunotherapy of cancer; these include injection of various immune effector cells, propagated and Β«trainedΒ» in a cell culture. Alternatively, cells presenting tumor antigens on their surface in a form recognized by the immune system can be used to achieve a therapeutic effect. The research results in this field are presented in thousands of texts, and their manual analysis is very complicated. We have developed an approach for automated text analysis in this area of biomedical science. Here we present the first results of the automated analysis of the data extracted from abstracts of scientific articles available in PubMed. These results demonstrate the associations between types of tumors and the most commonly used methods of their cell-based immunotherapy.ΠΠ»Π΅ΡΠΎΡΠ½Π°Ρ ΠΈΠΌΠΌΡΠ½ΠΎΡΠ΅ΡΠ°ΠΏΠΈΡ ΡΡΠΎ ΠΏΠ΅ΡΡΠΏΠ΅ΠΊΡΠΈΠ²Π½ΡΠΉ ΠΏΠΎΠ΄Ρ
ΠΎΠ΄ ΠΊ Π»Π΅ΡΠ΅Π½ΠΈΡ Ρ
ΡΠΎΠ½ΠΈΡΠ΅ΡΠΊΠΈΡ
ΠΈΠ½ΡΠ΅ΠΊΡΠΈΠΉ, Π°ΡΡΠΎΠΈΠΌΠΌΡΠ½Π½ΡΡ
Π½Π°ΡΡΡΠ΅Π½ΠΈΠΉ ΠΈ Π·Π»ΠΎΠΊΠ°ΡΠ΅ΡΡΠ²Π΅Π½Π½ΡΡ
ΠΎΠΏΡΡ
ΠΎΠ»Π΅ΠΉ. Π‘ΡΡΠ΅ΡΡΠ²ΡΠ΅Ρ ΠΌΠ½ΠΎΠΆΠ΅ΡΡΠ²ΠΎ ΡΡΡΠ°ΡΠ΅Π³ΠΈΠΉ ΠΈΠΌΠΌΡΠ½ΠΎΡΠ΅ΡΠ°ΠΏΠΈΠΈ ΡΠ°ΠΊΠ°, Π²ΠΊΠ»ΡΡΠ°Ρ ΠΈΠ½ΡΠ΅ΠΊΡΠΈΠΈ ΡΠ°Π·Π»ΠΈΡΠ½ΡΡ
ΠΈΠΌΠΌΡΠ½Π½ΡΡ
ΡΡΡΠ΅ΠΊΡΠΎΡΠ½ΡΡ
ΠΊΠ»Π΅ΡΠΎΠΊ, ΡΠ°Π·ΠΌΠ½ΠΎΠΆΠ΅Π½Π½ΡΡ
ΠΈ Β«ΠΎΠ±ΡΡΠ΅Π½Π½ΡΡ
Β» Π² ΠΊΠ»Π΅ΡΠΎΡΠ½ΠΎΠΉ ΠΊΡΠ»ΡΡΡΡΠ΅. Π ΠΊΠ°ΡΠ΅ΡΡΠ²Π΅ Π°Π»ΡΡΠ΅ΡΠ½Π°ΡΠΈΠ²Ρ Π΄Π»Ρ Π΄ΠΎΡΡΠΈΠΆΠ΅Π½ΠΈΡ ΡΠ΅ΡΠ°ΠΏΠ΅Π²ΡΠΈΡΠ΅ΡΠΊΠΎΠ³ΠΎ ΡΡΡΠ΅ΠΊΡΠ° ΠΌΠΎΠ³ΡΡ Π±ΡΡΡ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½Ρ ΠΊΠ»Π΅ΡΠΊΠΈ, ΠΏΡΠ΅Π΄ΡΡΠ°Π²Π»ΡΡΡΠΈΠ΅ ΠΎΠΏΡΡ
ΠΎΠ»Π΅Π²ΡΠΉ Π°Π½ΡΠΈΠ³Π΅Π½ Π½Π° ΡΠ²ΠΎΠ΅ΠΉ ΠΏΠΎΠ²Π΅ΡΡ
Π½ΠΎΡΡΠΈ Π² Β«ΠΏΠΎΠ½ΡΡΠ½ΠΎΠΌΒ» Π΄Π»Ρ ΠΈΠΌΠΌΡΠ½Π½ΠΎΠΉ ΡΠΈΡΡΠ΅ΠΌΡ Π²ΠΈΠ΄Π΅. Π Π΅Π·ΡΠ»ΡΡΠ°ΡΡ ΠΈΡΡΠ»Π΅Π΄ΠΎΠ²Π°Π½ΠΈΠΉ Π² ΡΡΠΎΠΉ ΠΎΠ±Π»Π°ΡΡΠΈ ΠΏΡΠ΅Π΄ΡΡΠ°Π²Π»Π΅Π½Ρ Π² ΡΡΡΡΡΠ°Ρ
ΡΠ΅ΠΊΡΡΠΎΠ², ΡΡΡΠ½ΠΎΠΉ Π°Π½Π°Π»ΠΈΠ· ΠΊΠΎΡΠΎΡΡΡ
Π·Π°ΡΡΡΠ΄Π½Π΅Π½. ΠΡ ΡΠ°Π·ΡΠ°Π±ΠΎΡΠ°Π»ΠΈ ΠΏΠΎΠ΄Ρ
ΠΎΠ΄ Π΄Π»Ρ Π°Π²ΡΠΎΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΎΠ³ΠΎ Π°Π½Π°Π»ΠΈΠ·Π° ΡΠ΅ΠΊΡΡΠΎΠ² Π² ΡΡΠΎΠΉ ΠΎΠ±Π»Π°ΡΡΠΈ Π±ΠΈΠΎΠΌΠ΅Π΄ΠΈΡΠΈΠ½ΡΠΊΠΎΠΉ Π½Π°ΡΠΊΠΈ. Π Π΄Π°Π½Π½ΠΎΠΉ ΡΠ°Π±ΠΎΡΠ΅ ΠΌΡ ΠΏΡΠ΅Π΄ΡΡΠ°Π²Π»ΡΠ΅ΠΌ ΠΏΠ΅ΡΠ²ΡΠ΅ ΡΠ΅Π·ΡΠ»ΡΡΠ°ΡΡ Π°Π²ΡΠΎΠΌΠ°ΡΠΈΡΠ΅ΡΠΊΠΎΠ³ΠΎ Π°Π½Π°Π»ΠΈΠ·Π° Π΄Π°Π½Π½ΡΡ
, ΠΈΠ·Π²Π»Π΅ΡΠ΅Π½Π½ΡΡ
ΠΈΠ· Π°Π±ΡΡΡΠ°ΠΊΡΠΎΠ² Π½Π°ΡΡΠ½ΡΡ
ΡΡΠ°ΡΠ΅ΠΉ, Π΄ΠΎΡΡΡΠΏΠ½ΡΡ
Π² PubMed. ΠΠ° ΠΊΠΎΡΠΏΡΡΠ΅ ΠΈΠ·Π²Π»Π΅ΡΠ΅Π½Π½ΡΡ
ΡΠ΅ΠΊΡΡΠΎΠ² ΠΌΡ Π΄Π΅ΠΌΠΎΠ½ΡΡΡΠΈΡΡΠ΅ΠΌ Π°ΡΡΠΎΡΠΈΠ°ΡΠΈΠΈ ΠΌΠ΅ΠΆΠ΄Ρ ΡΠΈΠΏΠ°ΠΌΠΈ ΠΎΠΏΡΡ
ΠΎΠ»Π΅ΠΉ ΠΈ Π½Π°ΠΈΠ±ΠΎΠ»Π΅Π΅ ΡΠ°ΡΡΠΎ ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΠ΅ΠΌΡΠΌΠΈ ΡΠΏΠΎΡΠΎΠ±Π°ΠΌΠΈ ΠΊΠ»Π΅ΡΠΎΡΠ½ΠΎΠΉ ΡΠ΅ΡΠ°ΠΏΠΈΠΈ