Международная конференция «Математические и информационные технологии, MIT-2016»

28 августа – 5 сентября 2016 г.

Врнячка Баня, Сербия - Будва, Черногория

Nugumanova A.   Мансурова М.Е.   Alimzhanov Y.   Baiburin Y.  

Using Non-Negative Matrix Factorization for Text Segmentation

Докладчик: Nugumanova A.

As applied to topic modeling, non-negative matrix factorization allows executing mapping of documents into the domain. The basis matrix allows to reduce the dimension of initial vector representations of documents, this being actively used for solution of the problems of text classification, clustering and information retrieval. The features matrix allows to evaluate the distribution of words occurring in the collection on topics. Sorting out the elements (words) of each extracted topic by the decrease of weights, one can define the most valuable (weighted) words of the topics. The aim of this work is to study the possibilities of increasing the quality of topic segmentation of documents on account of using such valuable words which we call topic representative.


К списку докладов

© 1996-2019, Институт вычислительных технологий СО РАН, Новосибирск