About Problems with Finding Outliers in a Single-Class Problem
DOI:
https://doi.org/10.71310/pcam.3_67.2025.11Keywords:
loss functions, unsupervised learning, diverse dataAbstract
Outlier detection is considered in a one-class classification problem. The challenges of detection are related to the choice of loss functions, optimization criteria, and transformations of various types of features. Examples are given from subject areas where such problems exist. The absence of clustering methods for outlier detection and the possibility of their unambiguous interpretation are justified. The diversity of analysis options is related to the presence of uncertainties in selecting clustering parameters. As additional domain knowledge, an analysis of the environment’s nature for outlier objects is provided. For this purpose, kernel density estimates of outliers are calculated over local areas of fixed size. A method is presented for determining the location of objects closest to the class center. The task of analyzing meteorological data for the city of Tashkent has been solved. To enable prompt response to environmental violations, a transition to solving a two-class classification problem is proposed. It is assumed that one class describes an acceptable state, while the other represents a deviation from environmental standards.
References
Борисова И.А., Кутненко О.А. Очистка данных от диагностических ошибок в признаковых пространствах большой размерности Mathematical Biology and Bioinformatics, – 2019. Т. 14. – №2. – С. 464–476. doi: http://dx.doi.org/10.17537/2019.14.464.
Игнатьев Н.А., Турсунмуротов Д.Х. Цензурирование обучающих выборок с использованием регуляризации отношений связанности объектов классов Научно-технический вестник информационных технологий, механики и оптики, – 2024. – Т. 24. – №2. – С. 322–329. doi: http://dx.doi.org/10.17586/2226-1494-2024-24-2-322-329.
Ahmed Fahim An extended DBSCAN clustering algorithm International Journal of Advanced Computer Science and Applications, – 2022. Vol. 13. – №3. – С. 245–258. doi: http://dx.doi.org/10.14569/IJACSA.2022.0130331.
Вапник В.Н. Природа статистической теории обучения / Пер. с англ. — The nature of statistical learning theory — 2nd ed. – 2000. – 324 с.
Vinod Kumar Chauhan, Kalpana Dahiya, Anuj Sharma Problem formulations and solvers in linear SVM:a review Artif Intell Rev, – 2019. – Vol. 52. – C. 803–855. doi: http://dx.doi.org/10.1007/s10462-018-9614-6.
Наврузов Э.Р. О формировании баз прецедентов для решения задач информационной безопасности Информатика. Информационная безопасность. Математика. Научный журнал. ВЕСТНИК РГГУ, – 2022. – №3. – С. 66–84. doi: http://dx.doi.org/10.28995/2686-679X-2022-3-66-84.
Игнатьев Н.А., Згуральская Е.Н. Кластерный анализ с применением обучения на основе отношений связанности и плотности распределения Вестник Томского государственного университета. Управление, вычислительная техника и информатика, – 2024. – №68. – С. 66–74. doi: http://dx.doi.org/10.17223/19988605/68/7.
Беккер Ж., Дэвис Ж. Обучение на основе положительных и необозначенных данных: Опрос. 2020 / Пер. с англ. Learning from positive and unlabeled data: A survey. – 2020. –№109. – С. 719–760.

Downloads
Published
Issue
Section
License
Copyright (c) 2025 N.A. Ignatiev

This work is licensed under a Creative Commons Attribution 4.0 International License.