'
Научный журнал «Вестник науки»

Режим работы с 09:00 по 23:00

zhurnal@vestnik-nauki.com

Информационное письмо

  1. Главная
  2. Архив
  3. Вестник науки №5 (86) том 4
  4. Научная статья № 156

Просмотры  88 просмотров

Anuarbekova A.A.

  


PREDICTING YOUTUBE MUSIC HITS WITH VISUAL AND BEHAVIORAL DATA *

  


Аннотация:
in today’s digital world, machine learning is used to find out what is common in popular and new songs on popular platforms such as YouTube Music. We leveraged a dataset of music videos, extracted relevant visual features from common YouTube metrics and video thumbnails, and developed a supervised machine learning models to predict the hit potential of a song. And that engagement metrics (likes, comments, popularity) and visual features could jointly become music video’s success as strong predictors.   

Ключевые слова:
machine learning, image features, social media features   


DOI 10.24412/2712-8849-2025-586-1268-1277

Streaming platforms have become the go-to medium for how music is received and consumed globally, reshaping the entire music industry. YouTube Music is one such hybrid platform that fuses audio and visual elements and can influence user behavior in multiple ways. YouTube continues to be one of the most popular music discovery tools, with millions of music videos allowing even huge stars and independent creators to get their music out on a global scale. In such a cut-throat atmosphere, being able to predict which songs will succeed and achieve “hit” status has become prized knowledge, whether for producers, marketers, or streaming services. Historically reliant on radio airplay, record label promotion, and cultural trends, popularity in the music business is being reshaped by an emerging music economy driven by digital consumption. But in the digital realm, new variables have been added to the mix that include user interaction (likes, comments, views), algorithmic suggestions, and even visual appeal (video thumbnails).Numerous studies have focused on behavioral signals being in place for digital content to succeed. For example, Sagiroglu [11] and Koseoglu [5] that likes and views are effectively popularization measures. According to Beer [1], the interaction patterns of digital platforms are increasingly shaping the landscape of digital consumption. On the other hand, it is suggested by North and Alluri [10] that visual characteristics have a significant impact on forming impressions of users and increasing user engagement. While interest for these individual factors has grown, little research has been done in predictive modeling with integrated visual and behavioral data. This blog post will bridge that gap, by diving into the nuts and bolts of how to harness the power of business analytics, coupled with machine learning, to generate predictions around YouTube music’s success potential. We aim to construct a prediction framework using YouTube engagement metrics and visual features derived from the thumbnail. By doing this, we contributed to the computer music community and offer application knowledge for industry practitioners. We also compared the results of different machine learning models to evaluate the best technique for this specific task. We predicted hit tracks using visual analytics. Predicting music popularity on digital platforms has emerged as an important research agenda upon the intersection of music information retrieval, social media analytics, and business analytics. In the past, researchers have studied user engagement statistics, factored metadata, social dynamics, and multimedia.Platform data is limited to multiple lines of inquiry on the relationship between engagement features such as likes, views, and comments and the success of online music content. Sagiroglu [11] proposed a track popularity predictive model by incorporating the metadata from YouTube and Spotify. Koseoglu [5] emphasized the role of marketing for YouTube in the music industry and showed how usage metrics are directly influencing popularity and revenue. Beer [1] emphasizes that, in addition to emerging interface design and recommendation algorithms, we are witnessing what new “audiences” look like as they watch multiple hours of video content on platforms such as YouTube.According to Cunningham and Craig [3], [2], likes and comments are meaningful in analyzing user engagement and content resonance, particularly between user-generated content and professional content. As Hesmondhalgh [4] correctly points out, this metric does not only measure success, as it can have a positive effect on the chance for visibility given to artists registered on the platform in the algorithmic environment.Besides behavioural data, the role of visual aspects in video clips, especially thumbnails, is also acknowledged. North and Alluri [10] demonstrated that the visual presentation of sound affects user perception and emotional response. Kumar [6] as well as Smith Johnson [12] focused on the thumbnails’ role and demonstrate that elements like color, brightness, and composition influence click-through rates and first attention.Zangerle [16] suggested a model of visual trend analysis of music streams by genre, which reflects how some tones or images are more prevalent with certain music categories. Nguyen and Liu [9], [7] additionally noted that image metadata (such as face, emotion, text) also act as useful features to estimate popularity.Predicting music popularity can also be achieved through different machine learning approaches. Wares [14] used decision trees and logistic regression on social media and graph data, and Wilson and Martinez [15] applied convolutional neural networks (CNNs) to detect temporal engagement patterns. Brown and Davis [12] discussed the popularity bias in recommender systems, and they contended for more balanced prediction system.Methods that aim to achieve higher levels of prediction are also actively explored. For example, Tran [13] proposed a methodology to combine elements such as user behavior, content attributes, and platform properties to maximize prediction accuracy. Similarly, Morales and Chen [8] emphasize the importance of understanding how different categories of users interact with content as well as their communication goals to improve modeling performance.However, little effort has been put into using activity and visual data in a single prediction model. The aim of this study is to address this gap by combining YouTube engagement data and video thumbnail analytics to improve prediction of future views and thus popularity. This study attempts to improve prediction of future views by analyzing YouTube engagement data and identifying gaps in it.Methods.In this study, we applied a machine learning algorithm to predict whether music is popular or not. First, we performed data cleaning and preprocessing. Then, we extracted two types of features from the data: user engagement metrics (likes, reviews, comments, etc.) and visual features (e.g., color palette, image dynamics) for training some different machine learning models. This allowed us to compare their effectiveness in predicting the popularity of music videos. The purpose of the comparison was to select the most accurate classifier. The entire design and implementation of the study was carried out in the Python programming language.The data was collected from publicly available sources. Using Google Cloud Platform and YouTube Music, we pulled data available with help of the official YouTube Data API v3. Then, we used Billboard Hot 100 chart dataset available on Kaggle which contains the most trending songs. After that we combined it with content from YouTube. Metadata via the YouTube API was extracted for those titles, collecting the following attributes: videoId, title, channelTitle, viewCount, likeCount, commentCount, publishedAt. We stored these values in a csv file, which is further evolved by manually assigning popularity scores based on rank and engagement. The dataset contains both numerical and categorical features for the 98 tracks. For visual component analysis, we extracted video thumbnails (cover images) of each song.For example, we chose four commonly used supervised machine learning models for comparison: logistic regression, random forest classifier, K-nearest neighbors (KNN), and multi-layer perceptron (MLP). These models were chosen according to their simplicity, efficiency and previous application in similar classification problems with numerical and categorical data. The dataset is divided into an training set (80%) and a test set (20%) randomly. Every model is learned from the training data and assessed on the test data based on two main metrics: accuracy and F1 score. Accuracy is a measure of the overall accuracy of the model, while the F1 score is used to balance precision and recall. Especially when dealing with imbalanced datasets like the one we have here. The metrics are calculated as follows:

  


Полная версия статьи PDF

Номер журнала Вестник науки №5 (86) том 4

  


Ссылка для цитирования:

Anuarbekova A.A. PREDICTING YOUTUBE MUSIC HITS WITH VISUAL AND BEHAVIORAL DATA // Вестник науки №5 (86) том 4. С. 1268 - 1277. 2025 г. ISSN 2712-8849 // Электронный ресурс: https://www.вестник-науки.рф/article/23427 (дата обращения: 08.07.2025 г.)


Альтернативная ссылка латинскими символами: vestnik-nauki.com/article/23427



Нашли грубую ошибку (плагиат, фальсифицированные данные или иные нарушения научно-издательской этики) ?
- напишите письмо в редакцию журнала: zhurnal@vestnik-nauki.com


Вестник науки © 2025.    16+




* В выпусках журнала могут упоминаться организации (Meta, Facebook, Instagram) в отношении которых судом принято вступившее в законную силу решение о ликвидации или запрете деятельности по основаниям, предусмотренным Федеральным законом от 25 июля 2002 года № 114-ФЗ 'О противодействии экстремистской деятельности' (далее - Федеральный закон 'О противодействии экстремистской деятельности'), или об организации, включенной в опубликованный единый федеральный список организаций, в том числе иностранных и международных организаций, признанных в соответствии с законодательством Российской Федерации террористическими, без указания на то, что соответствующее общественное объединение или иная организация ликвидированы или их деятельность запрещена.