Sergei Nasibian is a Quantitative Strategist at Rothesay, a London-based asset management company, where he developed from scratch the entire risk calculations Sergei Nasibian is a Quantitative Strategist at Rothesay, a London-based asset management company, where he developed from scratch the entire risk calculations

Spotting the Shift: Real-Time Change Detection with K-NN Density Estimation and KL Divergence

2026/02/14 06:10
Okuma süresi: 5 dk

Sergei Nasibian is a Quantitative Strategist at Rothesay, a London-based asset management company, where he developed from scratch the entire risk calculations framework that serves as the main source of analytics for hedging market exposure. Previously, Sergei worked as a Senior Data Scientist at Yandex Eats, where he developed the company’s delivery pricing system from the ground up and supported business expansion into new geographies. He also worked as a Data Scientist at McKinsey & Company and as a Quantitative Researcher at WorldQuant, where he won the global alpha building competition. Sergei holds a degree in Mathematics from Lomonosov Moscow State University and specializes in stochastic processes and clustering algorithms.

A model-agnostic method to catch subtle shifts in data distribution before your metrics degrade.

Machine learning models tend not to go downhill suddenly. Rather, their performance deteriorates gradually: drifts in the metric values, confidence measures, and accuracy of predictions tend to start before being noticed.

One reason why the model gradually goes unfit is the change of the input data distribution. Even slight changes may lead to the model becoming less reliable. Noticing such shifts of the input data has become vital for maintaining production-level systems.

Our guest expert, Sergei Nasibian, offers a real-time solution to the drift detection of data. This solution has both a straightforward and mathematical explanation. The expert’s approach uses the concept of k-nearest-neighbor density estimation and Kullback-Leibler divergence to detect whenever real-time data deviates from the training environment. This solution neither relies upon the assumption of the distribution type of the given data nor uses the knowledge of the internal functionings of the model.

The Silent Saboteur: Why Data Drift Matters

In production machine learning, the distributions of data are rarely constant. Market behavior as well as other factors can cause the input data to drift. In traditional monitoring, the output of the model, the measures of recall, accuracy, and precision, are of concern. However, when we see the output drop, the problem has already happened. Instead of that, Sergei monitors the input data.

The Dynamic Duo: K-NN and KL Divergence

Sergei’s method incorporates two very complementary techniques:

K-Nearest Neighbors for Density Estimation: K-Nearest Neighbors for density estimation relies solely on the data for the calculations instead of assuming how the data should look (Gaussian parametric family, anybody?). The algorithm relies on the proximity of the point to its k-nearest neighbor for the estimation of the probability density in the feature space.

KL Divergence: This measures the difference between two probability distributions. The greater the KL divergence, the more the current and reference distributions of the data differ. This can represent drift in the data.

The Method: Simple Yet Effective

Our expert’s detection system functions as follows:

Define the baseline: Training samples serve as the baseline distribution. Estimate the reference probability density via k-NN algorithm. 

Form the Sliding Window: Maintain a sliding window of “recent” observations as new observations flow in – the sliding window represents the “current” distribution. Apply the k-NN algorithm to estimate the probability density of the observations falling into the sliding window as well (use the same parameter k).

Calculate the KL divergence: Use the KL divergence metric to compare the two distributions. Higher values represent drifts, while smaller values represent similar distributions.

Trigger the alert: An alert should be triggered when the KL divergence goes above the pre-set threshold.

The Devil in the Details: Practical Considerations

Window Size Selection: If the chosen window size is too small, you’ll end up chasing the noise. When the window size gets too large, rapid changes will get missed. The type of your dataset and the time required to spot the changes will let you know what the best strategy should be.

Threshold calibration: Choosing the appropriate value for the KL divergence threshold is also very important. Like window size, this can lead to false positives if set too low or miss actual drifts if set too high. Sergei recommends splitting the homogeneous part of the sample into n sequential windows and calculating the pairwise KL-divergences. Then the 95th or 99th percentile of the set of obtained KL-divergences can be chosen as the threshold.

Determining the value of k: The greater the value of k, the less sensitive/localized the density function estimations will be. Lower values of k will emphasize distribution irregularities, but may lead to high sensitivity to errors in the data. A good starting point for the determination of the value of k is the square root of the sample size.

Real-World Application: E-commerce Recommendation Systems

For example, consider a recommendation system for an online retailer. If the model was trained on pre-pandemic shopping data, but customer behavior has since changed (e.g., increased purchases of home goods and decreased interest in travel accessories), the input data distribution will shift.

Traditional monitoring might show declining click-through rates days after the shift began. Our expert’s k-NN approach would flag the change much earlier by detecting that incoming customer feature vectors (in-app behavior) no longer match the training distribution.

When KL divergence spikes, you know something’s changed. Maybe it’s a seasonal trend, a marketing campaign effect, or a fundamental shift in customer preferences. Either way, you’re alerted in time to investigate and adapt.

The Scalability Question

Sergei notes that this technique can be adapted at the required scale through proper engineering solutions:

Sampling Strategies: When dealing with large-scale data, operate on a sample set instead of a full distribution.

Approximate Nearest Neighbors: Use Annoy or Faiss libraries for approximate nearest neighbor search.

Parallel Processing: Density estimation as well as the computation of the KL divergence can be processed across multiple machines.

Incremental Updates: Update rolling statistics rather than recomputing everything.

Piyasa Fırsatı
ChangeX Logosu
ChangeX Fiyatı(CHANGE)
$0.00071203
$0.00071203$0.00071203
+95.82%
USD
ChangeX (CHANGE) Canlı Fiyat Grafiği
Sorumluluk Reddi: Bu sitede yeniden yayınlanan makaleler, halka açık platformlardan alınmıştır ve yalnızca bilgilendirme amaçlıdır. MEXC'nin görüşlerini yansıtmayabilir. Tüm hakları telif sahiplerine aittir. Herhangi bir içeriğin üçüncü taraf haklarını ihlal ettiğini düşünüyorsanız, kaldırılması için lütfen service@support.mexc.com ile iletişime geçin. MEXC, içeriğin doğruluğu, eksiksizliği veya güncelliği konusunda hiçbir garanti vermez ve sağlanan bilgilere dayalı olarak alınan herhangi bir eylemden sorumlu değildir. İçerik, finansal, yasal veya diğer profesyonel tavsiye niteliğinde değildir ve MEXC tarafından bir tavsiye veya onay olarak değerlendirilmemelidir.

Ayrıca Şunları da Beğenebilirsiniz

Shiba Inu Leader Breaks Silence on $2.4M Shibarium Exploit, Confirms Active Recovery

Shiba Inu Leader Breaks Silence on $2.4M Shibarium Exploit, Confirms Active Recovery

The lead developer of Shiba Inu, Shytoshi Kusama, has publicly addressed the Shibarium bridge exploit that occurred recently, draining $2.4 million from the network. After days of speculation about his involvement in managing the crisis, the project leader broke his silence.Kusama emphasized that a special ”war room” has been set up to restore stolen finances and enhance network security. The statement is his first official words since the bridge compromise occurred.”Although I am focusing on AI initiatives to benefit all our tokens, I remain with the developers and leadership in the war room,” Kusama posted on social media platform X. He dismissed claims that he had distanced himself from the project as ”utterly preposterous.”The developer said that the reason behind his silence at first was strategic. Before he could make any statements publicly, he must have taken time to evaluate what he termed a complex and deep situation properly. Kusama also vowed to provide further updates in the official Shiba Inu channels as the team comes up with long-term solutions.Attack Details and Immediate ResponseAs highlighted in our previous article, targeted Shibarium's bridge infrastructure through a sophisticated attack vector. Hackers gained unauthorized access to validator signing keys, compromising the network's security framework.The hackers executed a flash loan to acquire 4.6 million BONE ShibaSwap tokens. The validator power on the network was majority held by them after this purchase. They were able to transfer assets out of Shibarium with this control.The response of Shibarium developers was timely to limit the breach. They instantly halted all validator functions in order to avoid additional exploitation. The team proceeded to deposit the assets under staking in a multisig hardware wallet that is secure.External security companies were involved in the investigation effort. Hexens, Seal 911, and PeckShield are collaborating with internal developers to examine the attack and discover vulnerabilities.The project's key concerns are network stability and the protection of user funds, as underlined by the lead developer, Dhairya. The team is working around the clock to restore normal operations.In an effort to recover the funds, Shiba Inu has offered a bounty worth 5 Ether ($23,000) to the hackers. The bounty offer includes a 30-day deadline with decreasing rewards after seven days.Market Impact and Recovery IncentivesThe exploit caused serious volatility in the marketplace of Shiba Inu ecosystem tokens. SHIB dropped about 6% after the news of the attack. However, The token has bounced back and is currently trading at around $0.00001298 at the time of writing.SHIB Price Source CoinMarketCap
Paylaş
Coinstats2025/09/18 02:25
Liberty All-Star® Growth Fund, Inc. January 2026 Monthly Update

Liberty All-Star® Growth Fund, Inc. January 2026 Monthly Update

BOSTON–(BUSINESS WIRE)–Below is the January 2026 Monthly Update for the Liberty All-Star Growth Fund, Inc. (NYSE: ASG). Liberty All-Star Growth Fund, Inc. Ticker
Paylaş
AI Journal2026/02/14 09:00
BlockchainFX or Based Eggman $GGs Presale: Which 2025 Crypto Presale Is Traders’ Top Pick?

BlockchainFX or Based Eggman $GGs Presale: Which 2025 Crypto Presale Is Traders’ Top Pick?

Traders compare Blockchain FX and Based Eggman ($GGs) as token presales compete for attention. Explore which presale crypto stands out in the 2025 crypto presale list and attracts whale capital.
Paylaş
Blockchainreporter2025/09/18 00:30