Total points 6 1. Question 1 Many ML models suffer from declining predicting capabilities over time. A common solution used to overcome this deterioration is to keep retraining your model with new data. As part of this process you may encounter a phenomenon called concept emergence. Which of the following statements accurately describes this emergent phenomenon? 1 / 1 point The persistent appearance of stationary data that remains immutable over time. > The appearance of new patterns in data distribution that were not previously present in your dataset. The lack of covariate shift. The loss of prediction quality over time. Correct That’s right! These new patterns are the new concepts emerging in your data. Continuous evaluation and monitoring will help you address them properly. 2. Question 2 Statistical process control is a technique that detects concept drift assuming that the errors follow a binomial distribution. Would the system trigger an alarm if p_{t}=\sigma_{t}=0.3p t=σt=0.3 and p_{min}=\sigma_{min}=0.12p min=σmin=0.12? 1 / 1 point > Yes No Correct That’s right! The values provided satisfy the alarm rule: p_t +\sigma_t \geq p_{min}+3\sigma_{min}p t+σt≥pmin+3σmin 3. Question 3 In sequential analysis you detect concept drift by calculating the negative predictive value, precision, recall, and specificity of the system based on a standard contingency table. If the data is stationary these quantities should not change over time. This analysis is tedious as it requires recomputing all these metrics each time we get a new sample. Which of the following approaches is usually implemented to overcome this problem? 1 / 1 point > Incremental update rule Monte Carlo sampling Adaptive windowing Recursive computation and caching Correct That’s right! The incremental update rule is P_{*}^{t}\leftarrow \eta_{*}P_{*}^{t-1}+ (1-\eta_{*}) I_{y_{t}=\hat{y}_t}P ∗t​ ←η∗​ P∗t−1 +(1−η∗)Iyt =y^t​ 4. Question 4 Drift detection techniques in unsupervised settings typically suffer from the curse of dimensionality. Which of the following techniques is an appropriate solution to mitigate the effects of this curse? (Check all that apply) 1 / 1 point SVD (Singular Value Decomposition) > PCA (Principal components analysis) Correct That’s right! Principal components analysis is the right tool to reduce the number of features to detect drift more efficiently. K-means > NMF (Non Negative Matrix Factorization) Correct That’s right! NMF is a very useful dimensionality reduction technique when the features are constrained to be all non-negative. 5. Question 5 In unsupervised settings, clustering is a very useful method to detect novelty in your data. In this method, you cluster the incoming batches of data to one of the known classes. If you observe that the features of the new data are lying far away from the classes of known features, you can term it as an emerging concept. The downside of this method is that it detects only __________ drift and not ___________ changes. 1 / 1 point population-based, cluster-based cluster-based, population-based. feature, cluster-based > cluster-based, feature Correct That’s right! Drift is detected only on a cluster centric view, 6. Question 6 It is a sad truth that most of the machine learning models are trained with a fixed set of stationary data. It is very likely that in this process you may have slightly biased your model in favor of your limited data at training. Consequently, as time progresses, your ML model's performance will deteriorate with time. Monitoring helps prevent this performance decay in which ways? (Check all that apply) 1 / 1 point Allows you to establish ground truth labels By retraining your model constantly By performing dimensionality reduction > Reduces false alarm rates Correct Correct! If applied wisely, monitoring can detect data drift early on and adjust the model accordingly to adapt to these changes and hence improving model’s performance. > Identify regions in latent space where the model performs poorly Correct That’s right! Monitoring will help you identify areas in latent space where your model struggles at classification. You can further use this knowledge to refine your model. > Allows you to identify distribution changes close to the classification boundaries Correct Yes! Data drift can change classification boundaries quite drastically and monitoring will he