Deep-Learning

1. Evaluation Metrics

Accuracy
- Definition: The ratio of correctly predicted instances to the total instances.
- Strengths: Simple and easy to interpret.
- Limitations: Misleading in imbalanced datasets where one class dominates.
- Example: A model predicting all instances as the majority class can achieve high accuracy but fail to capture minority class instances.
Precision
- Definition: The ratio of true positive predictions to the total predicted positives.
- Formula: ( $\text{Precision} = \frac{\text{TP}}{\text{TP} + \text{FP}}$ ).
- Use Case: Important when false positives are costly (e.g., spam detection).
- Example: In fraud detection, precision ensures minimal misclassification of legitimate transactions as fraudulent.
Recall (Sensitivity or True Positive Rate)
- Definition: The ratio of true positive predictions to the total actual positives.
- Formula: ( $\text{Recall} = \frac{\text{TP}}{\text{TP} + \text{FN}}$ ).
- Use Case: Critical when missing a positive instance has high consequences (e.g., disease detection).
- Example: In cancer diagnosis, recall focuses on identifying all cancer-positive cases.
F1-Score
- Definition: Harmonic mean of precision and recall, balancing both metrics.
- Formula: ( $F1 = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}}$ ).
- Use Case: Suitable when precision and recall are equally important.
- Example: Useful in cases where class distribution is imbalanced.
Confusion Matrix
- Components:
  - True Positive (TP): Correctly identified positive cases.
  - True Negative (TN): Correctly identified negative cases.
  - False Positive (FP): Incorrectly identified negative as positive.
  - False Negative (FN): Incorrectly identified positive as negative.
- Applications: A foundational tool for evaluating model predictions.
- Example: A two-class confusion matrix provides a clear snapshot of prediction errors.
ROC Curves and AUC
- ROC Curve:
  - Definition: A graph showing the trade-off between true positive rate (recall) and false positive rate at various thresholds.
  - Use: Visualizes the classifier’s ability to distinguish between classes.
- AUC (Area Under the Curve):
  - Definition: A single scalar value summarizing the ROC curve.
  - Interpretation: Higher AUC indicates better classifier performance.
  - Example: An AUC close to 1 signifies a model with near-perfect separability.

2. Model Comparison

Methods for Model Comparison
- Cross-validation:
  - Definition: Splitting the dataset into training and validation sets multiple times to evaluate consistency.
  - Benefit: Provides reliable performance metrics by reducing variance due to dataset splits.
- Metrics-based comparison:
  - Comparing models using metrics like accuracy, precision, recall, F1-score, and AUC.
- Statistical significance tests:
  - Techniques such as paired t-tests to ensure observed differences between models are not due to chance.
Limitations of Specific Metrics
- Accuracy:
  - Misleading in imbalanced datasets where it may favor majority class predictions.
  - Example: In a dataset with 99% of one class, predicting only the majority class achieves 99% accuracy.
- Precision vs. Recall Trade-offs:
  - High precision may reduce recall and vice versa.
  - Example: Increasing spam filters’ precision may miss legitimate emails (lower recall).
- F1-Score:
  - May not fully capture the relative importance of precision and recall in some cases.
  - Example: F1-score does not differentiate cases where precision is more important than recall or vice versa.
Use of Cost-Sensitive Metrics
- Cost Matrix:
  - Assigning weights to different types of prediction errors (e.g., FP vs. FN) based on their impact.
  - Example: In credit card fraud detection, a false positive (blocking legitimate transactions) may have a lower cost than a false negative (missing fraud).

This site is open source. Improve this page.