Yellowbrick — Analyst Tool
Show a PrecisionRecallCurve to prove it handles your imbalanced data.
from yellowbrick.model_selection import LearningCurve, ValidationCurve from yellowbrick.classifier import ROCAUC, ClassificationReport
: Use the ValidationCurve or LearningCurve to prove your model isn't over-fitting.
Yellowbrick also handles unsupervised learning through Clustering Visualizers. Determining the number of clusters in K-Means is often a guessing game, but Yellowbrick’s Elbow Method and Silhouette Visualizers provide a mathematical and visual justification for choosing the optimal "K." It takes the guesswork out of clustering by showing how well-defined the boundaries are between groups. yellowbrick analyst tool
: Generate ConfusionMatrix , ROCAUC , or ResidualsPlot for your results section. 2. Exporting for Your Document
vc = ValidationCurve(LogisticRegression(), param_name="C", param_range=np.logspace(-4, 1, 6)) vc.fit(X, y) vc.show() # Find C where validation score peaks
Yellowbrick is a sophisticated machine learning visualization library that extends the Scikit-Learn API to allow human steering of the model selection process. Essentially, it provides "visual analysis" tools that help data scientists wrap their heads around high-dimensional data, model performance, and feature relationships. Show a PrecisionRecallCurve to prove it handles your
from yellowbrick.classifier import ConfusionMatrix from sklearn.ensemble import RandomForestClassifier
: Yellowbrick: Visualizing the Scikit-Learn Model Selection Process
In ~10 lines of code, you’ve performed learning curve analysis, hyperparameter tuning, and multi-class ROC evaluation. That’s 2 hours of manual plotting compressed into 5 minutes. Determining the number of clusters in K-Means is
This is where changes the game.
If you are "producing a paper" about a machine learning project, here is how you use Yellowbrick to generate the necessary content. 1. Generate Publication-Ready Figures