Skip to content
View Khanz9664's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report Khanz9664

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Khanz9664/README.md

Most teams stop at accuracy. I ask: why did the model decide that — and can you prove it?

I'm a Machine Learning Engineer specializing in Explainable AI (XAI) and clinical ML systems. My work sits at the boundary between statistical rigor and software engineering — building pipelines that are not only high-performing but accountable.

My research uncovered what I call the Explainability Paradox: visually convincing saliency maps that fail causal validity tests. That finding is now under peer review.


TrustLens — Open-Source ML Reliability Framework

Most evaluation stops at accuracy_score. TrustLens goes deeper.

A single analyze() call surfaces calibration drift, subgroup bias, failure patterns, and representation quality — the things that matter in production, but don't appear on leaderboards.

from trustlens import analyze

report = analyze(model, X_val, y_val, y_prob=proba)
# → Calibration · Bias · Failure Modes · Representation

Live on PyPI · Built with production CI/CD (multi-Python testing, Ruff, MyPy) · Active contributor community


Full writeup  |  PyPI package  |  Repository


Research

Paper Under Review

Quantitative Faithfulness Benchmarking of CNNs vs. Vision Transformers: Implications for Clinical Trustworthiness

I Trained 3 different Models (VGG16, ViT B/16 and Custom CNN) and ran GradCAM++ and EigenCam on a chest X-ray dataset and found something counterintuitive: visually plausible heatmaps lacked causal validity. A 6-dimensional benchmark along with Pixel Deletion (AOPC/AUC) showed that patch-based Transformer attention was causally faithful where CNNs weren't — despite CNNs looking more "correct" to the human eye. I call this the Explainability Paradox.

Metrics used: Sparsity · Entropy · Inter-Method Agreement · AOPC/AUC · Bonferroni-corrected non-parametric testing

Project writeup  |  Repository


Deployed Systems

System Stack Live Highlight
CardioSense-AI XGBoost · FastAPI · Docker · Optuna 🟢 Live 90.16% acc · 0.9524 AUC · "Least Effort Path" optimizer for patient intervention
Breast Cancer MLOps Suite Random Forest · Z-Score Drift · Streamlit 🟢 Live 98.2% acc · Real-time out-of-distribution detection
Respiratory Disease Classifier VGG16 · ViT-B/16 · GradCAM++ · LIME Research 99% recall for COVID-19 · Explainability Paradox discovery
Apple Sales Intelligence Scikit-Learn · SciPy SLSQP · Streamlit 🟢 Live Constrained optimization for hardware-mix revenue maximization
Patient Safety Guardian Gemini 2.5 Pro · Google ADK · Streamlit 🟢 Live Kaggle Agents Intensive · Multi-agent clinical safety net · 100% critical interaction detection

Mathematical Foundations of ML

I write derivation-first articles — intuition before formulas, full proofs included. No hand-waving.

Article What It Covers
Gradient Descent Partial derivatives · learning rates · convergence from first principles
Lagrange Multipliers Constrained optimization · dual problems · geometric intuition
Bias–Variance Trade-Off The fundamental tension between model simplicity and prediction accuracy
Linear Regression OLS derivation · normal equations · assumption breakdown
Logistic Regression Sigmoid · MLE · cross-entropy loss gradient derivation

All articles →


Technical Stack

ML / DL          PyTorch · XGBoost · Scikit-Learn · VGG16 · ViT · Optuna
XAI              SHAP · LIME · GradCAM++ · EigenCAM · Pixel Deletion (AOPC/AUC)
MLOps            FastAPI · Docker · GitHub Actions CI/CD · Streamlit · REST APIs
Data Engineering Python · SQL · Pandas · NumPy · PCA · K-Means · Plotly
Drift Detection  Z-Score · Counterfactual Analysis · Synthetic Stress Testing

GitHub Activity

GitHub Streak
Contribution Graph

Trophies


"In God we trust. All others must bring data." — W. Edwards Deming


If your model can't explain itself, it has no business making decisions.

Pinned Loading

  1. TrustLens TrustLens Public

    Open-source Python library for evaluating ML model reliability beyond accuracy — with calibration, failure, and fairness diagnostics for informed deployment decisions.

    Python 12 14

  2. Transfer-Learning-for-Respiratory-Disease-Classification Transfer-Learning-for-Respiratory-Disease-Classification Public

    Transfer learning is a powerful enabler for medical imaging AI. By reusing knowledge from general images, we circumvent one of medicine’s biggest bottlenecks – limited annotated data.

    Python 2 1

  3. CardioSense-AI CardioSense-AI Public

    CardioSense AI — Where clinical trust meets high-performance ML. An explainable CDSS with automated risk roadmaps, AHA/ACC-aligned safety guardrails, and production-ready CI/CD pipelines.

    HTML 2

  4. Breast-Cancer-Prediction Breast-Cancer-Prediction Public

    Breast cancer is one of the most common cancers in women worldwide. Early detection and accurate diagnosis are crucial for effective treatment and improved survival rates. This project utilizes mac…

    Jupyter Notebook 1 1

  5. Python-For-DataScience Python-For-DataScience Public

    This repository offers a collection of Jupyter notebooks that chronicle my journey in learning Python for Data Science. The notebooks cover a range of topics, from basic programming concepts to adv…

    Jupyter Notebook 1

  6. portfolio portfolio Public

    This GitHub repository contains a personal portfolio project. It is built using HTML, CSS, and JavaScript. The primary purpose of this repository is to showcase the my skills and projects through a…

    HTML 4 2