Fraud Detection & Anomaly Analytics Engine
This project presents a powerful anomaly detection system developed using unsupervised machine learning techniques. Focused on fraud detection in financial transactions, the engine utilizes Isolation Forest, Local Outlier Factor, and Deep Learning Autoencoders to identify suspicious behavior without requiring labeled data. By comparing multiple algorithms and optimizing based on recall for the minority class, this project delivers a robust and scalable fraud analytics solution.
-
Python (Pandas, NumPy, Scikit-learn, Matplotlib, Seaborn)
-
Machine Learning: Isolation Forest, Local Outlier Factor
-
Deep Learning: Autoencoders (Keras + TensorFlow)
-
Streamlit for building an interactive fraud monitoring dashboard
-
Model Deployment: Joblib for exporting models
-
Dataset: Public anonymized Credit Card Transactions dataset
Tools & Technologies Used
-
Scaled and preprocessed a highly imbalanced credit card transactions dataset with over 284,000 entries.
-
Built and compared three unsupervised anomaly detection models: Isolation Forest, LOF, and Autoencoder.
-
Tuned model thresholds and evaluated performance using precision, recall, F1-score, and confusion matrix.
-
Visualized fraud vs. non-fraud clusters and reconstruction error distributions.
-
Exported trained models and deployed an interactive Streamlit dashboard for real-time fraud scoring.
What I did
.png)
.png)
.png)
.png)
Key Highlights
-
Real-World Dataset: Analyzed over 280,000 credit card transactions to detect rare fraud instances (only 0.17% fraud).
-
Multi-Model Approach: Used Isolation Forest, Local Outlier Factor (LOF), and Autoencoders for unsupervised fraud detection.
-
Explainable AI: Integrated SHAP (SHapley Additive Explanations) to interpret Autoencoder-based fraud predictions.
-
Risk Scoring System: Developed a combined risk scoring engine based on model consensus and reconstruction error.
-
Performance Evaluation: Assessed model performance using confusion matrix, precision-recall, and classification reports.
-
Final Predictions Exported: Generated a clean, deployable dataset with fraud predictions for app use.
-
Ready for Deployment: Exported trained models and built a Streamlit app for interactive fraud inspection.