Sogana
Pawan Kalyan
Data Scientist
BI Analyst
Data Analyst

Specialising in causal inference, ML explainability, anomaly detection, and AI-powered analytics pipelines — turning raw operational data into decisions that actually get made.

iwe24001@uconn.edu
PROJECTS SHIPPED BY TYPE
ML
BI
AI
SQL
ETL
5 projects
shipped
🔬Model Autopsy Suite

Forensic ML debugging console — investigates why a model got a prediction wrong using SHAP attribution, nearest-neighbour forensics, confidence decomposition, and an AI-written failure report across credit risk, medical, and equipment datasets.

SHAP ML Explainability scikit-learn XGBoost Streamlit Ollama
📡AnomalyAI — Investigation Console

Production-grade anomaly detection across 5 correlated IoT signals using triple-layer detection (Z-score / IQR / CUSUM) and Granger causality for root cause ranking. Features an evidence timeline and AI-written incident report.

Anomaly Detection Granger Causality IoT Analytics statsmodels Plotly
📈CausalLens — Impact Analyzer

Bayesian structural time series tool that answers "did this business decision actually work?" — models a statistical counterfactual and measures the true causal lift of interventions with 95% credible intervals across 4 real business scenarios.

Causal Inference BSTS Bayesian Methods statsmodels Streamlit
🗄️Text-to-SQL Query Engine

Natural language interface over a multi-table relational database using live schema injection and LLM inference. Validates SQL via DuckDB EXPLAIN before execution and returns results with an AI business explanation.

LLM NL-to-SQL DuckDB Prompt Engineering Streamlit
📊AI Dashboard Narrator

Auto-charting pipeline that profiles any CSV, infers column types, routes each variable to the right Plotly chart, and uses a local LLM to generate plain-English business narratives per chart — zero cloud API required.

Generative AI BI Automation Pandas Plotly Ollama
JUN 2021
JUL 2022
Data Scientist Intern
AVISHKAR TECH SOLUTIONS
  • Deployed supervised ML models (Random Forest, Gradient Boosting) improving robotic system accuracy by 15%.
  • Built real-time anomaly detection pipelines processing 50GB+ IoT sensor data — achieving a 25% drop in unplanned downtime.
  • Performed root cause analysis using Granger causality, cross-correlation, and lag analysis to diagnose cascading failures.
  • Designed 6+ Power BI & Tableau dashboards integrating live model outputs with operational KPIs for engineering teams.
  • Engineered time-series features (rolling stats, spectral decomp) improving F1-scores by 12% on imbalanced datasets.
15%
ML accuracy ↑
25%
Downtime ↓
60%
Prep time ↓
6+
Dashboards
ML & Statistics
scikit-learn XGBoost SHAP Causal Inference Bayesian UMAP NLP
AI & LLMs
PyTorch TensorFlow Ollama LangChain RAG Prompt Eng.
Data & SQL
Python SQL DuckDB PostgreSQL Pandas statsmodels
Visualisation & BI
Power BI Tableau Streamlit Plotly DAX Power Query
Cloud & Eng.
Azure AWS GCP Kafka Airflow ETL/ELT Git
Certifications
Power BI Associate Azure AZ-900 Oracle OCI
Proficiency by domain
Python / SQL
92%
Power BI / Tableau
88%
ML / Statistics
84%
LLMs / GenAI
78%
Cloud / ETL
72%

Open to full-time roles in Data Science, Data Analysis, and BI. Reach out via any of the channels below.

EMAIL COPIED ✓