Pawan Kalyan Sogana — Data Scientist · BI Analyst

Sogana
Pawan Kalyan

Data Scientist

BI Analyst

Data Analyst

Specialising in causal inference, ML explainability, anomaly detection, and AI-powered analytics pipelines — turning raw operational data into decisions that actually get made.

✉ iwe24001@uconn.edu

PROJECTS SHIPPED BY TYPE

SQL

ETL

5 projects

shipped

01 Selected Projects

🔬Model Autopsy Suite

↗

Forensic ML debugging console — investigates why a model got a prediction wrong using SHAP attribution, nearest-neighbour forensics, confidence decomposition, and an AI-written failure report across credit risk, medical, and equipment datasets.

SHAP ML Explainability scikit-learn XGBoost Streamlit Ollama

📡AnomalyAI — Investigation Console

↗

Production-grade anomaly detection across 5 correlated IoT signals using triple-layer detection (Z-score / IQR / CUSUM) and Granger causality for root cause ranking. Features an evidence timeline and AI-written incident report.

Anomaly Detection Granger Causality IoT Analytics statsmodels Plotly

📈CausalLens — Impact Analyzer

↗

Bayesian structural time series tool that answers "did this business decision actually work?" — models a statistical counterfactual and measures the true causal lift of interventions with 95% credible intervals across 4 real business scenarios.

Causal Inference BSTS Bayesian Methods statsmodels Streamlit

🗄️Text-to-SQL Query Engine

↗

Natural language interface over a multi-table relational database using live schema injection and LLM inference. Validates SQL via DuckDB EXPLAIN before execution and returns results with an AI business explanation.

LLM NL-to-SQL DuckDB Prompt Engineering Streamlit

📊AI Dashboard Narrator

↗

Auto-charting pipeline that profiles any CSV, infers column types, routes each variable to the right Plotly chart, and uses a local LLM to generate plain-English business narratives per chart — zero cloud API required.

Generative AI BI Automation Pandas Plotly Ollama

02 Experience

JUN 2021
JUL 2022

Data Scientist Intern

AVISHKAR TECH SOLUTIONS

Deployed supervised ML models (Random Forest, Gradient Boosting) improving robotic system accuracy by 15%.
Built real-time anomaly detection pipelines processing 50GB+ IoT sensor data — achieving a 25% drop in unplanned downtime.
Performed root cause analysis using Granger causality, cross-correlation, and lag analysis to diagnose cascading failures.
Designed 6+ Power BI & Tableau dashboards integrating live model outputs with operational KPIs for engineering teams.
Engineered time-series features (rolling stats, spectral decomp) improving F1-scores by 12% on imbalanced datasets.