F1 Prediction Engine
A triple-model ensemble (XGBoost, Monte Carlo, Bayesian) predicting the 2026 F1 era with 38.9% accuracy.
Loading experience
Data Scientist · Designer · Builder
Building things that live at the intersection of
data, design, and human experience.

Open to opportunities
About
During my first year of engineering, two people I love survived sudden cardiac arrest. Watching a cardiologist trace every millisecond on an ECG print-out — and pinpoint the exact moment each heart faltered — was a revelation: data, interpreted with precision, saves lives.
Those late-night explorations into HMMs, SVMs, and arrhythmia detection set my compass. Coming from a family of professors, I learned that knowledge is worthless unless it's shared for social good. Today I'm a data-science practitioner who enjoys connecting the dots — ideas across disciplines, people across teams, applications across industries.
Backed by 4+ years translating complex datasets into clear actions for stakeholders in telecom, logistics, and e-commerce — I'm now pursuing an M.S. in Applied Data Science at SJSU, sharpening statistical rigor and entrepreneurial vision toward a healthcare-focused analytics startup that empowers NGOs to improve outcomes for underserved communities.
research
Architected a fault-tolerant LangGraph multi-agent pipeline with GPT-4 tool-calling for geospatial data, analyzing structural inequity in Large Language Models (LLMs) for non-English scripts. Optimized downstream causality pipelines using PyTorch and R, directly improving risk analytics precision by 15%.
"Authored comprehensive research on LLMs & engineered multi-agent risk ingestion pipelines"
internship
Fine-tuned NASA Prithvi Vision Transformer via MAE pretraining in PyTorch for high-risk satellite detection. Architected scalable, cloud-native REST APIs on AWS EC2 & Lambda with Kubernetes orchestration and Prometheus observability, cutting spatial inference latency by 25%.
"Fine-tuned Vision Transformers processing 2TB daily via robust AWS/Kubernetes pipelines"
Full-Time
Developed Kafka-backed event-driven microservices on Azure Databricks with exactly-once semantics. Deployed RESTful XGBoost inference APIs via blue-green deployments, slashing infrastructure costs by $400K and reducing unplanned operational failures by 20%.
"Designed event-driven Azure tracking algorithms for 1.8M IoT signals & ML optimization"
Full-Time
Built low-latency RESTful APIs in Python/C++ for production messaging architectures. Engineered stateful stream processing via Kafka, refactored PostgreSQL materialized views tuning query latency (-28%), and deployed zero-downtime microservices using GitLab CI/CD and Docker.
"Stabilized concurrent PostgreSQL databases handling 15k+ req/sec & modernized CI/CD"
Work
A triple-model ensemble (XGBoost, Monte Carlo, Bayesian) predicting the 2026 F1 era with 38.9% accuracy.
Quantifying the structural 'Token Tax' and economic inequality disadvantaging non-English languages in global LLMs.
An interactive data visualization platform that transforms raw datasets into cinematic 3D narratives.
Research & Engineering
Season opener under the biggest regulatory reset in F1 history. XGBoost loved Hadjar at 35.2%; Monte Carlo gave Russell 75.8% from pole physics. The ensemble resolved at 38.9% — and documented three V1 failures for V2.
Post-Australia recalibration tightened the model's variance. Antonelli's first career pole at 19, Verstappen buried in P14 from an ERS error. Mercedes 1-2 probability: 84.5%. The Suzuka circuit amplifies what qualifying says.
First sprint weekend of 2026. V2 rebuilt with live sprint race results as the highest-weight input layer. Hamilton re-rated to 21.2% after his V1 underestimation. The fp1_to_quali_divergence feature corrected a systematic blind spot.
Kannada requires 3-5× more tokens than English for equivalent semantic content. Quantifying the Token Tax, the $262K-$365K annual cost premium for Hindi API services, and the three architectural bottlenecks that cascade from tokenization.
A privacy-first, multimodal AI system built at TreeHacks 2026. MediaPipe arm-drift + facial asymmetry detection fused with a PubMed-grounded RAG pipeline feeding Claude 3 for neurologic speech grading — all from a 60-second smartphone video.
Academic Work
Vishnu S. Pendyala, Mayank Kapadia, Basanth Periyapatna Roopa Kumar, Manav Anandani, Nischitha Nagendran
A dual big-data pipeline for real-time pandemic triage combining a streaming epidemiological risk prediction system (XGBoost + Bloom filter pre-screening on 3M+ CDC COVID-19 records) with a chest X-ray classification pipeline (EfficientNet-B0 + Grad-CAM). A lightweight GPT-based reasoning layer generates auditable ALERT/FLAG/LOG triage comments. CTGAN validates streaming robustness under synthetic load. Provides scalable, explainable, near-real-time decision support for public health readiness.
XGBoost Minority F1
0.76
Chest X-Ray Accuracy
99.5%
CDC Records Processed
3M+
Mayank Kapadia, Basanth Periyapatna Roopa Kumar, Nischitha Nagendran
A multimodal generative AI framework combining histopathology image classification, clinical note classification via ClinicalBERT with SFT/TAPT/LoRA fine-tuning, and prompt-driven clinical captioning using RAG. Establishes baselines on PatchCamelyon (262K images) and curates TCGA BRCA clinical notes into a balanced dataset of 2,380 reports for unified multimodal cancer diagnostics.
ViT-Base/16 ROC-AUC
0.9601
ClinicalBERT LoRA F1
0.94
PCam Test Images
32,768
Basanth Periyapatna Roopa Kumar, Nischitha Nagendran, Nandhakumar Apparsamy, Nitya Rondla
A deep learning framework for automated wildfire detection in Landsat-8 satellite imagery. Combines ResNet34-UNet with Convolutional Block Attention Modules (CBAM) across all 10 spectral bands, introduces soft labels from multi-annotator consensus to capture boundary uncertainty, and applies temperature scaling for calibrated probability estimates ready for operational deployment.
Mean IoU
69.6%
vs. Classical Baselines
+45.6%
ECE Reduction
86.1%
Anshu Reddy Dhamana, Basanth Periyapatna Roopa Kumar, Manav Rajesh Anandani, Nischitha Nagendran, Nitya Rondla, Srithareddy Devireddy, Vinuthna Papana
A knowledge graph-based traffic optimization system for the San Francisco Bay Area fusing traffic flow, weather, incidents, and events into a unified semantic graph. Node2vec embeddings power a stacking ensemble achieving 92% prediction accuracy with the connects_to relationship, delivering actionable insights into weekday vs. weekend traffic patterns for city planning and commuter support.
Prediction Accuracy
92%
Graph Relationship Types
6
Graph Nodes
550+
Library
Books read & screens watched in 2025. Hover to read my take.

Don Norman
Design
Yuval Noah Harari
History
Rick Rubin
Creative
Peter Thiel
Business
Daniel Kahneman
Psychology
Eric Jorgenson
Philosophy
Austin Kleon
Creative[M·06] SIGNAL: STRONG
Quotes
Thepeoplewhoarecrazyenoughtothinktheycanchangetheworldaretheoneswhodo.
Steve Jobs
Simplicityistheultimatesophistication.
Leonardo da Vinci
Youdon'thavetobegreattostart,butyouhavetostarttobegreat.
Zig Ziglar
Thebestwaytopredictthefutureistocreateit.
Peter Drucker
Stayhungry.Stayfoolish.
Stewart Brand
Whole Earth Catalog
Get in touch
Have a project in mind? A collaboration idea? Or just want to say hi?
My inbox is always open.