Publications

You can also find my articles on my Google Scholar profile.

Selected Publications


  • "LLMs and Databases: A Synergistic Approach to Data Utilization.", IEEE Data Engineering Bulletin
  • "Is Long Context All You Need? Leveraging LLM's Extended Context for NL2SQL.", VLDB
  • "High-Fidelity And Complex Test Data Generation For Real-World SQL Code Generation Services.", arxiv
  • "CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL.", ICLR
  • "CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL.",
  • "Automated Data Slicing for Model Validation: A Big Data - AI Integration Approach.", TKDE
  • "Slice Finder: Automated Data Slicing for Model Validation.", ICDE
  • "Quantifying Uncertainty in Data Exploration.", Brown University
  • "Democratizing Data Science through Interactive Curation of ML Pipelines.", SIGMOD
  • "Unknown Examples & Machine Learning Model Generalization.", arxiv
  • "Towards Quantifying Uncertainty in Data Analysis & Exploration.", IEEE Data Engineering Bulletin
  • "Towards Interactive Curation & Automatic Tuning of ML Pipelines.", MLSys
  • "Slice Finder: Automated Data Slicing for Model Validation.", ICDE
  • "Improved Neighborhood Search for Collaborative Filtering.", IJFIS
  • "Towards Quantifying Uncertainty in Data Analysis & Exploration", IEEE Bulletin (Data Engineering) 2018
  • "Estimating the Impact of Unknown Unknowns on Aggregate Query Results.", TODS
  • "Unknown examples & machine learning model generalization", arxiv 2018
  • "Towards Interactive Data Exploration.",
  • "A Data Quality Metric (DQM): How to Estimate the Number of Undetected Errors in Data Sets.",
  • "Estimating the Impact of Unknown Unknowns on Aggregate Query Results.", SIGMOD
  • "A Data Quality Metric (DQM): How to Estimate The Number of Undetected Errors in Data Sets.", VLDB
  • "Using RDMA for Lock Management.",
  • "Estimating the Impact of Unknown Unknowns on Aggregate Query Results.",
  • "A Behavior Analysis-Based Game Bot Detection Approach Considering Various Play Styles.",
  • "Personalized Expert-Based Recommender System: Training C-SVM for Personalized Expert Identification.",
  • Technical Reports


  • "High-Performance Llama 2 Training and Inference with PyTorch/XLA on Cloud TPUs", PyTorch Blog 2023
  • "PyTorch/XLA SPMD: Scale Up Model Training and Serving with Automatic Parallelization", PyTorch Blog 2023