Publications

You can also find my articles on my Google Scholar profile.

Selected Publications


  • "Is Long Context All You Need? Leveraging LLM's Extended Context for NL2SQL", VLDB 2025
  • "Chase-sql: Multi-path reasoning and preference optimized candidate selection in text-to-sql", ICLR 2025
  • "Democratizing data science through interactive curation of ml pipelines", SIGMOD 2019
  • "Slice finder: Automated data slicing for model validation", ICDE 2019
  • "Estimating the impact of unknown unknowns on aggregate query results", SIGMOD 2016 / TODS 2018 [Extended]
  • "Towards Quantifying Uncertainty in Data Analysis & Exploration", IEEE Bulletin (Data Engineering) 2018
  • "Unknown examples & machine learning model generalization", arxiv 2018
  • "A Data Quality Metric (DQM): How to Estimate The Number of Undetected Errors in Data Sets", VLDB 2017
  • "Game bot detection approach based on behavior analysis and consideration of various play styles", ETRI Journal 2013
  • Technical Reports


  • "High-Performance Llama 2 Training and Inference with PyTorch/XLA on Cloud TPUs", PyTorch Blog 2023
  • "PyTorch/XLA SPMD: Scale Up Model Training and Serving with Automatic Parallelization", PyTorch Blog 2023