Christopher M. Jermaine
Professor of Computer Science
  • B.A. Math/Comp. Sci (1993) The University of California, San Diego, San Diego, CA
  • M.S. (1997) The Ohio State University, Columbus
  • Ph.D. (2002) Georgia Institute of Technology, Atlanta, GA
Primary Department
   Department of Computer Science
 Professional Webpage
Research Areas
 I work on the design and development of elational database systems and applied statistical methods. I focus on using tools from probability and statistics within a database to permit flexible, scalable analysis of database data, with an emphasis on medical applications
Selected Publications

R Myers, JR Ruiz, CM Jermaine, JC Frenzel "A Comparison of Mortality Predictors in Cancer Surgery Patients." Anesthesia & Analgesia, 123 (2016)


S. Joshi, C. Jermaine "Robust Stratified Sampling Plans for Low Selectivity Queries." Proc. IEEE ICDE (2008) : p. 471, 12 pages.

 Refereed articles

Florin Rusu, Zixuan Zhuang, Mingxi Wu, Chris Jermaine "Workload-Driven Antijoin Cardinality Estimation." ACM Transactions on Database Systems (TODS), 40 (2015) : 1-41.


Risa B Myers, Christos Lazaridis, Christopher M Jermaine, Claudia S Robertson, Craig G Rusin "Predicting intracranial pressure and brain tissue oxygen crises in patients with severe traumatic brain injury." Critical care medicine, 44 (2016) : 1754-1761.


Risa B Myers, John C Frenzel MD, Joseph R Ruiz Md, Christopher M Jermaine "Do Anesthesiologists Know What They Are Doing? Mining a Surgical Time-Series Database to Correlate Expert Assessment with Outcomes." ACM Transactions on Knowledge Discovery from Data (TKDD), 10 (2016) : 1-27.


Yun Yu, Christopher Jermaine, Luay Nakhleh "Exploring phylogenetic hypotheses via Gibbs sampling on evolutionary networks." BMC Genomics, 17 (2016) : 12-.


Zekai J Gao, Chris Jermaine "Distributed Algorithms for Computing Very Large Thresholded Covariance Matrices." ACM Transactions on Knowledge Discovery from Data (TKDD), 11 (2016) : 1-25.


Supriya Nirkhiwale, Alin Dobra, Christopher M. Jermaine: A Sampling Algebra for Aggregate Estimation. PVLDB 6(14): 1798-1809 (2013)


Graham Cormode, Minos N. Garofalakis, Peter J. Haas, Chris Jermaine: Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches. Foundations and Trends in Databases 4(1-3): 1-294 (2012)


Jayendra Venkateswaran, Bin Song, Tamer Kahveci, Chris Jermaine: TRIAL: A Tool for Finding Distant Structural Similarities. IEEE/ACM Trans. Comput. Biology Bioinform. 8(3): 819-831 (2011)


Niketan Pansare, Vinayak R. Borkar, Chris Jermaine, Tyson Condie: Online Aggregation for Large MapReduce Jobs. PVLDB 4(11): 1135-1145 (2011)


Ravi Jampani, Fei Xu, Mingxi Wu, Luis Leopoldo Perez, Chris Jermaine, Peter J. Haas: The monte carlo database system: Stochastic analysis close to the data. ACM Trans. Database Syst. 36(3): 18 (2011)


Mingxi Wu, Chris Jermaine, Sanjay Ranka, Xiuyao Song, John Gums: A Model-Agnostic Framework for Fast Spatial Anomaly Detection. TKDD 4(4): 20 (2010)


Subi Arumugam, Ravi Jampani, Luis Leopoldo Perez, Fei Xu, Christopher M. Jermaine, Peter J. Haas: MCDB-R: Risk Analysis in the Database. PVLDB 3(1): 782-793 (2010)


Mingxi Wu, Chris Jermaine "Guessing the extreme values in a data set: a Bayesian method and its applications." VLDB J 2009, 18(2): 571-597.


Shantanu Joshi, Christopher M. Jermaine "Sampling-based estimators for subset-based queries." VLDB J 2009, 18(1): 181-202.


Abhijit Pol, Christopher M. Jermaine, Subramanian Arumugam "Maintaining very large random samples using the geometric file."  VLDB J, 17(5) (2008) : 997-1018.


Chris Jermaine, Subramanian Arumugam, Abhijit Pol, Alin Dobra "Scalable approximate query processing with the DBO engine." ACM Trans. Database Syst., 33(4) (2008)


Fei Xu, Christopher M. Jermaine, Alin Dobra "Confidence bounds for sampling-based group by estimates." ACM Trans. Database Syst., 33(3) (2008)


Jayendra Venkateswaran, Tamer Kahveci, Christopher M. Jermaine, Deepak Lachwani "Reference-based indexing for metric spaces with costly distance measures." VLDB J, 17(5) (2008) : 1231-1251.


Manas Somaiya, Christopher M. Jermaine, Sanjay Ranka "Learning correlations using the mixture-of-subsets model." ACM Trans. Know. Disc. and Data Mining, 1(4) (2008)


S. Joshi, C. Jermaine "Materialized Sample Views for Database Approximation." IEEE Trans. Knowl. Data Eng., 20 (3) (2008) : p. 337, 14 pages.


C. Jermaine "Online Random Shuffling of Large Database Tables." IEEE Trans. Knowl. Data Eng., 19 (1) (2007) : p. 73, 12 pages + 3 page appendix.


X. Song, M. Wu, C. Jermaine, S. Ranka "Conditional Anomaly Detection." IEEE Trans. Knowl. Data Eng., 19 (5) (2007) : p. 631, 15 pages.


c. Jermaine, E. Omiecinski, W. G. Yee "The Partitioned Exponential File for Database Storage Management." VLDB Journal, 16 (4) (2007) : P. 417, 21 pages.


C. Jermaine, A. Dobra, S. Arumugam, S. Joshi, A. Pol "The Sort-Merge-Shrink join." ACM Trans. Database Syst., 31 (4) (2006) : p. 1382, 35 pages + 10 page appendix.


C. Jermaine "Finding the Most Interesting Coorelations in a Database: How Hard Can It be?." Information Systems, 30 (1) (2005) : p. 21, 25 pages.


W.G. Yee, S.B. Navathe, E. Omiecinski, C. Jermaine "Efficient Data Allocation over Multiple Channels at Broadcast Servers." IEEE Trans. Computers, 51 (10) (2002) : p. 1231, 6 pages.

 Editor of books

Chris Jermaine (Editor) "Special Issue on Approximate Query Processing and Applications." Bulletin of the Technical Committee on Data Engineering, 38 (2015) : 1-112.


Christian S Jensen, Christopher Jermaine, Xiaofang Zhou (Editors) "Special section on the international conference on data engineering (Edited Issue)." IEEE Transactions on Knowledge and Data Engineering, 27 (2015)


Christopher Jermaine, Editor: Databases, Declarative Systems and Machine Learning. The Bulletin of the EEEE Technical Committee on Data Engineering, September, 2014.


Christian S. Jensen, Christopher M. Jermaine, Xiaofang Zhou (Eds.): 29th IEEE International Conference on Data Engineering, ICDE 2013, Brisbane, Australia, April 8-12, 2013. IEEE Computer Society 2013, ISBN 978-1-4673-4909-3

 Refereed conference papers

Risa Myers, John Frenzel, Joseph Ruiz, Christopher M. Jermaine "Correlating Surgical Vital Sign Quality with 30-Day Outcomes Using Regression on Time-Series Segment Features." Proc. SDM 2015: 11.


Yanxin Lu, Joe Warren, Christopher M. Jermaine, Swarat Chaudhuri, Scott Rixner "Grading the Graders: Motivating Peer Graders in a MOOC." Proc. WWW 2015: 11.


Anna Drummond and Christopher Jermaine:  Senders, Receivers and Authors in Document Classification. In Proc. IEEE International Conference on Data Mining 2014 (ICDM).


Anna Drummond, Yanxin Lu, Swarat Chaudhuri, Christopher Jermaine, Scott Rixner, and Joe Warren:  Learning to Grade Student Programs in a Massive Open Online Course. In Proc. IEEE International Conference on Data Mining 2014 (ICDM).


Luis Leopoldo Perez*, Christopher M. Jermaine: History-Driven Query Optimizaton with
Materialized Intermediate Vews. Proc. IEEE ICDE, 2014, 12 pages


Zhuhua Cai*, Luis Leopoldo Perez*, Shangyu Luo*, Jacob Gao*, Zografoula Vagena*,
Christopher M. Jermaine: An Experimental Comparison of Platforms for Implementing
and Executing Large-Scale Machine Learning Codes. Proc. ACM SIGMOD
2014, 12 pages.


Anna Drummond, Chris Jermaine, Zografoula Vagena: Topic Models For Feature Selection in Document Clustering. SDM 2013: 521-529


B. Bue and C. Jermaine: Multiclass Domain Adaptation with Iterative Manifold Alignment.
In Proc. IEEE WHISPERS 2013, 4 pages


Zhuhua Cai*, Christopher M. Jermaine, Zografoula Vagena*, Dionysios Logothetis, Luis
Leopoldo Perez*: The Pairwise Gaussian Random Field for High-Dimensional Data Imputation.
In Proc. ICDM 2013: 61, 10 pages.


Zhuhua Cai*, Zografoula Vagena*, Luis Leopoldo Perez*, Subramanian Arumugam*,
Peter J. Haas, Christopher M. Jermaine: Simulation of Database-valued Markov Chains
Using SimSQL. In Proc. ACM SIGMOD 2013: 637, 12 pages


Niketan Pansare, Chris Jermaine, Peter J. Haas, Nitendra Rajput: Topic Models over Spoken Language. ICDM 2012: 1062-1067


Zhuhua Cai, Chris Jermaine: The Latent Community Model for Detecting Sybils in Social Networks. NDSS 2012, 10 pages


Alin Dobra, Chris Jermaine, Florin Rusu, Fei Xu "Turbo-Charging Estimate Convergence in DBO." Proc. VLDB, 2010, 1: 419-430.


Fei Xu, Ravi Jampani, Mingxi Wu, Chris Jermaine, Tamer Kahveci: Surrogate ranking for very expensive similarity queries. ICDE 2010: 848-859


Luis Leopoldo Perez, Subi Arumugam, Christopher M. Jermaine: Evaluation of probabilistic threshold queries in MCDB. SIGMOD 2010: 687-698


Manas Somaiya, Christopher M. Jermaine, Sanjay Ranka: Mixture models for learning low-dimensional roles in high-dimensional data. KDD 2010: 909-918


Subi Arumugam, Alin Dobra, Christopher M. Jermaine, Niketan Pansare, Luis Leopoldo Perez: The DataPath system: a data-centric analytic processing engine for large data warehouses. SIGMOD Conference 2010: 519-53


Mingxi Wu, Xiuyao Song, Chris Jermaine, Sanjay Ranka, John Gums "A LRT Frame-work for Fast Spatial Anomaly Detection." Proc. ACM KDD 2009 (Runner-up Best Paper Award), p 887: 9.


F. Rusu, F. Xu, L.L. Perez, M. Wu, R. Jampani, C. Jermaine, A. Dobra "The DBO Database System." Proc. SIGMOD (2008) : p. 1223, 4 pages.


R. Jampani, F. Xu, M. Wu, L. L. Perez, C.M. Jermaine, P.L. Haas "MCDB: a Monte Carlo approach to Managing Uncertain Data." Proc. SIGMOD (2008) : p. 687, 14 pages.


X. Song, C. Jermaine, S. Ranka, J. Gums "A Bayesian Mixture Model with Linear Regression Mixing Proportions." Proc. ACM KDD (2008) : 9.


C. Jermaine, S. Arumugam, A. Pol, A. Dobra "Scalable Approximate Query Processing with the DBO Engine." Proc. ACM SIGMOD (2007) : p. 725, 12 pages.


F. Xu, C. Jermaine "Randomized Algorithms for Data Reconciliation in Wide Area Aggregate Query Processing." Proc. VLDB (2007) : p 639, 12 pages.


M. Wu, C. Jermaine "A Bayesian Method for Guessing the Extreme Values in a Data Set." Proc. VLDB (2007) : p. 471, 12 pages.


X. Song, M. Wu, C. Jermaine, S. Ranka "Statistical Change Detection for Multidimensional Data." Proc. ACM KDD (2007) : p. 667, 10 pages.


J. Venkateswaran, D. Lachwani, T. Kahveci, C. Jermaine "Reference-based Indexing of Sequence Databases." Proc. VLDB (2006) : p. 906, 12 pages.


M. Wu, C. Jermaine "Outlier Detection by Sampling with Accuracy Guarantees." Proc. ACM KDD (2006) : p. 767, 6 pages.


R. Jin, L. Glimcher, C. Jermaine, G. Agrawal "New Sampling-Based Estimators for OLAP Queries." Proc. IEEE ICDE (2006) : p. 86, 10 pages.


S. Arumugam, C. Jermaine "Closest-Point-of-Approach Join for Moving Object Histories." Proc. IEEE ICDE (2006) : p. 18, 10 pages.


A. Pol, C. Jermaine "Relational Confidence Bounds Are Easy With The Bootstrap." Proc. ACM SIGMOD (2005) : p. 587, 12 pages.


C. Jermaine, A. Dobra, A. Pol "Online Estimation for Subset-Based SQL Queries." Proc. VLDB (2005) : p. 745, 12 pages.


C. Jermaine, A. Dobra, S. Arumugam, S. Joshi, A. Pol "A Disk-Based Join With Probabilistic Guarantees." Proc. ACM SIGMOD (2005) : p. 563, 12 pages.


C. Jermaine, A. Pol, S. Arumugam "Online Maintenance of Very Large Random Samples." Proc. ACM SIGMOD (2004) : p. 299, 12 pages.


C. Jermaine "Playing Hide-and-Seek With Correlations." Proc. ACM KDD (2003) : p. 242, 6 pages.


C. Jermaine "Robust Estimation With Sampling and Approximate Pre-Aggregation." Proc. VLDB (2003) : p. 235, 12 pages.


C. Jermaine, E. Omiecinski "Lossy Reduction for Very High Dimensional Data." Proc. IEEE ICDE (2002) : p. 663, 10 pages.


W.G. Yee, S.B. Navathe, E. Omiecinski, C. Jermaine "Bridging Response Time and Energy-Efficiency in Broadcast Schedule Design." Proc. EDBT (2002) : p. 572, 18 pages.


C. Jermaine "The Computational Complexity of High-Dimensional Correlation Search." Proc. IEEE ICDE (2001) : p. 249, 7 pages.


C. Jermaine, E. Omiecinski, W.G. Yee "Maintaining a Large Spatial Database with T2SM." Proc. ACM GIS (2001) : p. 100, 6 pages.


C. Jermaine "Computing Program Modularizations with the k-Cut Method." Proc. WCRE (1999) : p. 224, 11 pages.


C. Jermaine, A. Datta, E. Omiecinski "A Novel Index Supporting High Volume Data Warehouse Insertion." Proc. VLDB (1999) : p. 235, 11 pages.

 Seminar Speaker

"Large Scale Machine Learning with the SimSQL System," University of Wisconsin, Madison


"Large Scale Machine Learning with the SimSQL System," University of Michigan


“Large Scale Bayesian Machine Learning with the Sim-SQL System,” University of Edinburgh, UK


"Large Scale Bayesian Machine Learning with the Sim-SQL System," EPFL, Switzerland

Editorial Positions
 Associate Editor, Proceedings of the Very Large Database Endowment. (2016 - 2016)

 Associate Editor, IEEE Transactions on Knowledge and Data Engineering. (2008 - 2013)

 Associate Editor, ACM Transactions on Database Systems. (2010 - 2015)

 Associate Editor, Very Large Database Journal. (2007 - 2013)

 Other, ACM SIGMOD Tutorials Track. (2014 - 2014)

 Other, IEEE ICDE Conference Technical Program. (2013 - 2013)

 Other, ACM SIGMOD Demonstrations Track. (2011 - 2011)

 Other, VLDB PhD Workshop. (2013 - 2013)

 Other, VLDB Workshop Program. (2011 - 2011)

Supervised Theses & Dissertations
 Laukik Chitnis, Ph.D. n/a. (2008) (Committee Member)

 Mark McKenney, Ph.D. n/a. (2008) (Committee Member)

 Alexandra Martinez, Ph.D. n/a. (2008) (Committee Member)

 Jun Liu, Ph.D. n/a. (2008) (Committee Member)

 Hyun Jung Park, Doctoral Towards Accurate Reconstruction of Phylogentic Networks. (2012) (Committee Member)

 Yun Yu, Master of Science From Gene Trees to Species Trees: Algorithms for Parasimonious Reconciliation. (2012) (Committee Member)

 Anna Drummond, Doctoral Statistical Machine Lerning for Text Mining with Markov Chain Monte Carlo Inference. (2014) (Committee Member)

 Niketan Pansare, PhD Large-Scale Online Aggregation Via Distributed Systems. (2014) (Thesis or Dissertation Director)

 Zekai Gao, Masters Distributed Algorithms for Computing Very Large Thresholded Covariance Matrices. (2014) (Thesis or Dissertation Director)

 Ruiqi Liu, MS Performance Analysis and Configuration Selection for Applications in the Cloud. (2014) (Committee Member)

 Rahul Kumar, MS-Thesis Learning to Highlight Relevant Text in Binary Classified. (2014) (Thesis or Dissertation Director)

 Troy Ruths, PhD Population Regulomics: Applying population genetics to the cis-regulome. (2014) (Committee Member)

 Yanxin Lu, MS Improving the efficiency & Quality of Massive Open Online Course by using the Automated Grading & Feedback generation. (2014) (Committee Member)

 Anna Drummond, PhD Extension of Topic Models in Text Analysis. (2014) (Thesis or Dissertation Director)

 Yun Yu, Doctoral Models and Methods for Evolutionary Histories Involving Hybridization and Incomplete Lineage Sorting. (2014) (Committee Member)

 Troy Ruths, Doctoral Population Regulomics: Applying opoulation genetics to the cis-regulome. (2014) (Committee Member)

 Zhuhua Cai, Doctoral Very Large Scale Bayesian Machine Learning. (2014) (Committee Member)

 Anhei Shu, Masters Data Minging of Chinese Social Media. (2014) (Committee Member)

 Luis Perez, Doctoral Query Processing and Optimization for Stochastic Analytics. (2015) (Thesis or Dissertation Director)

 Yiting Xia, PhD Contrast Plane Design and Performance Analysis for Optical Multicast-capable Datacenter Networks. (2016) (Committee Member)

Awards, Prizes, & Fellowships
 (T+R)^2 Award, Rice George R. Brown School of Engineering (May, 2016)

 Alfred P. Sloan Foundation Research Fellowship, (2008-2010)

 ACM SIGMOD Conference Best Paper Award, (2007)

 National Science Foundation CAREER Award, NSF (2004)

Positions Held
 Associate Editor, Very Large Database Journal. (2011 - 2011)

 Program Committee Member, ACM SIGMOD 2012, . (2012 - 2013)

 Program Co-Chair. 2013 IEEE ICDE Conference Technical Program.. (2013 - 2013)

 Program Committee member, ACM SIGMOD 2013. (2013 - 2013)

 Program Committee Member, SIAM Data Mining 2013. (2013 - 2013)

 Program Committee member, ACM KDD 2013. (2013 - 2013)

 Program Committee Member, VLDB 2012, . (2013 - 2013)

 Program Committee Member, IEEE ICDM 2012. (2012 - 2012)

 Program Committee Member, ACM SoCC 2013. (2013 - 2013)

 Associate Editor, IEEE Transaction on Knowledge and Data Engineering. (2011 - 2011)

 Associate Editor, ACM TRansactions on Database systems. (2011 - 2011)

 Program Committee Member, CIKM 2012. (2012 - 2012)

 Program Committee Member, VLDB. (2004 - 2004)

 Program Committee Member, VLDB. (2006 - 2006)

 Program Committee Member, VLDB. (2008 - 2008)

 Program Committee Member, VLDB. (2005 - 2005)

 Program Committee Member, VLDB. (2007 - 2007)

 Program Committee Member, ACM KDD. (2007 - 2007)

 Program Committee Member, ACM KDD. (2005 - 2005)

 Program Committee Member, ACM KDD. (2004 - 2004)

 Program Committee Member, ACM SIGMOD. (2006 - 2006)

 Program Committee Member, ACM KDD. (2008 - 2008)

 Program Committee Member, ACM KDD. (2006 - 2006)