Dr. Jun Huan is an internationally recognized investigator in data science. He has published more than 130 peer-reviewed papers in leading conferences and journals. He was a recipient of the US National Science Foundation Faculty Early Career Development Award in 2009. His group won the Best Student Paper Award at the IEEE International Conference on Data Mining in 2011 and the Best Paper Award (runner-up) at the ACM International Conference on Information and Knowledge Management in 2009. Dr. Huan’s work has appeared in media publications including Science Daily, R&D Magazine, and EurekAlert. He regularly serves on the program committee of top-tier international conferences exploring topics on machine learning, data mining, big data, and bioinformatics. Before joining Baidu Research, Dr. Huan served as a program director in the Information and Intelligent Systems division at the National Science Foundation. Before that he was the spahr professor in EECS at the University of Kansas. Google Scholar:https://scholar.google.com/citations?user=9X2ThuAAAAAJ&hl=en&oi=sra
Interactions Modeling in Multi-Task Multi-View Learning with Consistent Task DiversityXiaoli Li, Jun Huan
Multi-task Multi-view (MTMV) learning has recently undergone noticeable development for dealing with heterogeneous data. To exploit information from both related tasks and related views, a common strategy is to model task relatedness and view consistency separately. The drawback of this strategy is that it did not consider the interactions between tasks and views. To remedy this, we propose a novel method, racBFA, by adding rank constraints to asymmetric bilinear factor analyzers (aBFA). We then adapt racBFA to our MTMV learning problem and design a new MTMV learning algorithm, racMTMV. We evaluated racMTMV on 3 real-world data sets. The experimental results demonstrated the effectiveness of our proposed method.Data Science and Data Mining,Machine Learning and Deep Learning
DBSDA: Lowering the Bound of Misclassification Rate for Sparse Linear Discriminant Analysis via Model Debiasing.Haoyi Xiong, Wei Cheng, Wenqing Hu, Jiang Bian, Zeyi Sun, and Zhishan Guo.
Linear discriminant analysis (LDA) is a well-known technique for linear classification, feature extraction, and dimension reduction. To improve the accuracy of LDA under the high dimension low sample size (HDLSS) settings, shrunken estimators, such as Graphical Lasso, can be used to strike a balance between biases and variances. Although the estimator with induced sparsity obtains a faster convergence rate, however, the introduced bias may also degrade the performance. In this paper, we theoretically analyze how the sparsity and the convergence rate of the precision matrix (also known as inverse covariance matrix) estimator would affect the classification accuracy by proposing an analytic model on the upper bound of an LDA misclassification rate.Machine Learning and Deep Learning
An Efficient Algorithm for Graph Edit Distance Computation，Knowledge-Based SystemsXiaoyang Chen, Hongwei Huo, Jun Huan, Jeffrey Scott Vitter
"The graph edit distance (GED) is a well-established distance measure widely used in many applications, such as bioinformatics, data mining, pattern recognition, and graph classification. However, existing solutions for computing the GED suffer from several drawbacks: large search spaces, excessive memory requirements, and many expensive backtracking calls. In this paper, we present BSS_GED, a novel vertex-based mapping method that calculates the GED in a reduced search space created by identifying invalid and redundant mappings. BSS_GED employs the beam-stack search paradigm, a widely utilized search algorithm in AI, combined with two specially designed heuristics to improve the GED computation, achieving a trade-off between memory utilization and expensive backtracking calls.Machine Learning and Deep Learning
Uniprocessor Mixed-Criticality Scheduling with Graceful Degradation by Completion RateZhishan Guo, Kecheng Yang, Sudharsan Vaidhun, Samsil Arefin ,Sajal K. Das, and Haoyi Xiong.
The scheduling of mixed-criticality (MC) systems with graceful degradation is considered, where LO-criticality tasks are guaranteed some service in HI mode in the form of minimum cumulative completion rates. First, we present an easyto-implement admission-control procedure to determine which LO-criticality jobs to complete in HI mode. Then, we propose a demand-bound-function-based MC schedulability test that runs in pseudo-polynomial time for such systems under EDF-VD scheduling, wherein two virtual deadline setting heuristics are considered. Furthermore, we discuss a mechanism for the system to switch back from HI to LO mode and quantify the maximum time duration such recovery process would take. Finally, we show the effectiveness of our proposed method by experimental evaluation in comparison to state-of-the-art MC schedulers.Machine Learning and Deep Learning