Filter by category:

Sort by year:

Learning Root Source with Marked Multivariate Hawkes Processes

Wei Zhang , Fan Bu, Derek Owens-Oas, Xiaojin Zhu, Katherine Heller
Hawkes Process Preprint


The latent structure of mutually-exciting event sequences on networks often resembles a forest, with root events prompting subsequent event trees. Root source identification, the task of identifying the net- work node on which a root event occurs, is of interest in settings where finding the “initial responsible party” is important. In this paper, we introduce a novel concept, root source probability, to quantify the uncertainty of root source based on multivariate Hawkes processes, and develop a linear time algorithm to compute this quantity. A concretely specified model of our proposed framework is applied to text cascade data. Experiments on synthetic and real-world datasets show that our model identifies root sources that match the ground truth and human intuitions.

A Hidden Markov Model’s Agreement with Clinical Diagnoses and its Indication of Additional Preclinical Cognitive Deficits in the Wisconsin Registry for Alzheimer’s Prevention

Wei Zhang , Rebecca L. Koscik, Lindsay R. Clark, Vikas Singh, Xiaojin Zhu, Sterling C. Johnson.
Clinical Data Alzheimer’s & Dementia: The Journal of the Alzheimer’s Association, 13.7 (2017): P687-P688


Clinicians determine cognitive status based on review of multidimensional data (e.g., neuropsychological, medical history, daily functioning). Reliable determination of statuses, including potentially at preclinical stages of decline, is essential. In this study, we used a constrained hidden Markov model (HMM) to investigate the consistency of clinicians’ cognitive status assignments in the Wisconsin Registry for Alzheimer’s Prevention (WRAP).

Supervised Hashing with Latent Factor Models

Peichao Zhang, Wei Zhang , Wu-jun Li, Minyi Guo
Hashing In Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) , 2014.


Due to its low storage cost and fast query speed, hashing has been widely adopted for approximate nearest neighbor search in large-scale datasets. Traditional hashing methods try to learn the hash codesin an unsupervised way where the metric (Euclidean) structure of the training data is preserved. Very recently, supervised hashing methods, which try to preserve the semantic structure constructed from the semantic labels of the training points, have exhibited higher accuracy than unsupervised methods. In this paper, we propose a novel supervised hashing method, called latent factor hashing (LFH), to learn similarity-preserving binary codes based on latent factor models. An algorithm with convergence guarantee is proposed to learn the parameters of LFH. Furthermore, a linear-time variant with stochastic learning is proposed for training LFH on large-scale datasets. Experimental results on two large datasets with semantic labels show that LFH can achieve superior accuracy than state-of-the-art methods with comparable training time