Semi-Supervised Learning Software

Assembled by Jake Yang and Xiaojin Zhu
Last updated on March 18, 2013

Disclaimer: This collection is provided to facilitate machine learning research.
It is not an endorsement of any listed software. We are not responsible for the
content of any listed software -- access at your own risk. The information is
gathered with our best effort, and may contain mistakes. If you noticed an error
or want to alert us of new software, please contact us at jerryzhu@cs.wisc.edu.

(Sorted by title)

Title: Active Semi-Supervised Learning
- Author:
  - Xiaojin Zhu (University of Wisconsin-Madison)
- URL:
  - http://www.cs.cmu.edu/~zhuxj/pub/semisupervisedcode/active_learning/
- Description:
  - Implementation of semi-supervised learning combined with active learning. In active learning, the algorithm chooses which examples to label in the hopes of reducing the overall amount of data required for learning.
- Related papers:
  - Xiaojin Zhu, John Lafferty, and Zoubin Ghahramani. Combining active learning and semi- supervised learning using Gaussian fields and harmonic functions. In ICML 2003 workshop on The Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining, 2003.
Title: Aleph 0.6: graph label propagation, discrete regularization
- Author: Jose Iria
- Link: http://www.mloss.org/software/view/172/
- Description:
  - Aleph is both a multi-platform machine learning framework aimed at simplicity and performance, and a library of selected state-of-the-art Features:
    - semi-supervised algorithms: graph label propagation, discrete regularization, etc.
Title: DUALIST: Utility for Active Learning with Instances and Semantic Terms
- Developer:
  - Burr Settles
- Link:
  - https://code.google.com/p/dualist/
- Type:
  - Active Learning
  - Semi-supervised Learning
- Related papers:
  - B. Settles. Closing the Loop: Fast, Interactive Semi-Supervised Annotation With Queries on Features and Instances. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1467-1478. ACL, 2011. (addendum)
  - B. Settles and X. Zhu. Behavioral Factors in Interactive Training of Text Classifiers. In Proceedings of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL HLT), pages 563-567. ACL, 2012.
- Comment:
  - Well-documented.
- Video:
  - http://vimeo.com/21671958
Title: Gaussian Process Learning
- Author: Neil Lawrence (The University of Sheffield)
- URL:
  - http://staffwww.dcs.shef.ac.uk/people/N.Lawrence/fgplvm/
- Description:
  - Implementation of semi-supervised learning with Gaussian processes.
- Related papers:
  - Neil D. Lawrence and Michael I. Jordan. Semi-supervised learning via Gaussian processes. In Lawrence K. Saul, Yair Weiss, and Léon Bottou, editors, Advances in Neural Information Processing Systems 17. MIT Press, Cambridge, MA, 2005.
Title: Harmonic Function
- Author:
  - Xiaojin Zhu (University of Wisconsin-Madison)
- URL:
  - http://pages.cs.wisc.edu/~jerryzhu/pub/harmonic_function.m
- Description:
  - Matlab implementation of the harmonic function formulation of graph-based semi-supervised learning.
- Related papers:
  - Xiaojin Zhu, Zoubin Ghahramani, and John Lafferty. Semi-supervised learning using Gaussian fields and harmonic functions. In The 20th International Conference on Machine Learning (ICML), 2003.
Title: HSSR Hessian based semi supervised regression and dimensionality reduction
- Developers:
  - Kwang In Kim
  - Florian Steinke
  - Matthias Hein
- Language: Matlab
- Link:
  - http://www.mloss.org/software/view/223/
- Description from the Developers:
  - This package contains Matlab code for semi-supervised regression using the Hessian energy.
  - Semi-supervised regression based on the graph Laplacian suffers from the fact that the solution is biased towards a constant and the lack of extrapolating power (cf. project web for examples). Based on these observations, we propose using the second-order Hessian energy for semi-supervised regression which overcomes both these problems. If the data lies on or close to a low-dimensional submanifold in feature space, the Hessian energy prefers functions whose values vary linearly with respect to geodesic distance, which makes it particulary suited for semi-supervised dimensionality reduction.
  - The code provides a more stable estimate of the Hessian operator as the one used in "Hessian eigenmaps" proposed by Donoho and Grimes and thus using its eigenvectors can also be used for unsupervised dimensionality reduction.
Title: Java Implementation of Label Propagation
- Language: Java
- Author: smly@GitHub
- Link: https://github.com/smly/java-labelpropagation
- Related papers:
  - Chapelle O, Schölkopf B and Zien A: Semi-Supervised Learning, 508, MIT Press, Cambridge, MA, USA, (2006).
Title: The Junto Label Propagation Toolkit
- Author: Partha Pratim Talukdar - Carnegie Mellon University
- Language: Java
- Link: https://github.com/parthatalukdar/junto
- Type:
  - Gaussian Random Fields (GRF)
  - Adsorption and Modified Adsorption (MAD) algorithms
    - As described in:
      - Weakly Supervised Acquisition of Labeled Class Instances using Graph Random Walks. Talukdar et al., EMNLP 2008
      - New Regularized Algorithms for Transductive Learning. Partha Pratim Talukdar, Koby Crammer, ECML-PKDD 2009
      - Experiments in Graph-based Semi-Supervised Learning Methods for Class-Instance Acquisition. Partha Pratim Talukdar, Fernando Pereira, ACL 2010
    - LP_ZGL
      - As described in:
        
        Xiaojin Zhu and Zoubin Ghahramani. Learning from labeled and unlabeled data with label propagation.
        Technical Report CMU-CALD-02-107, Carnegie Mellon University, 2002.
Title: Layout-Aware Text Extraction from Full-text PDF of Scientific Articles
- Developers:
  - Biomedical Knowledge Engineering group @ the Information Sciences Institute
  - Gully Burns (gully@usc.edu)
  - Cartic Ramakrishnan(rcartic@gmail.com)
  - Ed Hovy (hovy@isi.edu).
- Link:
  - https://code.google.com/p/lapdftext/
- Description:
  - The PDF format is widely used for online scientific publications, however, it is notoriously difficult to read and handle computationally, which presents challenges for developers of biomedical text mining or biocuration informatics systems that use the published literature as an information source.The software implements Semi-supervised learning and Layout-Aware PDF Text Extraction (LA-PDFText).
- Paper:
  - Ramakrishnan, C., A. Patnia, E. Hovy and G. Burns (2012). "Layout-Aware Text Extraction from Full-text PDF of Scientific Articles." Source Code for Biology and Medicine 7(1): 7. [http://www.scfbm.org/content/7/1/7/abstract]
- Note:
  - Well-documented.
Title: Low Density Separation
- Authors:
  - Olivier Chapelle (Yahoo! Research),
  - Alexander Zien (Friedrich Miescher Laboratory of the Max Planck Society)
- URL: http://olivier.chapelle.cc/lds/
- Description:
  - Matlab/C implementation of the low density separation algorithm. This algorithm tries to place the decision boundary in regions of low density, similar to Transductive SVMs.
- Related papers:
  - Olivier Chapelle and Alexander Zien. Semi-supervised classification by low density separation. In Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics (AISTAT 2005), 2005.
Title: ManifoldLearn
- Author: Vikas Sindhwani (IBM T.J. Watson Research Center)
- URLs:
- Description:
  - Matlab code that implements manifold regularization and contains several other functions useful for different types of graph-based learning.
- Related Papers:
  - Mikhail Belkin, Partha Niyogi, and Vikas Sindhwani. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research, 7:2399-2434, November 2006.
Title: Maximum Variance Unfolding
- Author: Kilian Q. Weinberger
- URL:
  - http://www.cse.wustl.edu/~kilian/code/code.html
- Description:
  - Implements variations of the dimensionality reduction technique known as maximum variance unfolding. This is a graph-based, spectral method that can use unlabeled data in a preprocessing step for classification or other tasks.
- Related papers:
  - L. K. Saul, K. Q. Weinberger, J. H. Ham, F. Sha, and D. D. Lee. Spectral methods for dimensionality reduction. In O. Chapelle B. Schoelkopf and A. Zien, editors, Semisupervised Learning. MIT Press, 2006.
Title: Naive Bayes EM Algorithm 1.0.0
- Author: Rui Xia
- Language: C++
- URL:
  - http://www.mloss.org/software/view/357/
- Description:
  - An C++ implementation of Naive Bayes Classifier that performs both supervised learning and semi-supervised learning. The Naive Bayes algorithm requires the probabilistic distribution to be discrete. The multinomial event model is used for representation. For supervised learning the maximum likelihood estimate is used, and the expectation-maximization estimate is used for semi-supervised and un-supervised learning.
Title: Nonparametric Transforms of Graph Kernels
- Author:
  - Xiaojin Zhu (University of Wisconsin-Madison)
- URL:
  - http://pages.cs.wisc.edu/~jerryzhu/pub/nips04.tgz
- Description:
  - Implementation of an approach to building a kernel for semi-supervised learning. A non-parametric kernel is derived from spectral properties of graphs of labeled and unlabeled data. This formulation simplifies the optimization problem to be solved and can scale to large data sets.
- Related papers:
  - Xiaojin Zhu, Jaz Kandola, Zoubin Ghahramani, and John Lafferty. Nonparametric transforms of graph kernels for semi-supervised learning. In Lawrence K. Saul, Yair Weiss, and Leon Bottou, editors, Advances in Neural Information Processing Systems (NIPS) 17. MIT Press, Cambridge, MA, 2005.
Title: OMSA - opinion mining and sentiment analysis
- Author: lampts@GitHub
- Link: https://github.com/lampts/OMSA
- Approaches
  - Supervised
  - Unsupervised
  - Semi-supervised
- Key words
  - Methods:k-NN, NB, ME, SVM
  - Terms: frequency, tf/idf, cosine distance, similarity
Title: Parallel Semi-Supervised Latent Dirichlet Allocation
- Author:
  - David Andrzejewski - University of Wisconsin-Madison
- Language: C
- Link:
  - https://github.com/davidandrzej/pSSLDA
- Related papers:
  - Andrzejewski, D. and Zhu, X. (2009). Latent Dirichlet Allocation with Topic-in-Set Knowledge. NAACL 2009 Workshop on Semi-supervised Learning for NLP (NAACL-SSLNLP 2009)
  - Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research (JMLR) 3 (Mar. 2003), 993-1022.
  - Newman, D., Asuncion, A., Smyth, P., and Welling, M. Distributed Algorithms for Topic Models. Journal of Machine Learning Research (JMLR) 10 (Aug. 2009), 1801-1828.
Title: Percolator
- Language: C++ / Python
- Author:
  - SciLifeLab scilifelab.se
  - Stockholm University
  - the Karolinska Institutet
  - The Royal Institute of Technology (KTH)
  - Uppsala University.
- Description:
  - Semi-supervised learning for peptide identification from shotgun proteomics datasets
- Link:
  - https://github.com/percolator/percolator
Title: SemiL
- Authors:
  - Te-Ming Huang (Yottamine Analytics LLC),
  - Vojislav Kecman (University of Auckland)
- URL:
  - http://www.learning-from-data.com/te-ming/semil.htm
- Description:
  - Graph-based semi-supervised learning implementations optimized for large-scale data problems. The code combines and extends the seminal works in graph-based learning.
- Related papers:
  - Xiaojin Zhu, Zoubin Ghahramani, and John Lafferty. Semi-supervised learning using Gaussian fields and harmonic functions. In The 20th International Conference on Machine Learning (ICML), 2003.
  - Dengyong Zhou, Olivier Bousquet, Thomas Lal, Jason Weston, and Bernhard Sch ̈lkopf. Learning with local and global consistency. In Advances in Neural Information Processing System 16, 2004.
  - Te-Ming Huang, VojislavKecman,and Ivica Kopriva. Kernel Based Algorithms for Mining Huge Data Sets: Supervised, Semi-supervised, and Unsupervised Learning (Studies in Computational Intelligence). Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2006.
Title: Semi-Supervised Clustering
- Author:
  - Sugato Basu (Google)
- URL: http://www.cs.utexas.edu/users/ml/risc
- Description:
  - Code that performs metric pairwise constrained k-means clustering. Must-link and cannot-link constraints specify requirements for how examples should be placed in clusters.
- Related papers:
  - Sugato Basu, Mikhail Bilenko, Arindam Banerjee, and Raymond J. Mooney. Probabilistic semi-supervised clustering with constraints. In O. Chapelle, B. Schoelkopf, and A. Zien, editors, Semi-Supervised Learning, pages 71-98. MIT Press, 2006.
  - Sugato Basu, Ian Davidson, and Kiri Wagstaff, editors. Constrained Clustering: Advances in Algorithms, Theory, and Applications. Chapman & Hall/CRC Press, 2008.
Title: Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions
- Author: Sanjeev Satheesh - Stanford University
- Language: Java
- Link:
  - https://github.com/sancha/jrae
- WikiPage:
  - http://www.socher.org/index.php/Main/Semi-SupervisedRecursiveAutoencodersForPredictingSentimentDistributions
- Related papers:
  - Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions
- Note:
  - Parallelized to run on a multi-core machine.
Title: SGTlight (Spectral Graph Transducer)
- Author: Thorsten Joachims (Cornell University)
- URL: http://sgt.joachims.org/
- Description:
  - Implements the spectral graph transducer, which is a transductive learning method based on a combination of minimum cut problems and spectral graph theory.
- Related papers:
  - Thorsten Joachims. Transductive learning via spectral graph partitioning. In Proceedings of ICML-03, 20th International Conference on Machine Learning, 2003.
Title:Shogun - The Shogun Machine Learning Toolbox
- Author: Shogun Toolbox Foundation
- Language: C++, Matlab
- URL:
  - http://www.shogun-toolbox.org/page/home/
- Comment:
  - Comprehensive machine learning toolbox with user-friendly GUI. Well maintained, and well documented with an active development team.
Title: SVMlight
- Author: Thorsten Joachims (Cornell University)
- Language: C
- URL: http://svmlight.joachims.org/
- Features:
  - includes algorithm for approximately training large transductive SVMs (TSVMs) (see also Spectral Graph Transducer)
- Description:
  - General purpose support vector machine solver. Performs transductive classification by iteratively refining predictions on unlabeled instances.
- Related papers:
  - Thorsten Joachims. Making large-scale svm learning practical. In B. Schölkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods - Support Vector Learning. MIT Press, 1999.
  - Thorsten Joachims. Transductive inference for text classification using support vector ma- chines. In Proc. 16th International Conf. on Machine Learning, pages 200–209. Morgan Kauf- mann, San Francisco, CA, 1999.
Title: SVMlin: Fast Linear SVM Solvers for Supervised and Semi-supervised Learning
- Authors:
  - Vikas Sindhwani (IBM T.J. Watson Research Center),
  - S. Sathiya Keerthi (Cloud and Information Services Lab, Microsoft)
- Language: C++, Matlab
- URL:
  - http://vikas.sindhwani.org/svmlin.html
- Description:
  - Large-scale linear support vector machine package that can incorporate unlabeled examples using two different techniques for solving the non-convex S3VM problem.
  - Algorithms:
    - Supervised
      - Linear Regularized Least Squares (RLS) Classification
      - Modified Finite Newton Linear L2-SVMs
    - Semi-supervised
      - Linear Transductive L2-SVMs with multiple switchings
      - Deterministic Annealing (DA) for Semi-supervised Linear L2-SVMs
- Related papers:
  - Vikas Sindhwani, Partha Niyogi, Mikhail Belkin, and Sathiya Keerthi. Linear manifold regularization for large scale semi-supervised learning. In Proc. of the 22nd ICML Workshop on Learning with Partially Classified Training Data, August 2005.
  - Vikas Sindhwani and S. Sathiya Keerthi. Large scale semisupervised linear SVMs. In SIGIR 2006, 2006. DOI: 10.1145/1148170.1148253
Title: Supervised Distance Metric Learning with R
- Author:
  - Gao Tao joegaotao@gmail.com
  - Xiao Nan road2stat@gmail.com
- Language: R
- Link:
  - https://github.com/road2stat/sdml
- Description from the Authors:
  - Distance metric is widely used in the machine learning literature. We used to choose a distance metric according to a priori (Euclidean Distance , L1 Distance, etc.) or according to the result of cross validation within small class of functions (e.g. choosing order of polynomial for a kernel). Actually, with priori knowledge of the data, we could learn a more suitable distance metric with (semi-)supervised distance metric learning techniques. sdml is such an R package aims to implement the state-of-the-art algorithms for supervised distance metric learning. These distance metric learning methods are widely applied in feature extraction, dimensionality reduction, clustering, classification, information retrieval, and computer vision problems.
    - Algorithms planned in the first development stage:
    - Supervised Global Distance Metric Learning:
    - Relevant Component Analysis (RCA)
    - Discriminative Component Analysis (DCA)
    - Kernel Discriminative Component Analysis (KDCA)
    - Global Distance Metric Learning by - Convex Programming (GDMLCP)
    - Probablistic Global Distance Metric Learning (PGDM)
    - Supervised Local Distance Metric Learning:
    - Local Fisher Discriminant Analysis (LFDA)
    - Kernel Local Fisher Discriminant Analysis (KLFDA)
    - Localized Distance Metric Learning (LDM)
    - Information-Theoretic Metric Learning (ITML)
    - Neighbourhood Components Analysis (NCA)
    - Large Margin Nearest Neighbor Classifier (LMNN)
Title: TexNLP
- Authors:
  - Jason Baldridge
  - Taesun Moon
  - Elias Ponvert - University of Texas
- Language: Java
- Link:
  - https://github.com/utcompling/texnlp
- Description from the Developers:
  - The code supports supervised and semi-supervised learning for Hidden Markov Models for tagging, and standard supervised Maximum Entropy Markov Models (using the TADM toolkit). There is additional support for working with categories of Combinatory Categorial Grammar, especially with respect to supertagging for CCGbank.
- Papers:
  - Jason Baldridge. 2008. Weakly supervised supertagging with grammar-informed initialization. In Proceedings of COLING-2008. Manchester, UK.
  - Jason Baldridge and Alexis Palmer. 2009. How well does active learning actually work? Time-based evaluation of cost-reduction strategies for language documentation. In Proceedings of EMNLP-09. Singapore.
  - Alexis Palmer, Taesun Moon, Jason Baldridge, Katrin Erk, Eric Campbell, and Telma Can. 2010. Computational strategies for reducing annotation effort in language documentation: A case study in creating interlinear texts for Uspanteko. Linguistic Issues in Language Technology. 3(4):1-42.
Title: UniverSVM
- Author:
  - Fabian Sinz (Max Planck Institute for Biological Cybernetics)
- Language: C, C++
- URL:
  - http://www.epagoge.de/software/universvm/
- Description:
  - Large-scale support vector machine implementation. Performs transductive classification using the Universum technique, trained with the concave-convex procedure (CCCP).
- Related papers:
  - Jason Weston, Ronan Collobert, Fabian Sinz, Leon Bottou, and Vladimir Vapnik. Infer- ence with the universum. In ICML06, 23rd International Conference on Machine Learning, Pittsburgh, USA, 2006. DOI: 10.1145/1143844.1143971