Java Libraries

Rampart - Distribution

org.apache.rampart : rampart-dist

WS-Security, WS-Trust and WS-SecureConversation implementaion for Apache Axis2

Last Version: 1.6.2

Release Date:

Rampart - Documentation

org.apache.rampart : rampart-documentation

WS-Security, WS-Trust and WS-SecureConversation implementaion for Apache Axis2

Last Version: 1.6.2

Release Date:

Rampart - Samples

org.apache.rampart : rampart-sample

WS-Security, WS-Trust and WS-SecureConversation implementaion for Apache Axis2

Last Version: 1.6.2

Release Date:

Rampart - Integration

org.apache.rampart : rampart-integration

WS-Security, WS-Trust and WS-SecureConversation implementaion for Apache Axis2

Last Version: 1.6.2

Release Date:

Rampart - Test Suite

org.apache.rampart : rampart-tests

WS-Security, WS-Trust and WS-SecureConversation implementaion for Apache Axis2

Last Version: 1.6.2

Release Date:

Last Version: 1.9

Release Date:

wavelet

nz.ac.waikato.cms.weka : wavelet

A filter for wavelet transformation. For more information see: Wikipedia (2004). Discrete wavelet transform. Kristian Sandberg (2000). The Haar wavelet transform. University of Colorado at Boulder, USA.

Last Version: 1.0.2

Release Date:

votingFeatureIntervals

nz.ac.waikato.cms.weka : votingFeatureIntervals

Classification by voting feature intervals. Intervals are constucted around each class for each attribute (basically discretization). Class counts are recorded for each interval on each attribute. Classification is by voting. For more info see: G. Demiroz, A. Guvenir: Classification by voting feature intervals. In: 9th European Conference on Machine Learning, 85-92, 1997.

Last Version: 1.0.2

Release Date:

simpleEducationalLearningSchemes

nz.ac.waikato.cms.weka : simpleEducationalLearningSchemes

Simple learning schemes for educational purposes (Prism, Id3, IB1 and NaiveBayesSimple).

Last Version: 1.0.2

Release Date:

simpleCART

nz.ac.waikato.cms.weka : simpleCART

Class implementing minimal cost-complexity pruning. Note when dealing with missing values, use "fractional instances" method instead of surrogate split method. For more information, see: Leo Breiman, Jerome H. Friedman, Richard A. Olshen, Charles J. Stone (1984). Classification and Regression Trees. Wadsworth International Group, Belmont, California.

Last Version: 1.0.2

Release Date:

sequentialInformationalBottleneckClusterer

nz.ac.waikato.cms.weka : sequentialInformationalBottleneckClusterer

Cluster data using the sequential information bottleneck algorithm. Note: only hard clustering scheme is supported. sIB assign for each instance the cluster that have the minimum cost/distance to the instance. The trade-off beta is set to infinite so 1/beta is zero. For more information, see: Noam Slonim, Nir Friedman, Naftali Tishby: Unsupervised document classification using sequential information maximization. In: Proceedings of the 25th International ACM SIGIR Conference on Research and Development in Information Retrieval, 129-136, 2002.

Last Version: 1.0.2

Release Date:

scriptingClassifiers

nz.ac.waikato.cms.weka : scriptingClassifiers

Wrapper classifiers for Jython and Groovy code. Even though the classifier is serializable, the trained classifier cannot be stored persistently. I.e., one cannot store a model file and re-load it at a later point in time again to make predictions.

Last Version: 1.0.2

Release Date:

ridor

nz.ac.waikato.cms.weka : ridor

An implementation of a RIpple-DOwn Rule learner. It generates a default rule first and then the exceptions for the default rule with the least (weighted) error rate. Then it generates the "best" exceptions for each exception and iterates until pure. Thus it performs a tree-like expansion of exceptions.The exceptions are a set of rules that predict classes other than the default. IREP is used to generate the exceptions. For more information about Ripple-Down Rules, see: Brian R. Gaines, Paul Compton (1995). Induction of Ripple-Down Rules Applied to Modeling Large Databases. J. Intell. Inf. Syst. 5(3):211-228.

Last Version: 1.0.2

Release Date:

realAdaBoost

nz.ac.waikato.cms.weka : realAdaBoost

Class for boosting a 2-class classifier using the Real Adaboost method. For more information, see J. Friedman, T. Hastie, R. Tibshirani (2000). Additive Logistic Regression: a Statistical View of Boosting. Annals of Statistics. 95(2):337-407.

Last Version: 1.0.2

Release Date:

racedIncrementalLogitBoost

nz.ac.waikato.cms.weka : racedIncrementalLogitBoost

Classifier for incremental learning of large datasets by way of racing logit-boosted committees. For more information see: Eibe Frank, Geoffrey Holmes, Richard Kirkby, Mark Hall: Racing committees for large datasets. In: Proceedings of the 5th International Conferenceon Discovery Science, 153-164, 2002.

Last Version: 1.0.2

Release Date:

raceSearch

nz.ac.waikato.cms.weka : raceSearch

Races the cross validation error of competing attribute subsets. Use in conjuction with a ClassifierSubsetEval. RaceSearch has four modes: forward selection races all single attribute additions to a base set (initially no attributes), selects the winner to become the new base set and then iterates until there is no improvement over the base set. Backward elimination is similar but the initial base set has all attributes included and races all single attribute deletions. Schemata search is a bit different. Each iteration a series of races are run in parallel. Each race in a set determines whether a particular attribute should be included or not---ie the race is between the attribute being "in" or "out". The other attributes for this race are included or excluded randomly at each point in the evaluation. As soon as one race has a clear winner (ie it has been decided whether a particular attribute should be inor not) then the next set of races begins, using the result of the winning race from the previous iteration as new base set. Rank race first ranks the attributes using an attribute evaluator and then races the ranking. The race includes no attributes, the top ranked attribute, the top two attributes, the top three attributes, etc. It is also possible to generate a raked list of attributes through the forward racing process. If generateRanking is set to true then a complete forward race will be run---that is, racing continues until all attributes have been selected. The order that they are added in determines a complete ranking of all the attributes. Racing uses paired and unpaired t-tests on cross-validation errors of competing subsets. When there is a significant difference between the means of the errors of two competing subsets then the poorer of the two can be eliminated from the race. Similarly, if there is no significant difference between the mean errors of two competing subsets and they are within some threshold of each other, then one can be eliminated from the race.

Last Version: 1.0.2

Release Date:

paceRegression

nz.ac.waikato.cms.weka : paceRegression

Class for building pace regression linear models and using them for prediction. Under regularity conditions, pace regression is provably optimal when the number of coefficients tends to infinity. It consists of a group of estimators that are either overall optimal or optimal under certain conditions. The current work of the pace regression theory, and therefore also this implementation, do not handle: - missing values - non-binary nominal attributes - the case that n - k is small where n is the number of instances and k is the number of coefficients (the threshold used in this implmentation is 20) For more information see: Wang, Y (2000). A new approach to fitting linear models in high dimensional spaces. Hamilton, New Zealand. Wang, Y., Witten, I. H.: Modeling for optimal probability prediction. In: Proceedings of the Nineteenth International Conference in Machine Learning, Sydney, Australia, 650-657, 2002.

Last Version: 1.0.2

Release Date:

naiveBayesTree

nz.ac.waikato.cms.weka : naiveBayesTree

Class for generating a decision tree with naive Bayes classifiers at the leaves. For more information, see Ron Kohavi: Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid. In: Second International Conference on Knoledge Discovery and Data Mining, 202-207, 1996.

Last Version: 1.0.2

Release Date:

multiBoostAB

nz.ac.waikato.cms.weka : multiBoostAB

Class for boosting a classifier using the MultiBoosting method. MultiBoosting is an extension to the highly successful AdaBoost technique for forming decision committees. MultiBoosting can be viewed as combining AdaBoost with wagging. It is able to harness both AdaBoost's high bias and variance reduction with wagging's superior variance reduction. Using C4.5 as the base learning algorithm, Multi-boosting is demonstrated to produce decision committees with lower error than either AdaBoost or wagging significantly more often than the reverse over a large representative cross-section of UCI data sets. It offers the further advantage over AdaBoost of suiting parallel execution. For more information, see Geoffrey I. Webb (2000). MultiBoosting: A Technique for Combining Boosting and Wagging. Machine Learning. Vol.40(No.2).

Last Version: 1.0.2

Release Date:

leastMedSquared

nz.ac.waikato.cms.weka : leastMedSquared

Implements a least median squared linear regression utilizing the existing weka LinearRegression class to form predictions. Least squared regression functions are generated from random subsamples of the data. The least squared regression with the lowest meadian squared error is chosen as the final model. The basis of the algorithm is Peter J. Rousseeuw, Annick M. Leroy (1987). Robust regression and outlier detection.

Last Version: 1.0.2

Release Date:

isotonicRegression

nz.ac.waikato.cms.weka : isotonicRegression

Learns an isotonic regression model. Picks the attribute that results in the lowest squared error. Missing values are not allowed. Can only deal with numeric attributes. Considers the monotonically increasing case as well as the monotonically decreasing case.

Last Version: 1.0.2

Release Date:

hyperPipes

nz.ac.waikato.cms.weka : hyperPipes

Class implementing a HyperPipe classifier. For each category a HyperPipe is constructed that contains all points of that category (essentially records the attribute bounds observed for each category). Test instances are classified according to the category that "most contains the instance". Does not handle numeric class, or missing values in test cases. Extremely simple algorithm, but has the advantage of being extremely fast, and works quite well when you have "smegloads" of attributes.

Last Version: 1.0.2

Release Date:

grading

nz.ac.waikato.cms.weka : grading

Implements Grading. The base classifiers are "graded". For more information, see A.K. Seewald, J. Fuernkranz: An Evaluation of Grading Classifiers. In: Advances in Intelligent Data Analysis: 4th International Conference, Berlin/Heidelberg/New York/Tokyo, 115-124, 2001.

Last Version: 1.0.2

Release Date:

filteredAttributeSelection

nz.ac.waikato.cms.weka : filteredAttributeSelection

This package provides two meta attribute selection evaluators that can apply an arbitrary filter to the input data before executing the actual attribute selection scheme. One filters data and then passes it to an attribute evaluator (FilteredAttributeEval), and the other filters data and then passes it to a subset evaluator (FilteredSubsetEval).

Last Version: 1.0.2

Release Date:

SVMAttributeEval

nz.ac.waikato.cms.weka : SVMAttributeEval

Evaluates the worth of an attribute by using an SVM classifier. Attributes are ranked by the square of the weight assigned by the SVM. Attribute selection for multiclass problems is handled by ranking attributes for each class seperately using a one-vs-all method and then "dealing" from the top of each pile to give a final ranking. For more information see: I. Guyon, J. Weston, S. Barnhill, V. Vapnik (2002). Gene selection for cancer classification using support vector machines. Machine Learning. 46:389-422.

Last Version: 1.0.2

Release Date:

SPegasos

nz.ac.waikato.cms.weka : SPegasos

Implements the stochastic variant of the Pegasos (Primal Estimated sub-GrAdient SOlver for SVM) method of Shalev-Shwartz et al. (2007). This implementation globally replaces all missing values and transforms nominal attributes into binary ones. It also normalizes all attributes, so the coefficients in the output are based on the normalized data. Can either minimize the hinge loss (SVM) or log loss (logistic regression). For more information, see S. Shalev-Shwartz, Y. Singer, N. Srebro: Pegasos: Primal Estimated sub-GrAdient SOlver for SVM. In: 24th International Conference on MachineLearning, 807-814, 2007.

Last Version: 1.0.2

Release Date:

winnow

nz.ac.waikato.cms.weka : winnow

Implements Winnow and Balanced Winnow algorithms by Littlestone. For more information, see N. Littlestone (1988). Learning quickly when irrelevant attributes are abound: A new linear threshold algorithm. Machine Learning. 2:285-318; N. Littlestone (1989). Mistake bounds and logarithmic linear-threshold learning algorithms. University of California, Santa Cruz. Does classification for problems with nominal attributes (which it converts into binary attributes)

Last Version: 1.0.2

Release Date:

tertius

nz.ac.waikato.cms.weka : tertius

Finds rules according to confirmation measure (Tertius-type algorithm). For more information see: P. A. Flach, N. Lachiche (1999). Confirmation-Guided Discovery of first-order rules with Tertius. Machine Learning. 42:61-95.

Last Version: 1.0.2

Release Date:

tabuAndScatterSearch

nz.ac.waikato.cms.weka : tabuAndScatterSearch

Search methods contributed by Adrian Pino (ScatterSearchV1, TabuSearch). ScatterSearch: Performs an Scatter Search through the space of attribute subsets. Start with a population of many significants and diverses subset stops when the result is higher than a given treshold or there's not more improvement. For more information see: Felix Garcia Lopez (2004). Solving feature subset selection problem by a Parallel Scatter Search. Elsevier. Tabu Search: Abdel-Rahman Hedar, Jue Wangy, Masao Fukushima (2006). Tabu Search for Attribute Reduction in Rough Set Theory.

Last Version: 1.0.2

Release Date:

rotationForest

nz.ac.waikato.cms.weka : rotationForest

An ensemble learning method inspired by bagging and random sub-spaces. Trains an ensemble of decision trees on random subspaces of the data, where each subspace has been transformed using principal components analysis.

Last Version: 1.0.3

Release Date:

probabilisticSignificanceAE

nz.ac.waikato.cms.weka : probabilisticSignificanceAE

Evaluates the worth of an attribute by computing the Probabilistic Significance as a two-way function (attribute-classes and classes-attribute association). For more information see: Amir Ahmad, Lipika Dey (2004). A feature selection technique for classificatory analysis.

Last Version: 1.0.2

Release Date:

prefuseTree

nz.ac.waikato.cms.weka : prefuseTree

A visualization component for displaying tree structures from those schemes that can output trees (e.g. decision tree learners, Cobweb clusterer etc.). This component is available from the popup menu in the Explorer's classify and cluster panels. The component uses the prefuse visualization library.

Last Version: 1.0.3

Release Date:

ordinalStochasticDominance

nz.ac.waikato.cms.weka : ordinalStochasticDominance

An implementation of the Ordinal Stochastic Dominance Learner. Further information regarding the OSDL-algorithm can be found in: S. Lievens, B. De Baets, K. Cao-Van (2006). A Probabilistic Framework for the Design of Instance-Based Supervised Ranking Algorithms in an Ordinal Setting. Annals of Operations Research; Kim Cao-Van (2003). Supervised ranking: from semantics to algorithms; Stijn Lievens (2004). Studie en implementatie van instantie-gebaseerde algoritmen voor gesuperviseerd rangschikken

Last Version: 1.0.2

Release Date:

ordinalLearningMethod

nz.ac.waikato.cms.weka : ordinalLearningMethod

An implementation of the Ordinal Learning Method (OLM). Further information regarding the algorithm and variants can be found in: Arie Ben-David (1992). Automatic Generation of Symbolic Multiattribute Ordinal Knowledge-Based DSSs: methodology and Applications. Decision Sciences. 23:1357-1372.

Last Version: 1.0.2

Release Date:

normalize

nz.ac.waikato.cms.weka : normalize

An instance filter that normalize instances considering only numeric attributes and ignoring class index

Last Version: 1.0.2

Release Date:

multilayerPerceptronCS

nz.ac.waikato.cms.weka : multilayerPerceptronCS

An extension of the standard MultilayerPerceptron classifier in Weka that adds context-sensitive Multiple Task Learning (csMTL)

Last Version: 1.0.2

Release Date:

linearForwardSelection

nz.ac.waikato.cms.weka : linearForwardSelection

Extension of BestFirst. Takes a restricted number of k attributes into account. Fixed-set selects a fixed number k of attributes, whereas k is increased in each step when fixed-width is selected. The search uses either the initial ordering to select the top k attributes, or performs a ranking (with the same evalutator the search uses later on). The search direction can be forward, or floating forward selection (with opitional backward search steps). For more information see: Martin Guetlein (2006). Large Scale Attribute Selection Using Wrappers. Freiburg, Germany.

Last Version: 1.0.2

Release Date:

levenshteinEditDistance

nz.ac.waikato.cms.weka : levenshteinEditDistance

Computes the Levenshtein edit distance between two strings.

Last Version: 1.0.2

Release Date:

lazyBayesianRules

nz.ac.waikato.cms.weka : lazyBayesianRules

Lazy Bayesian Rules Classifier. The naive Bayesian classifier provides a simple and effective approach to classifier learning, but its attribute independence assumption is often violated in the real world. Lazy Bayesian Rules selectively relaxes the independence assumption, achieving lower error rates over a range of learning tasks. LBR defers processing to classification time, making it a highly efficient and accurate classification algorithm when small numbers of objects are to be classified. For more information, see: Zijian Zheng, G. Webb (2000). Lazy Learning of Bayesian Rules. Machine Learning. 4(1):53-84.

Last Version: 1.0.2

Release Date:

kfPMMLClassifierScoring

nz.ac.waikato.cms.weka : kfPMMLClassifierScoring

A Knowledge Flow plugin that provides a Knowledge Flow step for scoring test sets or instance streams using a PMML classifier.

Last Version: 1.0.3

Release Date:

hiddenNaiveBayes

nz.ac.waikato.cms.weka : hiddenNaiveBayes

Contructs Hidden Naive Bayes classification model with high classification accuracy and AUC. For more information refer to: H. Zhang, L. Jiang, J. Su: Hidden Naive Bayes. In: Twentieth National Conference on Artificial Intelligence, 919-924, 2005.

Last Version: 1.0.2

Release Date:

generalizedSequentialPatterns

nz.ac.waikato.cms.weka : generalizedSequentialPatterns

Class implementing a GSP algorithm for discovering sequential patterns in a sequential data set. The attribute identifying the distinct data sequences contained in the set can be determined by the respective option. Furthermore, the set of output results can be restricted by specifying one or more attributes that have to be contained in each element/itemset of a sequence. For further information see: Ramakrishnan Srikant, Rakesh Agrawal (1996). Mining Sequential Patterns: Generalizations and Performance Improvements.

Last Version: 1.0.2

Release Date:

fuzzyLaticeReasoning

nz.ac.waikato.cms.weka : fuzzyLaticeReasoning

The Fuzzy Lattice Reasoning Classifier uses the notion of Fuzzy Lattices for creating a Reasoning Environment. The current version can be used for classification using numeric predictors. For more information see: I. N. Athanasiadis, V. G. Kaburlasos, P. A. Mitkas, V. Petridis: Applying Machine Learning Techniques on Air Quality Data for Real-Time Decision Support. In: 1st Intl. NAISO Symposium on Information Technologies in Environmental Engineering (ITEE-2003), Gdansk, Poland, 2003; V. G. Kaburlasos, I. N. Athanasiadis, P. A. Mitkas, V. Petridis (2003). Fuzzy Lattice Reasoning (FLR) Classifier and its Application on Improved Estimation of Ambient Ozone Concentration.

Last Version: 1.0.2

Release Date:

fastCorrBasedFS

nz.ac.waikato.cms.weka : fastCorrBasedFS

Feature selection method based on correlation measureand relevance and redundancy analysis. Use in conjunction with an attribute set evaluator (SymmetricalUncertAttributeEval). For more information see: Lei Yu, Huan Liu: Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution. In: Proceedings of the Twentieth International Conference on Machine Learning, 856-863, 2003.

Last Version: 1.0.2

Release Date:

ensembleLibrary

nz.ac.waikato.cms.weka : ensembleLibrary

Manages a libary of ensemble classifiers

Last Version: 1.0.4

Release Date:

decorate

nz.ac.waikato.cms.weka : decorate

DECORATE is a meta-learner for building diverse ensembles of classifiers by using specially constructed artificial training examples. Comprehensive experiments have demonstrated that this technique is consistently more accurate than the base classifier, Bagging and Random Forests. Decorate also obtains higher accuracy than Boosting on small training sets, and achieves comparable performance on larger training sets. For more details see: P. Melville, R. J. Mooney: Constructing Diverse Classifier Ensembles Using Artificial Training Examples. In: Eighteenth International Joint Conference on Artificial Intelligence, 505-510, 2003; P. Melville, R. J. Mooney (2004). Creating Diversity in Ensembles Using Artificial Data. Information Fusion: Special Issue on Diversity in Multiclassifier Systems.

Last Version: 1.0.3

Release Date:

citationKNN

nz.ac.waikato.cms.weka : citationKNN

Modified version of the Citation kNN multi instance classifier. For more information see: Jun Wang, Zucker, Jean-Daniel: Solving Multiple-Instance Problem: A Lazy Learning Approach. In: 17th International Conference on Machine Learning, 1119-1125, 2000.

Last Version: 1.0.2

Release Date:

associationRulesVisualizer

nz.ac.waikato.cms.weka : associationRulesVisualizer

visualization component for displaying association rules that uses a modified version of the Association Rules Viewer from DESS IAGL of Lille. Requires Java 3D to be installed.

Last Version: 1.0.2

Release Date:

NNge

nz.ac.waikato.cms.weka : NNge

Nearest-neighbor-like algorithm using non-nested generalized exemplars (which are hyperrectangles that can be viewed as if-then rules). For more information, see Brent Martin (1995). Instance-Based learning: Nearest Neighbor With Generalization. Hamilton, New Zealand. Sylvain Roy (2002). Nearest Neighbor With Generalization. Christchurch, New Zealand.

Last Version: 1.0.2

Release Date: