Feedzai OpenML Providers for Python
Implementations of the Feedzai OpenML API to allow support for machine learning models in Python using Java Embedded Python.
Modules
Generic Python
The openml-generic-python
module contains a provider that allows developers to load Python code that conforms to a simple API. This is the most powerful approach (yet more cumbersome) since models can actually hold state.
Pull the provider from Maven Central:
<dependency>
<groupId>com.feedzai</groupId>
<artifactId>openml-generic-python</artifactId>
<!-- See project tags for latest version -->
<version>0.3.0</version>
</dependency>
Scikit-learn
The implementation in the openml-scikit
module adds support for models built with scikit-learn.
Pull this module from Maven Central:
<dependency>
<groupId>com.feedzai</groupId>
<artifactId>openml-scikit</artifactId>
<!-- See project tags for latest version -->
<version>0.3.0</version>
</dependency>
Building
This is a Maven project which you can build using the following command:
mvn clean install
Environment
To use these providers you need to have Python 3.6 with the following packages installed in your environment:
* numpy
* scipy
* jep (this requires JAVA_HOME to be configured)
* scikit-learn (for the scikit provider)
Note that this section only describes the known prerequisites that are common to any model generated in Python. Before importing a model you need to ensure that the required packages for that model are also installed.
Running the tests
To actually run the tests, two other configurations may be necessary for Jep to work properly:
-
The java.library.path property needs to point to the Jep library. An approach for this that typically works is setting the
LD_LIBRARY_PATH
environment variable:export LD_LIBRARY_PATH=/...path to.../python3.6/site-packages/jep:$LD_LIBRARY_PATH
-
Depending on the environment and package manager it may also be necessary to set the
LD_PRELOAD
variable to include the Python library:export LD_PRELOAD=/...path to.../lib/libpython3.6m.so
Feedzai has built a helpful docker image for testing, available on docker hub, that is being used in this repository's continuous integration. See the travis-ci configuration commands on how to use it. The image's Dockerfile also provides an example of the environment installation.