zementisr Quickstart Guide

zementisr is an R client for the Zementis Server API. Zementis Server is an execution engine for PMML models which also comes with model management capabilities.

In this quickstart guide we will show how you can use zementisr to deploy PMML models to Zementis Server, predict new values by sending data to the server and manage the entire PMML model life cycle without leaving your preferred R development environment.

Authentication

Zementis Server’s REST API uses HTTP Basic Authentication. For each request the client needs to provide username and password.

Since typing your password in the console is a bit too dangerous (you might accidentally share the .Rhistory file) and asking each time gets too cumbersome quickly, the zementisr package requires that you store your secrets and the base URL of your Zementis Server as environment variables in the .Renviron file in your home directory.

Please make sure to set the environment variables below in your .Renviron file before using functions from the zementisr package. You can easily edit .Renviron using usethis::edit_r_environ().

ZEMENTIS_base_url = https://localhost:9083/adapars
ZEMENTIS_usr = guybrush.threepwood
ZEMENTIS_pwd = bigwhoop

Preparation

Before we get started using the zementisr package, we will create two simple prediction models and convert them to PMML using pmml() from the pmml package. The first PMML model will be saved to disk:

library(rpart)
library(pmml)

iris_lm <- lm(Sepal.Length ~ ., data=iris)
iris_pmml <- pmml(iris_lm, model_name = "iris_model")
saveXML(iris_pmml, "iris_pmml.xml")
#> [1] "iris_pmml.xml"

kyphosis_fit <- rpart(Kyphosis ~ Age + Number + Start, data = kyphosis)
kyphosis_pmml <- pmml(kyphosis_fit, model_name = "kyphosis_model")

Model deployment

Now, we will start using functions from the zementisr package. We will begin with upload_model() to upload our PMML models to the server.

upload_model() either accepts a path to a PMML file on disk or an XMLNode object created with pmml::pmml(). Below we will demonstrate both options to upload the two models to the server. A successful upload always returns a list with the model name and its activation status.

library(zementisr)

upload_model("iris_pmml.xml")
#> $model_name
#> [1] "iris_model"
#> 
#> $is_active
#> [1] TRUE
upload_model(kyphosis_pmml)
#> $model_name
#> [1] "kyphosis_model"
#> 
#> $is_active
#> [1] TRUE

Basic model operations

After deployment you might be interested in how many PMML models are currently deployed to Zementis Server:

get_models()
#> [1] "iris_model"     "kyphosis_model"

Use get_model_properties() to get the PMML model’s name, description, input and output field properties:

get_model_properties("kyphosis_model")
#> $modelName
#> [1] "kyphosis_model"
#> 
#> $description
#> [1] "RPart Decision Tree Model"
#> 
#> $creationDate
#> [1] "2020-01-07 20:52:18"
#> 
#> $isActive
#> [1] TRUE
#> 
#> $inputFields
#>     name   type  usage
#> 1    Age DOUBLE ACTIVE
#> 2 Number DOUBLE ACTIVE
#> 3  Start DOUBLE ACTIVE
#> 
#> $outputFields
#>                  name   type  usage
#> 1  Predicted_Kyphosis STRING OUTPUT
#> 2  Probability_absent DOUBLE OUTPUT
#> 3 Probability_present DOUBLE OUTPUT

If you like to deactivate a PMML model without removing it from the server, do the following:

deactivate_model("iris_model")
#> $model_name
#> [1] "iris_model"
#> 
#> $is_active
#> [1] FALSE
deactivate_model("kyphosis_model")
#> $model_name
#> [1] "kyphosis_model"
#> 
#> $is_active
#> [1] FALSE

You even can add some magrittr and purrr flavor to chain several zementisr functions together. For instance, the following line of code lets you activate all your PMML models at once:

get_models() %>% purrr::map_df(activate_model)
#> # A tibble: 2 x 2
#>   model_name     is_active
#>   <chr>          <lgl>    
#> 1 iris_model     TRUE     
#> 2 kyphosis_model TRUE

Model predictions

If you like to predict a single new input record, use predict_pmml() which needs a one row data frame as its data input and the name of the deployed PMML model that should get the prediction. If executed successfully, predict_pmml() returns a list with the following components:

model A length one character vector containing the name of the PMML model that was executed on the server
outputs A data frame containing the prediction results. The values returned depend on the type of prediction model being executed on the server. You can spot the difference between a regression and a classification model in the output below

predict_pmml(iris[42, ], "iris_model")
#> $model
#> [1] "iris_model"
#> 
#> $outputs
#>   Predicted_Sepal.Length
#> 1               4.295281
predict_pmml(kyphosis[23, ], "kyphosis_model")
#> $model
#> [1] "kyphosis_model"
#> 
#> $outputs
#>   Probability_present Probability_absent Predicted_Kyphosis
#> 1           0.5714286          0.4285714            present

If you like to predict multiple new input records all at once, use predict_pmml_batch() which accepts data frames, .csv and .json files as data input. .csv and .json files can even be sent in compressed format (.zip or .gzip).

predict_pmml_batch(iris[23:25, ], "iris_model")
#> $model
#> [1] "iris_model"
#> 
#> $outputs
#>   Predicted_Sepal.Length
#> 1               4.722679
#> 2               5.059837
#> 3               5.369821
jsonlite::write_json(iris[23:25, ], "iris.json")
predict_pmml_batch("iris.json", "iris_model")
#> $model
#> [1] "iris_model"
#> 
#> $outputs
#>   Predicted_Sepal.Length
#> 1               4.722679
#> 2               5.059837
#> 3               5.369821
write.csv(iris[23:25, ], "iris.csv", row.names = FALSE)
predict_pmml_batch("iris.csv","iris_model")
#> $model
#> [1] "iris_model"
#> 
#> $outputs
#>   Predicted_Sepal.Length
#> 1               4.722679
#> 2               5.059837
#> 3               5.369821

As you can see by the output above, predict_pmml_batch() also returns a list with the two components model and outputs.

Downloading models

download_model() lets you download the PMML source of a deployed model. You might choose to download the PMML model source before deleting the model permanently from the server with delete_model() which is described in the next section. download_model() returns a list with two components:

The model_name of the downloaded model including the suffix “.pmml”
The model_source represented as an S3 object of class XMLInternalDocument created by parsing the server response using XML::xmlParse()

After downloading the model of your choice, you can use XML::saveXML() to store it on disk:

iris_download <- download_model("iris_model")
XML::saveXML(iris_download[["model_source"]], file = iris_download[["model_name"]])

Again using some tidyverse ingredients, you can easily download all deployed models at once and store them in a data frame:


downloads <- get_models() %>% purrr::map(download_model)

tibble::tibble(
  model_name = purrr::map_chr(downloads, "model_name"), 
  source = purrr::map(downloads, "model_source"))
#> # A tibble: 2 x 2
#>   model_name          source    
#>   <chr>               <list>    
#> 1 iris_model.pmml     <XMLIntrD>
#> 2 kyphosis_model.pmml <XMLIntrD>

If you like to store the downloaded models on disk instead, do this:

purrr::walk2(purrr::map(downloads, "model_source"),
             purrr::map_chr(downloads, "model_name"),
             XML::saveXML)

Deleting models

After a PMML model has reached the end of its life cycle you might want to remove it from the server using delete_model() which always returns a character vector with the names of the models still residing deployed to the server:

delete_model("iris_model")
#> [1] "kyphosis_model"
delete_model("kyphosis_model")
#> character(0)

The `...` argument

Each function from the zementisr package comes with a ... (dot-dot-dot) argument. It is used to pass on additional arguments to the underlying HTTP method from the httr package. This might be necessary if you need to set some curl options explicitly via httr::config().

Alexander Lemm

Authentication

Preparation

Model deployment

Basic model operations

Model predictions

Downloading models

Deleting models

The `...` argument

Contents

zementisr Quickstart Guide

Alexander Lemm

Authentication

Preparation

Model deployment

Basic model operations

Model predictions

Downloading models

Deleting models

The ... argument

Contents

The `...` argument