Tensorflow Estimator in Python machine learning
In this tutorial, we will learn about TensorFlow estimators using Python programming language. Estimators are high-level API that simplifies the task of machine learning. After the data is ser up the model is defined using TensorFlow estimators. tf.estimator.Estimator library provides a wide range of estimators for our use.
The module tf.estimators provide us with a wide variety of classes for use in the model, they include LinearRegressor, LinearClassifier, etc. The estimators can also be custom made. A premade estimator is any class derived from the tf.estimators.Estimator class. The interface of estimator consists of a train-evaluate predict loop that is similar to that of sci-kit-learn.
The Schematic diagram of an estimator is as below:

Estimator Classes
For this tutorial, we can create a model data with a synthetic dataset with some overlap. Below are the various classes in Tensorflow estimators API.
import numpy as np from sklearn.datasets import make_classification np.random.seed(42) X, y = make_classification(n_samples=100000, n_features=2, n_informative=2, n_redundant=0) n_train_samples = 1000 X_train, y_train = X[:n_train_samples], y[:n_train_samples] X_test, y_test = X[n_train_samples:], y[n_train_samples:]
- BaselineClassifier predicts the average in a dominant class.
import tensorflow as tf
def input_fn(X, y):
dataset = tf.data.Dataset.from_tensor_slices(({'X': X[:, 0], 'Y': X[:, 1]}, y))
dataset = dataset.shuffle(1000).batch(1000)
return dataset
from tensorflow.estimator import BaselineClassifier
clss = BaselineClassifier(n_classes=2)
clss.train(input_fn=lambda: input_fn(X_train, y_train), max_steps=10)
y_pred = clss.predict(input_fn=lambda: input_fn(X_test, y_test))
y_pred = np.array([p['class_ids'][0] for p in y_pred])
- LinaerClassifier is a classifier train a linear model to classify instances into multiple classes if the number of classes is being only two, then classifier is a binary classifier
from tensorflow.estimator import LinearClassifier
feature_columns = [
tf.feature_column.numeric_column(key='X', dtype=tf.float32),
tf.feature_column.numeric_column(key='Y', dtype=tf.float32)
]
clss = LinearClassifier(n_classes=2, feature_columns=feature_columns)
clss.train(input_fn=lambda: input_fn(X_train, y_train), max_steps=10)
y_pred = clss.predict(input_fn=lambda: input_fn(X_test, y_test))
y_pred = np.array([p['class_ids'][0] for p in y_pred])- DNNClassifier is the class to keep a neural network classifier that implements a multilayer perception classified network.
from tensorflow.estimator import DNNClassifier
feature_columns = [
tf.feature_column.numeric_column(key='X', dtype=tf.float32),
tf.feature_column.numeric_column(key='Y', dtype=tf.float32)
]
clss = DNNClassifier(n_classes=2, feature_columns=feature_columns, hidden_units=[32, 32])
clss.train(input_fn=lambda: input_fn(X_train, y_train), max_steps=10000)
y_pred = clss.predict(input_fn=lambda: input_fn(X_test, y_test))
y_pred = np.array([p['class_ids'][0] for p in y_pred])- BoostedTreesClassifier is one of the methods of implementation of tree ensemble for structured data implementation, these classifiers are fast train, requires a lot of tuning and there is no necessity for large dataset to work upon.
from tensorflow.estimator import BoostedTreesClassifier
feature_columns = [
tf.feature_column.numeric_column(key='X', dtype=tf.float32),
tf.feature_column.numeric_column(key='Y', dtype=tf.float32)
]
clss = BoostedTreesClassifier(n_classes=2, feature_columns=feature_columns, n_trees=100, n_batches_per_layer=1)
clss.train(input_fn=lambda: input_fn(X_train, y_train), max_steps=10000)
y_pred = clss.predict(input_fn=lambda: input_fn(X_test, y_test))
y_pred = np.array([p['class_ids'][0] for p in y_pred])MODEL implementation using Estimators
import pandas as pd
import tensorflow as tf
import numpy as np
train = pd.read_csv('./train-ready.csv')
test = pd.read_csv('./test-ready.csv')
def my_model(features, labels, mode, params):
n = tf.feature_column.input_layer(features, params['feature_columns'])
for units in params['hidden_units']:
n = tf.layers.dense(n, units=units, activation=tf.nn.relu)
logits = tf.layers.dense(n, params['n_classes'], activation=None)
predicted_classes = tf.argmax(logits, 1)
if mode == tf.estimator.ModeKeys.PREDICT:
predictions = {
'class_ids': predicted_classes[:, tf.newaxis],
'probabilities': tf.nn.softmax(logits),
'logits': logits,
}
return tf.estimator.EstimatorSpec(mode, predictions=predictions)
loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)
accuracy = tf.metrics.accuracy(labels=labels,
predictions=predicted_classes,
name='acc_op')
metrics = {'accuracy': accuracy}
tf.summary.scalar('accuracy', accuracy[1])
if mode == tf.estimator.ModeKeys.EVAL:
return tf.estimator.EstimatorSpec(
mode, loss=loss, eval_metric_ops=metrics)
# Create training op.
assert mode == tf.estimator.ModeKeys.TRAIN
optimizer = tf.train.AdagradOptimizer(learning_rate=0.1)
train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step())
return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)
def train_input_fn(features, labels, batch_size):
"""trainig input"""
dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))
dataset = dataset.shuffle(10).repeat().batch(batch_size)
return dataset
from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.1)
classifier = tf.estimator.Estimator(
model_fn=my_model,
params={
'feature_columns': feature_columns,
'hidden_units': [units, int(units/2)],
'n_classes': 2,
})
## Training and evaluating model
batch_size = 100
train_steps = 400
for i in range(100):
classifier.train(
input_fn=lambda:train_input_fn(X_train, y_train,
batch_size),
steps=train_steps)That’s all the different API in the TensorFlow estimator class and the implementation in a simple model for demonstration of the use of Estimator in Python.
Leave a Reply