Using C# to run Python Scripts with Machine Learning Models

Ernest Bonat, Ph.D.
10 min readJul 24, 2018

--

Python Data Ecosystem is the most popular package of libraries and frameworks for Data Science projects using Machine Learning (ML) algorithms today. It includes more than 1000+ developed libraries and frameworks. A huge Python international community from all around the world is constantly maintaining existing and developing new libraries and frameworks in Python programming language.

Being able to apply Machine Learning algorithms to business decisions is very important for companies today. When and how companies should apply Machine Learning algorithms based on their existing software and hardware infrastructure? Suppose you work for a company where the main programming language implementation could be C++, C#, Java or any other compiled language except Python or R. A simple question could be: Can any of these main programming languages solve all their Data Science needs? If not, what do you do in this case when the company wants to keep their main development language? I would like to propose a very simple solution for this very common problem. It is very simple because with a couple lines of C# code, for example, I can run any Python script file with any developed Machine Learning model. It will allow me to input/output any required model parameters in C# too.

Image Classification Project

The image classification project contains a dataset of thousands predefined grayscale images. Based on specific project requirements these images need to be classified in two categories 0 or 1. As you can see this is a simple binary classification project. To build the Machine Learning model I decided to use the scikit-learn MLPClassifier() classification model as my first option. In general, I found out that many companies start their image classification Data Science projects with eXtreme Gradient Boosting (XGBoost) algorithm. I think it’s important to know for every Data Scientist, XGBoost has been the winner for many Data Science and Machine Learning competitions today including in Kaggle. I’ll be covering this algorithm in my future blog papers. The dataset was split on train/valid/test as 80/10/10. To tune the MLPClassifier() hyperparameters model the GridSearchCV() method was used with the following selecting possible values:

ml_model = MLPClassifier()
hyper_parameter_candidates = [{"hidden_layer_sizes":[(20), (50), (100)],
"max_iter":[500, 800, 1000],
"activation":["identity", "logistic", "tanh", "relu"],
"solver":["lbfgs", "sgd", "adam"]}]
scoring_parameter = "accuracy"
cv_fold = KFold(n_splits=5, shuffle=True, random_state=1)

In my previous blog Refactoring Python Code for Machine Learning Projects. Python “Spaghetti Code” Everywhere! I presented a generic function tune_hyperparameter_model() to create an optimized classification model using GridSearchCV() and RandomizedSearchCV() methods.

Using the print_searchcv_result() function shown in the Python class code below ImageFileConvertFlattenArrayClass(), the best score and hyperparameters were determined as:

mean:0.985 std:(+/-0.006) for {"activation":"identity", "hidden_layer_sizes":100, "max_iter":500, "solver":"sgd"}

The table shows good statistical results of 97.34% accuracy score with the image test dataset.

Using the Python Pickle library the classification model file was saved locally as image_classification.pkl. Now that we have the model created let’s find out how C# can call it and pass data to and from it using input/output any parameters.

Running Python Script File in C#

I’ll be using the .NET System.Diagnostics Namespace to run a Python script file as a system process. Here is what the help documentation says. “The System.Diagnostics namespace provides classes that allow you to interact with system processes, event logs, and performance counters. The Process class provides functionality to monitor system processes across the network, and to start and stop local system processes. In addition to retrieving lists of running processes (by specifying either the computer, the process name, or the process id) or viewing information about the process that currently has access to the processor, you can get detailed knowledge of process threads and modules both through the Process class itself, and by interacting with the ProcessThread and ProcessModule classes. The ProcessStartInfo class enables you to specify a variety of elements with which to start a new process, such as input, output, and error streams, working directories, and command line verbs and arguments. These give you fine control over the behavior of your processes. Other related classes let you specify window styles, process and thread priorities, and interact with collections of threads and modules.”

The code below represents a simple C# class to run a Python script file with any developed Machine Learning model. The public method ExecutePythonScript() does all the work. As you can see the Python script file is passed as argument for the StartInfo object process (Arguments = filePythonScript). The StandardOutput method returns the output text contained by the Python print() function. This function evaluates each expression in turn and writes the resulting object to standard output. The StandardError method returns any occurred errors from the Python script file. In the catch block, I get the exception message to keep the code consistent.

using System;
using System.Diagnostics;
namespace RunPythonScript
{
/// <summary>
/// Machine Learning C# - Python
/// </summary>
public class MLSharpPython : IMLSharpPython
{
public readonly string filePythonExePath;
/// <summary>
/// ML Sharp Python class constructor
/// </summary>
/// <param name="exePythonPath">Python EXE file path</param>
public MLSharpPython(string exePythonPath)
{
filePythonExePath = exePythonPath;
}
/// <summary>
/// Execute Python script file
/// </summary>
/// <param name="filePythonScript">Python script file and input parameter(s)</param>
/// <param name="standardError">Output standard error</param>
/// <returns>Output text result</returns>
public string ExecutePythonScript(string filePythonScript, out string standardError)
{
string outputText = string.Empty;
standardError = string.Empty;
try
{
using (Process process = new Process())
{
process.StartInfo = new ProcessStartInfo(filePythonExePath)
{
Arguments = filePythonScript,
UseShellExecute = false,
RedirectStandardOutput = true,
RedirectStandardError = true,
CreateNoWindow = true
};
process.Start();
outputText = process.StandardOutput.ReadToEnd();
outputText = outputText.Replace(Environment.NewLine, string.Empty);
standardError = process.StandardError.ReadToEnd();
process.WaitForExit();
}
}
catch (Exception ex)
{
string exceptionMessage = ex.Message;
}
return outputText;
}
}
}

The interface of the MLSharpPythonclass is defined below.

namespace RunPythonScript
{
public interface IMLSharpPython
{
string ExecutePythonScript(string filePythonScript, out string standardError);
}
}

Because a C# class was created, it is a best practice to also we implement a unit test class program for it. I often see class libraries created without implementing interfaces/unit test and the results are usually poor or sub optimized performance with no way of knowing why. That is why we implement unit tests as a standard practice. The code below implements a simple unit test for ExecutePythonScript() method. No a lot of explanation here, it’s standard C# unit test code.

using Microsoft.VisualStudio.TestTools.UnitTesting;namespace RunPythonScript.Tests
{
[TestClass()]
public class MLSharpPythonTests
{
private static string filePythonExePath = Properties.Settings.Default.FilePythonExePath;
private static string folderImagePath = Properties.Settings.Default.FolderImagePath;
private static string filePythonNamePath = Properties.Settings.Default.FilePythonNamePath;
private static string filePythonParameterName = Properties.Settings.Default.FilePythonParameterName;
[TestMethod()]
public void ExecutePythonScriptTest()
{
string standardError;
string expectedOutputText = "1";
string fileNamePythonExe = filePythonExePath;
MLSharpPython mlSharpPython = new MLSharpPython(fileNamePythonExe);
string imagePathName = folderImagePath + "Image_Test_Name.png";
string fileNameParameter = $"{filePythonNamePath} {filePythonParameterName} {imagePathName}";
string actualOutputText = mlSharpPython.ExecutePythonScript(fileNameParameter, out standardError);
Assert.AreEqual(expectedOutputText, actualOutputText);
}
}
}

Here is a passed test result after running.

Summary
Last Test Run Passed (Total Run Time 0:00:02.120212)
1 Test Passed

Now that I have developed the class and its interface/unit test, I can create a simple program to run a Python script file. Because we should not hardcode variables in any programming language, I created an app.config file shown below to set Python Exe file path, folder image path, Python script file path and name, and input parameter name.

<?xml version="1.0" encoding="utf-8" ?>
<configuration>
<applicationSettings>
<RunPythonScript.Properties.Settings>
<setting name="FilePythonExePath" serializeAs="String">
<value>C:\FilePythonExePath\python.exe</value>
</setting>
<setting name="FolderImagePath" serializeAs="String">
<value>C:\FolderImagePath\</value>
</setting>
<setting name="FilePythonNamePath" serializeAs="String">
<value>C:\FilePythonNamePat\image_use_model_pkl_classification_csharp.py</value>
</setting>
<setting name="FilePythonParameterName" serializeAs="String">
<value>-image_path_name</value>
</setting>
</RunPythonScript.Properties.Settings>
</applicationSettings>
</configuration>

Let’s look at a C# console application to show how to run a Python script file using the MLSharpPython class. I believe I have commented it very well so anyone should be able to understand and use it in their C# company Data Science projects. As you can see from the code the output text of the ExecutePythonScript() method will determine if the passed test image file belongs to category 0 or 1.

using System;namespace RunPythonScript
{
public class Program
{
// Get config settings
private static string filePythonExePath = Properties.Settings.Default.FilePythonExePath;
private static string folderImagePath = Properties.Settings.Default.FolderImagePath;
private static string filePythonNamePath = Properties.Settings.Default.FilePythonNamePath;
private static string filePythonParameterName = Properties.Settings.Default.FilePythonParameterName;

static void Main(string[] args)
{
string outputText, standardError;

// Instantiate Machine Learning C# - Python class object
IMLSharpPython mlSharpPython = new MLSharpPython(filePythonExePath);
// Test image
string imagePathName = folderImagePath + "Image_Test_Name.png";
// Define Python script file and input parameter name
string fileNameParameter = $"{filePythonNamePath} {filePythonParameterName} {imagePathName}";
// Execute the python script file
outputText = mlSharpPython.ExecutePythonScript(fileNameParameter, out standardError);
if (string.IsNullOrEmpty(standardError))
{
switch (outputText.ToLower())
{
case "1":
Console.WriteLine("Image category 1");
break;
case "0":
Console.WriteLine("Image category 0");
break;
default:
Console.WriteLine(outputText);
break;
}
}
else
{
Console.WriteLine(standardError);
}
Console.ReadKey();
}
}
}

Below is an example of the program result for category 1 image.

Result: Image category 1

Python Script File

It’s time to look at the Python script. Let me start by mentioning that based on my Data Science consulting experience, the most commonly used image processing libraries in Python in order are: OpenCV, scikit-image and Python Imaging Library (PIL or Pillow). If you’ve never used the OpenCV library before the OpenCV-Python Tutorials documentation should be a good starting point to.

Let’s look at the Python code below and explain what it does. First of all, we’ll need three Python Data Ecosystem libraries: NumPy, OpenCV and Pickle (Python object serialization library). The class ImageFileConvertFlattenArrayClass includes the image_flatten_pixel_nparray_opencv() method to resize and flatten the images array from 2D to 1D. The generic print_exception_message() method is used to print the occurred exception messages.

To run this code the parameter -image_path_name needs to be added and passed as argument to the start main ‘__main__’. The main() function open, load and close the classifier model file image_classification.pkl. The predict method of the loaded classifier model determines if the passed image is category 0 or 1. This Python code has been commented very well for you to understand easily.

# python libraries
import os
import sys
import time
import traceback
import argparse

# python data ecosystem libraries
import numpy as np
import cv2
import pickle
import config
class ImageFileConvertFlattenArrayClass(object):
"""
image file convertor to flatten array class
"""

def __init__(self):
pass

def image_flatten_pixel_nparray_opencv(self, image_path_name, nparray_dimension = "1d", image_resize_factor=None):
"""
resize and flatten the image from 1d to 2d array
:param image_path_name: image path and file name
:param nparray_dimension: nparray dimension type
:param image_resize_factor: image resize (reshape) factor
"""
image_nparray = None
try:
image = cv2.imread(image_path_name, cv2.IMREAD_GRAYSCALE)
image_height, image_width = image.shape
if image_resize_factor is not None:
image_width_resize = int(image_width / image_resize_factor)
aspect_ratio = image_height / image_width
image_height_resize = int(aspect_ratio * image_width_resize)
image_dimension = (image_width_resize,
image_height_resize)
image = cv2.resize(image, image_dimension, interpolation=cv2.INTER_AREA)
image_width, image_height = image_width_resize, image_height_resize
image_nparray = np.fromstring(image.tobytes(), dtype=np.uint8)
if (nparray_dimension == "1d"):
image_nparray = image_nparray.reshape((1, image_width_resize * image_height_resize))
except:
exception_message = sys.exc_info()[0]
print("An error occurred. {}".format(exception_message))
return image_nparray

def print_searchcv_result(self, classifier_model):
"""
print grid or randomized search cv results: best score and best parameters
:param classifier_model: defined classifier model
:return none
"""
try:
print("Scores:")
means = classifier_model.cv_results_["mean_test_score"]
standard_deviations = classifier_model.cv_results_["std_test_score"]
for mean, standard_deviation, parameter in zip(means, standard_deviations, classifier_model.cv_results_["params"]):
mean = float("{0:0.3f}".format(mean))
standard_deviation = float("{0:0.3f}".format(standard_deviation * 2))
print("mean:{} (std:+-{}) for {}".format(mean, standard_deviation * 2, parameter))
print()
print("Best Score:")
print(float("{0:0.3f}".format(classifier_model.best_score_)))
print()
print("Best Parameters:")
print(classifier_model.best_params_)
print()
except:
exception_message = sys.exc_info()[0]
print("An error occurred. {}".format(exception_message))

def print_exception_message(self, message_orientation="horizontal"):
"""
print full exception message
:param message_orientation: horizontal or vertical
:return None
"""
try:
exc_type, exc_value, exc_tb = sys.exc_info()
file_name, line_number, procedure_name, line_code = traceback.extract_tb(exc_tb)[-1]
time_stamp = " [Time Stamp]: " + str(time.strftime("%Y-%m-%d %I:%M:%S %p"))
file_name = " [File Name]: " + str(file_name)
procedure_name = " [Procedure Name]: " + str(procedure_name)
error_message = " [Error Message]: " + str(exc_value)
error_type = " [Error Type]: " + str(exc_type)
line_number = " [Line Number]: " + str(line_number)
line_code = " [Line Code]: " + str(line_code)
if (message_orientation == "horizontal"):
print( "An error occurred:{};{};{};{};{};{};{}".format(time_stamp, file_name, procedure_name, error_message, error_type, line_number, line_code))
elif (message_orientation == "vertical"):
print( "An error occurred:\n{}\n{}\n{}\n{}\n{}\n{}\n{}".format(time_stamp, file_name, procedure_name, error_message, error_type, line_number, line_code))
except:
pass

def main(image_path_name):
"""
main function start program
:param image_path_name: image path and file name
"""
try:
# instantiate the object for image array flatten class
image_array_flatten_class = ImageFileConvertFlattenArrayClass()

# check is image path file name exists
if (os.path.exists(image_path_name) == False):
print("File {} not found.".format(image_path_name))
exit()

# resize and flatten the image from 2d to 1d array
nparray_dimension = "1d"
image_resize_factor = 10
image_1d_nparray = image_array_flatten_class.image_flatten_pixel_nparray_opencv(image_path_name, nparray_dimension, image_resize_factor)
# create the data frame
X_real = pd.DataFrame(image_1d_nparray)

# data frame normalization
X_min = config.X_MIN; X_max = config.X_MAX
X_real = (X_real.astype("float32") - X_min) / (X_max - X_min)
# open and close the mlp classifier pickle model
project_directory_path = os.path.dirname(os.path.realpath(__file__))
mlp_classifier_model_pkl = open(os.path.join(project_directory_path, "image_classification.pkl"), "rb")
mlp_classifier_model_file = pickle.load(mlp_classifier_model_pkl)
mlp_classifier_model_pkl.close()

# run the predict method and validate for image category 1 or 0
y_predict_file = mlp_classifier_model_file.predict(X_real)
if y_predict_file == 1:
print("1")
else:
print("0")
except:
image_array_flatten_class.print_exception_message()

# main top-level start program
if __name__ == '__main__':
arg_parse = argparse.ArgumentParser()
arg_parse.add_argument("-image_path_name")
arguments = arg_parse.parse_args()
image_path_name = arguments.image_path_name
main(image_path_name)

Deployment Requirements

To run a Python script file in C#, as I explained before the client PC will need to have the Python interpreter installed along with three Python Data Ecosystem libraries: NumPy, OpenCV and Pickle. That’s all, very simple!

Conclusion

The possibility of running Python script files with any developed Machine Learning models in Microsoft C# could be a good solution for companies that want to keep their main development infrastructure. The C# code will allow input/output any necessary model parameter(s). This solution will enable any Microsoft C# business application to integrate Machine Learning models developed in Python using any of the Python Data Ecosystem frameworks, an excellent idea for robust .NET system architecture. This could be applied using Java, C++, etc.

I really believe that after this blog is published many .NET shop companies will use the power of Python Data Ecosystem to tackle their Data Science projects needs along with known Machine Learning frameworks like scikit-learn, TensorFlow, Caffe, PyTorch, Keras, Neon, etc. Feel free to contact me with your feedback and let me know how I can help!

--

--

Ernest Bonat, Ph.D.

I’m a Senior Data Scientist and Engineer consultant. I work on Machine Learning application projects for Life Sciences using Python and Python Data Ecosystem.