esball登录
  咨询电话:15254611038

esball世博客户端

ML.NET速览

什么是ML.NET?

ML.NET是由微软创建,为.NET开发者准备的开源机器学习框架。它是跨平台的,可以在macOS,Linux及Windows上运行。

机器学习管道

ML.NET通过管道(pipeline)方式组合机器学习过程。整个管道分为以下四个部分:

Load Data 加载数据Transform Data 转换数据Choose Algorithm 选择算法Train Model 训练模型

示例

建立一个控制台项目。

dotnet new console -o myAppcd myApp

添加ML.NET类库包。

dotnet add package Microsoft.ML

在工程文件夹下创建一个名为iris-data.txt的文本文件,内容如下:

5.1,3.5,1.4,0.2,Iris-setosa4.9,3.0,1.4,0.2,Iris-setosa4.7,3.2,1.3,0.2,Iris-setosa4.6,3.1,1.5,0.2,Iris-setosa5.0,3.6,1.4,0.2,Iris-setosa5.4,3.9,1.7,0.4,Iris-setosa4.6,3.4,1.4,0.3,Iris-setosa5.0,3.4,1.5,0.2,Iris-setosa4.4,2.9,1.4,0.2,Iris-setosa4.9,3.1,1.5,0.1,Iris-setosa5.4,3.7,1.5,0.2,Iris-setosa4.8,3.4,1.6,0.2,Iris-setosa4.8,3.0,1.4,0.1,Iris-setosa4.3,3.0,1.1,0.1,Iris-setosa5.8,4.0,1.2,0.2,Iris-setosa5.7,4.4,1.5,0.4,Iris-setosa5.4,3.9,1.3,0.4,Iris-setosa5.1,3.5,1.4,0.3,Iris-setosa5.7,3.8,1.7,0.3,Iris-setosa5.1,3.8,1.5,0.3,Iris-setosa5.4,3.4,1.7,0.2,Iris-setosa5.1,3.7,1.5,0.4,Iris-setosa4.6,3.6,1.0,0.2,Iris-setosa5.1,3.3,1.7,0.5,Iris-setosa4.8,3.4,1.9,0.2,Iris-setosa5.0,3.0,1.6,0.2,Iris-setosa5.0,3.4,1.6,0.4,Iris-setosa5.2,3.5,1.5,0.2,Iris-setosa5.2,3.4,1.4,0.2,Iris-setosa4.7,3.2,1.6,0.2,Iris-setosa4.8,3.1,1.6,0.2,Iris-setosa5.4,3.4,1.5,0.4,Iris-setosa5.2,4.1,1.5,0.1,Iris-setosa5.5,4.2,1.4,0.2,Iris-setosa4.9,3.1,1.5,0.1,Iris-setosa5.0,3.2,1.2,0.2,Iris-setosa5.5,3.5,1.3,0.2,Iris-setosa4.9,3.1,1.5,0.1,Iris-setosa4.4,3.0,1.3,0.2,Iris-setosa5.1,3.4,1.5,0.2,Iris-setosa5.0,3.5,1.3,0.3,Iris-setosa4.5,2.3,1.3,0.3,Iris-setosa4.4,3.2,1.3,0.2,Iris-setosa5.0,3.5,1.6,0.6,Iris-setosa5.1,3.8,1.9,0.4,Iris-setosa4.8,3.0,1.4,0.3,Iris-setosa5.1,3.8,1.6,0.2,Iris-setosa4.6,3.2,1.4,0.2,Iris-setosa5.3,3.7,1.5,0.2,Iris-setosa5.0,3.3,1.4,0.2,Iris-setosa7.0,3.2,4.7,1.4,Iris-versicolor6.4,3.2,4.5,1.5,Iris-versicolor6.9,3.1,4.9,1.5,Iris-versicolor5.5,2.3,4.0,1.3,Iris-versicolor6.5,2.8,4.6,1.5,Iris-versicolor5.7,2.8,4.5,1.3,Iris-versicolor6.3,3.3,4.7,1.6,Iris-versicolor4.9,2.4,3.3,1.0,Iris-versicolor6.6,2.9,4.6,1.3,Iris-versicolor5.2,2.7,3.9,1.4,Iris-versicolor5.0,2.0,3.5,1.0,Iris-versicolor5.9,3.0,4.2,1.5,Iris-versicolor6.0,2.2,4.0,1.0,Iris-versicolor6.1,2.9,4.7,1.4,Iris-versicolor5.6,2.9,3.6,1.3,Iris-versicolor6.7,3.1,4.4,1.4,Iris-versicolor5.6,3.0,4.5,1.5,Iris-versicolor5.8,2.7,4.1,1.0,Iris-versicolor6.2,2.2,4.5,1.5,Iris-versicolor5.6,2.5,3.9,1.1,Iris-versicolor5.9,3.2,4.8,1.8,Iris-versicolor6.1,2.8,4.0,1.3,Iris-versicolor6.3,2.5,4.9,1.5,Iris-versicolor6.1,2.8,4.7,1.2,Iris-versicolor6.4,2.9,4.3,1.3,Iris-versicolor6.6,3.0,4.4,1.4,Iris-versicolor6.8,2.8,4.8,1.4,Iris-versicolor6.7,3.0,5.0,1.7,Iris-versicolor6.0,2.9,4.5,1.5,Iris-versicolor5.7,2.6,3.5,1.0,Iris-versicolor5.5,2.4,3.8,1.1,Iris-versicolor5.5,2.4,3.7,1.0,Iris-versicolor5.8,2.7,3.9,1.2,Iris-versicolor6.0,2.7,5.1,1.6,Iris-versicolor5.4,3.0,4.5,1.5,Iris-versicolor6.0,3.4,4.5,1.6,Iris-versicolor6.7,3.1,4.7,1.5,Iris-versicolor6.3,2.3,4.4,1.3,Iris-versicolor5.6,3.0,4.1,1.3,Iris-versicolor5.5,2.5,4.0,1.3,Iris-versicolor5.5,2.6,4.4,1.2,Iris-versicolor6.1,3.0,4.6,1.4,Iris-versicolor5.8,2.6,4.0,1.2,Iris-versicolor5.0,2.3,3.3,1.0,Iris-versicolor5.6,2.7,4.2,1.3,Iris-versicolor5.7,3.0,4.2,1.2,Iris-versicolor5.7,2.9,4.2,1.3,Iris-versicolor6.2,2.9,4.3,1.3,Iris-versicolor5.1,2.5,3.0,1.1,Iris-versicolor5.7,2.8,4.1,1.3,Iris-versicolor6.3,3.3,6.0,2.5,Iris-virginica5.8,2.7,5.1,1.9,Iris-virginica7.1,3.0,5.9,2.1,Iris-virginica6.3,2.9,5.6,1.8,Iris-virginica6.5,3.0,5.8,2.2,Iris-virginica7.6,3.0,6.6,2.1,Iris-virginica4.9,2.5,4.5,1.7,Iris-virginica7.3,2.9,6.3,1.8,Iris-virginica6.7,2.5,5.8,1.8,Iris-virginica7.2,3.6,6.1,2.5,Iris-virginica6.5,3.2,5.1,2.0,Iris-virginica6.4,2.7,5.3,1.9,Iris-virginica6.8,3.0,5.5,2.1,Iris-virginica5.7,2.5,5.0,2.0,Iris-virginica5.8,2.8,5.1,2.4,Iris-virginica6.4,3.2,5.3,2.3,Iris-virginica6.5,3.0,5.5,1.8,Iris-virginica7.7,3.8,6.7,2.2,Iris-virginica7.7,2.6,6.9,2.3,Iris-virginica6.0,2.2,5.0,1.5,Iris-virginica6.9,3.2,5.7,2.3,Iris-virginica5.6,2.8,4.9,2.0,Iris-virginica7.7,2.8,6.7,2.0,Iris-virginica6.3,2.7,4.9,1.8,Iris-virginica6.7,3.3,5.7,2.1,Iris-virginica7.2,3.2,6.0,1.8,Iris-virginica6.2,2.8,4.8,1.8,Iris-virginica6.1,3.0,4.9,1.8,Iris-virginica6.4,2.8,5.6,2.1,Iris-virginica7.2,3.0,5.8,1.6,Iris-virginica7.4,2.8,6.1,1.9,Iris-virginica7.9,3.8,6.4,2.0,Iris-virginica6.4,2.8,5.6,2.2,Iris-virginica6.3,2.8,5.1,1.5,Iris-virginica6.1,2.6,5.6,1.4,Iris-virginica7.7,3.0,6.1,2.3,Iris-virginica6.3,3.4,5.6,2.4,Iris-virginica6.4,3.1,5.5,1.8,Iris-virginica6.0,3.0,4.8,1.8,Iris-virginica6.9,3.1,5.4,2.1,Iris-virginica6.7,3.1,5.6,2.4,Iris-virginica6.9,3.1,5.1,2.3,Iris-virginica5.8,2.7,5.1,1.9,Iris-virginica6.8,3.2,5.9,2.3,Iris-virginica6.7,3.3,5.7,2.5,Iris-virginica6.7,3.0,5.2,2.3,Iris-virginica6.3,2.5,5.0,1.9,Iris-virginica6.5,3.0,5.2,2.0,Iris-virginica6.2,3.4,5.4,2.3,Iris-virginica5.9,3.0,5.1,1.8,Iris-virginica

粘贴下面的代码到Program文件中。

using System;using Microsoft.ML;using Microsoft.ML.Runtime.Api;using Microsoft.ML.Runtime.Data;namespace myApp{ class Program { public class IrisData { public float SepalLength; public float SepalWidth; public float PetalLength; public float PetalWidth; public string Label; } public class IrisPrediction { [ColumnName("PredictedLabel")] public string PredictedLabels; } static void Main(string[] args) { var mlContext = new MLContext(); string dataPath = "iris-data.txt"; var reader = mlContext.Data.TextReader(new TextLoader.Arguments() { Separator = ",", HasHeader = true, Column = new[] { new TextLoader.Column("SepalLength", DataKind.R4, 0), new TextLoader.Column("SepalWidth", DataKind.R4, 1), new TextLoader.Column("PetalLength", DataKind.R4, 2), new TextLoader.Column("PetalWidth", DataKind.R4, 3), new TextLoader.Column("Label", DataKind.Text, 4) } }); IDataView trainingDataView = reader.Read(new MultiFileSource(dataPath)); var pipeline = mlContext.Transforms.Categorical.MapValueToKey("Label") .Append(mlContext.Transforms.Concatenate("Features", "SepalLength", "SepalWidth", "PetalLength", "PetalWidth")) .Append(mlContext.MulticlassClassification.Trainers.StochasticDualCoordinateAscent(label: "Label", features: "Features")) .Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabel")); var model = pipeline.Fit(trainingDataView); var prediction = model.MakePredictionFunction<IrisData, IrisPrediction>(mlContext).Predict( new IrisData() { SepalLength = 3.3f, SepalWidth = 1.6f, PetalLength = 0.2f, PetalWidth = 5.1f, }); Console.WriteLine($"Predicted flower type is: {prediction.PredictedLabels}"); } }}

通过dotnet run命令运行程序后可得到预测结果。

Predicted flower type is: Iris-virginica

解例

例子中定义了两个类,IrisData与IrisPrediction。IrisData类是用于训练的数据结构,而IrisPrediction则用于预测。

MLContext类用于定义ML.NET的上下文(context),可以理解为是它的运行时环境。

接着,创建一个TextReader,用于读取数据集文件,可以看到其中规定了读取的格式。这里即是机器学习管道的第一步。

第二步,转换IrisData类中Label属性的类型,使之成为数值类型,因为只有数值类型的数据才能在模型训练中被使用。再将SepalLength,SepalWidth,PetalLength与PetalWidth合并为一,统合为数据集的Features。

第三步,为训练选择合适的算法,并传入标签(Label)和特征(Features)。

第四步,训练模型。

完成模型后,就可以用它进行预测了。因为最后预测的结果是字符串类型,所以在上述第三步的操作后有必要加上转换操作,把结果从数值类型再转回字符串类型。