- Python Machine Learning Cookbook(Second Edition)
- Giuseppe Ciaburro Prateek Joshi
- 266字
- 2021-06-24 15:40:56
Getting ready
Let's visualize our data to understand the problem at hand. We will use the svm.py file for this. Before we build the SVM, let's understand our data. We will use the data_multivar.txt file that's already provided to you. Let's see how to to visualize the data:
- Create a new Python file and add the following lines to it (the full code is in the svm.py file which has already been provided to you):
import numpy as np import matplotlib.pyplot as plt import utilities # Load input data input_file = 'data_multivar.txt' X, y = utilities.load_data(input_file)
- We just imported a couple of packages and named the input file. Let's look at the load_data() method:
# Load multivar data in the input file def load_data(input_file): X = [] y = [] with open(input_file, 'r') as f: for line in f.readlines(): data = [float(x) for x in line.split(',')] X.append(data[:-1]) y.append(data[-1]) X = np.array(X) y = np.array(y) return X, y
- We need to separate the data into classes, as follows:
class_0 = np.array([X[i] for i in range(len(X)) if y[i]==0]) class_1 = np.array([X[i] for i in range(len(X)) if y[i]==1])
- Now that we have separated the data, let's plot it:
plt.figure() plt.scatter(class_0[:,0], class_0[:,1], facecolors='black', edgecolors='black', marker='s') plt.scatter(class_1[:,0], class_1[:,1], facecolors='None', edgecolors='black', marker='s') plt.title('Input data') plt.show()
If you run this code, you will see the following:
The preceding consists of two types of points—solid squares and empty squares. In machine learning lingo, we say that our data consists of two classes. Our goal is to build a model that can separate the solid squares from the empty squares.