ML library in Golang

Elahe Dastan
3 min readAug 29, 2020
Photo by Pietro Jeng on Unsplash

Machine learning is used more and more often every day and the top language usually used for it is python but python is not that fast to satisfy users so I tried to implement ML algorithms in golang which is much faster take a look at https://github.com/elahe-dastan/newborn if you are interested.


package main

import (
"fmt"
"strconv"

"github.com/elahe-dastan/newborn/data"
)

func main() {
headers, content := data.ReadCSVData("./data/dataset_test.csv")
fmt.Println(headers)

x := make([]float64, len(content[headers[0]]))
y := make([]float64, len(content[headers[1]]))

for i := 0; i < len(content[headers[0]]); i++ {
x[i], _ = strconv.ParseFloat(content[headers[0]][i], 64)
y[i], _ = strconv.ParseFloat(content[headers[1]][i], 64)
}

data.ScatterPlot(x, y, headers[0], headers[1], "example")
}
[x y]

Regression
One of the basic things everybody has to learn to study ML is linear regression, logistic regression and nonlinear regression, regression is used in both predicting a value and classification.
Let's talk a little deeper about regression, we want to fit a line or curve to a set of data.I'm going to show the math calculations of what is happening.
check here if you are interested in the math calculations.

In this repository I haven't implemented linear regression separately because linear regression is polynomial regression with degree equal to 1 and lambda equal to 0

I myself made a vary small dataset using the relation: 5 + 3x1 + x1^2 + 2x2 + 4x2^2 so the answer for x1=0.051 and x2=1.251 should be around 13.917604999999998 Let's see how it works

package main

import (
"fmt"
"github.com/elahe-dastan/newborn/data"
"github.com/elahe-dastan/newborn/regression"
"strconv"
)

func main() {
headers, content := data.ReadCSVData("./regression/dataset_test.csv")
features := make([][]float64, len(content[headers[0]]))
for i := range features {
f := make([]float64, len(headers)-1)
for j := range f {
f[j], _ = strconv.ParseFloat(content[headers[j]][i], 64)
}
features[i] = f
}

values := make([]float64, len(content[headers[0]]))
for i := range values {
values[i],_ = strconv.ParseFloat(content[headers[len(headers) - 1]][i], 64)
}

p := regression.New()

p.Train(features, values, 2, 0.00028, 4000, 0)
fmt.Println(p.Predict([]float64{0.051,1.251}))
}

the answer was “14.041355047034626”, I think not too bad

K Nearest Neighbour
This algorithm is used for classification, to choose a label for a new data the algorithm finds the k nearest data to it and finds the most repeated label among them, different methods can be used to calculate the distance between the new data and the old ones in this repository I use one of the easiest approaches called Euclidean Distance.


package main

import (
"fmt"
"github.com/elahe-dastan/newborn/data"
"github.com/elahe-dastan/newborn/knn"
"strconv"
)

func main() {
headers, content := data.ReadCSVData("./knn/dataset_test.csv")
currentData := make([][]float64, len(content[headers[0]]))
for i := range currentData {
f := make([]float64, len(headers)-1)
for j := range f {
f[j], _ = strconv.ParseFloat(content[headers[j]][i], 64)
}
currentData[i] = f
}

labels := make([]int, len(content[headers[0]]))
for i := range labels {
labels[i], _ = strconv.Atoi(content[headers[len(headers)-1]][i])
}

newData := []float64{57.0,1.0,4.0,140.0,192.0,0.0,0.0,148.0,0.0,0.4,2.0,0.0,6.0}

// k = 3
label := knn.KNN(currentData, labels, newData, 3)
fmt.Println(label)
}
0

--

--