# Binary numbers as input

This is based on [this question](https://stackoverflow.com/questions/46012622/check-divisibility-using-machine-learning/46016389#46016389) from Stackoverflow.

It doesn't make any real-world sense, but it's easy introductory example, and more complex issues can be explained using this problem.

The problem is to classify whether some numbers (from fixed range) are divisible by some fixed number M.

The key idea is to not just encode numbers as ints, but use their binary representation.

## Checking divisibility with machine learning

In [2]:
import numpy as np

from sklearn.model_selection import train_test_split

In [3]:
N = 1024
logN = int(np.log2(N))
M = 10 

X_bin = [list(np.binary_repr(x, width=logN)) for x in range(N)]
X = np.array(X_bin)
y = np.arange(N) % M == 0

X_train, X_test, y_train, y_test = train_test_split(X_bin, y, test_size=0.125, stratify=y, random_state=0)

## SVM

In [4]:
from sklearn.svm import SVC

svc = SVC(kernel='linear', C=1)
svc.fit(X_train, y_train)

svc.score(X_test, y_test)

0.8984375

## Decision tree 

In [10]:
from sklearn.tree import DecisionTreeClassifier 

tree = DecisionTreeClassifier(max_features='log2', max_depth=2)
tree.fit(X_train, np.vstack([y_train, y_train]).T)

tree.score(X_test, y_test)

ValueError: Can't handle mix of binary and multilabel-indicator

In [9]:
np.vstack([y_train, y_train]).shape

(2, 896)