Lesson-01 通过构建线性回归-理解Loss函数-梯度下降与函数拟合

    技术2022-07-10  135

    Load Dataset

    from sklearn.datasets import load_boston data = load_boston() X, y = data['data'], data['target'] X[1] y[1] X.shape len(y) %matplotlib inline import matplotlib.pyplot as plt plt.scatter(X[:, 5], y)

    目标:就是要找一个“最佳”的直线,来拟合卧室和房价的关系

    import random k, b = random.randint(-100, 100), random.randint(-100, 100) def func(x): return k*x + b X_rm = X[:, 5] y_hat = [func(x) for x in X_rm] plt.scatter(X[:, 5], y) plt.plot(X_rm, y_hat)

    随机画了一根直线,结果发现,离得很远?🙁

    def

    draw_room_and_price(): plt.scatter(X[:, 5], y) def price(x, k, b): return k*x + b k, b = random.randint(-100, 100), random.randint(-100, 100) price_by_random_k_and_b = [price(r, k, b) for r in X_rm] print('the random k : {}, b: {}'.format(k, b)) draw_room_and_price() plt.scatter(X_rm, price_by_random_k_and_b)

    目标是想找到最“好”的K和b?

    我们需要一个标准去衡量这个东西到底好不好

    y_true, y ^ \hat{y} y^

    衡量y_true, y ^ \hat{y} y^ -> 损失函数

    y_true = [1, 4, 1, 4,1, 4, 1,4] y_hat = [2, 3, 1, 4, 1, 41, 31, 3]

    L1-Loss

    l o s s = 1 n ∑ i n ∣ y t r u e − i − y i ^ ∣ loss = \frac{1}{n} \sum_{i}^{n}| y_{true-i} - \hat{y_i} | loss=n1inytrueiyi^

    y_ture = [3, 4, 4] y_hat_1 = [1, 1, 4] y_hat_2 = [3, 4, 0]

    L1-Loss 值是多少呢? |3 - 1| + |4-1|+ |4 -4| = 2 + 2 + 0 = 4

    y 2 ^ \hat{y_2} y2^ L1-Loss |3-3| + |4-4|+|4-0| = 4

    l o s s = 1 n ∑ i n ( y i − y i ^ ) 2 loss = \frac{1}{n} \sum_{i}^{n} (y_i - \hat{y_i}) ^ 2 loss=n1in(yiyi^)2

    def loss(y, y_hat): sum_ = sum([(y_i - y_hat_i) ** 2 for y_i, y_hat_i in zip(y, y_hat)]) return sum_ / len(y) y_ture = [3, 4, 4] y_hat_1 = [1, 1, 4] y_hat_2 = [3, 4, 0] print(loss(y_ture, y_hat_1)) print(loss(y_ture, y_hat_2)) def price(x, k, b): return k*x + b k, b = random.randint(-100, 100), random.randint(-100, 100) price_by_random_k_and_b = [price(r, k, b) for r in X_rm] print('the random k : {}, b: {}'.format(k, b)) draw_room_and_price() plt.scatter(X_rm, price_by_random_k_and_b) cost = loss(list(y), price_by_random_k_and_b) print('The Loss of k: {}, b: {} is {}'.format(k, b, cost))

    Loss 一件事情你只要知道如何评价它好与坏 基本上就完成了一般了工作了

    最简单的方法,我们随机生成若干组k和b,然后找到最佳的一组k和b

    def price(x, k, b): return k*x + b trying_times = 5000 best_k, best_b = None, None min_cost = float('inf') losses = [] for i in range(trying_times): k = random.random() * 100 - 200 b = random.random() * 100 - 200 price_by_random_k_and_b = [price(r, k, b) for r in X_rm] #draw_room_and_price() #plt.scatter(X_rm, price_by_random_k_and_b) cost = loss(list(y), price_by_random_k_and_b) if cost < min_cost: min_cost = cost best_k, best_b = k, b print('在第{}, k和b更新了'.format(i)) losses.append(min_cost)

    We could add a visualize

    min_cost best_k, best_b def plot_by_k_and_b(k, b): price_by_random_k_and_b = [price(r, k, b) for r in X_rm] draw_room_and_price() plt.scatter(X_rm, price_by_random_k_and_b) plot_by_k_and_b(best_k, best_b)

    2-nd 方法 进行方向的调整

    k的变化有两种: 增大和减小

    b的变化也有两种:增大和减小

    k, b这一组值我们进行变化,就有4种组合:

    当,k和b沿着某个方向 d n d_n dn变化的时候,如何,loss下降了,那么,k和b接下来就继续沿着 d n d_n dn这个方向走,否则,我们就换一个方向

    directions = [ (+1, -1), (+1, +1), (-1, -1), (-1, +1) ] def price(x, k, b): return k*x + b trying_times = 10000 best_k = random.random() * 100 - 200 best_b = random.random() * 100 - 200 next_direction = random.choice(directions) min_cost = float('inf') losses = [] scala = 0.3 for i in range(trying_times): current_direction = next_direction k_direction, b_direction = current_direction current_k = best_k + k_direction * scala current_b = best_b + b_direction * scala price_by_random_k_and_b = [price(r, current_k, current_b) for r in X_rm] cost = loss(list(y), price_by_random_k_and_b) if cost < min_cost: min_cost = cost best_k, best_b = current_k,current_b print('在第{}, k和b更新了'.format(i)) losses.append((i, min_cost)) next_direction = current_direction else: next_direction = random.choice(list(set(directions) - {current_direction})) len(losses) min_cost

    3-rd 梯度下降

    我们能不能每一次的时候,都按照能够让它Loss减小方向走?

    都能够找到一个方向

    l o s s = 1 n ∑ i n ( y i − y ^ ) ∗ ∗ 2 loss = \frac{1}{n} \sum_i^n (y_i - \hat{y})**2 loss=n1in(yiy^)2 l o s s = 1 n ∑ i n ( y i − ( k ∗ x i + b ) ) 2 loss = \frac{1}{n} \sum_i^n (y_i - (k*x_i + b))^2 loss=n1in(yi(kxi+b))2

    ∂ l o s s ∂ k = − 2 n ∑ ( y i − ( k x i + b ) ) x i \frac{\partial{loss}}{\partial{k}} = -\frac{2}{n}\sum(y_i - (kx_i + b))x_i kloss=n2(yi(kxi+b))xi ∂ l o s s ∂ b = − 2 n ∑ ( y i − ( k x i + b ) ) \frac{\partial{loss}}{\partial{b}} = -\frac{2}{n}\sum(y_i - (kx_i + b)) bloss=n2(yi(kxi+b))

    ∂ l o s s ∂ k = − 2 n ∑ ( y i − y ^ i ) x i \frac{\partial{loss}}{\partial{k}} = -\frac{2}{n}\sum(y_i - \hat{y}_i)x_i kloss=n2(yiy^i)xi ∂ l o s s ∂ b = − 2 n ∑ ( y i − y ^ i ) \frac{\partial{loss}}{\partial{b}} = -\frac{2}{n}\sum(y_i - \hat{y}_i) bloss=n2(yiy^i)

    def partial_k(x, y, y_hat): gradient = 0 for x_i, y_i, y_hat_i in zip(list(x), list(y), list(y_hat)): gradient += (y_i - y_hat_i) * x_i return -2 / len(y) * gradient def partial_b(y, y_hat): gradient = 0 for y_i, y_hat_i in zip(list(y), list(y_hat)): gradient += (y_i - y_hat_i) return -2 / len(y) * gradient def price(x, k, b): # Operation : CNN, RNN, LSTM, Attention 比KX+B更复杂的对应关系 return k*x + b trying_times = 50000 min_cost = float('inf') losses = [] scala = 0.3 k, b = random.random() * 100 - 200, random.random() * 100 - 200

    参数初始化问题! Weight Initizalition 问题!

    best_k, best_b = None, None learning_rate = 1e-3 # Optimizer Rate for i in range(trying_times): price_by_random_k_and_b = [price(r, k, b) for r in X_rm] cost = loss(list(y), price_by_random_k_and_b) if cost < min_cost: # print('在第{}, k和b更新了'.format(i)) min_cost = cost best_k, best_b = k, b losses.append((i, min_cost)) k_gradient = partial_k(X_rm, y, price_by_random_k_and_b) # 变化的方向 b_gradient = partial_b(y, price_by_random_k_and_b) k = k + (-1 * k_gradient) * learning_rate ## 优化器: Optimizer ## Adam 动量 momentum b = b + (-1 * b_gradient) * learning_rate

    封装成一块一块儿的,别人用的时候,不需要重新在开始写了

    len(losses) print(min_cost) best_k, best_b def square(x): return 10 * x**2 + 5 * x + 5 import numpy as np _X = np.linspace(-100, 100) _y = [square(x) for x in _X] plt.plot(_X, _y) plot_by_k_and_b(k=best_k, b=best_b) plot_by_k_and_b(k=best_k, b=best_b)
    Processed: 0.015, SQL: 9