Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
afeiguin
GitHub Repository: afeiguin/comp-phys
Path: blob/master/14_02_multilayer-networks.ipynb
374 views
Kernel: Python 3

How a regression network is traditionally trained

This network is trained using a data set D=(x(n),t(n))D = ({{\bf x}^{(n)}, {\bf t}^{(n)}}) by adjusting w{\bf w} so as to minimize an error function, e.g.,

ED(w)=ni(yi(x(n);w)ti(n))2E_D({\bf w}) = \sum_n\sum_i (y_i({\bf x}^{(n)};{\bf w}) - t_i^{(n)})^2

This objective function is a sum of terms, one for each input/target pair {x,t}\{ {\bf x}, {\bf t} \}, measuring how close the output y(x;w){\bf y}({\bf x}; {\bf w}) is to the target t{\bf t}:

ED(w)=nEx(n),Ex(n)=i(yi(x(n);w)ti(n))2E_D({\bf w}) = \sum_n E_{\bf x}^{(n)}, \quad E_{\bf x}^{(n)}=\sum_i (y_i({\bf x}^{(n)};{\bf w}) - t_i^{(n)})^2

This minimization is based on repeated evaluation of the gradient of EDE_D. This gradient can be efficiently computed using the backpropagation algorithm which uses the chain rule to find the derivatives, as we discuss below.

Often, regularization (also known as weight decay) is included, modifying the objective function to:

M(w)=αED(w)+βEW(w),M({\bf w})=\alpha E_D({\bf w}) + \beta E_W({\bf w}),

where EW=12iwi2E_W = \frac{1}{2}\sum_i w_i^2.

Gradient descent

(From Wikipedia) Cool animations at http://www.benfrederickson.com/numerical-optimization/

Gradient descent is a first-order iterative optimization algorithm for finding the minimum of a function. To find a local minimum of a function using gradient descent, one takes steps proportional to the negative of the gradient (or of the approximate gradient) of the function at the current point.

Gradient descent is based on the observation that if the multi-variable function F(x) F(\mathbf {x} ) is defined and differentiable in a neighborhood of a point a \mathbf {a} , then F(x) F(\mathbf {x} ) decreases fastest if one goes from a \mathbf {a} in the direction of the negative gradient of FF at a \mathbf {a} , F(a) -\nabla F(\mathbf {a} ). It follows that, if

an+1=anηF(an)\mathbf {a} _{n+1}=\mathbf {a} _{n}-\eta \nabla F(\mathbf {a} _{n})

for η\eta small enough, then F(an)F(an+1)F(\mathbf {a_{n}} )\geq F(\mathbf {a_{n+1}} ). In other words, the term ηF(a)\eta \nabla F(\mathbf {a} ) is subtracted from a \mathbf {a} because we want to move against the gradient, namely down toward the minimum. With this observation in mind, one starts with a guess x0\mathbf {x} _{0} for a local minimum of FF, and considers the sequence x0,x1,x2,\mathbf {x} _{0},\mathbf {x} _{1},\mathbf {x} _{2},\dots such that

xn+1=xnγnF(xn), n0.{x} _{n+1}=\mathbf {x} _{n}-\gamma _{n}\nabla F(\mathbf {x} _{n}),\ n\geq 0.

We have

F(x0)F(x1)F(x2)F(\mathbf {x} _{0})\geq F(\mathbf {x} _{1})\geq F(\mathbf {x} _{2})\geq \cdots , so hopefully the sequence (xn)(\mathbf {x} _{n}) converges to the desired local minimum. Note that the value of the step size η\eta is allowed to change at every iteration.

This process is illustrated in the adjacent picture. Here FF is assumed to be defined on the plane, and that its graph has a bowl shape. The blue curves are the contour lines, that is, the regions on which the value of FF is constant. A red arrow originating at a point shows the direction of the negative gradient at that point. Note that the (negative) gradient at a point is orthogonal to the contour line going through that point. We see that gradient descent leads us to the bottom of the bowl, that is, to the point where the value of the function FF is minimal.

#### Illustration of the gradient descept procedure on a series of iterations down a bowl shaped surface

The "Zig-Zagging" nature of the method is also evident below, where the gradient descent method is applied to F(x,y)=sin(12x214y2+3)cos(2x+1ey)F(x,y)=\sin \left({\frac {1}{2}}x^{2}-{\frac {1}{4}}y^{2}+3\right)\cos(2x+1-e^{y})

%matplotlib inline from matplotlib import pyplot pyplot.rcParams['image.cmap'] = 'jet' import numpy as np x0 = -1.4 y0 = 0.5 x = [x0] # The algorithm starts at x0, y0 y = [y0] eta = 0.1 # step size multiplier precision = 0.00001 def f(x,y): f1 = x**2/2-y**2/4+3 f2 = 2*x+1-np.exp(y) return np.sin(f1)*np.cos(f2) def gradf(x,y): f1 = x**2/2-y**2/4+3 f2 = 2*x+1-np.exp(y) dx = np.cos(f1)*np.cos(f2)*x-np.sin(f1)*np.sin(f2)*2. dy = np.cos(f1)*np.cos(f2)*(-y/2.)-np.sin(f1)*np.sin(f2)*(-np.exp(y)) return (dx,dy) err = 100. while err > precision: (step_x, step_y) = gradf(x0, y0) x0 -= eta*step_x y0 -= eta*step_y x.append(x0) y.append(y0) err = eta*(abs(step_x)+abs(step_y)) print(x0,y0) #### All this below is just to visualize the process dx = 0.05 dy = 0.05 xx = np.arange(-1.5, 1.+dx, dx) yy = np.arange(0., 2.+dy, dy) V = np.zeros(shape=(len(yy),len(xx))) for iy in range(0,len(yy)): for ix in range(0,len(xx)): V[iy,ix] = f(xx[ix],yy[iy]) X, Y = np.meshgrid(xx, yy) pyplot.contour(X, Y, V) #pyplot.plot(x,y,linestyle='--', lw=3); pyplot.scatter(x,y); pyplot.ylabel("y") pyplot.xlabel("x");
0.3226478037930326 1.602369170618785
Image in a Jupyter notebook

Stochastic gradient descent (SGD)

Stochastic gradient descent (often shortened to SGD), also known as incremental gradient descent, is a stochastic approximation of the gradient descent optimization and iterative method for minimizing an objective function that is written as a sum of differentiable functions.

There are a number of challenges in applying the gradient descent rule. To understand what the problem is, let's look back at the quadratic cost EDE_D. Notice that this cost function has the form E=nEx(n)E=\sum_n E_{\bf x}^{(n)} In practice, to compute the gradient ED\nabla E_D we need to compute the gradients Ex(n)\nabla E_{\bf x}^{(n)} separately for each training input, x(n){\bf x^{(n)}} and then average them. . Unfortunately, when the number of training inputs is very large this can take a long time, and learning thus occurs slowly.

Stochastic gradient descent can be used to speed up learning. The idea is to estimate the gradient E\nabla E by computing Ex\nabla E_{\bf x} for a small sample of randomly chosen training inputs. By averaging over this small sample it turns out that we can quickly get a good estimate of the true gradient.

To connect this explicitly to learning in neural networks, suppose wkw_k and blb_l denote the weights and biases in our neural network. Then stochastic gradient descent works by picking out a randomly chosen mini-batch of training inputs, and training with those, wkwkηj=1mEx(j)wk w_k \rightarrow w_k - \eta \sum_{j=1}^m \frac{\partial{E_{\bf x}^{(j)}}}{\partial w_k}

blblηj=1mEx(j)blb_l \rightarrow b_l - \eta \sum_{j=1}^m \frac{\partial{E_{\bf x}^{(j)}}}{\partial b_l}

where the sums are over all the training examples in the current mini-batch. Then we pick out another randomly chosen mini-batch and train with those. And so on, until we have exhausted the training inputs, which is said to complete an epoch of training. At that point we start over with a new training epoch.

The pseudocode would look like:

Choose an initial vector of parameters ww and learning rate η\eta.

Repeat until an approximate minimum is obtained:

Randomly shuffle examples in the training set. For i=1,2,...,n , do:

w:=wηEi(w).\quad \quad \quad \quad \quad w:=w-\eta \nabla E_{i}(w).

Example: linear regression

As seen previously, the objective function to be minimized is:

E(w)=i=1nEi(w)=i=1n(w1+w2xiyi)2.\begin{aligned} E(w)=\sum _{i=1}^{n}E_{i}(w)=\sum _{i=1}^{n}\left(w_{1}+w_{2}x_{i}-y_{i}\right)^{2}. \end{aligned}

And the gradent descent equations can be written in matrix form as:

[w1w2]:=[w1w2]η[2(w1+w2xiyi)2xi(w1+w2xiyi)].\begin{bmatrix}w_{1}\\w_{2}\end{bmatrix}:={\begin{bmatrix}w_{1}\\w_{2}\end{bmatrix}}-\eta {\begin{bmatrix}2(w_{1}+w_{2}x_{i}-y_{i})\\2x_{i}(w_{1}+w_{2}x_{i}-y_{i})\end{bmatrix}}.

We'll generate a series of 100 random points aligned more or less along the line y=a+bxy=a+bx with a=1a=1 and b=2b=2

%matplotlib inline from matplotlib import pyplot import numpy as np a = 1 b = 2 num_points = 100 np.random.seed(637163) # we make sure we always generate the same sequence x_data = np.random.rand(num_points)*20. y_data = x_data*b+a+3*(2.*np.random.rand(num_points)-1) pyplot.scatter(x_data,y_data) pyplot.plot(x_data, b*x_data+a) #### Least squares fit sum_x = np.sum(x_data) sum_y = np.sum(y_data) sum_x2 = np.sum(x_data**2) sum_xy = np.sum(x_data*y_data) det = num_points*sum_x2-sum_x**2 fit_a = (sum_y*sum_x2-sum_x*sum_xy)/det fit_b = (num_points*sum_xy-sum_x*sum_y)/det print(fit_a,fit_b) pyplot.xlim(-1,22) pyplot.ylim(-1,24) pyplot.plot(x_data, fit_b*x_data+fit_a);
1.1637760980701564 2.001777141438794
Image in a Jupyter notebook

We now write an SGD code for this problem. The training_data is a list of tuples (x, y) representing the training inputs and corresponding desired outputs. The variables epochs and mini_batch_size are what you'd expect - the number of epochs to train for, and the size of the mini-batches to use when sampling. eta is the learning rate, η\eta. If the optional argument test_data is supplied, then the program will evaluate the network after each epoch of training, and print out partial progress. This is useful for tracking progress, but slows things down substantially.

The code works as follows. In each epoch, it starts by randomly shuffling the training data, and then partitions it into mini-batches of the appropriate size. This is an easy way of sampling randomly from the training data. Then for each mini_batch we apply a single step of gradient descent. This is done by the code self.update_mini_batch(mini_batch, eta), which updates the coefficients according to a single iteration of gradient descent, using just the training data in mini_batch.

epochs = 1000 mini_batch_size = 10 eta = 0.01/mini_batch_size a = 3. b = 3. def update_mini_batch(mini_batch, eta): global a, b a0 = a b0 = b for x, y, in mini_batch: e = eta*(a0+b0*x-y) a -= e b -= x*e training_data = list(zip(x_data,y_data)) for j in range(epochs): np.random.shuffle(training_data) mini_batches = [training_data[k:k+mini_batch_size] for k in range(0, len(training_data), mini_batch_size)] for mini_batch in mini_batches: update_mini_batch(mini_batch, eta) print ("Epoch {0}: {1} {2}".format(j,a,b))
Epoch 0: 2.851091506514999 1.7876531781856742 Epoch 1: 2.827178748563592 1.963573397229449 Epoch 2: 2.7801714492150196 1.8619634110879069 Epoch 3: 2.756668947213542 2.0288955459781213 Epoch 4: 2.7076355088367525 1.880872788736252 Epoch 5: 2.6709938639494206 1.9094861291196814 Epoch 6: 2.6459565191117687 2.083968448182042 Epoch 7: 2.592571941252409 1.867303977943171 Epoch 8: 2.5668650012203917 1.9282486257504263 Epoch 9: 2.5260855862259803 1.8254093290530535 Epoch 10: 2.4977119308055653 1.8393082159024072 Epoch 11: 2.461504007375818 1.8187446556619598 Epoch 12: 2.440510503849123 1.9639665902712513 Epoch 13: 2.4192661745417667 2.056408083610859 Epoch 14: 2.3778520184589746 1.9200785689241224 Epoch 15: 2.352636754920605 1.9557702061187 Epoch 16: 2.32668522997869 1.985365726392641 Epoch 17: 2.29093524056995 1.9302755099169442 Epoch 18: 2.264373631096743 1.9014850942257417 Epoch 19: 2.240116842945883 1.935758005663645 Epoch 20: 2.2192529930462856 2.0084713102139746 Epoch 21: 2.1882155215215175 1.954607851080249 Epoch 22: 2.1610945265017794 1.871860736902566 Epoch 23: 2.1460869019987685 2.0191055043901693 Epoch 24: 2.1134743012209274 1.917120770897182 Epoch 25: 2.093876809726992 1.9695147135445203 Epoch 26: 2.0732373408257976 2.009659333749805 Epoch 27: 2.0545056929658063 1.9990817647672674 Epoch 28: 2.0358849885566888 2.0684880037349314 Epoch 29: 2.0127961055519528 1.981549450254667 Epoch 30: 1.9912110003344137 1.9953084240830377 Epoch 31: 1.9702356965968113 1.9337881228020706 Epoch 32: 1.958931435902377 1.9703136538892496 Epoch 33: 1.9320635446309764 1.8826305504717291 Epoch 34: 1.9183934975941435 1.9108781471703922 Epoch 35: 1.9002148717519194 1.933657775282379 Epoch 36: 1.8765188176622247 1.8501242260953568 Epoch 37: 1.8636240317182582 1.8732118340436419 Epoch 38: 1.8580542540860803 2.043103451641688 Epoch 39: 1.8404926684574494 1.9833198887817372 Epoch 40: 1.818471335258018 1.885853165755979 Epoch 41: 1.8104060287044144 1.946463310907274 Epoch 42: 1.7945224315136403 1.955383791204891 Epoch 43: 1.782283463654412 2.0023405102821905 Epoch 44: 1.749520717458214 1.8281773503780525 Epoch 45: 1.7453813582714632 2.003640838074842 Epoch 46: 1.728623006260412 1.9766876117635683 Epoch 47: 1.7084093749222484 1.934436123013543 Epoch 48: 1.6882385559184097 1.8316669036210635 Epoch 49: 1.6872347550750442 1.9027956243651138 Epoch 50: 1.6799350406816318 2.0067968405782595 Epoch 51: 1.6544625765529073 1.8881985255370595 Epoch 52: 1.6437316979583543 1.9255046345948543 Epoch 53: 1.6381622976293797 2.0030543868483233 Epoch 54: 1.619288123392432 1.9054952135108394 Epoch 55: 1.6081107386909623 1.8842249840685168 Epoch 56: 1.606385744453914 1.995354114039655 Epoch 57: 1.590951010569201 1.935539287963154 Epoch 58: 1.5907159977092078 2.0039496119285936 Epoch 59: 1.583681348958318 2.0712979889802097 Epoch 60: 1.5630265005446518 1.9298084540897775 Epoch 61: 1.564720140787769 2.0808310251200517 Epoch 62: 1.5466686232306155 1.9187231079896427 Epoch 63: 1.5388524762367308 1.9621349285397989 Epoch 64: 1.5405490598559062 2.0291358831335318 Epoch 65: 1.5208453821228052 1.8248931026367041 Epoch 66: 1.5213798489903287 1.8930706301200002 Epoch 67: 1.529153914875216 2.1490931535209667 Epoch 68: 1.5022716493694421 1.9262311478789624 Epoch 69: 1.500068778772673 2.0144696446422423 Epoch 70: 1.4895292591761131 1.9988696565984023 Epoch 71: 1.4750010294727158 1.9334703018634691 Epoch 72: 1.468734739081002 1.9509592360886463 Epoch 73: 1.4602541491491496 1.9535588676349709 Epoch 74: 1.4629985193934691 2.087981232637853 Epoch 75: 1.4482903637054756 1.9769627087994208 Epoch 76: 1.4441169372446487 1.9911884358343 Epoch 77: 1.4374914169928699 1.9697513594833118 Epoch 78: 1.4290881203003065 1.9810815298729503 Epoch 79: 1.4246361485906045 2.021522320675344 Epoch 80: 1.416276663539712 1.997969411686069 Epoch 81: 1.4088098511794314 1.9930406863592567 Epoch 82: 1.4068185837259672 2.0164782110987383 Epoch 83: 1.3962324622369873 1.8977840351410191 Epoch 84: 1.401436092664027 1.9950621893919915 Epoch 85: 1.387471145834617 1.8824887521430227 Epoch 86: 1.4014093218694679 2.1053832817280766 Epoch 87: 1.3846125002362668 1.961879809671709 Epoch 88: 1.3803176460734246 1.9693072109387666 Epoch 89: 1.3733167787595022 1.9193370938253795 Epoch 90: 1.3719395040783267 1.9529063749285855 Epoch 91: 1.3657851409032489 1.9519469364018414 Epoch 92: 1.365731695836303 2.0299060306472323 Epoch 93: 1.352704037805701 1.9068562249348497 Epoch 94: 1.358872635570531 1.9505330043142486 Epoch 95: 1.352406342772186 1.924937800945274 Epoch 96: 1.3529921471002395 1.9582904963736072 Epoch 97: 1.3467058766687356 1.964421345479529 Epoch 98: 1.3441056012495736 1.9585772301210544 Epoch 99: 1.338307075761737 1.9641242330382802 Epoch 100: 1.3389750343328595 2.0331089097262622 Epoch 101: 1.3292056062449498 1.970316569830166 Epoch 102: 1.3286221627701793 2.0548925899273556 Epoch 103: 1.3249933581558049 2.0102278112773337 Epoch 104: 1.3141363246433797 1.9167494489192494 Epoch 105: 1.3112904405797088 1.9230166472931127 Epoch 106: 1.3155463891174177 2.0363423473576034 Epoch 107: 1.3066507352411303 2.002921226309002 Epoch 108: 1.3012267385619987 1.992156333232618 Epoch 109: 1.2998811762793303 2.0836475234636054 Epoch 110: 1.2869768488136015 2.0326476069467847 Epoch 111: 1.2782409042449163 1.955308271763259 Epoch 112: 1.2702227407769626 1.843641680619616 Epoch 113: 1.2841364346903865 2.0428258403939537 Epoch 114: 1.2799771181355302 2.027691157552291 Epoch 115: 1.272749198706809 1.9838436304512466 Epoch 116: 1.2751164588056392 2.048106822415394 Epoch 117: 1.2709851254712097 1.9917962774665723 Epoch 118: 1.2762427149034599 2.136723976667836 Epoch 119: 1.2614286952500717 1.9537916421283819 Epoch 120: 1.268051811501334 2.003348786243271 Epoch 121: 1.2705624559612398 2.05644114955335 Epoch 122: 1.2488428579582682 1.824502349600589 Epoch 123: 1.2609746890519609 2.046167589066578 Epoch 124: 1.2649024819516497 2.073455837171682 Epoch 125: 1.2553435081447994 1.9933812628787015 Epoch 126: 1.2598887147760574 2.0313796669882813 Epoch 127: 1.2599147404360962 2.062627907835911 Epoch 128: 1.249692892431034 1.975233671612488 Epoch 129: 1.2442021578463776 1.9552628960704093 Epoch 130: 1.2386105150995876 1.982099666924853 Epoch 131: 1.2355586187412317 1.9757114294223843 Epoch 132: 1.2443294895643744 2.101623933536461 Epoch 133: 1.2299004597831222 1.9768187344562476 Epoch 134: 1.2276038542035699 1.9822626470879492 Epoch 135: 1.2277446415672508 1.9615810105868339 Epoch 136: 1.2286379676808046 2.0355041539229313 Epoch 137: 1.2211714193515004 1.9701469343010707 Epoch 138: 1.2298287656838378 2.024296033663489 Epoch 139: 1.227733776450261 2.035072556742687 Epoch 140: 1.2188039214924324 1.9586362914585522 Epoch 141: 1.221958994511311 2.018941730208499 Epoch 142: 1.2157233735073354 1.9671959386100284 Epoch 143: 1.2124366156263169 1.9592948983993996 Epoch 144: 1.2147369220284092 2.0442536562207776 Epoch 145: 1.205955712341693 1.9694423842214128 Epoch 146: 1.2049971023807025 1.9859694713761042 Epoch 147: 1.2075221330905666 2.009002905561291 Epoch 148: 1.2115221193191605 2.059401540299177 Epoch 149: 1.2025709919887162 1.982096858804897 Epoch 150: 1.2025820405123533 1.9921201608490595 Epoch 151: 1.2044366839866598 2.0146807988967903 Epoch 152: 1.206892541459285 2.037165558406217 Epoch 153: 1.2084186753224047 2.0514455070803828 Epoch 154: 1.2089888077827122 2.051536688143954 Epoch 155: 1.2076227373011457 1.9988099365677854 Epoch 156: 1.2037083483319981 1.9950064987376799 Epoch 157: 1.206526240140426 2.074083444964129 Epoch 158: 1.1969956856897452 1.9783641111066796 Epoch 159: 1.1988308898232631 1.9543139803979717 Epoch 160: 1.1995691446186438 1.937559081054288 Epoch 161: 1.2038438870466412 2.015030834060468 Epoch 162: 1.1985588246541923 1.9698469983800948 Epoch 163: 1.1958191873606725 1.9663497539993224 Epoch 164: 1.186928424021493 1.86455626410724 Epoch 165: 1.1928061814986093 2.0497622851939536 Epoch 166: 1.1865071366714997 2.0408127489658647 Epoch 167: 1.182851618741269 2.0110138481333046 Epoch 168: 1.18782422094513 2.0552683461751196 Epoch 169: 1.1897908718277397 2.0792826146138967 Epoch 170: 1.1895586261808533 2.020543573678281 Epoch 171: 1.1825696897860893 1.9801100085432106 Epoch 172: 1.181820009229163 1.9799370528048312 Epoch 173: 1.1868381738615787 2.0495345787064054 Epoch 174: 1.1839766393778706 2.0473350252553804 Epoch 175: 1.1850485909968755 2.0716526425398674 Epoch 176: 1.1824535090862134 2.011224705734455 Epoch 177: 1.180767591043016 2.004658322219915 Epoch 178: 1.1821388960311063 1.988800626421598 Epoch 179: 1.1880522464685088 2.067303807558048 Epoch 180: 1.1759645896758149 1.9546667268531908 Epoch 181: 1.1790533790141011 1.9842002311282236 Epoch 182: 1.1836537100256175 2.0460500757626106 Epoch 183: 1.1822863317308494 2.025085525889144 Epoch 184: 1.172530841497514 1.9242741669889663 Epoch 185: 1.1823675397686417 2.0441537662254428 Epoch 186: 1.177468047390862 1.9588840402287566 Epoch 187: 1.1801243667511367 1.970880411546426 Epoch 188: 1.1867122725568682 2.071167968819989 Epoch 189: 1.175747819017282 1.9120770900835262 Epoch 190: 1.178488236180694 1.9711391056919325 Epoch 191: 1.1834159610241293 2.076521736901644 Epoch 192: 1.1840720138080807 2.0695169723793168 Epoch 193: 1.1878662150844443 2.1054361965492254 Epoch 194: 1.172204080715583 1.9193349078207402 Epoch 195: 1.1796266578616825 2.021273557852105 Epoch 196: 1.1756283433183137 1.978397476038657 Epoch 197: 1.179347870801012 2.057485247243262 Epoch 198: 1.1771510299609378 1.9839566352038045 Epoch 199: 1.1734142144784492 1.9413638781685132 Epoch 200: 1.1796858491714215 2.0257042728471295 Epoch 201: 1.1828027565096388 2.073049797618982 Epoch 202: 1.1836229294305753 2.105296552012404 Epoch 203: 1.1748330785870187 1.97842429934267 Epoch 204: 1.1710094849765988 1.9225888313967678 Epoch 205: 1.1750019318878657 2.0161089101204266 Epoch 206: 1.1688822960668843 1.9173751852467291 Epoch 207: 1.171000593729568 1.9496922473163611 Epoch 208: 1.1719395299814825 1.9846125819542637 Epoch 209: 1.1653718472989494 1.9657401115851765 Epoch 210: 1.1666777708673137 1.9969331887938948 Epoch 211: 1.1647159687365278 1.9909138128673325 Epoch 212: 1.1754452887943576 2.0857264475141384 Epoch 213: 1.169688967309767 2.054910054131244 Epoch 214: 1.169127580623336 2.0601206215450856 Epoch 215: 1.1652074626875797 1.9973034692573273 Epoch 216: 1.1702852193446573 2.0519593792424358 Epoch 217: 1.1646162669879867 1.968465416447997 Epoch 218: 1.1707704669426298 2.0249056512171877 Epoch 219: 1.1724593556038376 2.082380488563087 Epoch 220: 1.1600188422806863 1.9533301129439185 Epoch 221: 1.1603990597387488 1.9908556837124307 Epoch 222: 1.172306977616543 2.149832223460105 Epoch 223: 1.1533139870665958 1.897368373790175 Epoch 224: 1.1603077278460978 1.9740605449537711 Epoch 225: 1.1610865668922927 2.0156478512874445 Epoch 226: 1.1598149090178131 2.0062030866511695 Epoch 227: 1.1629546106433142 2.010542546698093 Epoch 228: 1.1636366920994934 2.0645587753976815 Epoch 229: 1.158441789962946 1.9732275213983936 Epoch 230: 1.1594068941620521 2.0211588360306543 Epoch 231: 1.157992091904049 1.9899786212760588 Epoch 232: 1.1586025563761901 1.9669302651828673 Epoch 233: 1.158935020774732 1.9624153218725968 Epoch 234: 1.1780146089004881 2.2084053499839253 Epoch 235: 1.1624086023948017 1.9904183594049172 Epoch 236: 1.1687114571640016 2.0243325338089573 Epoch 237: 1.163780074230273 2.0115919918004996 Epoch 238: 1.1593855584185195 1.9610645305963619 Epoch 239: 1.1557913057904843 1.9313205589116578 Epoch 240: 1.1604861873712111 2.0147368305307953 Epoch 241: 1.1643855379493093 2.0308919296252905 Epoch 242: 1.1684656530568671 2.0836439974908796 Epoch 243: 1.156170351707462 1.9437864267698877 Epoch 244: 1.1632490253614063 2.0331559937553387 Epoch 245: 1.1625416949346654 1.9878681854250837 Epoch 246: 1.1743225954897536 2.139668084828036 Epoch 247: 1.160922425872473 1.9673081971292015 Epoch 248: 1.158370283489093 1.9186952135315374 Epoch 249: 1.1590132165288352 1.9575008370562035 Epoch 250: 1.1692487934322693 2.047704473254694 Epoch 251: 1.1661088708383454 2.0272174059642936 Epoch 252: 1.164244940978814 1.9922728331857886 Epoch 253: 1.1740138877833493 2.144034922963351 Epoch 254: 1.1627478961724935 1.9765857849724182 Epoch 255: 1.157342812461918 1.9176534632661586 Epoch 256: 1.167310602022143 2.023937327632803 Epoch 257: 1.1664092885855957 2.0225094450202428 Epoch 258: 1.166403114048588 2.0214241796771435 Epoch 259: 1.1640833336509528 1.9501414768095904 Epoch 260: 1.172884748612123 2.047402314044348 Epoch 261: 1.1733694938120844 2.0709007251204166 Epoch 262: 1.1703883076248496 1.9846698128270954 Epoch 263: 1.1721027556301264 2.037220774305596 Epoch 264: 1.17839549649102 2.1112297457663614 Epoch 265: 1.1654972084590798 1.9444411088542253 Epoch 266: 1.1770102224348051 2.0984330461842684 Epoch 267: 1.1661324837193872 1.9735288321449527 Epoch 268: 1.1684228835987054 2.0136744531303976 Epoch 269: 1.162849000284166 1.9527235425674798 Epoch 270: 1.1734050786860557 2.0986326550647374 Epoch 271: 1.160698086530356 1.968205613519547 Epoch 272: 1.1718205074971704 2.087588337895705 Epoch 273: 1.1707801461385712 2.05541134597421 Epoch 274: 1.1691796980186806 2.0359205380058287 Epoch 275: 1.1681172510742326 2.0465446260179836 Epoch 276: 1.1668021632958008 2.055020724857271 Epoch 277: 1.151646609766444 1.9027054110156603 Epoch 278: 1.1630268716903074 2.0320960711046787 Epoch 279: 1.1532204157848187 1.9267965144776447 Epoch 280: 1.1545101829463142 1.9504711296639747 Epoch 281: 1.1539752359172013 1.9155125002617734 Epoch 282: 1.1605993185017476 1.9984865154380536 Epoch 283: 1.16832157879913 2.0808959291191385 Epoch 284: 1.1571226626221311 1.9496603321626576 Epoch 285: 1.1637392503157475 1.966644596396405 Epoch 286: 1.1703061954315055 2.0508837977273955 Epoch 287: 1.1661373161151396 1.9581123052424907 Epoch 288: 1.1716489545640907 2.0660943356351633 Epoch 289: 1.162210911845686 2.000755633795154 Epoch 290: 1.1665233458674846 2.103888995353846 Epoch 291: 1.152381620806598 2.041867334236522 Epoch 292: 1.1533313698555427 2.0270061674169306 Epoch 293: 1.1540479557480243 2.0564307797064973 Epoch 294: 1.1530786586022215 2.015652244367046 Epoch 295: 1.1586921442683673 2.0977453777519637 Epoch 296: 1.1466277814629358 1.9802845942481904 Epoch 297: 1.1484481049977502 1.9713593809492327 Epoch 298: 1.145805258441764 1.9447097646418092 Epoch 299: 1.1476998763032493 1.9471699750330864 Epoch 300: 1.1475673482226003 1.946809335925523 Epoch 301: 1.1450321497622773 1.9366124455646785 Epoch 302: 1.1610396227941684 2.084448184315205 Epoch 303: 1.15617334776939 2.066528471403558 Epoch 304: 1.157186834932992 2.039126665822382 Epoch 305: 1.1518226564498195 2.0724202431858814 Epoch 306: 1.1475072952738203 2.0187104774884315 Epoch 307: 1.144911046485268 1.9691664109997982 Epoch 308: 1.1436634433193602 1.9981925681435333 Epoch 309: 1.1440556245610136 1.9958492184228103 Epoch 310: 1.1416845744986281 1.9567346183934688 Epoch 311: 1.1465242589209208 1.9965803402508104 Epoch 312: 1.1402822568281583 1.9572394426584003 Epoch 313: 1.1444256744712749 1.9917700404530183 Epoch 314: 1.1341670575342746 1.8563959255071167 Epoch 315: 1.1451735809680927 1.9990284579202906 Epoch 316: 1.1448462684191392 1.946829763991842 Epoch 317: 1.1460292976769055 1.9591527701272857 Epoch 318: 1.1524783638351412 2.0395256491258884 Epoch 319: 1.1481255014746177 1.955782751168601 Epoch 320: 1.1535458673634562 2.018376299089275 Epoch 321: 1.1564232403894963 2.077434874460244 Epoch 322: 1.147254973345927 1.9583338917817799 Epoch 323: 1.1516395714656882 2.009468785579987 Epoch 324: 1.1557673220619185 2.063228215257118 Epoch 325: 1.1534744287737841 1.9935416955109886 Epoch 326: 1.150181727132005 1.949609916674194 Epoch 327: 1.1441988979348574 1.869514673151032 Epoch 328: 1.1501336543379006 1.9673135310095184 Epoch 329: 1.1511906713198927 1.9199337926780946 Epoch 330: 1.1494617877918734 1.9073749427857871 Epoch 331: 1.1469113985657917 1.9805302170310206 Epoch 332: 1.1541405113958176 2.0325597852045463 Epoch 333: 1.1402980906326248 1.8390940750249538 Epoch 334: 1.1531753874448596 1.9989140653485673 Epoch 335: 1.1567599623902975 2.017653960628913 Epoch 336: 1.1581622391493147 2.015007567430858 Epoch 337: 1.1590143212810402 2.0609980775168237 Epoch 338: 1.1510751732507323 2.001375491561636 Epoch 339: 1.1481402813804618 1.9660002677298571 Epoch 340: 1.1493973273100468 1.9400570063599067 Epoch 341: 1.1583282367595962 2.0790651999854237 Epoch 342: 1.1546969443141344 2.019734985064594 Epoch 343: 1.1527831700666127 1.9674135205163883 Epoch 344: 1.1549645414004304 2.0057042393205355 Epoch 345: 1.1576926450047145 2.0576406101941864 Epoch 346: 1.161981814267257 2.0685419449014533 Epoch 347: 1.1618604580758003 2.0649695637689844 Epoch 348: 1.1562319540905006 1.9578249376360854 Epoch 349: 1.1625951733893936 1.9825105915855945 Epoch 350: 1.157983377883249 1.9215557186068493 Epoch 351: 1.1642282348431414 1.9658373090480363 Epoch 352: 1.1528841361924658 1.8699415500500947 Epoch 353: 1.1588619403644715 1.973303504547195 Epoch 354: 1.1644781193218436 2.0546850735644964 Epoch 355: 1.1603904375276721 2.025002971930877 Epoch 356: 1.1549509698733604 1.9856711113014198 Epoch 357: 1.1448840032833865 1.8643261327266034 Epoch 358: 1.1526553868308187 1.9910940344948775 Epoch 359: 1.1590624478940748 2.0905241266197585 Epoch 360: 1.1438425028437778 1.8713452738982141 Epoch 361: 1.1401137304292206 1.8118780505269045 Epoch 362: 1.1495959026983391 1.9370362546355204 Epoch 363: 1.1484417214953213 1.9652754975670976 Epoch 364: 1.1488545201409164 1.990225168446491 Epoch 365: 1.1446320914497148 1.9126347675717308 Epoch 366: 1.149701200028218 1.9804858077778578 Epoch 367: 1.143652647759062 1.867703685220218 Epoch 368: 1.156057605455258 2.0361489292896566 Epoch 369: 1.1523585704588584 2.0066252854364075 Epoch 370: 1.1545886779431278 2.0608739349648895 Epoch 371: 1.1513732490007436 1.9781191873402264 Epoch 372: 1.153112913284324 1.9985838970000227 Epoch 373: 1.1520276656465205 1.9937025191088742 Epoch 374: 1.153933670869458 2.031596796668612 Epoch 375: 1.1538351175449262 2.037474678249571 Epoch 376: 1.1520646242862507 2.017049789453241 Epoch 377: 1.1466829516420303 1.9419215391653144 Epoch 378: 1.163010896069329 2.079237976454214 Epoch 379: 1.1500419394047414 1.9335876359293516 Epoch 380: 1.157359178919251 2.0173619548181474 Epoch 381: 1.1553387148870204 1.9869264142006289 Epoch 382: 1.150777670711536 1.89310014361363 Epoch 383: 1.1604969603315158 2.0407271442227106 Epoch 384: 1.1523358843246543 1.9426426565179926 Epoch 385: 1.1635756840442424 2.004076157846764 Epoch 386: 1.1673614420569172 2.0596411901504927 Epoch 387: 1.1585544994201846 1.9538972375490153 Epoch 388: 1.1635871495397658 2.02419123848167 Epoch 389: 1.1626716170881763 1.9844917942397182 Epoch 390: 1.1640985607676797 1.9942870242566482 Epoch 391: 1.1667433246462153 2.012710781308925 Epoch 392: 1.166057448474898 2.013157912048576 Epoch 393: 1.1625924218958799 1.9780533169102879 Epoch 394: 1.1677546394375398 2.0679503119367917 Epoch 395: 1.165376031086863 2.040136965156445 Epoch 396: 1.158296853822276 1.9564699927587872 Epoch 397: 1.1606700470871392 1.9480817414503098 Epoch 398: 1.168912601731494 2.041150736836026 Epoch 399: 1.1706081338896337 2.021184077249987 Epoch 400: 1.1801842754484062 2.1239415606354117 Epoch 401: 1.1677212057163933 2.012395110885921 Epoch 402: 1.1672601497312562 2.0311179189639055 Epoch 403: 1.1446005340238736 1.745576950488077 Epoch 404: 1.16644155750266 2.0371263086647105 Epoch 405: 1.160096559777111 1.9573745798179463 Epoch 406: 1.1561184001485691 1.9270445802835297 Epoch 407: 1.1564428794892025 1.9534130611724356 Epoch 408: 1.157832333324879 1.9990185814286516 Epoch 409: 1.1542536578440337 1.9603170999631656 Epoch 410: 1.1579641202641253 2.0197871817756243 Epoch 411: 1.1483763991029288 1.9513250981321313 Epoch 412: 1.1404922532824875 1.9209794035824443 Epoch 413: 1.1507849734020454 2.0644281753554914 Epoch 414: 1.1522729146307125 2.014082678488246 Epoch 415: 1.1539226907918823 2.0452288186933716 Epoch 416: 1.1553935535896283 2.0516110385635793 Epoch 417: 1.1522612541246897 2.0132793132417066 Epoch 418: 1.1445911719870643 1.8998976720242748 Epoch 419: 1.15111755242973 1.9942442880638716 Epoch 420: 1.1492918953099753 1.9723618584340097 Epoch 421: 1.1508398588156288 2.0285906972912917 Epoch 422: 1.1457613876618533 1.9442137226381164 Epoch 423: 1.1544611539760208 2.033125521968535 Epoch 424: 1.158597511625977 2.0883628018192923 Epoch 425: 1.1571397943655337 2.030533907318707 Epoch 426: 1.1485143165837068 1.9784233945860508 Epoch 427: 1.154701391933309 1.9971076092250635 Epoch 428: 1.153064423920393 1.9959524721314952 Epoch 429: 1.1567851179708466 2.019159503275338 Epoch 430: 1.1571300814502 2.0130031186865063 Epoch 431: 1.1557711599737877 1.961595185581823 Epoch 432: 1.1546534087018567 1.9417015113839 Epoch 433: 1.1600875830711923 1.9985945740120918 Epoch 434: 1.1621550174809294 2.0081173436065765 Epoch 435: 1.1647858119071102 2.0069931180021467 Epoch 436: 1.171922539030513 2.0754662230190997 Epoch 437: 1.167478389331363 1.9709286580435443 Epoch 438: 1.1584175000364914 1.8390082675323287 Epoch 439: 1.1702472757144489 1.9891024656575065 Epoch 440: 1.171445495576249 2.0087085929409083 Epoch 441: 1.1707096902503504 1.9762975241864922 Epoch 442: 1.1637849791651869 1.9026433988163767 Epoch 443: 1.171386642660623 1.9761437663208812 Epoch 444: 1.1779935021986057 2.046505549169825 Epoch 445: 1.1671231314509358 1.9652553164672595 Epoch 446: 1.1684116342751572 2.0495697045197634 Epoch 447: 1.1647003781266703 1.9978788500721527 Epoch 448: 1.1622405847121584 1.9532584123270598 Epoch 449: 1.163145617594456 1.97476781288562 Epoch 450: 1.1660072402971797 1.9742775233108107 Epoch 451: 1.1754628552358457 2.0751210976336836 Epoch 452: 1.1735655090109642 2.0469302285938347 Epoch 453: 1.169287003886958 2.002149720380278 Epoch 454: 1.1635890708472378 1.9611595279040226 Epoch 455: 1.1715093753124384 2.069328916004224 Epoch 456: 1.1680462317101417 2.0319407602859423 Epoch 457: 1.172094998018456 2.0318137992892273 Epoch 458: 1.160355532966944 1.8619506742616743 Epoch 459: 1.173269399075153 2.027583132221297 Epoch 460: 1.1684714341198927 1.963550735443599 Epoch 461: 1.170713830400304 2.044528597373633 Epoch 462: 1.1771672554199253 2.152225397121842 Epoch 463: 1.165837688657203 2.0185833196372154 Epoch 464: 1.1737217995420748 2.1373669706733183 Epoch 465: 1.1578672638403995 1.9871013310791956 Epoch 466: 1.155718232390874 1.9557080632755361 Epoch 467: 1.1557765390859782 1.9816049106330598 Epoch 468: 1.1618268652269887 2.070405227570197 Epoch 469: 1.161491829599467 2.055096537614472 Epoch 470: 1.1501254958475469 1.9661249599952544 Epoch 471: 1.1572012124584645 2.0097660872796412 Epoch 472: 1.1589604474043982 2.0302610947822126 Epoch 473: 1.1570301975614823 2.023791417809254 Epoch 474: 1.154208144563905 1.9919144916032592 Epoch 475: 1.1519935009767621 1.9600780058834755 Epoch 476: 1.1523555948256758 1.9765881895154538 Epoch 477: 1.157402986756551 1.9904666517287626 Epoch 478: 1.1633983269171395 2.052477744992646 Epoch 479: 1.1599612891715394 1.977522581958186 Epoch 480: 1.1630221140374917 2.0175865662949612 Epoch 481: 1.1603236425247734 2.0134881202380304 Epoch 482: 1.1622428147093367 2.041719852029051 Epoch 483: 1.1663182676893415 2.053700356965903 Epoch 484: 1.1715021454571897 2.1250912513664217 Epoch 485: 1.1693220355043468 2.096288016694156 Epoch 486: 1.162186265069271 2.007679351313094 Epoch 487: 1.156178143974997 1.977949535688021 Epoch 488: 1.1602071602457047 1.994003641212097 Epoch 489: 1.1628305001357682 2.015400203164338 Epoch 490: 1.161280006087401 2.003279628817588 Epoch 491: 1.1599510493939122 2.002753704350214 Epoch 492: 1.1601241418913693 1.98766468931923 Epoch 493: 1.1622395894987398 1.9936463538182703 Epoch 494: 1.157830575420715 1.9005282558070824 Epoch 495: 1.1624944319141972 1.9714438683768225 Epoch 496: 1.1622015660822151 1.9369276388742183 Epoch 497: 1.1688861721589185 1.9886871017866947 Epoch 498: 1.1696200308857707 2.0001774086437893 Epoch 499: 1.1751592349446145 2.0805927706849436 Epoch 500: 1.1726821241284315 2.0608937547750488 Epoch 501: 1.1693244711262787 2.0588277961979427 Epoch 502: 1.164318861403886 1.9568837861962476 Epoch 503: 1.1674620721851714 2.001643097605654 Epoch 504: 1.1686390240785633 2.0403778965299755 Epoch 505: 1.160558346790356 1.9947000458052078 Epoch 506: 1.1550101382922815 1.9553919771250783 Epoch 507: 1.1633863462773795 2.047664305171632 Epoch 508: 1.161234130215018 2.004647106635081 Epoch 509: 1.163735588214571 2.0240105576567906 Epoch 510: 1.154880290041301 1.9186242647350635 Epoch 511: 1.165342490479345 2.0470743379697582 Epoch 512: 1.1643114503808094 2.0418126189070493 Epoch 513: 1.157790152976131 1.9940304144216299 Epoch 514: 1.16612453996357 2.0405570445462846 Epoch 515: 1.1624344094319572 1.9741268566492267 Epoch 516: 1.1655260982797748 2.001969136200936 Epoch 517: 1.1724045019844676 2.091767385080294 Epoch 518: 1.1716510265738609 2.0843143345325106 Epoch 519: 1.1686269961900118 2.073748595118983 Epoch 520: 1.1638694305193875 2.0280956923017217 Epoch 521: 1.1636480533268407 2.0332406500489677 Epoch 522: 1.1603089324484057 1.9955956478220662 Epoch 523: 1.1655246105364943 2.0737222183350674 Epoch 524: 1.1662126905080017 2.076627691101107 Epoch 525: 1.1568865515059628 1.974100652504382 Epoch 526: 1.168518075087314 2.116093664338233 Epoch 527: 1.1739213365871974 2.1025021700328166 Epoch 528: 1.1640961527037137 2.0044410213216626 Epoch 529: 1.1672718428903104 2.0129850523107686 Epoch 530: 1.161172217703352 1.9262876678915872 Epoch 531: 1.1585188388036571 1.9180818748253303 Epoch 532: 1.1653540395516186 2.0225110646631754 Epoch 533: 1.157019824202749 1.9078821673186914 Epoch 534: 1.1661352704968753 2.005397500866252 Epoch 535: 1.1627653941287703 1.9065647246932755 Epoch 536: 1.1578514209429478 1.8395637428079765 Epoch 537: 1.1662191899152756 2.00156943830715 Epoch 538: 1.168201945668109 2.023471497362851 Epoch 539: 1.1665228526276126 2.0644991500670145 Epoch 540: 1.1658816613974488 2.058665702998422 Epoch 541: 1.1643447355846515 1.986067069948219 Epoch 542: 1.162476253404076 1.9906159550427986 Epoch 543: 1.1747341048941513 2.15537441684528 Epoch 544: 1.158303334439897 1.940222766571229 Epoch 545: 1.167750097576072 2.047987879505625 Epoch 546: 1.161499969447288 1.9796463160817586 Epoch 547: 1.1618397809182983 2.0145491275750587 Epoch 548: 1.1530248211180012 1.9542380628345206 Epoch 549: 1.1531222017080704 1.9157917081991016 Epoch 550: 1.1635858828490209 2.0209805841913973 Epoch 551: 1.1579552887038964 1.9522790453630605 Epoch 552: 1.1649831808558848 2.046686158613785 Epoch 553: 1.1640074373615312 2.003159153474343 Epoch 554: 1.1651457482691459 2.0102017022980583 Epoch 555: 1.1715635029370517 2.0164581257660914 Epoch 556: 1.1764307339401388 2.0724774274837436 Epoch 557: 1.1669823043140815 1.9717705460185897 Epoch 558: 1.1744484834107063 1.997186379174134 Epoch 559: 1.169286604704071 1.9482003074049572 Epoch 560: 1.1708232269684675 1.9670896160116247 Epoch 561: 1.177918040159993 2.0833842383961665 Epoch 562: 1.175450588919051 2.0409732812502694 Epoch 563: 1.1757013259264324 2.036028007875955 Epoch 564: 1.169852341967423 1.9178011529882681 Epoch 565: 1.1839854524569744 2.1097700742101275 Epoch 566: 1.1699901890362303 1.9443972610595948 Epoch 567: 1.1784189477903166 2.0134108575047995 Epoch 568: 1.1724111060404596 1.9501752190008588 Epoch 569: 1.1793644855150411 2.058851187496867 Epoch 570: 1.1710704634361726 1.9844881108378327 Epoch 571: 1.170637349147944 2.0128491214203437 Epoch 572: 1.1674859871552459 1.995827220735414 Epoch 573: 1.1602092563525992 1.937681733563966 Epoch 574: 1.1562973452235896 1.9081278689182501 Epoch 575: 1.1601815573497287 1.9425865690783073 Epoch 576: 1.1618195147540886 1.9472110697687859 Epoch 577: 1.1708741364461133 2.043250021263253 Epoch 578: 1.1676068093394385 2.003299759582118 Epoch 579: 1.1651953384573908 1.9480119950750352 Epoch 580: 1.1729481290413073 2.0578489946616196 Epoch 581: 1.1756518105924687 2.081163220123037 Epoch 582: 1.1665099587772048 1.9747648880773814 Epoch 583: 1.167405836182548 1.9592179552107116 Epoch 584: 1.1746998654284155 2.0028697591643874 Epoch 585: 1.1773365248807184 2.0719121468963726 Epoch 586: 1.174884035757471 2.0468189831104544 Epoch 587: 1.166853113745297 1.9770294711982148 Epoch 588: 1.164852928893286 1.9380153940723435 Epoch 589: 1.1641376525904508 1.9377478395940595 Epoch 590: 1.1599106893044446 1.9057579256872386 Epoch 591: 1.1709976070406567 2.029604329245804 Epoch 592: 1.1716528984848746 2.0380568684025455 Epoch 593: 1.1686722556889133 2.0331959031481532 Epoch 594: 1.1632339738919797 1.9820065033475909 Epoch 595: 1.1564164096740799 1.9271097108640185 Epoch 596: 1.1666694828568867 2.0724520045260295 Epoch 597: 1.1537707827222201 1.9181065128036707 Epoch 598: 1.1571075122251715 1.9705058791561187 Epoch 599: 1.1721611168487915 2.1617614972119306 Epoch 600: 1.155183953593385 1.9749704550186957 Epoch 601: 1.1554366688928757 1.9844079087563904 Epoch 602: 1.1647280640469377 2.081337702842052 Epoch 603: 1.1658496108379754 2.1060936098291947 Epoch 604: 1.14758503546801 1.8717290219119311 Epoch 605: 1.1510322758027156 1.9657134666956575 Epoch 606: 1.1535227944197683 1.989369661635029 Epoch 607: 1.1538889452189671 2.0018450944565944 Epoch 608: 1.1635852958354194 2.1179681261409202 Epoch 609: 1.1554217599794048 1.9053692359601369 Epoch 610: 1.1697281053661879 2.063683085157363 Epoch 611: 1.1654273181045691 2.0749523365744253 Epoch 612: 1.1532446422864804 1.9181586657893415 Epoch 613: 1.163051758110536 2.046855736204119 Epoch 614: 1.151106979641301 1.9490453567949784 Epoch 615: 1.1589446331299682 2.0621825131187985 Epoch 616: 1.156008845475296 2.0593705707766103 Epoch 617: 1.1561201349749786 2.077364593638815 Epoch 618: 1.1479035736525844 2.008954274865795 Epoch 619: 1.144524703842288 1.980690356318616 Epoch 620: 1.136992737736608 1.920267931915191 Epoch 621: 1.1449055065162845 2.0256836009563326 Epoch 622: 1.1369548617973584 1.9139524101299534 Epoch 623: 1.1394590802641777 1.9412799156168985 Epoch 624: 1.1440634230707285 1.9759853212911 Epoch 625: 1.1377341050508412 1.8056331460596537 Epoch 626: 1.154621556420888 2.0517240402014973 Epoch 627: 1.1437171562739505 1.9261258346472023 Epoch 628: 1.150385307559757 2.027778902322227 Epoch 629: 1.1530967199002244 2.0626847385825093 Epoch 630: 1.1531172636195939 2.065527734414501 Epoch 631: 1.1490742275494856 1.9638324313626334 Epoch 632: 1.155315310074967 2.102823402912834 Epoch 633: 1.1417601690775259 1.9172810647771976 Epoch 634: 1.1510159534567739 2.010997298421388 Epoch 635: 1.149005741676144 1.9937017899446772 Epoch 636: 1.155222921847631 2.01113908577509 Epoch 637: 1.150820535642392 1.9526606615164677 Epoch 638: 1.1551409338435366 2.000433609996005 Epoch 639: 1.1578834103979951 2.0468988071021372 Epoch 640: 1.1565305856473265 2.0128183635846515 Epoch 641: 1.1591606428192935 2.01835623644454 Epoch 642: 1.1569524851263293 2.045253459625615 Epoch 643: 1.1514787531975645 1.9551687657003172 Epoch 644: 1.163649621435769 2.0838699038768436 Epoch 645: 1.1472411948456802 1.9178902458696894 Epoch 646: 1.1447468750038605 1.8467790295194404 Epoch 647: 1.165686895471076 2.1111351565553234 Epoch 648: 1.1576923239875891 1.9899596857567805 Epoch 649: 1.161485272855309 2.025151149884347 Epoch 650: 1.1597597440083867 1.908835408981916 Epoch 651: 1.1650400581070193 2.0420247761455457 Epoch 652: 1.1675470577075604 2.062979899616677 Epoch 653: 1.159614412837413 1.994903243221295 Epoch 654: 1.1582505049578176 1.9941648150426208 Epoch 655: 1.1634176578590663 2.057592596278545 Epoch 656: 1.1611885297523123 2.007311825230074 Epoch 657: 1.154168506409637 1.9096176697993068 Epoch 658: 1.1658267998253125 2.0735413463909116 Epoch 659: 1.1622960295547577 1.9920497664560168 Epoch 660: 1.1605113725197804 1.9769764654165227 Epoch 661: 1.1688240775030034 2.0720466891643783 Epoch 662: 1.166474635389868 2.043905242100123 Epoch 663: 1.1561023168581162 1.891245447819258 Epoch 664: 1.170902381424022 2.0453216165097943 Epoch 665: 1.1735445476514155 2.0428657951344187 Epoch 666: 1.172054013314705 2.0262399185341877 Epoch 667: 1.1708034841598174 2.0146171031004765 Epoch 668: 1.1671142915453008 1.9348631427426084 Epoch 669: 1.1775445636369115 2.0477901564859926 Epoch 670: 1.1644419422264496 1.919474895221312 Epoch 671: 1.1741559338165275 2.074967660671925 Epoch 672: 1.1731059472314012 2.0687478375680235 Epoch 673: 1.1625465689120127 1.9400170806618209 Epoch 674: 1.1621233434513776 1.9331084122401405 Epoch 675: 1.172690681588269 2.078074424231667 Epoch 676: 1.1673711696972886 2.0179073495059945 Epoch 677: 1.1706102870624606 2.0697200328933634 Epoch 678: 1.1703580494763104 1.9874324564763488 Epoch 679: 1.1619811693165076 1.8821263438324085 Epoch 680: 1.1760639541364406 2.0506844372123743 Epoch 681: 1.1768429181245823 2.0749670618770857 Epoch 682: 1.1636128240680195 1.8879778704396915 Epoch 683: 1.1709499629950786 1.9835964000124204 Epoch 684: 1.176966935674532 2.0451598736206993 Epoch 685: 1.1748268450232888 2.010486718394343 Epoch 686: 1.172105455781458 1.9888604323678445 Epoch 687: 1.1773147388707024 2.0552232315545726 Epoch 688: 1.174981523032786 2.025749691408062 Epoch 689: 1.175802764810576 2.0272108047081865 Epoch 690: 1.1741388731261873 2.0438495359399673 Epoch 691: 1.1645637415696226 1.914852964763105 Epoch 692: 1.1775437201929273 2.0837327430593087 Epoch 693: 1.171021545689541 2.0380823234432075 Epoch 694: 1.1720299111368093 2.077853785353434 Epoch 695: 1.166607405867305 2.0266663943476817 Epoch 696: 1.1616206086636185 2.0041200728231723 Epoch 697: 1.1650256609464475 2.0303823418621296 Epoch 698: 1.164814414701397 1.9838895901176739 Epoch 699: 1.1743731417049934 2.121649344069409 Epoch 700: 1.1729325183302142 2.090510587257664 Epoch 701: 1.167460302805175 1.9853618003596634 Epoch 702: 1.171378151734284 2.053763389219159 Epoch 703: 1.1702239862054735 2.055377866431336 Epoch 704: 1.1655974083142302 1.99677623915332 Epoch 705: 1.1637079902216065 1.9796444231160475 Epoch 706: 1.1657164709848369 1.9887143991472445 Epoch 707: 1.1675292832336142 2.0241719840109442 Epoch 708: 1.1698232737370793 2.035350913839469 Epoch 709: 1.1734883786457513 2.058245749818177 Epoch 710: 1.178176428028091 2.0956221073814563 Epoch 711: 1.1707212472550819 2.0256728162977207 Epoch 712: 1.1575486624890918 1.8194403182077832 Epoch 713: 1.177523685965058 2.0673595592156992 Epoch 714: 1.1747159994855776 2.043155876208187 Epoch 715: 1.1710098799379844 1.9762692647653097 Epoch 716: 1.1640454907921152 1.8903064571396975 Epoch 717: 1.1762622377686232 2.0574860911681907 Epoch 718: 1.1787825645581391 2.002981652644582 Epoch 719: 1.1760841297560203 1.9935265856839561 Epoch 720: 1.1809428938255737 2.0650923158709085 Epoch 721: 1.1765205975787607 1.9578502910758795 Epoch 722: 1.1774642175778434 2.019709090157791 Epoch 723: 1.1680699833280164 1.898312445095025 Epoch 724: 1.1761840770108003 2.0462077852453637 Epoch 725: 1.1718479222055256 2.0058071438445033 Epoch 726: 1.1664817877999039 1.9096610999007464 Epoch 727: 1.1724817886067012 1.9513913121667488 Epoch 728: 1.177878687532817 2.0048043751218154 Epoch 729: 1.1805702427200604 2.038253110489647 Epoch 730: 1.1731762805250165 1.9773643250294353 Epoch 731: 1.1719481861256338 1.9681530248838734 Epoch 732: 1.170958734170868 1.931736000361683 Epoch 733: 1.1809837229509716 2.076098698308606 Epoch 734: 1.1814920816011996 2.1008648023647707 Epoch 735: 1.1739589961909729 1.9850473133869209 Epoch 736: 1.175209416452425 2.010759548096611 Epoch 737: 1.17394430015845 1.980847883428966 Epoch 738: 1.1742166094964246 1.9785416003352714 Epoch 739: 1.176104874697144 1.9923927012366787 Epoch 740: 1.1713424931242489 1.8749667813350874 Epoch 741: 1.179252499460744 1.9830882619410506 Epoch 742: 1.1719552117063976 1.8948350185906075 Epoch 743: 1.1823893626823063 2.0678006992277806 Epoch 744: 1.1780966097356345 1.998036497360876 Epoch 745: 1.179165780753876 2.004630972509234 Epoch 746: 1.1835778630200546 2.0950285707921648 Epoch 747: 1.1766786813161292 2.044100782536718 Epoch 748: 1.160163503077738 1.9177065378896447 Epoch 749: 1.1675883035849126 2.014212936737423 Epoch 750: 1.1634300571662926 1.9721611100038532 Epoch 751: 1.1715877858447237 2.043941039719496 Epoch 752: 1.1584171500053826 1.8746409041865049 Epoch 753: 1.1763127461801561 2.0918118518975723 Epoch 754: 1.1659716692903694 1.9742224619202564 Epoch 755: 1.1646604634177344 1.9361122857977011 Epoch 756: 1.160620833248367 1.8662711354233066 Epoch 757: 1.1661982405581681 1.968819352593975 Epoch 758: 1.1629045745496742 1.9550035485315502 Epoch 759: 1.1647548221153303 1.9730341239745295 Epoch 760: 1.1663056539956702 1.965890594274193 Epoch 761: 1.1779151897816287 2.081608721625768 Epoch 762: 1.1693962316786333 1.941286889886656 Epoch 763: 1.1805267074161756 2.0743145136767094 Epoch 764: 1.1800985705084266 2.0668825274432767 Epoch 765: 1.1726861160529571 1.956012847673854 Epoch 766: 1.1877647813139627 2.083771880721844 Epoch 767: 1.1759959529950046 1.9502345357584432 Epoch 768: 1.180106600845589 2.030203465317662 Epoch 769: 1.1785547996239625 2.0075038810866097 Epoch 770: 1.1721198204166714 1.9509697428767063 Epoch 771: 1.1748246698164564 1.9928669913981039 Epoch 772: 1.1730788942530228 1.9950697028851319 Epoch 773: 1.175246930868462 1.9668602729713875 Epoch 774: 1.1734417451383476 1.9514775324234275 Epoch 775: 1.1865747041781565 2.098993134805362 Epoch 776: 1.188501736263222 2.0506954757968603 Epoch 777: 1.182852656139179 1.9646481114324323 Epoch 778: 1.185967165253824 2.013231001970683 Epoch 779: 1.1844395490522908 2.0910341881077166 Epoch 780: 1.1737915821547968 1.9646164138616604 Epoch 781: 1.191758869935296 2.18962453706787 Epoch 782: 1.1831959356758552 2.1198312684358753 Epoch 783: 1.1723311593442 1.9859166627501075 Epoch 784: 1.1744433846975588 2.019754884079452 Epoch 785: 1.1719566576658207 2.019318801231734 Epoch 786: 1.1710418213323723 1.988913309003219 Epoch 787: 1.1620316669598556 1.8707335762957873 Epoch 788: 1.1703379354775691 1.997393590357857 Epoch 789: 1.1719740907252663 2.025390894214763 Epoch 790: 1.1751951340536277 2.073895434476763 Epoch 791: 1.1649917139926171 1.9780477435131272 Epoch 792: 1.1703869198290446 2.025718385462993 Epoch 793: 1.1702765273424296 2.0614739778898983 Epoch 794: 1.1625825879514962 1.9226274177688791 Epoch 795: 1.1655086323726378 1.9208708143722197 Epoch 796: 1.1691174033705247 1.9006829597676416 Epoch 797: 1.1729928727329737 1.9640889316484784 Epoch 798: 1.1778463811902438 1.9921573108254367 Epoch 799: 1.1773514612461151 1.9667117251989434 Epoch 800: 1.177124613518903 2.0343926448254734 Epoch 801: 1.1749925961056273 1.985964658981048 Epoch 802: 1.179957250611606 2.047084039569667 Epoch 803: 1.178775628421662 2.0210086054432637 Epoch 804: 1.175220208981832 2.02350131862926 Epoch 805: 1.1673621446963132 1.960305978293802 Epoch 806: 1.1791041023504112 2.1120544016014757 Epoch 807: 1.178017568433982 2.0771402984542813 Epoch 808: 1.165647287412458 1.9602070874952493 Epoch 809: 1.1581522666958617 1.9120867281927372 Epoch 810: 1.1574695893847933 1.9553021737969036 Epoch 811: 1.1532611898613325 1.9119749320993886 Epoch 812: 1.1614700136038754 1.98716206728456 Epoch 813: 1.1671994116082518 2.0275299821752375 Epoch 814: 1.1666677665583243 2.0417365532450047 Epoch 815: 1.1697040029415398 2.0734529891667592 Epoch 816: 1.1686166547942265 2.0558238071783412 Epoch 817: 1.1601948873090822 2.010241210464497 Epoch 818: 1.150618385029723 1.9258408855428137 Epoch 819: 1.1636429174708893 2.0233682903580497 Epoch 820: 1.1622884142555794 2.0014636002126447 Epoch 821: 1.1555748556733847 1.906875567540085 Epoch 822: 1.1664335200140223 2.0571278055304023 Epoch 823: 1.1583224588173926 1.9869757506956676 Epoch 824: 1.1702089173813757 2.1543993462777062 Epoch 825: 1.1616518864819756 2.0773275184740485 Epoch 826: 1.1551967684697095 2.0063270888888423 Epoch 827: 1.152287628406903 1.9779246087420448 Epoch 828: 1.157850677359235 2.057287178586857 Epoch 829: 1.1556715647828006 2.0239885485233216 Epoch 830: 1.1565890096397193 2.0503171254560946 Epoch 831: 1.1595389084512515 1.98795193268577 Epoch 832: 1.1589800571798163 2.011033267696134 Epoch 833: 1.1556691202822504 1.9693737537898452 Epoch 834: 1.157665567164785 2.0208205094235487 Epoch 835: 1.1595353260212122 2.063046521990212 Epoch 836: 1.1557062861486722 1.9994191106493338 Epoch 837: 1.1565809689923208 1.983921711149201 Epoch 838: 1.1623608098363054 2.0646902809727394 Epoch 839: 1.1581531743891427 1.987295732082542 Epoch 840: 1.1590402311588939 1.9689879988643828 Epoch 841: 1.1646506455863843 2.0480950713908213 Epoch 842: 1.162215740353453 1.9799758074439537 Epoch 843: 1.1571640072959855 1.921655076066351 Epoch 844: 1.1580560500600952 1.9277735300278482 Epoch 845: 1.1649186113385384 1.9996409990007527 Epoch 846: 1.1646426722474732 2.003677002909093 Epoch 847: 1.163233802067977 1.9591389964503692 Epoch 848: 1.1707356847623558 2.063112166983802 Epoch 849: 1.1649031859386627 1.9707133701977613 Epoch 850: 1.1669768604197768 2.003781069027834 Epoch 851: 1.1666918279123084 1.9670802100651719 Epoch 852: 1.177111771987998 2.082291134554474 Epoch 853: 1.1714490445121664 1.9742670079602334 Epoch 854: 1.1690847946719853 1.9717368670033464 Epoch 855: 1.1704196198091974 2.001142604751988 Epoch 856: 1.1643601918228275 1.9346720864323468 Epoch 857: 1.169987727850821 1.9859981783905607 Epoch 858: 1.1679002555552322 1.9558523607403306 Epoch 859: 1.1700790689829543 1.9659696296969515 Epoch 860: 1.1702633562751767 2.035529723782505 Epoch 861: 1.1668381815422824 1.9730851200413737 Epoch 862: 1.1710780627357755 2.0021502399044393 Epoch 863: 1.1778865245313674 2.0584634789808436 Epoch 864: 1.1706303717546014 1.9793857058274713 Epoch 865: 1.1720638650705177 2.002020048112528 Epoch 866: 1.177027344548512 2.094128393383114 Epoch 867: 1.1590711620519718 1.930179604717959 Epoch 868: 1.1595880558962133 1.9178769953065604 Epoch 869: 1.163754641289815 1.9709385703174107 Epoch 870: 1.1626672733179608 1.9655228206886322 Epoch 871: 1.1630797077085382 2.0002004967166447 Epoch 872: 1.1660986739522754 2.071909935186307 Epoch 873: 1.160184645422692 1.9995763508452455 Epoch 874: 1.160804198929034 2.0300445505981894 Epoch 875: 1.1585661002935455 2.0156828588956004 Epoch 876: 1.1608319160411553 2.0008201812945963 Epoch 877: 1.1675099494614467 2.11487769035713 Epoch 878: 1.1590479574905752 2.005462536710277 Epoch 879: 1.1599498451300045 2.0019678492327615 Epoch 880: 1.1551786199453369 2.0250694826853666 Epoch 881: 1.1446999750255613 1.9146614964687192 Epoch 882: 1.1607304073807792 2.100712842396314 Epoch 883: 1.1587528035888388 2.075430378710201 Epoch 884: 1.1450873628445297 1.9458641513033033 Epoch 885: 1.1516276042155942 1.9869954395285911 Epoch 886: 1.1558943781481392 1.9879195865884405 Epoch 887: 1.1588996968992071 2.0238636993884156 Epoch 888: 1.1589243138387106 1.9685674227858476 Epoch 889: 1.1574874914236424 1.9914169005594886 Epoch 890: 1.1458185129034988 1.8453262503642136 Epoch 891: 1.149657736705517 1.9360649678219166 Epoch 892: 1.1525349621243426 2.0003436914457113 Epoch 893: 1.1566643190248354 2.0899291699728164 Epoch 894: 1.1538013519485895 2.0285061124476362 Epoch 895: 1.1494894712829409 1.9841138725718106 Epoch 896: 1.1467280116705567 1.9875843391309578 Epoch 897: 1.1457867423798365 1.965955036334711 Epoch 898: 1.1500922673760217 2.0171676734984088 Epoch 899: 1.1461515369907496 1.9924543962760368 Epoch 900: 1.1561367888381597 2.124404315519627 Epoch 901: 1.143623481118157 1.9879421847563254 Epoch 902: 1.1371309103687557 1.9300487974583553 Epoch 903: 1.1392961111037059 1.923134522407426 Epoch 904: 1.1407954023431104 1.9720533933578297 Epoch 905: 1.1447308147935196 1.9998299887445616 Epoch 906: 1.1464585126021198 2.0148125135693653 Epoch 907: 1.1404403740371 1.9297813616045891 Epoch 908: 1.1463488767108843 2.0096770631354377 Epoch 909: 1.1546029622929534 2.0832874132502397 Epoch 910: 1.150169572833268 2.03405456699914 Epoch 911: 1.1461421224307 1.940212083531217 Epoch 912: 1.1461235515941184 1.9879469239195826 Epoch 913: 1.1437277165448931 1.9458425306722948 Epoch 914: 1.1465826613349566 1.963993027082908 Epoch 915: 1.153700095135566 2.061082796293308 Epoch 916: 1.145535162769503 1.9399497199454492 Epoch 917: 1.1514181013196756 2.00126064285679 Epoch 918: 1.145316405324972 1.8983230429796274 Epoch 919: 1.14742180546 1.9248133926280373 Epoch 920: 1.1453945133821228 1.9258806454553379 Epoch 921: 1.1564822331850053 2.0417494917942447 Epoch 922: 1.16532666842355 2.111873507098976 Epoch 923: 1.1540591309061792 1.9727663628126484 Epoch 924: 1.1511551928026063 1.9398995635622858 Epoch 925: 1.153896463580002 1.958796187945312 Epoch 926: 1.1585792016906362 1.9770359681201741 Epoch 927: 1.1638372257027831 1.987788305134844 Epoch 928: 1.1693475765031072 2.0466449244121923 Epoch 929: 1.1613297289146602 1.9509056905792623 Epoch 930: 1.1602256163595408 1.907791824877854 Epoch 931: 1.1762509516408628 2.111297411782571 Epoch 932: 1.1699169121556179 2.0205901983640597 Epoch 933: 1.1684207492079968 1.9272494985274782 Epoch 934: 1.1775275555358624 2.0234431331006006 Epoch 935: 1.1835057205367496 2.096536323344183 Epoch 936: 1.1808714871626944 2.06793651984597 Epoch 937: 1.1711764282577117 1.944775568850033 Epoch 938: 1.1765847908624938 2.059669845699896 Epoch 939: 1.1689618571823044 1.99094115269792 Epoch 940: 1.1702445180640104 2.0274149394537373 Epoch 941: 1.1673014771928076 2.006856804873804 Epoch 942: 1.1680741730951625 2.0138214259752067 Epoch 943: 1.1632470351155013 1.9746100270936742 Epoch 944: 1.164775740105387 1.9920699533172106 Epoch 945: 1.1688593880528286 2.0088814901064453 Epoch 946: 1.1747648520414604 2.064169474455046 Epoch 947: 1.166189301475967 1.9512623293150524 Epoch 948: 1.1723293093984961 1.9991619097278583 Epoch 949: 1.1724157247810891 2.0119349303671377 Epoch 950: 1.1749165447698378 2.056822162737071 Epoch 951: 1.1736709326235968 2.0236610959914625 Epoch 952: 1.1750017021555834 2.0385020828048606 Epoch 953: 1.1767926748338744 2.078712474345494 Epoch 954: 1.1712354911892442 1.988543257494375 Epoch 955: 1.1740383287480203 2.053345404770091 Epoch 956: 1.1751871446465831 2.0933373481889297 Epoch 957: 1.1641647182469066 1.9155137587626623 Epoch 958: 1.1691004333247237 1.9414331144327648 Epoch 959: 1.1772536725425993 2.067633262103231 Epoch 960: 1.1779986870370482 2.086082028772622 Epoch 961: 1.1738897563079478 2.047915322270108 Epoch 962: 1.1619161143374852 1.8923934588954494 Epoch 963: 1.1697457741957606 1.9844759110197736 Epoch 964: 1.1697351096822857 1.9741557080092809 Epoch 965: 1.1674910142105261 1.970702981080298 Epoch 966: 1.1708913097774833 2.035510742775975 Epoch 967: 1.163560206727453 1.959034424532293 Epoch 968: 1.1645150122282628 2.0045453242661675 Epoch 969: 1.1679107423551707 2.063905013621921 Epoch 970: 1.1677195713875725 2.0433237073388213 Epoch 971: 1.17002548380566 2.087239453154267 Epoch 972: 1.1669677656803277 2.00776004658578 Epoch 973: 1.1693951222961703 2.0054310660716648 Epoch 974: 1.163591461957682 1.9957651646191144 Epoch 975: 1.1631281436776817 1.996814233291856 Epoch 976: 1.1657782005156223 2.0337180441688014 Epoch 977: 1.1613277185536075 1.9761369889578149 Epoch 978: 1.1715340660463078 2.0889472256757897 Epoch 979: 1.1643274955849614 2.0325757476315434 Epoch 980: 1.169053706321655 2.1097168551532635 Epoch 981: 1.162086973500796 1.9555235042178425 Epoch 982: 1.1659416789106982 2.0106789994605276 Epoch 983: 1.169102413154393 2.0281705562256747 Epoch 984: 1.1554108232659615 1.9228184709910991 Epoch 985: 1.1588930741118912 1.921251824994387 Epoch 986: 1.1550587111137056 1.8840277756684038 Epoch 987: 1.1638976152823144 2.025125792607056 Epoch 988: 1.1671422605177448 2.097307271135845 Epoch 989: 1.1556821951123852 1.9115550838186852 Epoch 990: 1.160829285965992 1.940623122785074 Epoch 991: 1.165446317556802 2.000186893753661 Epoch 992: 1.1718972107106718 2.076132788705562 Epoch 993: 1.163709861587923 1.9829789758500083 Epoch 994: 1.1624906108985786 2.000615035032854 Epoch 995: 1.1613760832770657 1.9794077021416954 Epoch 996: 1.1678270786062839 2.0712825199502247 Epoch 997: 1.1632112053389412 2.0243115044550732 Epoch 998: 1.1598619426016632 1.9867915874642743 Epoch 999: 1.1667041528951323 2.064176860885111

Challenge 14.2

Use SGD to train the single neuron in the previous notebook using a linearly separable set of 100 points, divided by the line 52x+32y+3=0-\frac{5}{2}x+\frac{3}{2}y+3=0

### We provide a set of randomly generated training points num_points = 100 w1 = -2.5 w2 = 1.5 w0 = 3. np.random.seed(637163) # we make sure we always generate the same sequence x_data = np.random.rand(num_points)*10. y_data = np.random.rand(num_points)*10. z_data = np.zeros(num_points) for i in range(len(z_data)): if (y_data[i] > (-w0-w1*x_data[i])/w2): z_data[i] = 1. pyplot.scatter(x_data,y_data,c=z_data,marker='o',linewidth=1.5,edgecolors='black') pyplot.plot(x_data,(-w1*x_data-w0)/w2) pyplot.gray() pyplot.xlim(0,10) pyplot.ylim(0,10);
Image in a Jupyter notebook

You will need the following auxiliary functions:

def sigmoid(z): """The sigmoid function.""" return 1.0/(1.0+np.exp(-z)) def sigmoid_prime(z): """Derivative of the sigmoid function.""" return sigmoid(z)*(1-sigmoid(z))

A simple network to classify handwritten digits

Most of this section has been taken from M. Nielsen's free on-line book: "Neural Networks and Deep Learning" http://neuralnetworksanddeeplearning.com/

In this section we discuss a neural network which can solve the more interesting and difficult problem, namely, recognizing individual handwritten digits.

The input layer of the network contains neurons encoding the values of the input pixels. Our training data for the network will consist of many 28 by 28 pixel images of scanned handwritten digits, and so the input layer contains 784=28×28 neurons. The input pixels are greyscale, with a value of 0.0 representing white, a value of 1.0 representing black, and in between values representing gradually darkening shades of grey.

The second layer of the network is a hidden layer. We denote the number of neurons in this hidden layer by nn , and we'll experiment with different values for nn . The example shown illustrates a small hidden layer, containing just n=15n=15 neurons.

The output layer of the network contains 10 neurons. If the first neuron fires, i.e., has an output 1\sim 1 , then that will indicate that the network thinks the digit is a 0 . If the second neuron fires then that will indicate that the network thinks the digit is a 1 . And so on. A little more precisely, we number the output neurons from 0 through 9 , and figure out which neuron has the highest activation value. If that neuron is, say, neuron number 6 , then our network will guess that the input digit was a 6 . And so on for the other output neurons.

Network to identify single digits. The output layer has 10 neurons, one for each digit.

The first thing we'll need is a data set to learn from - a so-called training data set. We'll use the MNIST data set, which contains tens of thousands of scanned images of handwritten digits, together with their correct classifications. MNIST's name comes from the fact that it is a modified subset of two data sets collected by NIST, the United States' National Institute of Standards and Technology. Here's a few images from MNIST:

The MNIST data comes in two parts. The first part contains 60,000 images to be used as training data. These images are scanned handwriting samples from 250 people, half of whom were US Census Bureau employees, and half of whom were high school students. The images are greyscale and 28 by 28 pixels in size. The second part of the MNIST data set is 10,000 images to be used as test data. Again, these are 28 by 28 greyscale images. We'll use the test data to evaluate how well our neural network has learned to recognize digits. To make this a good test of performance, the test data was taken from a different set of 250 people than the original training data (albeit still a group split between Census Bureau employees and high school students). This helps give us confidence that our system can recognize digits from people whose writing it didn't see during training.

In practice, we are going to split the data a little differently. We'll leave the test images as is, but split the 60,000-image MNIST training set into two parts: a set of 50,000 images, which we'll use to train our neural network, and a separate 10,000 image validation set.

We'll use the notation xx to denote a training input. It'll be convenient to regard each training input xx as a 28×28=784-dimensional vector. Each entry in the vector represents the grey value for a single pixel in the image. We'll denote the corresponding desired output by y=y(x) , where y is a 10 -dimensional vector. For example, if a particular training image, xx , depicts a 6 , then y(x)=(0,0,0,0,0,0,1,0,0,0)Ty(x)=(0,0,0,0,0,0,1,0,0,0)^T is the desired output from the network. Note that T here is the transpose operation, turning a row vector into an ordinary (column) vector.

""" mnist_loader ~~~~~~~~~~~~ A library to load the MNIST image data. For details of the data structures that are returned, see the doc strings for ``load_data`` and ``load_data_wrapper``. In practice, ``load_data_wrapper`` is the function usually called by our neural network code. """ #### Libraries # Standard library import pickle import gzip # Third-party libraries import numpy as np def load_data(): """Return the MNIST data as a tuple containing the training data, the validation data, and the test data. The ``training_data`` is returned as a tuple with two entries. The first entry contains the actual training images. This is a numpy ndarray with 50,000 entries. Each entry is, in turn, a numpy ndarray with 784 values, representing the 28 * 28 = 784 pixels in a single MNIST image. The second entry in the ``training_data`` tuple is a numpy ndarray containing 50,000 entries. Those entries are just the digit values (0...9) for the corresponding images contained in the first entry of the tuple. The ``validation_data`` and ``test_data`` are similar, except each contains only 10,000 images. This is a nice data format, but for use in neural networks it's helpful to modify the format of the ``training_data`` a little. That's done in the wrapper function ``load_data_wrapper()``, see below. """ f = gzip.open('data/mnist.pkl.gz', 'rb') training_data, validation_data, test_data = pickle.load(f, encoding='latin1') f.close() return (training_data, validation_data, test_data) def load_data_wrapper(): """Return a tuple containing ``(training_data, validation_data, test_data)``. Based on ``load_data``, but the format is more convenient for use in our implementation of neural networks. In particular, ``training_data`` is a list containing 50,000 2-tuples ``(x, y)``. ``x`` is a 784-dimensional numpy.ndarray containing the input image. ``y`` is a 10-dimensional numpy.ndarray representing the unit vector corresponding to the correct digit for ``x``. ``validation_data`` and ``test_data`` are lists containing 10,000 2-tuples ``(x, y)``. In each case, ``x`` is a 784-dimensional numpy.ndarry containing the input image, and ``y`` is the corresponding classification, i.e., the digit values (integers) corresponding to ``x``. Obviously, this means we're using slightly different formats for the training data and the validation / test data. These formats turn out to be the most convenient for use in our neural network code.""" tr_d, va_d, te_d = load_data() training_inputs = [np.reshape(x, (784, 1)) for x in tr_d[0]] training_results = [vectorized_result(y) for y in tr_d[1]] training_data = list(zip(training_inputs, training_results)) validation_inputs = [np.reshape(x, (784, 1)) for x in va_d[0]] validation_data = list(zip(validation_inputs, va_d[1])) test_inputs = [np.reshape(x, (784, 1)) for x in te_d[0]] test_data = list(zip(test_inputs, te_d[1])) return (training_data, validation_data, test_data) def vectorized_result(j): """Return a 10-dimensional unit vector with a 1.0 in the jth position and zeroes elsewhere. This is used to convert a digit (0...9) into a corresponding desired output from the neural network.""" e = np.zeros((10, 1)) e[j] = 1.0 return e

Note also that the biases and weights are stored as lists of Numpy matrices. So, for example net.weights[1] is a Numpy matrix storing the weights connecting the second and third layers of neurons. (It's not the first and second layers, since Python's list indexing starts at 0.) Since net.weights[1] is rather verbose, let's just denote that matrix ww . It's a matrix such that wjkw_{jk} is the weight for the connection between the kthk^{th} neuron in the second layer, and the jthj^{th} neuron in the third layer. This ordering of the jj and kk indices may seem strange. The big advantage of using this ordering is that it means that the vector of activations of the third layer of neurons is: a=sigmoid(wa+b)a'=\mathrm {sigmoid}(wa+b)

There's quite a bit going on in this equation, so let's unpack it piece by piece. aa is the vector of activations of the second layer of neurons. To obtain aa' we multiply aa by the weight matrix ww , and add the vector bb of biases. We then apply the function sigmoid elementwise to every entry in the vector wa+bwa+b.

Of course, the main thing we want our Network objects to do is to learn. To that end we'll give them an SGD method which implements stochastic gradient descent.

Most of the work is done by the line

delta_nabla_b, delta_nabla_w = self.backprop(x, y)

This invokes something called the backpropagation algorithm, which is a fast way of computing the gradient of the cost function. So update_mini_batch works simply by computing these gradients for every training example in the mini_batch, and then updating self.weights and self.biases appropriately.

The activation alja_{lj} of the jthj^{th} neuron in the lthl^{th} layer is related to the activations in the (l1)th(l-1)^{th} layer by the equation ajl=sigmoid(kwjklakl1+bjl)a^l_j=\mathrm{sigmoid}(\sum_k w_{jk}^l a^{l-1}_k+b^l_j) where the sum is over all neurons kk in the (l1)th(l−1)^{th} layer. To rewrite this expression in a matrix form we define a weight matrix wlw^l for each layer, ll . The entries of the weight matrix wlw^l are just the weights connecting to the lthl^{th} layer of neurons, that is, the entry in the jthj^{th} row and kthk^{th} column is wjklw^l_{jk}. Similarly, for each layer ll we define a bias vector, blb^l. You can probably guess how this works - the components of the bias vector are just the values bjlb^l_j , one component for each neuron in the lthl^{th} layer. And finally, we define an activation vector ala^l whose components are the activations ajla^l_j.

With these notations in mind, these equations can be rewritten in the beautiful and compact vectorized form al=sigmoid(wlal1+bl).a^l=\mathrm{sigmoid}(w^la^{l-1}+b^l). This expression gives us a much more global way of thinking about how the activations in one layer relate to activations in the previous layer: we just apply the weight matrix to the activations, then add the bias vector, and finally apply the sigmoid function.

Apart from self.backprop the program is self-explanatory - all the heavy lifting is done in self.SGD and self.update_mini_batch, which we've already discussed. The self.backprop method makes use of a few extra functions to help in computing the gradient, namely sigmoid_prime, which computes the derivative of the sigmoid function, and self.cost_derivative. You can get the gist of these (and perhaps the details) just by looking at the code and documentation strings. Note that while the program appears lengthy, much of the code is documentation strings intended to make the code easy to understand. In fact, the program contains just 74 lines of non-whitespace, non-comment code.

""" network.py ~~~~~~~~~~ A module to implement the stochastic gradient descent learning algorithm for a feedforward neural network. Gradients are calculated using backpropagation. Note that I have focused on making the code simple, easily readable, and easily modifiable. It is not optimized, and omits many desirable features. """ #### Libraries # Standard library import random # Third-party libraries import numpy as np class Network(object): def __init__(self, sizes): """The list ``sizes`` contains the number of neurons in the respective layers of the network. For example, if the list was [2, 3, 1] then it would be a three-layer network, with the first layer containing 2 neurons, the second layer 3 neurons, and the third layer 1 neuron. The biases and weights for the network are initialized randomly, using a Gaussian distribution with mean 0, and variance 1. Note that the first layer is assumed to be an input layer, and by convention we won't set any biases for those neurons, since biases are only ever used in computing the outputs from later layers.""" self.num_layers = len(sizes) self.sizes = sizes self.biases = [np.random.randn(y, 1) for y in sizes[1:]] self.weights = [np.random.randn(y, x) for x, y in zip(sizes[:-1], sizes[1:])] def feedforward(self, a): """Return the output of the network if ``a`` is input.""" for b, w in zip(self.biases, self.weights): a = sigmoid(np.dot(w, a)+b) return a def SGD(self, training_data, epochs, mini_batch_size, eta, test_data=None): """Train the neural network using mini-batch stochastic gradient descent. The ``training_data`` is a list of tuples ``(x, y)`` representing the training inputs and the desired outputs. The other non-optional parameters are self-explanatory. If ``test_data`` is provided then the network will be evaluated against the test data after each epoch, and partial progress printed out. This is useful for tracking progress, but slows things down substantially.""" if test_data: n_test = len(test_data) n = len(training_data) for j in range(epochs): random.shuffle(training_data) mini_batches = [ training_data[k:k+mini_batch_size] for k in range(0, n, mini_batch_size)] for mini_batch in mini_batches: self.update_mini_batch(mini_batch, eta) if test_data: print ("Epoch {0}: {1} / {2}".format( j, self.evaluate(test_data), n_test)) else: print ("Epoch {0} complete".format(j)) def update_mini_batch(self, mini_batch, eta): """Update the network's weights and biases by applying gradient descent using backpropagation to a single mini batch. The ``mini_batch`` is a list of tuples ``(x, y)``, and ``eta`` is the learning rate.""" nabla_b = [np.zeros(b.shape) for b in self.biases] nabla_w = [np.zeros(w.shape) for w in self.weights] for x, y in mini_batch: delta_nabla_b, delta_nabla_w = self.backprop(x, y) nabla_b = [nb+dnb for nb, dnb in zip(nabla_b, delta_nabla_b)] nabla_w = [nw+dnw for nw, dnw in zip(nabla_w, delta_nabla_w)] self.weights = [w-(eta/len(mini_batch))*nw for w, nw in zip(self.weights, nabla_w)] self.biases = [b-(eta/len(mini_batch))*nb for b, nb in zip(self.biases, nabla_b)] def backprop(self, x, y): """Return a tuple ``(nabla_b, nabla_w)`` representing the gradient for the cost function C_x. ``nabla_b`` and ``nabla_w`` are layer-by-layer lists of numpy arrays, similar to ``self.biases`` and ``self.weights``.""" nabla_b = [np.zeros(b.shape) for b in self.biases] nabla_w = [np.zeros(w.shape) for w in self.weights] # feedforward activation = x activations = [x] # list to store all the activations, layer by layer zs = [] # list to store all the z vectors, layer by layer for b, w in zip(self.biases, self.weights): z = np.dot(w, activation)+b zs.append(z) activation = sigmoid(z) activations.append(activation) # backward pass delta = self.cost_derivative(activations[-1], y) * \ sigmoid_prime(zs[-1]) nabla_b[-1] = delta nabla_w[-1] = np.dot(delta, activations[-2].transpose()) # Note that the variable l in the loop below is used a little # differently to the notation in Chapter 2 of the book. Here, # l = 1 means the last layer of neurons, l = 2 is the # second-last layer, and so on. It's a renumbering of the # scheme in the book, used here to take advantage of the fact # that Python can use negative indices in lists. for l in range(2, self.num_layers): z = zs[-l] sp = sigmoid_prime(z) delta = np.dot(self.weights[-l+1].transpose(), delta) * sp nabla_b[-l] = delta nabla_w[-l] = np.dot(delta, activations[-l-1].transpose()) return (nabla_b, nabla_w) def evaluate(self, test_data): """Return the number of test inputs for which the neural network outputs the correct result. Note that the neural network's output is assumed to be the index of whichever neuron in the final layer has the highest activation.""" test_results = [(np.argmax(self.feedforward(x)), y) for (x, y) in test_data] return sum(int(x == y) for (x, y) in test_results) def cost_derivative(self, output_activations, y): """Return the vector of partial derivatives \partial C_x / \partial a for the output activations.""" return (output_activations-y) #### Miscellaneous functions def sigmoid(z): """The sigmoid function.""" return 1.0/(1.0+np.exp(-z)) def sigmoid_prime(z): """Derivative of the sigmoid function.""" return sigmoid(z)*(1-sigmoid(z))

We first load the MNIST data:

training_data, validation_data, test_data = load_data_wrapper()

After loading the MNIST data, we'll set up a Network with 30 hidden neurons.

net = Network([784, 30, 10])

Finally, we'll use stochastic gradient descent to learn from the MNIST training_data over 30 epochs, with a mini-batch size of 10, and a learning rate of η\eta=3.0:

net.SGD(training_data, 30, 10, 3.0, test_data=test_data)
Epoch 0: 9125 / 10000 Epoch 1: 9201 / 10000 Epoch 2: 9285 / 10000 Epoch 3: 9317 / 10000 Epoch 4: 9299 / 10000 Epoch 5: 9388 / 10000 Epoch 6: 9394 / 10000 Epoch 7: 9397 / 10000 Epoch 8: 9425 / 10000 Epoch 9: 9395 / 10000 Epoch 10: 9408 / 10000 Epoch 11: 9440 / 10000 Epoch 12: 9448 / 10000 Epoch 13: 9460 / 10000 Epoch 14: 9445 / 10000 Epoch 15: 9459 / 10000 Epoch 16: 9467 / 10000 Epoch 17: 9466 / 10000 Epoch 18: 9434 / 10000 Epoch 19: 9450 / 10000 Epoch 20: 9463 / 10000 Epoch 21: 9472 / 10000 Epoch 22: 9465 / 10000 Epoch 23: 9482 / 10000 Epoch 24: 9487 / 10000 Epoch 25: 9458 / 10000 Epoch 26: 9481 / 10000 Epoch 27: 9479 / 10000 Epoch 28: 9476 / 10000 Epoch 29: 9479 / 10000

Challenge 14.3

Try creating a network with just two layers - an input and an output layer, no hidden layer - with 784 and 10 neurons, respectively. Train the network using stochastic gradient descent. What classification accuracy can you achieve?

Number of hidden layers

Suppose that we want to approximate a set of functions to a given accuracy. How many hidden layers do we need? The answer is: At most two layers, with arbitrary accuracy obtained given enough units per layer. It has been also shown that only one layer is enough to approximate any continuous function. Of course, there is no way to know how many units we would need, and this is not known in general, and this number may grow exponentially with the number of input units.