Back Prop Algorithm - What remains constant in derivatives
The following are a set of notes on the back-prop algorithm. I happen to work in dynamical systems where one needs to compute the stability of periodic orbits. One could argue a dynamical system is essentially a rule for iteration such as the one below.
\[x_{n+1} = f(x_n)\]Interestingly, there are sytems which come back to the same point after some iterations. Lets say the following is true,
\[x_{n+4} = x_{n} = f(f(f(f(x_n)))) = f^4(x_n)\]One computes the stability of a discrete system as above by computing its derivative. If the derivative is less than 1, then the system is stable else unstable.
So to compute the stability of the \(f^4(.)\) function one needs to find the following:
\[\frac{d}{dx_n}(f^4(x_n)) = \frac{d}{dx_n}(x_{n+4}) = \frac{dx_{n+1}}{dx_n}.\frac{dx_{n+2}}{dx_{n+1}}.\frac{dx_{n+3}}{dx_{n+2}}.\frac{dx_{n+4}}{dx_{n+3}}\]This is essentially the backpropagation update for the function \(f^4(.)\) with respect to the previous layer outputs \(x_n,x_{n+1}..\). While this is obvious for many, how this helps me is to understand what is kept constant when one takes the layer’s derivative.
That is as follows,
\[\frac{d}{dx_n}(x_{n+4}) = \frac{dx_{n+1}}{dx_n}.\frac{dx_{n+2}}{dx_{n+1}}.\frac{dx_{n+3}}{dx_{n+2}}.\frac{dx_{n+4}}{dx_{n+3}} = f'(x_n).f'(x_{n+1}).f'(x_{n+2}).f'(x_{n+3})\]Therefore, instead of taking the derivative of the overall function \(f^4(.)\), we could take the derivative of the function \(f(.)\) and just multiply.
Consequently, in general, one can just use \(f'(.)\) and apply the function on the \(n\) iterates of the function and get the derivative \(n^{th}\) application of the function. The interesting aspect is what is kept constant when one applies the function is clear here. Cheers.
Reference: Ott, E. (2002). Chaos in Dynamical Systems (2nd ed.). Cambridge: Cambridge University Press. doi:10.1017/CBO9780511803260
Posts
Policy Gradient Algorithm
Visit to Weston Park Sheffield
Notes on Inverse transform sampling
Eigenvalues and poles
Back Prop Algorithm - What remains constant in derivatives
Wordpress to Jekyll Conversion
Phase functions
Solving Dynamical Systems in Javascript
Javascript on markdown file
Walking data
Walking, it is complicated
PRC
Isochrone
Walking, it's complicated
Newtons iteration as a map - Part 2
Newton's iteration as map - Part 1
ChooseRight
Mathematica for machine learning - Learning a map
Prediction and Detection, A Note
Why we walk ?
The equations that fall in love!
Oru cbi diarykkuripp(ഒരു സിബിഐ ഡയറിക്കുറിപ്പ്)
A way to detect your stress levels!!
In search of the cause in motor control
Compressive sensing - the most magical of signal processing.
Machine Learning using python in 5 lines
Can we measure blood pressure from radial artery pulse?
subscribe via RSS