<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>MACHINE LEARNING | Dezhi Yu</title><link>https://halfrost.me/categories/machine-learning/</link><atom:link href="https://halfrost.me/categories/machine-learning/index.xml" rel="self" type="application/rss+xml"/><description>MACHINE LEARNING</description><generator>HugoBlox Kit (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Sun, 21 Oct 2018 10:05:36 +0000</lastBuildDate><image><url>https://halfrost.me/media/favicon_hu_4db6119fa52e8e17.png</url><title>MACHINE LEARNING</title><link>https://halfrost.me/categories/machine-learning/</link></image><item><title>How to understand gradient descent?</title><link>https://halfrost.me/post/how-to-understand-gradient-descent/</link><pubDate>Sun, 21 Oct 2018 10:05:36 +0000</pubDate><guid>https://halfrost.me/post/how-to-understand-gradient-descent/</guid><description>&lt;p&gt;Gradient descent is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. To find a local minimum of a function using gradient descent, we take steps proportional to the negative of the gradient (or approximate gradient) of the function at the current point. But if we instead take steps proportional to the positive of the gradient, we approach a local maximum of that function; the procedure is then known as gradient ascent. Gradient descent is generally attributed to Cauchy, who first suggested it in 1847, but its convergence properties for non-linear optimization problems were first studied by Haskell Curry in 1944.&lt;/p&gt;
&lt;p&gt;Click
to read full article.&lt;/p&gt;</description></item></channel></rss>