The L2 update rule is derived from the derivative of the loss function with
respect to the model weights  an L2 regularized loss function contains an
additional additive term involving the weights. This paper provides some
useful mathematical background:
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.58.7377
The code that computes the new L2 weight is here:
https://github.com/apache/incubatorspark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/optimization/Updater.scala#L90
The compute function calculates the new weights based on the current
weights gradient as computed at each step. Contrast it with the code in the
SimpleUpdater class to get a sense for how the regularization parameter is
incorporated  it's fairly simple.
In general, though, I agree it makes sense to include a discussion of the
algorithm and a reference to the specific version we implement in the
scaladoc.
 Evan
On Thu, Jan 9, 2014 at 10:49 AM, Walrus theCat <walrusthecat@gmail.com>wrote:
> No  I'm not, and I appreciate the comment. What I'm looking for is a
> specific mathematical formula that I can map to the source code.
> Personally, specifically, I'd like to see how the loss function gets
> embedded into the w (gradient), in the case of the regularized and
> unregularized operation.
>
> Looking through the source, the "loss history" makes sense to me, but I
> can't see how that translates into the effect on the gradient.
>
> On Thu, Jan 9, 2014 at 10:39 AM, Sean Owen <srowen@gmail.com> wrote:
>
>> L2 regularization just means "regularizing by penalizing parameters
>> whose L2 norm is large", and L2 norm just means squared length. It's
>> not something you would write an ML paper on any more than what the
>> vector dot product is. Are you asking something else?
>>
>> On Thu, Jan 9, 2014 at 6:19 PM, Walrus theCat <walrusthecat@gmail.com>
>> wrote:
>> > Thanks Christopher,
>> >
>> > I wanted to know if there was a specific paper this particular codebase
>> was
>> > based on. For instance, Weka cites papers in their documentation.
>> >
>> > On Wed, Jan 8, 2014 at 7:10 PM, Christopher Nguyen <ctn@adatao.com>
>> wrote:
>> >>
>> >> Walrus, given the question, this may be a good place for you to start.
>> >> There's some good discussion there as well as links to papers.
>> >>
>> http://www.quora.com/MachineLearning/WhatisthedifferencebetweenL1andL2regularization
>> >>
>> >>
>> >> On Jan 8, 2014 2:24 PM, "Walrus theCat" <walrusthecat@gmail.com>
>> wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> Can someone point me to the paper that algorithm is based on?
>> >>>
>> >>> Thanks
