from mxnet import gluon, np, npx
npx.set_np()Most real-world recommender data is implicit — clicks, watches, purchases. There are no explicit ratings, and the unobserved (user, item) pairs are a mix of “didn’t like it” and “haven’t seen it yet”. MSE on a 0/1 target is wrong.
Better framing: personalized ranking — given an observed positive (user, i), the model should rank i above sampled unobserved items. Treating every unobserved pair as a literal negative target is usually misaligned with ranking because exposure is missing-not-at-random.
Two pairwise losses for this:
Both turn implicit feedback into pairwise comparisons; the model learns to put positives above negatives.
For each user u, let I_u^+ be observed positives (clicked, watched, bought) and sample negatives j \notin I_u^+ from the item catalog. Training examples are triples:
D = \{(u,i,j): i\in I_u^+,\, j\notin I_u^+\}.
The model never needs an absolute rating target. It only needs the score gap
\Delta_{uij} = \hat r_{ui} - \hat r_{uj}.
Large positive gaps mean the positive item outranks the sampled negative. The sampled negative is a training contrast, not proof that the user would dislike the item.
Sampled negatives j per positive (u, i); loss is log-sigmoid of the score margin:
class BPRLoss(gluon.loss.Loss):
def __init__(self, weight=None, batch_axis=0, **kwargs):
super(BPRLoss, self).__init__(weight=None, batch_axis=0, **kwargs)
def forward(self, positive, negative):
distances = positive - negative
loss = - np.sum(np.log(npx.sigmoid(distances)), 0, keepdims=True)
return lossHard-margin alternative — equivalent to a max-margin classifier over score differences:
class HingeLossbRec(gluon.loss.Loss):
def __init__(self, weight=None, batch_axis=0, **kwargs):
super(HingeLossbRec, self).__init__(weight=None, batch_axis=0,
**kwargs)
def forward(self, positive, negative, margin=1):
distances = positive - negative
loss = np.sum(np.maximum(- distances + margin, 0))
return lossBoth losses reward positive margins, but their gradients behave differently:
\ell_\textrm{BPR}(\Delta) = -\log \sigma(\Delta), \qquad \ell_\textrm{hinge}(\Delta) = \max(0, m-\Delta).