## Does regret change when the loss function is dependent on the previous predictions?

The loss function of each expert in the expert advice problem(or any online learning problem) depends on the time($$t$$) and expert advice at that time($$f_{t}(i)$$). suppose in this problem, loss function depends on the previous prediction of the algorithm. $$l _{t} (i) = p_{1} p_{2} \cdots p_{t-1}f_{t}(i)$$ such that $$p$$ show prediction of algorithm.

Does the upper bound of regret change?

## SVM regret using SGD

I was reading the Online Convex Optimization book by Elad Hazan and I was wondering on chapter 3.4.1 how does he find the optimal value for step_size(ηt) using the analysis of SGD regret and also what is the regret of doing SVM with SGD.

## How to implement minimaztion of maximum regret?

We run a report, but for one of our big clients, it is taking too long. There is a loop where it loops for a list and fetches it upstream 1by1. We’re making this a bulk operation now, but some items in the list might be invalid and upstream don’t tell us which. I now need to silently handle the failures and remove them from the list and send the request again. I figured the best way to do this is to split the list and try again with a recursive function.

Having looked at the 2 egg problem, I wish to do the same here. I wish to minimize the number of requests to be made. That problem seems to use some quadratic function, I’m not familiar with those things or algorithms. Any help or ideas on how to implement this?

## Why is the objective in a multi-armed bandit problems to minimize (cumulative) regret?

In a multi-armed bandit, the objective is to maximize the expected cumulative reward. This objective is usually (equivalently?) stated in terms of expected cumulative regret.

Question: Why not just deal with the reward? Why formulate the objective in terms of regret?