I can’t criticize that—it’s incredible.
It must have always been the plan, and we can expect this ‘limit’ to keep increasing.
Remember that the hidden chains of thought (CoTs), which are generated at your expense but not available to you, are the real gold here (for further reinforcement learning).
This is how the model will ‘learn’ to produce increasingly better CoTs. We are now the human annotators for these simulated reasoning chains.
Finally. 50 a week is pretty low. 50 a day is good enough
Fifty a day is quite good, provided it’s not filled with unnecessary content. More would always be better, but this number is a solid starting point. Fifty a week feels restrictive; I hope they find ways to reduce costs so it can be used more freely.
This has made me very very happy today!