[ML.1] AML - Reinforcement Learning assignment typos
I would like to start a new blog series - Machine Learning [ML], as I am slowly re-qualifying from pure science to data science.
Recently I finished the fourth course on Coursera in a row from Advanced Machine Learning specialization - Practical Reinforcement Learning.
I will copy my review of the course (mark 5/5):
"This is my fourth AML course, and for now I would say it is the best one. It connects lectures and practice in the best way. On the other hand, there are mistakes all around, as it is beta-version. In my opinion, it is not fair to put the beta-version course into paid specialization."
Yes, you read it right - it is the beta-version. Hence, a few practical tasks (or assignments), had some mistakes where they should not be, i.e. in the part of the code that should not be changed by the student. I will list all of them here and try to make somebodies life easier :)
Week 1 / Assignment 1 - "OpenAI Gym":
There is a wrong hint:
Hint: your action at each step should depend either on t or on s.
One should only use t!!!
Week 6 / Assignment 8 - "Bandirs & exploration":
Change 1 - class BernoulliBandit:
class BernoulliBandit:
def pull(self, action):
line
if np.random.random() > self._probs[action]:
changed to:
if np.any(np.random.random() > self._probs[action]):
i.e. the condition is put under np.any().
Change 2 - def plot_regret:
def plot_regret(scores):
line
plt.legend([agent.name for agent in scores])
changed to
plt.legend([agent for agent in scores])
i.e. agent.name is changed to solely agent.
Change 3 - submission:
Instead of submit_bandits, make new function submit_bandits2 in submit.py:
def submit_bandits2(agents, scores, email, token):
epsilon_greedy_agent = None
ucb_agent = None
thompson_sampling_agent = None
for agent in agents:
if "EpsilonGreedyAgent" in agent.name:
epsilon_greedy_agent = agent.name
if "UCBAgent" in agent.name:
ucb_agent = agent.name
if "ThompsonSamplingAgent" in agent.name:
thompson_sampling_agent = agent.name
assert epsilon_greedy_agent is not None
assert ucb_agent is not None
assert thompson_sampling_agent is not None
grader = grading.Grader("VL9tBt7zEeewFg5wtLgZkA")
grader.set_answer("YQLYE", (int(scores[epsilon_greedy_agent][int(1e4) - 1]) - int(scores[epsilon_greedy_agent[int(5e3) - 1])))
grader.set_answer("FCHOZ", (int(scores[epsilon_greedy_agent][int(1e4) - 1]) - int(scores[ucb_agent][int(1e4) - 1])))
grader.set_answer("0JWHl", (int(scores[epsilon_greedy_agent][int(5e3) - 1]) - int(scores[ucb_agent][int(5e3) - 1])))
grader.set_answer("4rH5M", (int(scores[epsilon_greedy_agent][int(1e4) - 1]) - int(scores[thompson_sampling_agent][int(1e4) - 1])))
grader.set_answer("TvOqm", (int(scores[epsilon_greedy_agent][int(5e3) - 1]) - int(scores[thompson_sampling_agent][int(5e3) - 1])))
grader.submit(email, token)
Week 6 / Assignment 9 - "MCTS":
class WithSnapshots(Wrapper):
def load_snapshot(self,snapshot):
line
self.render(close=True) # close popup windows since we can't load into them
changed to
self.close()
I hope somebody will find this helpful and please let me know if you think that something else should be here.
If you are new to SteemIT and decide to join based on this article - let me know using any media you prefer.