Skip to content
This repository has been archived by the owner on Jul 27, 2023. It is now read-only.

Total epoch rewards positive, but total money less? #114

Open
windowshopr opened this issue Dec 25, 2021 · 1 comment
Open

Total epoch rewards positive, but total money less? #114

windowshopr opened this issue Dec 25, 2021 · 1 comment

Comments

@windowshopr
Copy link

Anybody come across these kinds of results?

image

This run starts with a 10,000 balance, which is default. The total rewards is positive, suggesting it's profiting, but the total money is less than the initial 10,000. Why might that be?

Looking at the logic for both, I can't figure out why that's happening. When we "buy" a stock:

                if action == 1 and starting_money >= self.trend[t] and t < (len(self.trend) - self.half_window):
                    inventory.append(self.trend[t])
                    starting_money -= self.trend[t]

...and when we "sell" a stock:

                elif action == 2 and len(inventory):
                    bought_price = inventory.pop(0)
                    total_profit += self.trend[t] - bought_price
                    starting_money += self.trend[t]

So the total profit is just the trade's profit, which can either be positive or negative, and it just gets added to the recurrent total profit. The starting money gets subtracted from on the buy, and added to on the sell, so they SHOULD match, but they don't?

When the total money after the epoch is greater than 10,000, the total reward matches, i.e. if the final total money is 10,230, the total reward will say 230, but that's not true if the total money is less than 10,000.

You can look at any of the agent codes, it's all the same, but I'm looking specifically at 4.policy-gradient-agent.ipynb for this example. Maybe it's something blatantly obvious, but I sure don't see it.

@ArtificialZeng
Copy link

How about now?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants