DDPG Help with problems and questions? – Info AI
Hey reader,
–
–
–
So i have been studying alot of things in reinfocrement learning and came to an conclusion that i wanna try the ddpg algorithm. i tried code from; https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow/blob/master/contents/9_Deep_Deterministic_Policy_Gradient_DDPG/DDPG_update2.py but when using this code i had directly a million questions… i tried running it first to see if it works fine and uit worked good enough with the pendeluem example environment, but then i realized that i couldn’t get my environment hooked up/working with this existing code!?.
–
–
–
My environment consists atleast of reset, action and init function. the reset function should* return pictures in 1080×720 pixels in what i believe a rgb array…? my action function consists of needing an environment and choosing based on the q values an action, i understand i will need to update that to something more suiteable for this algorithm. the actions are “up” “down” “right” . when for example up is called it will call a function that does his job… it is not so relevant on what the actions do (?). and in my init function i do some init things, but not really anything related to the ddpg algorithm. this environment file is first been used for normal q-learning i understand there needs to be things changed, but i don’t get how/where..
–
–
–
init looks like:
def __init__(self):
super(Environment, self).__init__()
self.action_space = ['up', 'down', 'right']
self.n_actions = len(self.action_space)
# self.n_features = 3 #this is i think needed for ddpg, don't know wich value it should be also
–
–
–
Thats around everything i have in my environment… if i need more in my environment just let me know…
i know that my environment is continious states/actions and that i then need “box” environment to be created, but then again i have no idea how to set this up with this ddpg code i currently have.
–
–
–
End goal:
–
–
–
1: The AI should be build properly and should have all the ddpg aspects in his algorithm
2: The AI is attached to the environment, input as images and actions as 0, 1, 2 (0=up,1=down,2=down)
3: The AI should save his progress somewhere (like a pickle)
–
–
–
At last: i can’t share my environment code with you, but it doens’t has anything strange in it, just a reset and step function , with reset (should*) returning a picture. and step having the actions 0,1,2 in it, i know there should be something switched that the actions aren’t chosen in step but in the algo but atleast from step it should return the action chosen right?
–
–
–
Should* : i have pictures that are timeframes from the game, there .png and are 1080×720 (will change later size) i want the ai to get so see the pictures i don’t have any good ways yet to feed the pictures to the environment, but i think i’ve read something about array’s and rgb modes?
–
–
–
If you know anything about let me know on things that may help (please don’t just post only links to documentations, i’ve read them all, explaining is allowed?), if you know all/some answers please let me know, if you could send some example code with it aswell, so i can understand it? if you really wanna help me out with live chat we can skype/discord, just send a pm with your discord/skype name so we can talk about it?
–
–
–
Thanks already in advance for reading and responding!
–
–
–
Jan
Article Prepared by Ollala Corp