Reinforcement Learning Algorithms As Function Optimizers
Any nonassociative reinforcement learning algorithm can be viewed as a method for performing function optimization through (possibly noise-corrupted) sampling of function values. A description is given of the results of simulations in which the optima of several deterministic functions studied by D. H. Ackley (Ph.D. Diss., Carnegie-Mellon Univ., 1987) were sought using variants of REINFORCE algorithms. Results obtained for certain of these algorithms compare favorably to the best results found by Ackley.
MSU Digital Commons Citation
Williams, Ronald J. and Peng, Jing, "Reinforcement Learning Algorithms As Function Optimizers" (1989). Department of Computer Science Faculty Scholarship and Creative Works. 513.