Heuristic dynamic programming using echo state network as online trainable adaptive critic Academic Article uri icon