1 Introduction
In recent years, distributed optimization has attracted extensive attention in various fields
[1–5]. This is due to its wide range of practicability in numerous areas such as distributed coordinated control
[6], power grid economic dispatch
[7] and internet of things applications
[8].
In practical applications, various convex optimization problems have been solved recently. Incremental sub-gradient algorithm is one of the earlier methods to solve convex optimization problems
[9]. Furthermore, combining consistency algorithm with sub-gradient algorithm, it is proved that all agents will reach the optimal consistency state in [
10] and [
11]. In [
11] and [
12], sub-gradient algorithms based on consistency are proposed for distributed convex optimization problems with or without bounded closed convexity. In [
13], a continuous-time sub-gradient algorithm is proposed for solving distributed convex optimization problems with general constraints. Moreover, for the multi-agent rendezvous problem, where the dynamic change of each robot is continuous, a distributed sub-gradient shortest distance rendezvous algorithm is proposed in [
14]. In [
15], a distributed optimization algorithm based on gradient tracking is proposed for distributed convex optimization problems with set constraints. It is worth noting that all the above works rely on real gradient information.
However, it is not feasible or costly to calculate the gradient information accurately in practical applications. For example, in the internet of things
[16], fog computing can not get the closed expression of delay since its online decision-making needs to adapt to the user preferences and the availability of resources is temporarily unpredictable. Moreover, in bandit optimization
[17], the player can only observe the value of the function, not the specific target function. In order to solve the above problems, zeroth-order stochastic optimization has attracted researchers's attention recently. In [
18] and [
19], by employing the two-point gradient estimator, several different zeroth-order optimization algorithms have been proposed for unconstrained problem. In [
20], an adaptive distributed bandit primal-dual algorithm based on two-point gradient estimator is proposed for the distributed online stochastic convex optimization with time-varying constraints. In [
21], a distributed stochastic approximation method of Kieffer-Wolfowitz type is developed for zeroth-order optimization of stochastic networks. In [
22], a distributed gradient descent algorithm is proposed for multi-agent optimization problems with set constraints, where the real gradient information is replaced by local construction of random gradients. In [
23], the convex optimization problem with time-varying network is discussed in detail, and an algorithm of approximate substitution of sub-gradient by bilateral random gradient is proposed.
It is worth noting that all the aforementioned investigations are conducted for convex optimization problems. However, the problems of pseudoconvex optimization
[24] exist widely in reality such as fractional programming
[25], economics
[26], solid mechanics
[27]. Compared with the convex optimization, pseudoconvex optimization is more general, which also covers some nonconvex cases
[25]. In fact, online distributed optimization problem with strongly pseudoconvex cost function has been studied in
[28], where real gradient information of objective functions is required. However, the real gradient information is often difficult to be achieved in practical applications, or computing real gradients takes high costs. Motivated by [
18,
19,
28], we try to develop zeroth-order methods to deal with online distributed optimization problem with strongly pseudoconvex cost function.
In this paper, the problem of online distributed optimization with strongly pseudoconvex cost functions is studied, where the real gradients of objective functions are not available. Different from [
29–
32], where the cost functions are assumed to be convex, here cost functions are assumed to be strongly pseudoconvex, which is more general. Both the zeroth-order information and the pseudoconvexity of cost functions will decrease the convergence rate of the online algorithm and enlarge the bound of the dynamic regret. How to deal with this setting and achieve an effective regret bound is a great challenge. To solve the problem, an online zeroth-order stochastic optimization algorithm is proposed. In this algorithm, the auxiliary optimization-based strategy is adopted to update the state of agents, and a single-point gradient estimator is used to estimate the local gradient. When running this algorithm, less real information is used than the algorithm in [
28], which makes our algorithm more practical than that in [
28]. This is different from [
18] and [
19], where agent estimates the true gradient based on the two-point gradient estimator, here the single-point gradient estimator can be realized even the values of some random variables change rapidly. The performance of this algorithm is measured by the expectation of dynamic regret. We prove that if the graph is
-strongly connected, and the cumulative deviation of the minimizer sequence grows with a certain rate, then the expectation of dynamic regret grows sublinearly.
This paper is organized as follows. In Section 2, We formulate the problem and propose an online zeroth-order stochastic optimization algorithm based on single-point gradient estimator. In Section 3, We state the main results of this paper and give the proof. A simulation example is given in Section 4. The Section 5 is the conclusion of the full text.
Notations: The absolute value of the scalar is denoted by . The set of real numbers is denoted by . is used to represent the set of positive integers. For any , we denote set . We use to denote -dimensional real vector space. For given vectors , and matrix , we denote , , and , where represents the entry of vector . For function , We use to represent the gradient of , and use to denote its Hessian matrix. is a identity matrix. is an -dimensional vector whose elements are all ones. For a matrix , denotes the matrix entry in the row and column, represents the row of the matrix . and represent the maximum and minimum eigenvalues of , and denote .
2 Problem Formulation
2.1 Basic Graph Theory
A time-varying directed communication graph is set as , where is a set of vertices, is an edge set. is a non-negative matrix to represent the weight of adjacent edges, where for some if and otherwise. We denote as the neighbor set of agent at time , where for any . For a fixed topology , represents the length of path between node and node , where the path is a sequence of distinct nodes and for . is strongly connected if there is a path between any pair of distinct nodes. For , an -edge set is defined as for some constant . Under the above conditions, if with and is strongly connected for any , then is called -strongly connected.
The following assumption is made for the graph.
Assumption 1 For any , is a -strongly connected graph and is a doubly stochastic matrix.
The connectivity of graph in Assumption 1 plays an important role in facilitating agents to achieve a common state
[12, 33]. The following lemma in [
12] is recalled.
Lemma 1[12] Under Assumption , for any , there exists certain and satisfying
where is the state transition matrix of the consensus model with for any .
2.2 Online Distributed Pseudoconvex Optimization
Consider a multi-agent system consisting of agents, labeled by set . Agents communicate with each other through the digraph . For agent , a sequence of cost functions is given by , where is twice differentiable for any , is unknown to the agents, and . At each iteration time , agent selects a state . Then, agent receives a local cost function , which means that the cost function information is not available until the agent updates actions. In this case, at each iteration time , each agent attempts to cooperatively solve the following optimization problems
For any and some random vector , denote : as the local cost function of agent . We use to describe some nonadditive random processes, and the expected local cost function is introduced here . (2) is rewritten as follows
Here some basic assumptions are made for the problem.
Assumption 2 is non-empty, bounded and compact, i.e., for any , there holds , where is a positive constant. Moreover, is a closed convex set.
Assumption 3 For any , each is -strongly pseudoconvex on i.e., for any , implies with some constant .
Remark 1 In Assumption 3, the gradient of the cost function satisfies strong pseudomonotonicity. Note that, if
is strongly pseudomonotone, then
is strongly pseudoconvex
[34]. The definitions of pseudoconvex, quasiconvex and convex optimization and their relationships are discussed as follows. For a given differentiable function
,
is pseudoconvex on
if for any two different points
,
implies
. Moreover, if for any two different points
,
implies
with some constant
, then
is called a
-strongly pseudoconvex function on
. And
is convex on
if for any two different points
, implies
. Moreover,
is quasiconvex on
if for any two different points
and
, implies
. According to the above definitions, it is not difficult to verify that quasiconvex function includes convex function and pseudoconvex function, and convex function is a special case of pseudoconvex function. Compared with convex optimization
[30–32], strongly pseudoconvex optimization problems is more general. It is undeniable that quasiconvex is more complex than pseudoconvex, and it is a more general nonconvex problem. In our future works, we will consider the case with quasiconvex.
Assumption 4 For any , on .
Any online algorithm should mimic the performance of its offline counterpart, and the gap between them is called regret. If offline benchmark for agents is to minimize
, then the regret is called static regret
[1], which is defined as
where
. Here the offline benchmark for agents is to minimize
at each time, and such regret is called dynamic regret
[34], which is defined as
where
for any
. The offline benchmark of dynamic regret (5) is more stringent than that of static regret. It is undeniable that dynamic regret may fail in the worst case. Inspired by
[30–32], we use the following deviation of the minimizer sequence
to describe the difficulty:
2.3 Online Distributed Algorithms
Consider the following centralized optimization problem
where
is a closed convex set,
is a strongly pseudoconvex function, and the real gradients of objective function are not available. To address this problem, motivated by [
35], We begin to construct the gradient estimator.
Consider a gradient estimator which is based on a single-point evaluation of local cost function. At each time , agent estimates the gradient of its local cost function with
where represents the step-size and is a random perturbation vector which is independently generated by each agent.
To solve (2), an online zeroth-order stochastic optimization algorithm is proposed as follows
for any , where is the state of agent used to estimate the minimizer of the objective function at time , . is a symmetric and positive definite matrix. Each agent updates its state by the average-consensus algorithm and are positive and decaying learning rates, and initial value .
Motivated by the single-point gradient estimate strategy
[35], consensus algorithm
[36], and the auxiliary optimization algorithm
[33], we propose the algorithm (9). The consensus term
is inspired by the consensus algorithm in [
10] and [
36]. Under this algorithm, each agent updates actions according to its own state, the state information received from its neighbors at the current moment and the information of gradient estimation. This determines algorithm (9) is distributed and online.
Remark 2 By implementing the algorithm, matrix should be known in prior by all agents, which may prevent the proposed algorithm from being fully distributed. In fact, in a balanced and periodically strongly connected (-strongly connected) communication graph, it is not difficult to determine a common constant for each agent by only using local information. For example, the local initial state of each agent is set to be a symmetric and positive definite matrix , . Each agent updates its state by the average-consensus algorithm
then all agents' states converge to , which helps each one achieve a common symmetric and positive definite matrix .
3 Main Results
Theorem 1 Under Assumptions , , and , if the learning rates are selected to be , , where , then for any and learning time , we have
where , , , , , , , , , , , , , .
Remark 3 By Theorem 1, we can see that
plays an important role in the sublinear boundary of dynamic regret. It can be noted that
. If
sublinearly grows with
, then
, which implies online distributed algorithm (9) solves problem (1) well. Therefore, the algorithm (9) is suitable for solving the strongly pseudoconvex optimization problem, where the real gradients of objective function are not available. If the amplitude of minimizer sequence
change is drastic,
might become linear with
, then the performance of algorithm (9) cannot be guaranteed. This is natural since the problem is also unsolvable in the worst case, even in online convex optimization
[30–32]. In convergence analysis, there is an error between the estimate gradient and the real gradient, which enlarges the bound of the dynamic regret. In order to overcome this difficulty, we construct a new smooth function to establish the relationship between the estimate gradient and the real gradient. By selecting an appropriate step-size of the smoothing function and an appropriate learning rate of the online algorithm, we prove that the expectation of the regret function increases sublinearly. In addition, the convergence rate of this algorithm is slower than that of [
28]. This is normal since the algorithm uses less information and does not directly use the real gradient information.
Before giving the proof of Theorem 1, some useful lemmas need to be provided. First, the Karush-Kuhn-Tucker (KKT) condition for pseudoconvex optimization is recalled.
Lemma 2[1] is a minimum point of on if and only if , .
Lemma 3[35] Suppose that for any and , the random perturbation vector , and , there holds , where represents a estimation bias of the gradient estimator with the property .
Now we present the following lemma, which gives the upper bound of the error of each agent's state and the average value in each iteration time under algorithm (9).
Lemma 4 Under Assumptions , , and , for any ,
and
where and .
Proof. Using KKT condition to measure the the first equation of algorithm (9), there holds
for any . By the convexity of , we have for any and . Let , together with the facts that and , we have
Let us denote , from (13), there holds . Moreover,
Let and to represent the stack of the term of and the stack of the term of , respectively, where . Their relationship is as follows
which implies that
From (1) we know that is a doubly stochastic matrix for any , then
From (14) and (15), there holds
for any . By (1), there holds
This result proves (11). Moreover, due to the facts that is non-increasing and , we have
Using Cauchy-Schwarz inequality yields
Substituting (17) into (16) proves the validity of (12).
Then we give the cumulative upper bound of the error between the average state of the agent and the minimum point.
Lemma 5 Under Assumptions , and , if is non-increasing, there holds
To prove Lemma 5, we give an auxiliary lemma as follows.
Lemma 6 Under Assumptions , and , for any and ,
Proof. Note that
Then, note that . By using Young's inequality, there holds
Using (12) and substituting into (20) it yields (19).
Proof. (Proof of Lemma 5) Note that,
for any . Now we denote , there holds
Then, by KKT condition, there holds
for any . By (19) we have
Under Assumption 3, by Lemma 2, we have
Note that for any , we have for any . Since in Assumption 2 is bounded, there holds
Then,
By using Young's inequality, we have
we denote , by (22)(25) and using (9), there holds
Due to the fact for any , there holds . By accumulating the two sides of (26) from to , we can achieve (18) in Lemma 5.
Proof. (Proof of Theorem ) By Lemma 4 and Lemma 5, for any , there holds
Let and . Note that . Similarly, , and , we have
where the first equation is established by changing the order of summation. Note that for any yields , which leads to the result of the last inequality. Similarly, we have .
From Lemma 2, taking the total expectation for yields
Taking expectation on both sides of the inequality (27) results in that
Then, using Jensen's inequality, we have
we can take expectation on both sides of (31) to get
Due to the fact , and by (32), we have
By inequalities (27)(33), we have
Note that and due to the Lipschitz continuity of , there holds
for any . Taking the expectation of and substituting (34) into (35), it immediately implies (10).
4 A Simulation Example
In this section, we illustrate the performance of the algorithm through a simulation example. A multi-agent system is set up with six agents, labeled as , where each agent transmits information to its neighbors through a directed graph. As shown in Figure 1, there are , , , four possible digraph. The switching sequence of four graphs is . The weight of each edge is assumed to be , where is the number of agent 's neighbors. Obviously, the union of these graphs is strongly connected. Here we set the learning time to be . At each time , the objective function is given as follows
Figure 1 The communication graph |
Full size|PPT slide
where and . In this simulation, parameters are selected as , , , , , is randomly selected from and subject to a uniform distribution. Moreover, the set constraint is given by . For any , it is not difficult to verify that is strongly pseudoconvex on . Now suppose that agent can only access its local objective function and set . Algorithm (14) is applied to the problem, and is used to represent agent 's state. Initial states of agents are given by: and . By selecting , the trajectories of agents' states are shown in Figure 2. Figure 3 displays the average regrets of all agents respectively and shows that each average regret decays to zero after a period of time. Thus, the observations are consistent with results established in Theorem 1.
Figure 2 The trajectories of agents' states under algorithm (9) |
Full size|PPT slide
Figure 3 The trajectory of , |
Full size|PPT slide
5 Conclusions
This paper studies the problem of online distributed optimization problem with strongly pseudoconvex function. To solve this problem, we propose an online distributed zeroth-order stochastic optimization algorithm based on the single-point gradient estimator. Under this algorithm, each agent updates actions based on its own state, the state information received from its neighbors at the current moment and the information of gradient estimation. The performance of the algorithm is measured by the expectation of the dynamic regret, if the time-varying graph is -strongly connected. The result shows that the bound of dynamic regret in expectation grows sublinearly, if the increment rate of minimizer sequence deviation is within a certain range. The simulation experiment in the previous section has verified its effectiveness. By considering practical application, our future work will focus on how to solve the more general pseudoconvex problem.
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}