Ceres Library
brief introduction
Ceres library is an open source C + + nonlinear optimization library developed by Google, which is widely used to solve least squares problems.
The Github home page of Ceres library is as follows:
install
First, download the source code of Cere:
git clone https://ceres-solver.googlesource.com/ceres-solver
Second, you need to install dependencies:
sudo apt install liblapack-dev libsuitesparse-dev libcxsparse3 libgflags-dev libgoogle-glog-dev libgtest-dev
Then, enter the source directory for compilation and installation:
cd ceres-solver mkdir build && cd build cmake .. make -j12 sudo make install
Default installation file locations: \ usr\local\include\ceres and \ usr\local\lib
Engineering configuration
Import the library file, cmakelists. In the project root directory Txt fill in as follows:
# Ceres-solver find_package(Ceres REQUIRED) include_directories(${CERES_INCLUDE_DIRS})
Cmakelists under src Txt file is filled as follows:
# link ceres-solver target_link_libraries(Study ${CERES_LIBRARIES})
include header file main H fill in the following:
// Ceres-solver #include <ceres/ceres.h>
Problem description
In the Cere library, the least squares problem is modeled as follows (kernel least squares with boundary):
min
x
1
2
∑
i
ρ
i
(
∥
f
i
(
x
i
1
,
x
i
2
,
⋯
,
x
i
k
)
∥
2
)
s
.
t
.
l
j
≤
x
j
≤
u
j
\min_x\frac{1}{2}\sum_i\rho_i\Bigl(\begin{Vmatrix}f_i(x_{i_1},x_{i_2},\cdots,x_{i_k})\end{Vmatrix}^2\Bigr)\\ s.t.\quad l_j\leq x_j \leq u_j
xmin21i∑ρi(∥∥fi(xi1,xi2,⋯,xik)∥∥2)s.t.lj≤xj≤uj
Among the questions expressed above,
x
1
,
⋯
,
x
n
x_1,\cdots,x_n
x1,..., xn are called optimization variables, also known as Parameter blocks;
f
i
(
⋅
)
f_i(\cdot)
fi (⋅) is called Cost function;
ρ
i
(
⋅
)
\rho_i(\cdot)
ρ i (⋅) is called Loss function, also known as kernel function;
ρ
i
(
∥
f
i
(
x
i
1
,
x
i
2
,
⋯
,
x
i
k
)
∥
2
)
\rho_i\Bigl(\begin{Vmatrix}f_i(x_{i_1},x_{i_2},\cdots,x_{i_k})\end{Vmatrix}^2\Bigr)
ρ i (‖ ‖ fi (xi1, xi2,..., xik) ‖ ‖ 2) is called Residual block; parameter
l
j
l_j
lj # and
u
j
u_j
uj is called the lower and upper bounds of the parameter.
When the loss function is taken as a constant value: ρ i ( x ) = x \rho_i(x)=x ρ i (x)=x, at this time, the upper and lower limits of the parameters are established as l j = − ∞ , u j = ∞ l_j=-\infin,u_j=\infin lj = − ∞, uj = ∞, that is, it is constructed as a least squares problem with unbounded constraints. At this time, the cost function is equivalent to the loss function, that is, the residual block.
Ceres library API
Using Ceres library to solve the least squares problem is mainly composed of three parts: the construction of cost function, the construction of least squares problem and the solution of least squares problem.
In order to facilitate the learning of Ceres library, the handwritten Gauss Newton method is still used here as an example. The problem is described as follows: https://blog.csdn.net/weixin_45929038/article/details/122973583
Similarly, extract points as observation data sets:
int main() { /*-------- Initial parameter configuration--------*/ // Actual curve parameters double ar = 1.0, br = 1.0, cr = 1.0; // Initial value of estimated curve parameters double ae = 2.0, be = 1.0, ce = 5.0; // Number of sampling observation data points int N = 100; // Noise standard deviation and its reciprocal double w_sigma = 1.0; // Random number generator cv::RNG rng; /*-------- Observation data generation--------*/ vector<double> x_data, y_data; for(int i=0; i<N; i++){ double x = i / 100.0; x_data.push_back(x); y_data.push_back(exp(ar * x * x +br * x + cr) + rng.gaussian(w_sigma*w_sigma)); } // Parameter block double abc[3] = {ae, be, ce}; return 0; }
Constructing cost function
The cost function is constructed by using the imitation function of C + +. Its essence is the structure struct or class, and it overloads () to make it have the same call mode as the function, so it is called imitation function. Struct is used for definition here, which mainly includes constructor and overloaded operator ():
For example, the cost function for solving curve parameters is constructed as follows:
/*-------- Build CostFunction--------*/ struct CURVE_FITTING_COST{ // Constructor CURVE_FITTING_COST(double x, double y): _x(x), _y(y){} // Overload operator () template<typename T> bool operator()(const T *const abc, T *residual) const { // y - exp(ax^2 + bx + c) residual[0] = T(_y) - ceres::exp(abc[0] * T(_x) * T(_x) + abc[1] * T(_x) + abc[2]); return true; } // Member variables that store values const double _x, _y; };
Constructor
The constructor is used to pass the value of known parameters (sensor observation data) into the cost function. When there are no known parameters in the optimized problem, there is no need to write the constructor.
In this example, 100 points on the curve are used as input.
Overload operator ()
The operator () is a template method whose return value is of type bool. The accepted parameters are parameter block and residual block.
input
Parameters should be passed into the array of variables at one time using pointers. For example, for the above problems, the optimized parameters include a , b , c a,b,c a. b and c are three parameters, so the transmission parameter block is * const abc, which corresponds to an array of three parameters.
The above problem is an unconstrained least squares problem, so the residual block is equivalent to the cost function y i − e x p ( a x i 2 + b x i + c ) y_i-exp(ax_i^2+bx_i+c) yi−exp(axi2+bxi+c).
Template type T
Since different open source algorithm libraries have their own defined data types (Vector in Eigen corresponds to Point in Opencv and Matrix corresponds to Matrix), the input and output of operators are unified as template type T.
const keyword
In the process of solving the least squares, we don't want to modify the contents of overloaded operator () and parameter block due to abnormal operation, so we use const to modify it.
Constructing least squares problem
The least squares problem is mainly constructed by Ceres::Problem class:
// instantiation ceres::Problem problem;
After instantiating this class, the least squares problem is constructed by calling the corresponding member function.
Residual block
Use the following function to pass the residuals:
problem.AddResidualBlock(CostFunction* cost_function, LossFunction* loss_function, const vector<double *> parameter_blocks);
- cost_function: cost function
- loss_function: loss function. If the constructed problem is an unconstrained least squares problem, the input parameter is nullptr
- parameter_blocks: parameter blocks, that is, the parameters of the final solution
The cost function is described as follows:
Derivative model of cost function
Ceres provides three template classes for derivation with different accuracy and efficiency:
- AutoDiffCostFunction: automatic derivation class. The derivation method is determined by Ceres library. It is the longest used class.
- NumericDiffCostFunction: numerical derivative class. The derivative numerical solution form is written manually by the user. It is usually used when automatic derivation cannot be called directly. Its accuracy and efficiency are lower than that of automatic derivation
- Analytical derivatives: analytical derivatives. It is used when derivatives have closed analytical form. It needs to manage residuals and Jacobian matrix by itself.
Ceres officially recommends automatic derivation:
ceres::AutoDiffCostFunction<CostFunctor, int residualDim, int paramDim>(CostFunctor* functor);
- CostFunctor: the cost function built in the first step
- residualDim: residual block latitude, that is, the output latitude of the model
- paramDim: parameter block latitude, that is, the input latitude of the model
- Function: pointer, which is the construction of cost function
For example, the problems in handwritten Gauss Newton method can be constructed as follows:
ceres::AutoDiffCostFunction<CURVE_FITTING_COST, 1, 3>(new CURVE_FITTING_COST(x_data[i], y_data[i]))
Among them, the residual latitude is 1, that is, the error in the problem e i e_i ei; The parameter latitude is 3, that is, the latitude of the curve parameter a , b , c a,b,c a,b,c; The passed in parameters are used to build the cost function of each time.
For this problem, the configuration of the construction residual term is as follows:
for(int i = 0; i < 100; i++){ // Residual term configuration problem.AddResidualBlock(new ceres::AutoDiffCostFunction<CURVE_FITTING_COST, 1, 3>(new CURVE_FITTING_COST(x_data[i], y_data[i])), nullptr, abc); }
Here, the loss function (unconstrained) is not used, so it is passed into nullptr; The parameter block is a three-dimensional matrix abc
other
See the reference blog: Ceres detailed explanation (I) Problem class
There are some contents about Problem::AddParameterBlock() and other member functions
Solving the least squares problem
Use ceres::Solve to solve, and its function prototype is as follows:
void Solve(const Solver::Options& options, Problem* problem, Solver::Summary* summary)
- Options: configuration of solver and configuration options of solution
- Problem: the problem to be solved, that is, the least squares problem we build
- summary: the optimization information of the solution, which is used to store the optimization information in the solution process
The configuration of the solver is described as follows:
Solver::Options
- minimizer_type: iterative solution method. Options are as follows:
- TRUST_REGION: trust region method, default value
- LINEAR_SEARCH: linear search method
- Parameters usually remain at their default values
- trust_ region_ strategy_ The trust domain policy is as follows:
- LEVENBERG_MARQUARDT, lewenberg Marquardt method, default
- DOGLEG: dog leg method, commonly known as DOGLEG method
- linear_solver_type: method of solving linear equations
- DENSE_QR: QR decomposition method, default value, used for small-scale least squares solution
- DENSE_NORMAL_CHOLESKY and SPARSE_NORMAL_CHOLESKY: CHolesky decomposition, which is used for solving large-scale nonlinear least squares with sparsity
- CGNR: conjugate gradient method for solving sparse equations
- DENSE_ SCUR and spark_ SCHUR: SCHUR decomposition, used to solve BA problems
- ITERATIVE_SCHUR: conjugate gradient SCHUR, used to solve BA problem
- num_threads: number of threads used for solving
- minimizer_progress_to_stdout: whether to output the optimization information to the terminal
- bool type, false by default. If set to true, the output is different according to the iteration method:
- Trust region method
- cost: the value of the current objective function
- cost_change: objective function change caused by current parameter change
- |Gradient|: the module length of the current gradient
- |step|: parameter variation
- tr_ratio: the ratio of the actual change of the objective function to the change of the objective function in the trust region
- tr_radius: trust region radius
- ls_iter: number of iterations of the linear solver
- iter_time: current iteration time
- total_time: total optimization time
- Linear search method
- f: The value of the current objective function
- d: Objective function change caused by current parameter change
- g: Module length of current gradient
- h: Parameter variation
- s: Linear search optimal step size
- it: current iteration time
- tt: total optimization time
In the handwritten Gauss Newton method problem, the following configuration is carried out:
// Options ceres::Solver::Options options; // cholesky decomposition options.linear_solver_type = ceres::DENSE_NORMAL_CHOLESKY; // Number of threads options.num_threads = 4; // Output optimization information options.minimizer_progress_to_stdout = true;
Solver::Summary
Solver::Summary is the information of solver and various variables. The common member functions are as follows:
- Brief report (): output a simple summary of a single line;
- FullReport(): output a complete summary of multiple lines.
Ceres Library: handwritten Gauss Newton method
source code
The code is as follows:
#include "main.h" #include <chrono> /*-------- Build CostFunction--------*/ struct CURVE_FITTING_COST{ // Constructor CURVE_FITTING_COST(double x, double y): _x(x), _y(y){} // Overload operator () template<typename T> bool operator()(const T *const abc, T *residual) const { // y - exp(ax^2 + bx + c) residual[0] = T(_y) - ceres::exp(abc[0] * T(_x) * T(_x) + abc[1] * T(_x) + abc[2]); return true; } // Member variables that store values const double _x, _y; }; int main() { /*-------- Initial parameter configuration--------*/ // Actual curve parameters double ar = 1.0, br = 1.0, cr = 1.0; // Initial value of estimated curve parameters double ae = 2.0, be = 1.0, ce = 5.0; // Number of sampling observation data points int N = 100; // Noise standard deviation and its reciprocal double w_sigma = 1.0; // Random number generator cv::RNG rng; /*-------- Observation data generation--------*/ vector<double> x_data, y_data; for(int i = 0; i < N; i++){ double x = i / 100.0; x_data.push_back(x); y_data.push_back(exp(ar * x * x +br * x + cr) + rng.gaussian(w_sigma * w_sigma)); } // Parameter block double abc[3] = {ae, be, ce}; /*-------- Constructing least squares problem--------*/ // instantiation ceres::Problem problem; for(int i = 0; i < N; i++){ // Residual term configuration problem.AddResidualBlock(new ceres::AutoDiffCostFunction<CURVE_FITTING_COST, 1, 3>(new CURVE_FITTING_COST(x_data[i], y_data[i])), nullptr, abc); } /*-------- Solving the least squares problem--------*/ // Options ceres::Solver::Options options; // cholesky decomposition options.linear_solver_type = ceres::DENSE_NORMAL_CHOLESKY; // Number of threads options.num_threads = 4; // Output optimization information options.minimizer_progress_to_stdout = true; // summary ceres::Solver::Summary summary; // Solve start timing t1 chrono::steady_clock::time_point t1 = chrono::steady_clock::now(); // solve ceres::Solve(options, &problem, &summary); // Solution end timing t2 chrono::steady_clock::time_point t2 = chrono::steady_clock::now(); // Total solution time chrono::duration<double> time_used = chrono::duration_cast<chrono::duration<double>>(t2 - t1); cout << "solve time cost = " << time_used.count() << " seconds. " << endl; /*-------- Result output--------*/ cout << summary.BriefReport() << endl; cout << "estimated a,b,c = "; for(auto a:abc) cout << a << " "; cout << endl; return 0; }
output
The output contents are as follows:
iter cost cost_change |gradient| |step| tr_ratio tr_radius ls_iter iter_time total_time 0 4.631766e+07 0.00e+00 9.35e+07 0.00e+00 0.00e+00 1.00e+04 0 1.30e-03 1.76e-03 1 6.190334e+06 4.01e+07 1.27e+07 9.68e-01 8.66e-01 1.65e+04 1 1.29e-03 3.80e-03 2 8.093942e+05 5.38e+06 1.73e+06 9.43e-01 8.69e-01 2.76e+04 1 1.03e-03 4.89e-03 3 9.975023e+04 7.10e+05 2.38e+05 8.72e-01 8.77e-01 4.83e+04 1 1.19e-03 6.12e-03 4 1.050054e+04 8.92e+04 3.31e+04 7.14e-01 8.95e-01 9.53e+04 1 1.34e-03 7.51e-03 5 7.599604e+02 9.74e+03 4.54e+03 5.14e-01 9.32e-01 2.69e+05 1 1.13e-03 8.70e-03 6 6.636318e+01 6.94e+02 4.62e+02 5.17e-01 9.78e-01 8.07e+05 1 1.20e-03 9.95e-03 7 5.102078e+01 1.53e+01 1.40e+01 2.57e-01 9.98e-01 2.42e+06 1 1.13e-03 1.11e-02 8 5.100209e+01 1.87e-02 1.87e-02 1.44e-02 9.99e-01 7.26e+06 1 9.99e-04 1.22e-02 solve time cost = 0.0123565 seconds. Ceres Solver Report: Iterations: 9, Initial cost: 4.631766e+07, Final cost: 5.100209e+01, Termination: CONVERGENCE estimated a,b,c = 0.877573 1.21245 0.931249