12-[LVI-SAM]visual_feature_ Optical flow tracking algorithm

Posted by Bryan Ando on Sun, 05 Dec 2021 16:42:27 +0100

2021SC@SDUSC

visual_feature_ Optical flow tracking algorithm

In the last blog, we briefly analyzed img_callback is a callback function, but the functions used in it are not analyzed. Especially the optical flow tracking algorithm we have mentioned before.

This blog mainly analyzes the readImage function mentioned in the previous blog.

0. Image processing and preparation

a. Image processing

If the image is too bright or too dark, histogram equalization is required. Where equal indicates that the image is too bright or too dark.

Histogram equalization is a method to enhance image contrast. Its main idea is to histogram The distribution becomes approximately uniform, thereby enhancing the contrast of the image. Although histogram equalization is only the basic method in digital image processing, it plays a very powerful role and is a very classic algorithm.

In LVI-SAM, the algorithm we deal with is limited contrast adaptive histogram equalization (CLAHE). The effects are as follows:

After processing:

After processing, you can see that the light and dark contrast of the image is more obvious. Bright places are brighter and dark places are darker, opening the gap. Therefore, the amount of information of the image is also increased to facilitate subsequent processing.

// Initialization of some variables needed
cv::Mat img;
TicToc t_r;
cur_time = _cur_time;

// If it is 1, it means that it is too bright or too dark, and histogram equalization is required
if (EQUALIZE)
{
    // Using adaptive histogram equalization
    cv::Ptr<cv::CLAHE> clahe = cv::createCLAHE(3.0, cv::Size(8, 8));
    TicToc t_c;
    clahe->apply(_img, img);
    ROS_DEBUG("CLAHE costs: %fms", t_c.toc());
}
else
    img = _img;

b. Data preparation

Before formally starting the algorithm, it is necessary to clarify some variables and judge whether it is the first frame.

// If the image data forw of the current frame_ If img is empty, it indicates that the image data frame is read in for the first time
if (forw_img.empty())
{
    // Assign the read image to the current frame forw_img, which is also assigned to prev_img,cur_img
    prev_img = cur_img = forw_img = img;
}
else
{
    // This is not the first time. It means that an image has been read in before, and only the current frame forw needs to be updated_ IMG data
    forw_img = img;
}

// Now forw_pts may also save the feature points in the previous frame, so it is cleared
forw_pts.clear();

1. LK pyramid optical flow tracking

In this part of the code, call cv::calcOpticalFlowPyrLK() to the feature point cur of the previous frame_ PTS performs LK pyramid optical flow tracking to obtain forw_pts. Status marks the cur of the previous frame_ IMG to forw_ The tracking status of img feature points. Points that cannot be tracked are marked as 0. And eliminate the failed feature points. Finally, update the tracking times.

Introduction to LK pyramid optical flow tracking

if (cur_pts.size() > 0)
{
    TicToc t_o;
    vector<uchar> status;
    vector<float> err;
    
    // Call cv::calcOpticalFlowPyrLK() to cur the feature point of the previous frame_ PTS performs LK pyramid optical flow tracking to obtain forw_pts
    // Status marks the cur of the previous frame_ IMG to forw_ The tracking status of img feature points. Points that cannot be tracked are marked as 0
    cv::calcOpticalFlowPyrLK(cur_img, forw_img, cur_pts, forw_pts, status, err, cv::Size(21, 21), 3);

    // Mark the point outside the image boundary as 0
    for (int i = 0; i < int(forw_pts.size()); i++)
        if (status[i] && !inBorder(forw_pts[i]))
            status[i] = 0;
    
    // According to the status, the tracking failed points are eliminated
    // Not only from the current frame data forw_pts, but also from cur_un_pts,prev_pts and cur_ Elimination in PTS
    // prev_pts and cur_ The feature points in PTS correspond to each other one by one
    // ids that records the feature point id and track that records the number of times the feature point is tracked_ CNT should also be eliminated
    reduceVector(prev_pts, status);
    reduceVector(cur_pts, status);
    reduceVector(forw_pts, status);
    reduceVector(ids, status);
    reduceVector(cur_un_pts, status);
    reduceVector(track_cnt, status);
    ROS_DEBUG("temporal optical flow costs: %fms", t_o.toc());
}

// If the optical flow tracking is successful, the number of times the feature points are successfully tracked will be increased by 1. The value represents the number of times they are tracked. The larger the value, the longer they are tracked
for (auto &n : track_cnt)
    n++;

2. Publish feature points

If you need to publish feature points, you need to extract them.

if (PUB_THIS_FRAME)
{
    //Eliminate outliers through basic matrix
    rejectWithF();
    
    ROS_DEBUG("set mask begins");
    TicToc t_m;
    setMask(); //Ensure that adjacent feature points are separated by 30 pixels, and set the mask
    ROS_DEBUG("set mask costs %fms", t_m.toc());

    ROS_DEBUG("detect feature begins");
    TicToc t_t;
    
    // Calculate whether new feature points need to be extracted
    int n_max_cnt = MAX_CNT - static_cast<int>(forw_pts.size());
    if (n_max_cnt > 0)
    {
        if(mask.empty())
            cout << "mask is empty " << endl;
        if (mask.type() != CV_8UC1)
            cout << "mask type wrong " << endl;
        if (mask.size() != forw_img.size())
            cout << "wrong size " << endl;
        cv::goodFeaturesToTrack(forw_img, n_pts, MAX_CNT - forw_pts.size(), 0.01, MIN_DIST, mask);
    }
    else
        n_pts.clear();
    ROS_DEBUG("detect feature costs: %fms", t_t.toc());

    ROS_DEBUG("add feature begins");
    TicToc t_a;
    // Add the newly detected feature point n_pts added to forw_ In PTS, id initializes - 1,track_cnt is initialized to 1
    addPoints();
    ROS_DEBUG("selectFeature costs: %fms", t_a.toc());
}

You can see that several self-defined functions are used here.

a. rejectWithF()

Eliminate outliers through basic matrix

void FeatureTracker::rejectWithF()
{
    if (forw_pts.size() >= 8)
    {
        ROS_DEBUG("FM ransac begins");
        TicToc t_f;
        vector<cv::Point2f> un_cur_pts(cur_pts.size()), un_forw_pts(forw_pts.size());
        for (unsigned int i = 0; i < cur_pts.size(); i++)
        {
            Eigen::Vector3d tmp_p;
            // Convert 2D coordinates to 3D coordinates according to different camera models
            m_camera->liftProjective(Eigen::Vector2d(cur_pts[i].x, cur_pts[i].y), tmp_p);
            // Convert to normalized pixel coordinates
            tmp_p.x() = FOCAL_LENGTH * tmp_p.x() / tmp_p.z() + COL / 2.0;
            tmp_p.y() = FOCAL_LENGTH * tmp_p.y() / tmp_p.z() + ROW / 2.0;
            un_cur_pts[i] = cv::Point2f(tmp_p.x(), tmp_p.y());

            m_camera->liftProjective(Eigen::Vector2d(forw_pts[i].x, forw_pts[i].y), tmp_p);
            tmp_p.x() = FOCAL_LENGTH * tmp_p.x() / tmp_p.z() + COL / 2.0;
            tmp_p.y() = FOCAL_LENGTH * tmp_p.y() / tmp_p.z() + ROW / 2.0;
            un_forw_pts[i] = cv::Point2f(tmp_p.x(), tmp_p.y());
        }

        vector<uchar> status;
        // Call cv::findFundamentalMat to un_cur_pts and un_forw_pts calculating F matrix
        cv::findFundamentalMat(un_cur_pts, un_forw_pts, cv::FM_RANSAC, F_THRESHOLD, 0.99, status);
        int size_a = cur_pts.size();
        reduceVector(prev_pts, status);
        reduceVector(cur_pts, status);
        reduceVector(forw_pts, status);
        reduceVector(cur_un_pts, status);
        reduceVector(ids, status);
        reduceVector(track_cnt, status);
        ROS_DEBUG("FM ransac: %d -> %lu: %f", size_a, forw_pts.size(), 1.0 * forw_pts.size() / size_a);
        ROS_DEBUG("FM ransac costs: %fms", t_f.toc());
    }
}

b. setMask()

Sort tracking points and remove dense points. The tracked feature points are sorted according to the number of times tracked and selected in turn. The mask is used for similar non maximum suppression with a radius of 30. The dense points are removed to make the feature points evenly distributed.

void FeatureTracker::setMask()
{
    // Do one more treatment for the fisheye lens
    if(FISHEYE)
        mask = fisheye_mask.clone();
    else
        mask = cv::Mat(ROW, COL, CV_8UC1, cv::Scalar(255));
    

    // prefer to keep features that are tracked for long time
    vector<pair<int, pair<cv::Point2f, int>>> cnt_pts_id; // Construct (cnt, pts, id) sequences

    for (unsigned int i = 0; i < forw_pts.size(); i++)
        cnt_pts_id.push_back(make_pair(track_cnt[i], make_pair(forw_pts[i], ids[i])));

    // Forw for feature points tracked by optical flow_ PTS, sorted from large to small according to the number of times cnt is tracked (lambda expression)
    sort(cnt_pts_id.begin(), cnt_pts_id.end(), [](const pair<int, pair<cv::Point2f, int>> &a, const pair<int, pair<cv::Point2f, int>> &b)
         {
            return a.first > b.first;
         });
	
    // Clear cnt, pts, id and store again
    forw_pts.clear();
    ids.clear();
    track_cnt.clear();

    for (auto &it : cnt_pts_id)
    {
        if (mask.at<uchar>(it.second.first) == 255)
        {
             // If the mask value corresponding to the current feature point position is 255, the current feature point is retained, and the corresponding feature point positions pts, id and tracked times cnt are stored respectively
            forw_pts.push_back(it.second.first);
            ids.push_back(it.second.second);
            track_cnt.push_back(it.first);
            
            // Set the radius around the current feature point as min in the mask_ The dist area is set to 0, and the points in this area will not be selected later (so that the tracking points are not concentrated in one area)
            cv::circle(mask, it.second.first, MIN_DIST, 0, -1);
        }
    }
}

c. addPoints()

Add a new tracking point.

void FeatureTracker::addPoints()
{
    for (auto &p : n_pts)
    {
        forw_pts.push_back(p);
        ids.push_back(-1); // The newly extracted feature point id is initialized to - 1
        track_cnt.push_back(1); // The number of times the newly extracted feature points are tracked is initialized to 1
    }
}

3. End

//When the next frame image arrives, the current frame data becomes the data published in the previous frame
prev_img = cur_img;
prev_pts = cur_pts;
prev_un_pts = cur_un_pts;

//Forw the data of the current frame_ img,forw_pts assigned to cur of the previous frame_ img,cur_pts
cur_img = forw_img;
cur_pts = forw_pts;

//According to different camera models, the distortion is corrected and converted to the normalized coordinate system to calculate the speed
undistortedPoints();
prev_time = cur_time;

a. undistortedPoints()

The image coordinates of diagonal points are de distorted and corrected, and the velocity of each corner point is calculated

void FeatureTracker::undistortedPoints()
{
    cur_un_pts.clear();
    cur_un_pts_map.clear();
    //cv::undistortPoints(cur_pts, un_pts, K, cv::Mat());
    for (unsigned int i = 0; i < cur_pts.size(); i++)
    {
        Eigen::Vector2d a(cur_pts[i].x, cur_pts[i].y);
        Eigen::Vector3d b;
        
        // Convert 2D coordinates to 3D coordinates according to different camera models
        m_camera->liftProjective(a, b);
        
        // Then extend to the depth normalization plane
        cur_un_pts.push_back(cv::Point2f(b.x() / b.z(), b.y() / b.z()));
        cur_un_pts_map.insert(make_pair(ids[i], cv::Point2f(b.x() / b.z(), b.y() / b.z())));
        //printf("cur pts id %d %f %f", ids[i], cur_un_pts[i].x, cur_un_pts[i].y);
    }
    
    // caculate points velocity
    // Calculate the velocity of each feature point to pts_velocity
    if (!prev_un_pts_map.empty())
    {
        double dt = cur_time - prev_time;
        pts_velocity.clear();
        for (unsigned int i = 0; i < cur_un_pts.size(); i++)
        {
            if (ids[i] != -1)
            {
                std::map<int, cv::Point2f>::iterator it;
                it = prev_un_pts_map.find(ids[i]);
                if (it != prev_un_pts_map.end())
                {
                    double v_x = (cur_un_pts[i].x - it->second.x) / dt;
                    double v_y = (cur_un_pts[i].y - it->second.y) / dt;
                    pts_velocity.push_back(cv::Point2f(v_x, v_y));
                }
                else
                    pts_velocity.push_back(cv::Point2f(0, 0));
            }
            else
            {
                pts_velocity.push_back(cv::Point2f(0, 0));
            }
        }
    }
    else
    {
        for (unsigned int i = 0; i < cur_pts.size(); i++)
        {
            pts_velocity.push_back(cv::Point2f(0, 0));
        }
    }
    prev_un_pts_map = cur_un_pts_map;
}

Appendix: Code

void FeatureTracker::readImage(const cv::Mat &_img, double _cur_time)
{
    cv::Mat img;
    TicToc t_r;+
    cur_time = _cur_time;

    if (EQUALIZE)
    {
        cv::Ptr<cv::CLAHE> clahe = cv::createCLAHE(3.0, cv::Size(8, 8));
        TicToc t_c;
        clahe->apply(_img, img);
        ROS_DEBUG("CLAHE costs: %fms", t_c.toc());
    }
    else
        img = _img;

    if (forw_img.empty())
    {
        prev_img = cur_img = forw_img = img;
    }
    else
    {
        forw_img = img;
    }

    forw_pts.clear();

    if (cur_pts.size() > 0)
    {
        TicToc t_o;
        vector<uchar> status;
        vector<float> err;
        cv::calcOpticalFlowPyrLK(cur_img, forw_img, cur_pts, forw_pts, status, err, cv::Size(21, 21), 3);

        for (int i = 0; i < int(forw_pts.size()); i++)
            if (status[i] && !inBorder(forw_pts[i]))
                status[i] = 0;
        reduceVector(prev_pts, status);
        reduceVector(cur_pts, status);
        reduceVector(forw_pts, status);
        reduceVector(ids, status);
        reduceVector(cur_un_pts, status);
        reduceVector(track_cnt, status);
        ROS_DEBUG("temporal optical flow costs: %fms", t_o.toc());
    }

    for (auto &n : track_cnt)
        n++;

    if (PUB_THIS_FRAME)
    {
        rejectWithF();
        ROS_DEBUG("set mask begins");
        TicToc t_m;
        setMask();
        ROS_DEBUG("set mask costs %fms", t_m.toc());

        ROS_DEBUG("detect feature begins");
        TicToc t_t;
        int n_max_cnt = MAX_CNT - static_cast<int>(forw_pts.size());
        if (n_max_cnt > 0)
        {
            if(mask.empty())
                cout << "mask is empty " << endl;
            if (mask.type() != CV_8UC1)
                cout << "mask type wrong " << endl;
            if (mask.size() != forw_img.size())
                cout << "wrong size " << endl;
            cv::goodFeaturesToTrack(forw_img, n_pts, MAX_CNT - forw_pts.size(), 0.01, MIN_DIST, mask);
        }
        else
            n_pts.clear();
        ROS_DEBUG("detect feature costs: %fms", t_t.toc());

        ROS_DEBUG("add feature begins");
        TicToc t_a;
        addPoints();
        ROS_DEBUG("selectFeature costs: %fms", t_a.toc());
    }
    prev_img = cur_img;
    prev_pts = cur_pts;
    prev_un_pts = cur_un_pts;
    cur_img = forw_img;
    cur_pts = forw_pts;
    undistortedPoints();
    prev_time = cur_time;
}

Appendix: References

ManiiXu/VINS-Mono-Learning: VINS-Mono

Histogram equalization - Zhihu

Topics: Algorithm Computer Vision image processing

Programmer Think