Opencv learning - basic image operation

Posted by psychowolvesbane on Mon, 17 Jan 2022 08:50:07 +0100

Data reading - pictures

read

The smallest component of each image is pixels. It is composed of many pixels. The range of each pixel is 0-255 (0 is black, the brightness is the darkest, 255 is white, and the brightness is the brightest)
RGB three color space. Generally, pictures are three channels. It can be regarded as represented by a three-layer matrix. The corresponding different pixels will have a corresponding value in the three color channels

import cv2 as cv    #The format read by opencv is BGR
import numpy as np
import matplotlib.pyplot as plt
#You don't need to call show() to generate directly after the drawing is completed
%matplotlib inline 
img = cv.imread('1.jpg')

The reading parameter is: picture name, which will be searched in the current directory
After reading, the result is: a three channel numpy array form

Look at the shape:

img.shape

This is the corresponding result:

(400, 400, 3)

be careful:
1. The form is the form of [h,w,c]
2. The reading order of OpenCV and PIL for three channels is different. The format read by opencv is BG, and RPIL(python image processing library) is opened using RGB

Exhibition

#Image display, you can also create multiple windows
cv.imshow('image',img)
#Waiting / disappearing time, in milliseconds, 0 indicates any key termination, not automatic termination
cv.waitKey(0)
cv.destroyAllWindows()

Here, there are two points to remember:
1.cv.waitKey(0)
Here is the display / disappearance time of the picture. If the parameter is 0, it means that you press any key to exit automatically. If it is not 0, the parameter is the time set by yourself, in milliseconds
2.cv.destroyAllWindows()
Official answer:
You can call destroyWindow() or destroyAllWindows() to close the window and deallocate any associated memory usage. For a simple program, you don't actually need to call these functions, because the operating system will automatically close all resources and windows of the application when you exit
Generally speaking, that is:
If there is no previous operation to free the memory, destroyallWIndows will free the memory occupied by that variable

We aggregate the contents of the whole picture into a function:

def cv_show(name,img):
    cv.imshow(name,img)
    cv.waitKey(0)
    cv.destroyAllWindows()

In this way, you can directly use this function when displaying later pictures

Grayscale

img2 = cv.imread('1.jpg',cv.IMREAD_GRAYSCALE)
cv2 IMREAD_COLOR color image 
cv2 IMREAD_GRAYSCALE Gray image

We can show the pictures to observe the differences between the pictures:
IMG is the initial image. It is a three channel color image, while img2 is a grayscale image. It has only one channel and only needs brightness to express.


View img2's shape

img2.shape

Results obtained:

(400, 400)#Only [h,w], no c

preservation

cv.imwrite('1_gray.jpg',img2)

Look at the picture type

cv.imwrite('1_gray.jpg',img2)

This is the result:

numpy.ndarray

Look at the number of pixels in the picture

img.size

This is the result:

480000

Look at the range of each value of the matrix corresponding to the picture

img.dtype

This is the result:

dtype('uint8')

Data reading – video

vc = cv.VideoCapture('1.mp4')
#Check whether it is opened correctly
if vc.isOpened():
    open,frame = vc.read()
else:
    open = False
while open:
    #Parameters: ret: whether the current frame is read correctly, frame: the three channel matrix corresponding to each frame image
    ret,frame = vc.read()
    if frame is None:
        break
    if ret == True:
        gray = cv.cvtColor(frame,cv.COLOR_BGR2GRAY)
        cv.imshow('result',gray)
        #You can set how long you want him to wait or exit automatically. You can set the key to close. 27 corresponds to the exit key
        if cv.waitKey(10) % 0xFF == 27:
            break
vc.release()
cv.destroyAllWindows()

It is equivalent to reading video frame by frame, which is equivalent to processing pictures

ROI (area of interest)

Intercepting part of image data

img = cv.imread('1.jpg')
tree = img[0:200,0:200]
cv_show('tree',tree)

In this way, we get the height and width we want, and then intercept all three channels

Image color channel extraction

b,g,r = cv.split(img)

See what b is:

array([[190, 192, 194, ..., 175, 165, 159],
       [190, 191, 193, ..., 183, 158, 180],
       [192, 192, 193, ..., 172, 159, 194],
       ...,
       [ 87,  62, 146, ...,  20,  18,  20],
       [ 68,  50, 173, ...,  22,  20,  23],
       [ 72,  54, 179, ...,  21,  22,  25]], dtype=uint8)

The picture above is the representation of single matrix:
To process the channels separately and then combine them together:

img=cv.merge((b,g,r))

Keep only the writing of R
Because the reading method of opencv is the reading method of BGR, if you want to retain a certain layer of channels, just modify the last number of channels.

#Keep only R
cur_img = img.copy()
cur_img[:,:,0]=0
cur_img[:,:,1]=0
cv_show('R',cur_img)
#Keep only G
cur_img = img.copy()
cur_img[:,:,0]=0
cur_img[:,:,2]=0
cv_show('R',cur_img)
#Keep only B
cur_img = img.copy()
cur_img[:,:,1]=0
cur_img[:,:,2]=0
cv_show('R',cur_img)

Boundary fill

When reading data, we usually take the form of reading convolution kernel, that is, [3 * 3] or [5 * 5], that is, we will read the data around it as a reference. Therefore, when we read the edge data, we need to fill the boundary. Yes, we can read all valid information to prevent loss.

top_size,bottom_size,left_size,right_size = (50,50,50,50)
#The difference is, borderType: what is the filling method
replicateto = cv.copyMakeBorder(img,top_size,bottom_size,left_size,right_size,borderType=cv.BORDER_REPLICATE)
reflect = cv.copyMakeBorder(img,top_size,bottom_size,left_size,right_size,borderType=cv.BORDER_REFLECT)
reflect101 = cv.copyMakeBorder(img,top_size,bottom_size,left_size,right_size,borderType=cv.BORDER_REFLECT_101)
wrap = cv.copyMakeBorder(img,top_size,bottom_size,left_size,right_size,borderType=cv.BORDER_WRAP)
#value is the parameter. Here, set the parameter to 0, that is, black. You can also set other colors
constant = cv.copyMakeBorder(img,top_size,bottom_size,left_size,right_size,borderType=cv.BORDER_CONSTANT,value = 0)

1.top_size,bottom_size,left_size,right_size means filling the top, bottom, left and right of the picture respectively
2.borderType

BORDER_REPLICATE: Copy method, that is, copy the most edge pixels
BORDER_REFLECT: Reflection method: copy the pixels in the image of interest on both sides, for example: fedcba|abcdefgh|hgfedcb
BORDER_REFLECT_101: Reflection method: that is, it is symmetrical with the edge pixel as the axis, gfedcb|abcdefgh|gfedcba
BORDER_WRAP: Outer packaging method, cdefgh|abcdefgh|abcdefg
BORDER_CONSTANT: Constant value filling( value That's the parameter. Here, set the parameter to 0, that is, black, or other colors)

The following is the actual operation:

import matplotlib.pyplot as plt
plt.subplot(231),plt.imshow(img,'gray'),plt.title('ORIGINAL')
plt.subplot(232),plt.imshow(replicateto,'gray'),plt.title('REPLICATE')
plt.subplot(233),plt.imshow(reflect,'gray'),plt.title('REFLECT')
plt.subplot(234),plt.imshow(reflect101,'gray'),plt.title('REFLECT_101')
plt.subplot(235),plt.imshow(wrap,'gray'),plt.title('WRAP')
plt.subplot(236),plt.imshow(constant,'gray'),plt.title('CONSTANT')

plt.show()

This is the display result of the code:

numerical calculation

img_tree = cv.imread('1.jpg')
img_flo = cv.imread('2.jpg')
#Addition and subtraction will be performed on each pixel
img_tree2 = img_tree+10
img_tree[:5,:,0]
img_tree[:5,:,0]

Here, because a picture matrix is relatively large, the system can go to the first 5 lines and observe the differences

img_tree[:5,:,0]
array([[190, 192, 194, ..., 175, 165, 159],
       [190, 191, 193, ..., 183, 158, 180],
       [192, 192, 193, ..., 172, 159, 194],
       [194, 194, 193, ..., 168, 180, 176],
       [195, 194, 194, ..., 153, 165, 151]], dtype=uint8)

Add and subtract on each pixel of the original picture

img_tree[:5,:,0]
array([[200, 202, 204, ..., 185, 175, 169],
       [200, 201, 203, ..., 193, 168, 190],
       [202, 202, 203, ..., 182, 169, 204],
       [204, 204, 203, ..., 178, 190, 186],
       [205, 204, 204, ..., 163, 175, 161]], dtype=uint8)

When two pictures are added or subtracted directly:
Note: two pictures must have the same shape to add or subtract

(img_tree+img_tree2)[:5,:,0]
#Equivalent to the sum of their corresponding pixels and% 256
array([[134, 138, 142, ..., 104,  84,  72],
       [134, 136, 140, ..., 120,  70, 114],
       [138, 138, 140, ...,  98,  72, 142],
       [142, 142, 140, ...,  90, 114, 106],
       [144, 142, 142, ...,  60,  84,  56]], dtype=uint8)

When using the add function:

cv.add(img_tree,img_tree2)[:5,:,0]
#When the sum of the corresponding pixels is greater than 255, that is, it is out of bounds, it is 255
array([[255, 255, 255, ..., 255, 255, 255],
       [255, 255, 255, ..., 255, 255, 255],
       [255, 255, 255, ..., 255, 255, 255],
       [255, 255, 255, ..., 255, 255, 255],
       [255, 255, 255, ..., 255, 255, 255]], dtype=uint8)

Image fusion

img_tree.shape
(400, 400, 3)
img_flo.shape
(1080, 1920, 3)

It can be found that their shapes are different, so we need to make the resize method to make their shapes the same

#The addition and subtraction must be the same as the shape. The parameters of resize, w,h
img_flo=cv.resize(img_flo,(400,400))
img_flo.shape

Then they can be fused, and they conform to a formula
R = ax + by + b
X and Y here represent picture 1 and picture 2
a represents the weight of picture 1, and b represents the weight of picture 2
b represents the offset

res = cv.addWeighted(img_tree,0.4,img_flo,0.6,0)

Zoom transformation

#For scaling transformation, the previous value is formulated as (0,0). You can not say the specific value, but you can say the multiple relationship between length and width. The following meaning is, x*3,y*1, that is, the length becomes 3 times and the height remains unchanged
res = cv.resize(img,(0,0),fx = 3,fy = 1)
plt.imshow(res)
#The following meaning is: change the first point into (0,0), then the length remains unchanged and the height becomes three times the original
res = cv.resize(img,(0,0),fx = 1,fy = 3)
plt.imshow(res)

We can also choose equal scale magnification:

res = cv.resize(img,(0,0),fx = 3,fy = 3)
plt.imshow(res)

Topics: OpenCV