Teach you to draw in Excel in Python

Posted by junebug on Thu, 23 Dec 2021 08:21:43 +0100

I've seen many headlines before, saying that someone in which country has insisted on self-study and drawing with excel for many years, and the effect is very amazing. I admire their patience. But as a programmer, I naturally have to challenge myself. This demand can be completed in ten minutes!

Basic ideas

The basic idea to realize this requirement is to read the color value of each pixel of the picture, and then fill each cell in excel with color. Therefore, PIL and openpyxl libraries are mainly used.

PIL use

PIL is a very common library for image processing in Python, and its function is also very powerful. Only a small part of the functions in pil are needed here.

from PIL import Image
img = Image.open(img_path) 
width, height = img.size 
r, g, b = img.getpixel((w - 1, h - 1)) 

Image.open() is a function to open a picture in PIL. It supports multiple picture types img_path is a picture path, which can be a relative path or an absolute path img Size is the size attribute of the image, including the width and height img GetPixel () is a function to obtain the color value of the picture. You need to pass in a tuple or list, and the value is the pixel coordinate xy

openpyxl using

openpyxl is almost the most fully functional library for operating excel files in Python, and only a small part of its functions are needed here.

import openpyxl
from openpyxl.styles import fills

workbook = openpyxl.Workbook()
worksheet = workbook.active
cell.fill = fills.PatternFill(fill_type="solid", fgColor=hex_rgb)
workbook.save(out_file)

openpyxl.Workbook() creates a new excel file workbook Active activates a worksheet cell fill = fills. Patternfill (fill_type="solid", fgColor=hex_rgb) fills a cell, fill_type="solid" is the fill type, fgColor=hex_rgb is the fill color workbook Save() to save a file, you need to pass in the file name to save

Write a piece of code

The core that needs to be used to write this drawing requirement is the usage of PIL and openpyxl described above. However, when actually writing, there will be some other problems, such as:

1. The color value obtained by GetPixel () is rgb decimal, but fills The color value received by the fgColor ` parameter in patternfill is a hexadecimal value

This problem is actually decimal to hexadecimal, which is easy to solve

def int_to_16(num):
    num1 = hex(num).replace('0x', '')
    num2 = num1 if len(num1) > 1 else '0' + num1 # When there is only one digit, fill zero in front
    return num2

2.excel cells are rectangular by default. Changing to square will not deform the picture

if h == 1:
  _w = cell.column
  _h = cell.col_idx
  # Adjust column width
  worksheet.column_dimensions[_w].width = 1

# Adjust row height
worksheet.row_dimensions[h].height = 6
 There's a double for Loop, the outer layer is`width`,The inner layer is`height`,Is a column by column fill color, so judge`if h == 1`,Avoid adjusting column widths multiple times.

3.excel supports a limited number of styles

This problem is quite serious. If you directly operate the high-definition large image, the final output excel file may prompt us that there is a problem with the file and needs to be repaired automatically when it is opened.

But after it is repaired, you will find that all the filled colors have disappeared!


Error message

At first, I thought it was because too many rows and columns were used.

After querying the data, it is found that the number of large rows supported by version 13 excel is 1048576 and the maximum number of columns is 16384. The number of cells we use is far from the limit.

The cause of the problem was found only after various tests such as changing pictures, changing excel version and modifying codes.

The original reason is that the original form of excel consists of multiple xml files, and the filled colors are stored in a style In the xml file, an error will be reported when the file is too large.

So in order to solve this problem, there are two solutions. The first is to reduce the picture and the second is to reduce the picture color. When reducing the picture, it has the function of reducing the picture color. The method of reducing the picture color can adopt grayscale, binarization and other methods.

Generally speaking, it is necessary to control the number of colors * the number of cells < the threshold (about 3300w).

MAX_WIDTH = 300
MAX_HEIGHT = 300
def resize(img):
    w, h = img.size
    if w > MAX_WIDTH:
        h = MAX_WIDTH / w * h
        w = MAX_WIDTH

    if h > MAX_HEIGHT:
        w = MAX_HEIGHT / h * w
        h = MAX_HEIGHT
    return img.resize((int(w), int(h)), Image.ANTIALIAS)

Final effect

Heaven pays off those who have a heart. Open the final output excel and you can see the effect!

Therefore, all problems that can be solved with Python will eventually be solved with Python.

All codes

# draw_excel.py

from PIL import Image
import openpyxl
from openpyxl.styles import fills
import os

MAX_WIDTH = 300
MAX_HEIGHT = 300

def resize(img):
    w, h = img.size
    if w > MAX_WIDTH:
        h = MAX_WIDTH / w * h
        w = MAX_WIDTH

    if h > MAX_HEIGHT:
        w = MAX_HEIGHT / h * w
        h = MAX_HEIGHT
    return img.resize((int(w), int(h)), Image.ANTIALIAS)


def int_to_16(num):
    num1 = hex(num).replace('0x', '')
    num2 = num1 if len(num1) > 1 else '0' + num1
    return num2


def draw_jpg(img_path):

    img_pic = resize(Image.open(img_path))
    img_name = os.path.basename(img_path)
    out_file = './result/' + img_name.split('.')[0] + '.xlsx'
    if os.path.exists(out_file):
        os.remove(out_file)

    workbook = openpyxl.Workbook()
    worksheet = workbook.active

    width, height = img_pic.size

    for w in range(1, width + 1):

        for h in range(1, height + 1):
            if img_pic.mode == 'RGB':
                r, g, b = img_pic.getpixel((w - 1, h - 1))
            elif img_pic.mode == 'RGBA':
                r, g, b, a = img_pic.getpixel((w - 1, h - 1))

            hex_rgb = int_to_16(r) + int_to_16(g) + int_to_16(b)

            cell = worksheet.cell(column=w, row=h)

            if h == 1:
                _w = cell.column
                _h = cell.col_idx
                # Adjust column width
                worksheet.column_dimensions[_w].width = 1
            # Adjust row height
            worksheet.row_dimensions[h].height = 6

            cell.fill = fills.PatternFill(fill_type="solid", fgColor=hex_rgb)

        print('write in:', w, '  |  all:', width + 1)
    print('saving...')
    workbook.save(out_file)
    print('success!')

if __name__ == '__main__':
    draw_jpg('mona-lisa.jpg')

Attach=

When it comes to the number of colors * number of cells < threshold (about 2564), some people may wonder how this 2564 came from.

Of course, I got this from my test.

Since there are two variables: the number of colors and the number of cells, it is natural to have more than two test methods. A number of observation colors and a number of observation cells.

But I only did a test of the number of colors here. (the biggest reason is that it takes too long to generate tens of thousands of rows * tens of thousands of columns of excel... Lazy...)

    count = 0
    MAX_WIDTH = 255
    for w in range(1, MAX_WIDTH + 1):
        for h in range(1, MAX_WIDTH + 1):
            cell = worksheet.cell(column=w, row=h)
            if h == 1:
                _w = cell.column
                _h = cell.col_idx
                # Adjust column width
                worksheet.column_dimensions[_w].width = 1
            # Adjust row height
            worksheet.row_dimensions[h].height = 6

            if count < 255 ** 3:
                back = int_to_16(num=count)
                back = '0' * (6 - len(back)) + back
            else:
                back = ''.join([int_to_16(random.randint(0, 255)) for _ in range(3)])
            cell.fill = fills.PatternFill(fill_type="solid", fgColor=back)
            count += 1

count is a variable for recording colors to ensure that each color is not repeated. However, at present, there are only 256 ^ 3 colors represented by RGB on the computer. Adjust max_ The value of width is used to test the threshold of excel

Finally, the generated test excel is as follows:

Look.!??

last

Due to limited energy, ~ lazy ~, limited ability ~ dishes ~, there is no single color test, and there may be other methods without the limit of this threshold. I like this article and pay attention to it. Thank you for reading.

Topics: Python