MySQL data type

Posted by Dogrox on Sat, 05 Mar 2022 10:33:48 +0100

Transfer from Explain MySQL data types in detail , the content is slightly modified.

integer

data type Number of bytes Signed minimum Signed maximum Unsigned minimum Unsigned maximum
TINYINT 1 -128 127 0 255
SMALLINT 2 -32768 32767 0 65535
MEDIUMINT 3 -8388608 8388607 0 16777215
INT 4 -2147483648 2147483647 0 4294967295
BIGINT 8 -9223372036854775808 9223372036854775807 0 18446744073709551616

From the perspective of practical development, we must select the appropriate data type for the appropriate column, that is, do we use this data type or not? for instance:

  • If an enumeration field has only 0 and 1 enumeration values, TINYINT is enough, but BIGINT is used in the development scenario, resulting in a waste of resources
  • If there is 100W data in the data table, a total of 700W bytes, that is, about 6.7M, will be wasted. If more tables do so, more will be wasted

Run the following SQL statement:

drop table if exists test_tinyint;
create table test_tinyint (
    num tinyint
);

insert into test_tinyint values(-100);
insert into test_tinyint values(255);

When executing the code in line 7, the error Out of range value for column 'num' at row 1 is reported, that is, the inserted number range is out of bounds, which indicates that the integer in MySQL is signed by default.

Change the num field definition in line 3 to num tinyint unsigned, and the insertion in line 7 will not report an error, but the insertion - 100 in line 6 will report an error, because unsigned integers cannot represent negative numbers.

Integer (N) form

Sometimes, we will encounter some definitions of integers written as int(11). int(N) we only need to remember two points:

  • No matter how much N equals, int always takes up 4 bytes
  • When the width exceeds zero, the default value is zero. When the width exceeds zero, the default value is zero
drop table if exists test_int_width;
create table test_int_width (
    a int(5),
    b int(5) unsigned,
    c int(5) unsigned zerofill,
    d int(8) unsigned zerofill
);

insert into test_int_width values(1, 1, 1, 1111111111);

select * from test_int_width;
+------+------+-------+------------+
| a    | b    | c     | d          |
+------+------+-------+------------+
|    1 |    1 | 00001 | 1111111111 |
+------+------+-------+------------+
1 row in set (0.00 sec)

float

data type Number of bytes remarks
float 4 Single precision floating point
double 8 Double precision floating point

Take float as an example:

drop table if exists test_float;
create table test_float (
    num float(5, 2)
);

insert into test_float values(1.233);
insert into test_float values(1.237);
insert into test_float values(10.233);
insert into test_float values(100.233);
insert into test_float values(1000.233);
insert into test_float values(10000.233);
insert into test_float values(100000.233);

select * from test_float;
+--------+
| num    |
+--------+
|   1.23 |
|   1.24 |
|  10.23 |
| 100.23 |
+--------+
4 rows in set (0.00 sec)

In float(M,D) and double(M, D):

  • D represents the precision after the decimal point of floating-point data. If it exceeds D, it will be rounded. Therefore, 1.233 is rounded to 1.23 and 1.237 is rounded to 1.24
  • M stands for the total number of digits of floating-point data, and M stands for 5 digits in total, that is, only three digits are supported before the decimal point. Therefore, we do not see the insertion of 1000.23, 10000.233 and 100000.233 data, because the insertion reports an error (Out of range value for column 'num' at row 1)

When we do not specify M and D, it will be processed according to the actual accuracy.

Fixed point type

Fixed point type is decimal type. With floating point type, why do we need fixed-point type? Write a paragraph of SQL and you'll see:

drop table if exists test_decimal;
create table test_decimal (
    float_num float(10, 2),
    double_num double(20, 2),
    decimal_num decimal(20, 2)
);

insert into test_decimal values(1234567.66, 1234567899000000.66, 1234567899000000.66);
insert into test_decimal values(1234567.66, 12345678990000000.66, 12345678990000000.66);

select * from test_decimal;
+------------+----------------------+----------------------+
| float_num  | double_num           | decimal_num          |
+------------+----------------------+----------------------+
| 1234567.62 |  1234567899000000.80 |  1234567899000000.66 |
| 1234567.62 | 12345678990000000.00 | 12345678990000000.66 |
+------------+----------------------+----------------------+
2 rows in set (0.00 sec)

We can see that the float and double types have the problem of precision loss, that is, the data written to the database may not be the data inserted into the database, but the decimal type will not have the problem of precision loss no matter how much data is written to the data. This is the reason why we want to introduce the decimal type. The decimal type is common in the banking system Internet financial system and other systems that are sensitive to numbers after the decimal point.

Finally, let's talk about the difference between decimal and float/double. My personal summary is mainly reflected in two points:

  • float/double stores approximate values in db, while decimal is saved as a string
  • The rule of decimal(M,D) is the same as that of float/double, but the difference is that when float/double does not specify M and D, it is processed according to the actual accuracy by default, while decimal(10, 0) defaults to decimal when decimal does not specify M and D

Date type

MySQL supports five types of dates: date, time, year, datetime, timestamp:

data type Number of bytes format remarks
date 3 yyyy-MM-dd Store date value
time 3 HH:mm:ss Storage hours, minutes and seconds
year 1 yyyy Storage Year
datetime 8 yyyy-MM-dd HH:mm:ss Storage date + time
timestamp 4 yyyy-MM-dd HH:mm:ss Store date + time, which can be used as time stamp
drop table if exists test_time;
create table test_time (
    date_value date,
    time_value time,
    year_value year,
    datetime_value datetime,
    timestamp_value timestamp
);

insert into test_time values(now(), now(), now(), now(), now());

select * from test_time;
+------------+------------+------------+---------------------+---------------------+
| date_value | time_value | year_value | datetime_value      | timestamp_value     |
+------------+------------+------------+---------------------+---------------------+
| 2022-03-05 | 16:05:37   |       2022 | 2022-03-05 16:05:37 | 2022-03-05 16:05:37 |
+------------+------------+------------+---------------------+---------------------+
1 row in set (0.00 sec)

Here we focus on the differences between datetime and timestamp:

  • As listed above, datetime takes up 8 bytes and timestamp takes up 4 bytes
  • Due to the difference in the number of bytes, the time range that datetime and timestamp can store is also different. The storage range of datetime is 1000-01-01 00:00:00 - 9999-12-31 23:59:59, and the storage range of timestamp is 19700101080001 - 20380119111407
  • The default value of datetime is null. When the inserted value is null, the value of this column is null; The default value of timestamp is not empty. When the inserted value is null, MySQL will take the current time
  • The time stored in datetime is independent of the time zone. The time stored and displayed in timestamp depend on the current time zone

In practice, a table often has two default fields. One records the creation time and the other records the latest update time. This time can be realized by using timestamp type.

char and varchar types

When it comes to MySQL character types, we should be most familiar with char and varchar. For the comparison between char and varchar, let me summarize:

  1. char is a fixed length string with a length range of 0 ~ 255 and is independent of the encoding method. No matter what the actual length of the character is, it will be stored according to the specified length, and if it is not enough, it will be supplemented with spaces; varchar is a variable length string. In the utf8 encoded database, its length ranges from 0 to 21844
  2. The number of bytes actually occupied by char is the number of bytes occupied by the stored character. The number of bytes actually occupied by varchar is the stored character + 1 or + 2 or + 3 (see Research on the space occupied by varchar data and the maximum string limit)
  3. When MySQL processes char type data, it will dispose of all spaces at the end, while varchar type data will not

The first and second points will be explained later. For the third point, write SQL verification:

drop table if exists test_string;
create table test_string (
    char_value char(5),
    varchar_value varchar(5)
);

insert into test_string values('a', 'a');
insert into test_string values(' a', ' a');
insert into test_string values('a ', 'a ');
insert into test_string values(' a ', ' a ');

select length(char_value), length(varchar_value) from test_string;
+--------------------+-----------------------+
| length(char_value) | length(varchar_value) |
+--------------------+-----------------------+
|                  1 |                     1 |
|                  2 |                     2 |
|                  1 |                     2 |
|                  2 |                     3 |
+--------------------+-----------------------+
4 rows in set (0.00 sec)

As you can see, char type data does not retain spaces at the end of the string.

Research on the space occupied by varchar data and the maximum string limit

Next, let's explore how to calculate the actual space occupied by varchar data and the maximum string that can be accommodated. First, we need to give a conclusion: this part is related to the specific coding method.

First write a paragraph of SQL to create a table. The encoding format of utf8 is as follows:

drop table if exists test_varchar;
create table test_varchar (
    varchar_value varchar(100000)
) charset=utf8;

Execution error:

Column length too big for column 'varchar_value' (max = 21845); use BLOB or TEXT instead

According to the prompt, we change the size to 21845, and the execution still reports an error:

Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs

If it is changed to 21844, there will be no problem. Therefore, under utf8 coding, we can know varchar(M), and the maximum M = 21844. What about gbk

drop table if exists test_varchar;
create table test_varchar (
    varchar_value varchar(100000)
) charset=gbk;

The same error is reported:

Column length too big for column 'varchar_value' (max = 32767); use BLOB or TEXT instead

Changing the size to 32767 is also an error message in the same format as utf8:

Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs

Changing to M=32765 won't be a problem, so why? Explain in detail:

  • MySQL requires that the data length of a row cannot exceed 65535, that is, 64K
  • For the table that does not specify varchar field not null, there will be 1 byte to indicate whether the field is null
  • varchar(M). When the M range is 0 < = m < = 255, there will be one byte to record the length of varchar type string. When m > 255, there will be two bytes to record the length of varchar type string. Combining this with the previous point, the actual available 65535 bytes is 65535-3 = 65532 bytes
  • All English, regardless of its coding method, occupy one byte, but for gbk coding, one Chinese character occupies two bytes, so the maximum M=65532/2=32766; For utf8 coding, a Chinese character accounts for 3 bytes, so the maximum M=65532/3=21844, which explains the phenomenon in the experiment
  • Draw inferences from one instance. For utf8mb4 encoding mode, one character may occupy up to 4 bytes, so varchar(M), the maximum M is 65532 / 4 = 16383, which can be verified by yourself

Similarly, the above is the case where there is only varchar data in the table. If there are int, double and char data in the table at the same time, the space occupied by these data needs to be subtracted to calculate the maximum value of varchar(M) data M.

varchar, text and blob

Finally, let's talk about two data types: text and blob. Their original design intention is to store big data, because as mentioned earlier, the maximum data volume of MySQL single line is 64K.

First of all, text and varchar are a group of data types with both differences and connections. The connection is that when the M of varchar(M) is greater than some values, varchar will automatically turn to text:

  • M> Turn to tinytext at 255
  • M> text at 500
  • M> Turn to mediumtext at 20000

Therefore, there is no difference between varchar(M) and text, and the difference between varchar(M) and text is:

  • Single line 64K, that is, 65535 bytes of space. varchar can only use 65532 / 65533 bytes, but text can use all 65535 bytes
  • Text text(M) can be specified, but no matter how much M equals, it has no effect
  • Textdefault value is not allowed, varchar default value is allowed

Varchar and text are two data types. It is recommended to use varchar instead of text if you can use varchar (high storage efficiency). The M of varchar(M) has a length limit. As mentioned earlier, if it is greater than the limit, you can use mediumtext (16M) or longtext (4G).

As for text and blob, text stores strings and blob stores binary strings. In short, blob is used to store binary data of files such as pictures, audio and video.

Topics: Database MySQL SQL computer