mysql character set and data type

Posted by Greaser9780 on Wed, 15 May 2019 19:48:27 +0200

Lesson_2

MySql-2

1.1 character set

Character Set: A set of binary codes for visible characters when they are saved and transmitted.

Character sets are used in two places

1. Save data
 2. Data transmission

1.2 Use Character Sets for Survival

1. MySQL can specify character encoding on servers, databases, tables and fields
 2. Specifying character encoding on the server is specified when installing the server
 3. The database, tables and fields are specified at the time of creation.
create table test(
-> name varchar(10) charset utf8
->)charset=gbk;

Differences of 1.3 gbk, GB2312 and UTF8

These three kinds of character encoding methods are commonly used. gbk and GB2312 (self-contained) are simplified Chinese encoding, and utf8 is international universal encoding.

Python, MySQL, jQuery and so on are all open source products. Open source products recommend utf8 coding.

Both gbk and gb2312 are simplified Chinese, but the internal coding is different.

In python

Under gbk or gb2312, a Chinese takes up 2 bytes
 Under utf8, a Chinese takes up 3 bytes
Note: In the database, if the character encoding is utf8, the length of a Chinese character is 1, and the byte length of a Chinese character is 2.
create table `emp`(
sex varchar(21844)
)engine=myisam charset=utf8;

65535/3-1


create table `emp2`(
sex varchar(32766)
)engine=myisam charset=gbk;

65535/2-1
If the database is UTF8, then the `table'you created in the library will use the database character set by default if no character set is set.

1.4 Character encoding during transmission

Encoding accepted by mysql server

mysql> show variables like 'character_%';
+--------------------------+-----------------------------------------------+
| Variable_name            | Value                                         |
+--------------------------+-----------------------------------------------+
| character_set_client     | utf8  #The server accepts the client's character code|
| character_set_connection | utf8                                          |
| character_set_database   | gbk                                           |
| character_set_filesystem | binary                                        |
| character_set_results    | utf8  #Character encoding for returning results|
| character_set_server     | utf8                                          |
| character_set_system     | utf8                                          |
| character_sets_dir       | C:\phpStudy\PHPTutorial\MySQL\share\charsets\ |
+--------------------------+-----------------------------------------------+
8 rows in set (0.00 sec)

Now the code sent by the client is inconsistent with the code accepted by the server

Solve:
# windows defaults to GBK
set character_set_client=gbk;

Character encoding returns differently
 Returning is utf8 windows not aware of scrambling code

Solve:
set character_set_results=gbk;

Tips:
Setnames gbk; -> set names can be set to modify three character encoding at a time;

mysql> show variables like 'character_%';
+--------------------------+-----------------------------------------------+
| Variable_name            | Value                                         |
+--------------------------+-----------------------------------------------+
| character_set_client     | gbk                                           |
| character_set_connection | gbk                                           |
| character_set_database   | gbk                                           |
| character_set_filesystem | binary                                        |
| character_set_results    | gbk                                           |
| character_set_server     | utf8                                          |
| character_set_system     | utf8                                          |
| character_sets_dir       | C:\phpStudy\PHPTutorial\MySQL\share\charsets\ |
+--------------------------+-----------------------------------------------+
8 rows in set (0.00 sec)

2 proofreading set

In a character set, the relationship between characters and characters becomes a proofreading set. For example, the size relationship between (ASCII)a and B, if case a > B is distinguished, if case a is not distinguished

#utf8_general_ci uses the _ci proofreading set, which is case-insensitive
create table t1(
name char(1)
)charset=utf8 collate=utf8_general_ci;

create table t2(
name char(1)
)charset=utf8 collate=utf8_bin;

insert into t1 values ('a'),('B');
insert into t2 values ('a'),('B');
mysql> select * from t1 order by name;
+------+
| name |
+------+
| a    |
| B    |
+------+
2 rows in set (0.00 sec)

#After sorting queries, there is no case-sensitive explanation that a and b are case-insensitive.
mysql> select * from t2 order by name;
+------+
| name |
+------+
| B    |
| a    |
+------+
2 rows in set (0.00 sec)

#According to the positive order, compared with ASCLL coding, B < a, B is in front of a, so this binary proofreading set is case-sensitive;
Remember well:
_ bin: Comparisons by binary coding
 _ ci: Case-insensitive comparison
#Display all character sets

mysql> show character set;
+----------+-----------------------------+---------------------+--------+
| Charset  | Description                 | Default collation   | Maxlen |
+----------+-----------------------------+---------------------+--------+
| big5     | Big5 Traditional Chinese    | big5_chinese_ci     |      2 |
| dec8     | DEC West European           | dec8_swedish_ci     |      1 |
| cp850    | DOS West European           | cp850_general_ci    |      1 |
| hp8      | HP West European            | hp8_english_ci      |      1 |
| koi8r    | KOI8-R Relcom Russian       | koi8r_general_ci    |      1 |
| latin1   | cp1252 West European        | latin1_swedish_ci   |      1 |
| latin2   | ISO 8859-2 Central European | latin2_general_ci   |      1 |
| swe7     | 7bit Swedish                | swe7_swedish_ci     |      1 |
| ascii    | US ASCII                    | ascii_general_ci    |      1 |
| ujis     | EUC-JP Japanese             | ujis_japanese_ci    |      3 |
| sjis     | Shift-JIS Japanese          | sjis_japanese_ci    |      2 |
| hebrew   | ISO 8859-8 Hebrew           | hebrew_general_ci   |      1 |
| tis620   | TIS620 Thai                 | tis620_thai_ci      |      1 |
| euckr    | EUC-KR Korean               | euckr_korean_ci     |      2 |
| koi8u    | KOI8-U Ukrainian            | koi8u_general_ci    |      1 |
| gb2312   | GB2312 Simplified Chinese   | gb2312_chinese_ci   |      2 |
| greek    | ISO 8859-7 Greek            | greek_general_ci    |      1 |
| cp1250   | Windows Central European    | cp1250_general_ci   |      1 |
| gbk      | GBK Simplified Chinese      | gbk_chinese_ci      |      2 |
| latin5   | ISO 8859-9 Turkish          | latin5_turkish_ci   |      1 |
| armscii8 | ARMSCII-8 Armenian          | armscii8_general_ci |      1 |
| utf8     | UTF-8 Unicode               | utf8_general_ci     |      3 |
| ucs2     | UCS-2 Unicode               | ucs2_general_ci     |      2 |
| cp866    | DOS Russian                 | cp866_general_ci    |      1 |
| keybcs2  | DOS Kamenicky Czech-Slovak  | keybcs2_general_ci  |      1 |
| macce    | Mac Central European        | macce_general_ci    |      1 |
| macroman | Mac West European           | macroman_general_ci |      1 |
| cp852    | DOS Central European        | cp852_general_ci    |      1 |
| latin7   | ISO 8859-13 Baltic          | latin7_general_ci   |      1 |
| utf8mb4  | UTF-8 Unicode               | utf8mb4_general_ci  |      4 |
| cp1251   | Windows Cyrillic            | cp1251_general_ci   |      1 |
| utf16    | UTF-16 Unicode              | utf16_general_ci    |      4 |
| cp1256   | Windows Arabic              | cp1256_general_ci   |      1 |
| cp1257   | Windows Baltic              | cp1257_general_ci   |      1 |
| utf32    | UTF-32 Unicode              | utf32_general_ci    |      4 |
| binary   | Binary pseudo charset       | binary              |      1 |
| geostd8  | GEOSTD8 Georgian            | geostd8_general_ci  |      1 |
| cp932    | SJIS for Windows Japanese   | cp932_japanese_ci   |      2 |
| eucjpms  | UJIS for Windows Japanese   | eucjpms_japanese_ci |      3 |
+----------+-----------------------------+---------------------+--------+
39 rows in set (0.00 sec)
#Display all proofreading sets
mysql> show collation;

3. MySQL Data Type - Value Type

(1) integer

integer Occupying byte Range
tinyint 1 -2 ^7 ~ 2^7-1 (-128~127)
smallint 2 -2 ^15 ~ 2^15-1 (-32768~32765)
mediumint 3 -2 ^23 ~ 2^23-1 (-8388608~8388607)
int 4 -2 ^31~ 2^31-1 (-2147483648~2147483647)
bigint 8 - 2 ^ 63 ~ 2 ^ 63 - 1 (too big)

(2)unsigned (unsigned)

If a number is an unsigned number, then it must be a non-negative number.

The first record of database mysql must be 1 (or at least 0), absolutely not 0.

tinyint unsigned 2^8-1

The range of unsigned number is equal to twice that of signed number.

mysql> create table test1(
    -> age tinyint unsigned
    -> );
Query OK, 0 rows affected (0.00 sec)

mysql> insert into test1 values (128);
Query OK, 1 row affected (0.00 sec)

(3) Display width

Shaping supports display width, which is the smallest display number, such as int(11) for a minimum of 11.

The number of digits represents this number, if not enough to use 0 as the lead. Display digits do not work by default, and zerofill must be aggregated to work.

create table stu(
id int(5),
age int(5) zerofill
);

#Insert test data
insert into stu values (1,23),(2,123456);
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> select * from stu;
+------+--------+
| id   | age    |
+------+--------+
|    1 |  00023 |
|    2 | 123456 |    #The value itself is displayed when the display width exceeds 5.
+------+--------+

The display width does not determine the display size of the integer, but fills in the leading 0 when the value is not enough. If it exceeds the display number, the leading 0 is not added. The display itself.

(4) floating point type

float Occupying byte Range
float 4 -3.4E+38 ~ 3.4E+38
double 8 -1.8E+308 ~ 1.8E+308
create table test2(
  num1 float,
  num2 double
  );
insert into test2 values(23.123,2.123);

#float theoretically retains only one decimal point, according to the actual version
#In theory, only two decimal points are reserved, according to the actual version.

insert into test2 values(2.2E2,2.123);
#It also supports scientific counting.

insert into test2 values(999.999999999999999999999999999,2.123456);
#Loss accuracy

We don't have double precision in python, float represents floating point number.

(5) fixed-point number

decimal(M,D),M The maximum is 65.,D The maximum is 30.,The default is(10,0)

create table test3(
    num1 float(10,2),
    num2 decimal(10,2)
    );

insert into test3 values(123.57,1234567.00);
mysql> select * from test3;
+-------------+-------------+
| num1        | num2        |
+-------------+-------------+
| 12345679.00 |      123.57 |
| 12345679.00 | 12346623.57 |
+-------------+-------------+
2 rows in set (0.00 sec)


#Fixed-point numbers can retain multiple decimal points, which in some cases will lose accuracy. Floating-point numbers are more efficient than fixed-point numbers.
#Both floating-point and fixed-point support unsigned.

4.MySql Data Type - Character Type

data type describe
char Fixed length (255)
varchar Variable length (65535)
tinytext 2^8-1 (255)
text 2^16-1 (65535)
mediumtext 2^24-1
longtext 2^32-1
char(10) and varchar(10)

    Similarity: No more than 10 characters
    The difference: char, you type in five characters, and it still takes up 10 characters.
        varchar If you type five characters, he will return more space.
        The maximum length of char is 255
        The theoretical length of varchar is 65535
        Actually, I can't. I use GBK (two characters in mysql), so 65535/2.
create table test(
  name varchar(32766)
  )charset=gbk;


#Utf8 was created incorrectly, supporting a maximum of 21845 characters, because utf8 uses three characters for a Chinese character

create table test(
  name varchar(65535)
  )charset=gbk

#utf8 was created incorrectly and supported at most32767Character,because gbk A Chinese character to be used2Character

5.MySql data type - enum

List some options, radio.

create table test(
name varchar(32),
sex enum('male','female','secrecy')
);

insert into test values('Jia Nailiang','male');
#Insert normal
insert into test values('PGONE','Simon?');
#Insert wrong report
Enumerations are managed by integers inside the database. The first value is 1 and the second value is 2, pushing backwards in turn.

insert into test values('Wang Baoqiang', 1);
insert into test values('Liu Yunbin', 3);


Advantages of enumeration type:
1. limit value
 2. Save space
 3. Efficiency of operation
 Consideration: Given that enumeration takes 2 bytes, how many enumeration values can be set?

Two bytes, 16 bits, 2 ^ 16-1, can save up to 65535/2 values

6.MySql Data Type - Set

List data types and save multiple selections

create table test2(
name varchar(32),
hobby set('eat','Sleep','Read a Book','Chi Huang')
);

insert into test2 values('Liu Yun bin','eat,Sleep,Chi Huang');
insert into test2 values('Liu Yun bin-2','eat,Sleep,Li Xiaolu');
Sets, like enumerations, assign a fixed value to each element in a different way.
It is also managed using integers
 This is how collections are allocated:
'Eat','Sleep','Read','Jianhuang'

2 ^ 02 ^ 12 ^ 22 ^ 3
 When saving, add up multiple values (converted to integers) so that the values are very large (waste resources).
A set of its elements is represented by a bit, and several sets of elements require several bits.

Think: Given that collections take up eight bytes, how many options can collections hold?
Only 64 options can be saved.

7.MySql Data Type - Date-Time Type

data type describe
date Date, 8 bytes
time time
datetime Date time
year Year, occupy one byte
timestamp Time stamp, 4 bytes

(1)datetime

#Format year-month-day time: minutes: seconds
create table test4(
create_at datetime
);

insert into test4 values('2018-01-12 15:00:53');
insert into test4 values(now());#Represents the current time.
insert into test4 values('10000-12-31 23:59:59'); #error

(2)date

Date part representing only time

(3)time

Represents the time part and the time interval, ranging from - 838:59:59 to 838:59:59

create table test5(
create_at time
);

insert into test5 values('12:12:12');
insert into test5 values('212:12:12');
insert into test5 values('-800:12:12');
insert into test5 values('-839:59:59');  #False, out of range

#Time supports day-to-day representation of time intervals
insert into test5 values('10 12:59:59');

(4)timestamp

Starting from the Greenwich time of 100:00:00 seconds on January 1, 1970, it is a special type in python, but it is shown in data time format in mysql.

create table test6(
create_at timestamp
);

insert into test6 values(now());
insert into test6 values('2018-01-12 15:43:30');
insert into test6 values('2038-01-19 11:14:07');#Here is a node.

\
insert into test6 values('2038-01-19 11:14:08');#More than 4 bytes
timestamp Format and datetime It's the same.,The difference between them is that:
datetime From 1 to 9999,and timestamp From 1970~2038 year.
timestamp Occupy 4 bytes,More than four bytes by 2038.

(5)year

It takes up only 255 bytes, based on 1900, ranging from 1900 + 1 to 1900 + 255.

create table test7(
create_at year
);

insert into test7 values(1900);#FALSE
insert into test7 values(1901);
insert into test7 values(2155);
insert into test7 values(2156);#Wrong, out of scope

8. Boolean

mysql does not support Boolean, so 1 and 0 are used instead.

create table test8(
num boolean
);

insert into test8 values(True);
insert into test8 values(1);
insert into test8 values(False);
insert into test8 values(0);
insert into test8 values('A');#error

//Boolean type is rarely used. It can be replaced by tinyint or enum.

9. Column attribute - null | not null

create table test9(
id int auto_increment primary key not null comment'Primary key',
username varchar(64) not null,
pwd char(64) null
);

insert into test9 values(null,'admin','');
insert into test9 values(null,'','');
insert into test9 values(null,null,'');
insert into test9 values(null,'',null);

#An empty string is not equal to null

10. Column attribute -- (default)

create table test10(
id int auto_increment primary key not null comment'Primary key',
username varchar(64) not null,
pwd char(64) null default '123456'
);

insert into test10(username) values('admin2');#Recommended SQL statements


create table test11(
age int  default '123456'
);
insert into test11 values(null);
#Enter null and insert null, not write is the default

11. Auto-growth column

auto_increment
 The default field value starts from 1 and increases by 1 at a time. Features: There will be no duplicate values. Primary keys are commonly used.

Friendship Tip: Automatically growing columns in mysql must be the primary key.
The auto-growing intensity can be input (null) at insertion time.

The deleted id cannot be reused:
If the ID value of the touch in the middle is relatively large, then the ID inserted later is +1 in turn according to the maximum value.

12. Column Properties - Primary Key

primary key
//Primary key: Primary key is unique
//Characteristic: Can't be empty, can't repeat
//A table has only one primary key

CREATE TABLE `test12` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `num` int(11) NOT NULL,
  `age` int(11) NOT NULL,
  PRIMARY KEY (`id`,`num`,`age`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

#The above primary keys are'3', in fact, a primary key, which consists of three fields (a joint primary key), and will hardly be used in development.
#(Teaching use)

create table demo1(
id int auto_increment primary key
);

create table demo2(
id int auto_increment ,
primary key(id)
);

create table demo3(
id int(11) 
);

alter table demo3 add primary key(id);

//Principles for selecting primary keys:
1.Minimum:You must select a single key as the primary key
2.stability:This field as the primary key should not be modified

13. Column attributes - unique keys

Characteristic: Can't repeat, can't be empty

A table can have multiple unique keys

create table test13(
uid int auto_increment primary key,
mobile char(11) unique,
email char(32) unique,
pwd char(32)
);

insert into test13 values(null,'13877776666','this_dog@qq.com','123456');

insert into test13 values(null,'13877776665','this_do@qq.com','123456');

#Modifying table attribute methods to add unique keys
alter table test13 add unique `my`(mobile);

#Delete the unique key or use the delete method
alter table test13 drop index mobile;#Alias delete aliases, no aliases delete field names

14. Column Properties - Comments

Notes are used to communicate with programmers.
A little bit: At least you can see what this field is for.

stauts tinyint comment'describes a person's state 1 is dead, 2 is alive, and 0 is half dead. 

15.SQL notes

Notes to python
 # One-line comment
'''

'''Multi-line Comments

Notes to mysql
 # One-line comment
 - One-line comment

/* Content of comment*/multi-line comment
User name " or 1=1 #"
Password

select * from test13 where uname=" " or 1=1 –" " and pwd=" ";

Practice

# What data types are commonly used to store mobile phone numbers?
Character char(11)
# What data type does gender use?
Character type (enumeration type) Boolean type
 # What is age?
Integer tinyint unsigned
 # What are the photos for?
binary characters (save only paths, not resources)
varchar()
# What type of salary do you use?
decimal fixed-point type 
# Is the student's name allowed to be blank?
Not allow
 # Can the home address be empty?
Better not.
# Can E-mail be empty?
For the time being, there is a little weight. It's optional. It's better not to use it.
# Can the exam result be empty?
default 0 is not allowed 
# Can the number entered in the primary key column be null?
null must not be used as a placeholder
 # Can a table have multiple primary keys?
May not
 # In a school database, if the school allows students to rename, the class does not allow rename, then the class and the student's name as a combination of the main key is OK?
primary key(id,name)
Yes, but it's not reasonable.
# Is the identifier column allowed to use string types?
Yes, but not allowed.
# What if there is no suitable column (field) in the table as the primary key?
Add a self-growing field as the primary key

Topics: MySQL encoding Windows Database