Technology sharing calculation method of MySQL field length limit

Posted by foo on Thu, 27 Feb 2020 11:04:37 +0100

Author: kay is good at Oracle, mysql, PostgresSQL and other database fields. He is good at Oracle, MySQL performance optimization, database architecture design, database fault repair, data migration and recovery. He is keen on studying MySQL database kernel source code and sharing technical articles. He has Oracle OCP certification and works in Jiangsu Guotai new point Software Co., Ltd. and is a member of DBA technical team

1, MySQL restrictions review

Prior to Restrictions on MySQL Text fields In this paper, we talked about the limitations of MySQL in the Server layer and InnoDB layer, but the algorithm of the limitations was not expanded in detail. Here is a supplementary explanation. First, we review the two limitations of MySQL

1. The MySQL server layer is limited to a single record with a size of no more than 65535 bytes; 2. The InnoDB layer cannot exceed half of the innodb_page_size (in fact, it should be smaller because some metadata information in the page should be deducted). Take the default 16K setting as an example, and the limit is 8126.

Another: the following calculation methods have been described in MySQL 5.7.27

2, Calculation method of Server layer limitation

2.1. Calculation process

Generally speaking, if the MySQL Server layer has restrictions, an error is returned as follows:

Row size too large. The maximum row size for the used table type, 
not counting BLOBs, is 65535.
This includes storage overhead, check the manual. 
You have to change some columns to TEXT or BLOBs


Therefore, you can locate the calculation method based on this error report. The key function path is as follows:

mysql_prepare_create_table    /* sql/sql_table.cc:3474 */
  --> mysql_create_frm        /* sql/unireg.cc:107     */
   --> pack_header           /* sql/unireg.cc:733     */


This part of code exists in the package header:

if (reclength > (ulong) file->max_record_length()) /* 65535 on the right */
{
  my_error(ER_TOO_BIG_ROWSIZE, /* This is the status value corresponding to the above error message */
            MYF(0),
            static_cast<long>(file->max_record_length()));
  DBUG_RETURN(1);
}


The point is whether the reclength value is greater than 65535. So let's understand the calculation process of reclength:

/* sql/unireg.cc:843 */
if (field->offset + data_offset + length > reclength)
  reclength= field->offset + data_offset + length;

/*
field->offset Calculation method of
sql/sql_table.cc:3816
*/ 
while ((sql_field=it++)) {
     /* Ellipsis part */
    sql_field->offset= record_offset;  /* Data offset of current record */
    if (MTYP_TYPENR(sql_field->unireg_check) == Field::NEXT_NUMBER)
      auto_increment++;
    
    if (sql_field->stored_in_db)
      record_offset+= sql_field->pack_length; /* Accumulate the pack [length] of the current field */
}
	

/*
data_offset Calculation method of
sql/unireg.cc:139     
*/ 
data_offset= (create_info->null_bits + 7) / 8;    


/*
length Calculation method of
sql/unireg.cc:833     
*/ 
length=field->pack_length;
 /* This pack'u length is related to the field type. Take varchar for example*/

/* sql/sql_table.cc:3082 */
sql_field->sql_type=    MYSQL_TYPE_VAR_STRING;
sql_field->pack_length= calc_pack_length(sql_field->sql_type,
                                        (uint) sql_field->length);

//The calculation method of 'Calc pack length' is as follows:


size_t calc_pack_length(enum_field_types type, size_t length)
{
  switch (type) {
  case MYSQL_TYPE_VAR_STRING:
  case MYSQL_TYPE_STRING:
  case MYSQL_TYPE_DECIMAL:     return (length);
  case MYSQL_TYPE_VARCHAR:     return (length + (length < 256 ? 1: 2));
  case MYSQL_TYPE_YEAR:
  case MYSQL_TYPE_TINY  : return 1;
  case MYSQL_TYPE_SHORT : return 2;
  case MYSQL_TYPE_INT24:
  case MYSQL_TYPE_NEWDATE: return 3;
  case MYSQL_TYPE_TIME: return 3;
  case MYSQL_TYPE_TIME2:
    return length > MAX_TIME_WIDTH ?
           my_time_binary_length(length - MAX_TIME_WIDTH - 1) : 3;
  case MYSQL_TYPE_TIMESTAMP: return 4;
  case MYSQL_TYPE_TIMESTAMP2:
    return length > MAX_DATETIME_WIDTH ?
           my_timestamp_binary_length(length - MAX_DATETIME_WIDTH - 1) : 4;
  case MYSQL_TYPE_DATE:
  case MYSQL_TYPE_LONG  : return 4;
  case MYSQL_TYPE_FLOAT : return sizeof(float);
  case MYSQL_TYPE_DOUBLE: return sizeof(double);
  case MYSQL_TYPE_DATETIME: return 8;
  case MYSQL_TYPE_DATETIME2:
    return length > MAX_DATETIME_WIDTH ?
           my_datetime_binary_length(length - MAX_DATETIME_WIDTH - 1) : 5;
  case MYSQL_TYPE_LONGLONG: return 8;   /* Don't crash if no longlong */
  case MYSQL_TYPE_NULL  : return 0;
  case MYSQL_TYPE_TINY_BLOB:    return 1+portable_sizeof_char_ptr;
  case MYSQL_TYPE_BLOB:     return 2+portable_sizeof_char_ptr;
  case MYSQL_TYPE_MEDIUM_BLOB:  return 3+portable_sizeof_char_ptr;
  case MYSQL_TYPE_LONG_BLOB:    return 4+portable_sizeof_char_ptr;
  case MYSQL_TYPE_GEOMETRY: return 4+portable_sizeof_char_ptr;
  case MYSQL_TYPE_JSON:         return 4+portable_sizeof_char_ptr;
  case MYSQL_TYPE_SET:
  case MYSQL_TYPE_ENUM:
  case MYSQL_TYPE_NEWDECIMAL:
    abort(); return 0;                          // This shouldn't happen
  case MYSQL_TYPE_BIT: return length / 8;
  default:
    return 0;
  }
}


Summary of 2.2.

According to the sorting of the above calculation methods, the formula for calculating the length in MySQL Server layer can be written as the following pseudo code:

data_offset = (number_null_field + 7) / 8; /*number_null_field Indicates the number of fields allowed to be null*/

total_length = 0;
for (int i = 0; i < n_fileds; i++) {
    total_length += calc_pack_length(field_type, length)
}

/*
field_type:data type
length:Bytes of data
 Note the number of bytes of varchar(100). In utf8mb4, the number of bytes is 400
*/	

total_length += data_offset;


Through the above calculation, it is required to meet the requirement of total length < = 65535, which can be checked by MySQL Server layer

3, Calculation method of InnoDB layer limitation

3.1 calculation process

If the length of InnooDB layer exceeds the limit, an error is reported as follows:

ERROR 1118 (42000): Row size too large (> 8126). 
Changing some columns to TEXT or BLOB may help.
In current row format, BLOB prefix of 0 bytes is stored inline.


Through the search of error information, the judgment method is as follows:

/* dict0dict.cc:2515 */
/* Check the size limit on leaf pages. */
if (rec_max_size >= page_rec_max) {
    ib::error_or_warn(strict)
        << "Cannot add field " << field->name
        << " in table " << table->name
        << " because after adding it, the row size is "
        << rec_max_size
        << " which is greater than maximum allowed"
        " size (" << page_rec_max
        << ") for a record on index leaf page.";

    return(TRUE);
}


The definition of page [rec] Max is as follows:

/* dict0dict.cc:2431 */
page_rec_max = srv_page_size == UNIV_PAGE_SIZE_MAX
              ? REC_MAX_DATA_SIZE - 1
              : page_get_free_space_of_empty(comp) / 2;

/* srv0srv.cc:222 */
ulong srv_page_size = UNIV_PAGE_SIZE_DEF; 

/* This is the default value (DEF). If page_size is configured as 32K in my.cnf, then this value is 32K */

/* univ.i:343 */
# define UNIV_PAGE_SIZE_DEF	(1 << UNIV_PAGE_SIZE_SHIFT_DEF)

/* univ.i:332 */
# define UNIV_PAGE_SIZE_SHIFT_DEF	14

/* univ.i:341 */
# define UNIV_PAGE_SIZE_MAX	(1 << UNIV_PAGE_SIZE_SHIFT_MAX)

/* univ.i:330 */
# define UNIV_PAGE_SIZE_SHIFT_MAX	16

/* rem0rec.h:1092 */
# define REC_MAX_DATA_SIZE	16384


You can see from the above source definition:

1. The default value of SRV page size is 1 < < 14, which is 16384, and our configuration is 16384; 2. The value of Univ page size Max is 1 < < 16, which is 65536;

Therefore, SRV page size = = Univ page size Max does not hold, so page rec max = page get free space of empty (COMP) / 2

The definition of "page" get "free" space "of" empty "is as follows:

/*************************************************************//**
Calculates free space if a page is emptied.
@return free space */
UNIV_INLINE
ulint
page_get_free_space_of_empty(
/*=========================*/
    ulint   comp)       /*!< in: nonzero=compact page layout */
{
    if (comp) { /* Now the configuration of the release version will follow this logic */
        return((ulint)(UNIV_PAGE_SIZE
                   - PAGE_NEW_SUPREMUM_END
                   - PAGE_DIR
                   - 2 * PAGE_DIR_SLOT_SIZE));
    } /* 16384 - 120 - 8 - 4 = 16252 */

    return((ulint)(UNIV_PAGE_SIZE
               - PAGE_OLD_SUPREMUM_END
               - PAGE_DIR
               - 2 * PAGE_DIR_SLOT_SIZE));
}

/* univ.i:312 */
# define UNIV_PAGE_SIZE		((ulint) srv_page_size)   
/* 16384 in our version */

/* page0page.h:122 */
# define PAGE_NEW_SUPREMUM_END (PAGE_NEW_SUPREMUM + 8)
/* 38 + 36 + 2*10 + 2*5 + 8 + 8 = 120 */

/* page0page.h:119 */
# define PAGE_NEW_SUPREMUM	(PAGE_DATA + 2 * REC_N_NEW_EXTRA_BYTES + 8)

/* page0page.h:104 */
# define PAGE_DATA	(PAGE_HEADER + 36 + 2 * FSEG_HEADER_SIZE)

/* page0page.h:56 */
# define	PAGE_HEADER	FSEG_PAGE_DATA

/* fsp0types.h:68 */
# define FSEG_PAGE_DATA		FIL_PAGE_DATA

/* fil0fil.h:517 */
# define FIL_PAGE_DATA		38U

/* fsp0types.h:81 */
# define FSEG_HEADER_SIZE	10

/* rem0rec.h:52 */
# define REC_N_NEW_EXTRA_BYTES	5

/* page0page.h:152 */
# define	PAGE_DIR		FIL_PAGE_DATA_END

/* fil0fil.h:525 */
# define FIL_PAGE_DATA_END	8

/* page0page.h:155 */
# define	PAGE_DIR_SLOT_SIZE	2


As shown above, the returned value of page get free space of empty (COMP) is 16252, that is, page rec max = 16252 / 2, which is just equal to 8126. In fact, it can be inferred from the above error result (> 8126).

Next, let's look at how a record's actual length (REC ﹐ Max ﹐ size) is calculated:

Let's simplify the code. The calculation process is as follows:

/* The following is about the calculation part of rec max size in the function dict index too big for tree */
/* The auxiliary functions involved are also described below */

rec_max_size = comp /* comp For true */
        ? REC_N_NEW_EXTRA_BYTES  /* rem0rec.h:52:#define REC_N_NEW_EXTRA_BYTES	5 */
        : REC_N_OLD_EXTRA_BYTES; /* rem0rec.h:49:#define REC_N_OLD_EXTRA_BYTES	6 */
	             
rec_max_size += UT_BITS_IN_BYTES(new_index->n_nullable);

/*
The implementation of this macro is as follows:

ut0ut.h:222
#define UT_BITS_IN_BYTES(b) (((b) + 7) / 8)

That is (number of fields allowed to be empty + 7) / 8
*/

for (i = 0; i < new_index->n_fields; i++) {
    const dict_field_t* field
        = dict_index_get_nth_field(new_index, i);
    const dict_col_t*   col
        = dict_field_get_col(field);
    ulint           field_max_size;
    ulint           field_ext_max_size;
    
    /* In dtuple_convert_big_rec(), variable-length columns
    that are longer than BTR_EXTERN_LOCAL_STORED_MAX_SIZE
    may be chosen for external storage.
    
    Fixed-length columns, and all columns of secondary
    index records are always stored inline. */
    
    /* Determine the maximum length of the index field.
    The field_ext_max_size should be computed as the worst
    case in rec_get_converted_size_comp() for
    REC_STATUS_ORDINARY records. */
    
    field_max_size = dict_col_get_fixed_size(col, comp);  

	/* In essence, it is to determine whether the length is fixed and obtain the number of bytes of the fixed length field*/
    
    /*If it is fixed length, directly goto add field size for evaluation*/

    if (field_max_size && field->fixed_len != 0) {
        /* dict_index_add_col() should guarantee this */
        ut_ad(!field->prefix_len
              || field->fixed_len == field->prefix_len);
        /* Fixed lengths are not encoded
        in ROW_FORMAT=COMPACT. */
        field_ext_max_size = 0;
        goto add_field_size;
    }
    
    /*If it's variable length, calculate the maximum value, and then find the length of field ﹣ ext ﹣ Max ﹣ size,
    This is the variable for the user to store the field length
    This is only the default initial value. The overflow page will be set according to the length later
    */

    field_max_size = dict_col_get_max_size(col);
    field_ext_max_size = field_max_size < 256 ? 1 : 2;
    
    if (field->prefix_len) { 
	/* When determining the length of a table, you can use else directly, which will be used in building a secondary index. However, it is not a B tree with a clustered index, so it can be ignored */
        if (field->prefix_len < field_max_size)  {
            field_max_size = field->prefix_len;
        }
    } else if (field_max_size > BTR_EXTERN_LOCAL_STORED_MAX_SIZE 
           && dict_index_is_clust(new_index)) {
         
	/*
	btr0types.h:56
	#define BTR_EXTERN_LOCAL_STORED_MAX_SIZE    \
	         (BTR_EXTERN_FIELD_REF_SIZE * 2)
	
	btr0types.h:53
	#define BTR_EXTERN_FIELD_REF_SIZE   FIELD_REF_SIZE
	
	page0size.h:32
	#define FIELD_REF_SIZE 20
	         
	So this code means:
	If the maximum value of the variable length field is greater than 40 (twice of the overflow page pointer), the field is in the page
	Only 40 bytes are reserved, and the length variable is set to 1, that is, a total of 41 bytes are occupied.*/

	/* In the worst case, we have a locally stored
	column of BTR_EXTERN_LOCAL_STORED_MAX_SIZE bytes.
	The length can be stored in one byte.  If the
	column were stored externally, the lengths in
	the clustered index page would be
	BTR_EXTERN_FIELD_REF_SIZE and 2. */

        field_max_size = BTR_EXTERN_LOCAL_STORED_MAX_SIZE;
        field_ext_max_size = 1;
    }
    
    if (comp) {
        /* Add the extra size for ROW_FORMAT=COMPACT.
        For ROW_FORMAT=REDUNDANT, these bytes were
        added to rec_max_size before this loop. */
        rec_max_size += field_ext_max_size;
    }
	add_field_size:
	rec_max_size += field_max_size;
    
    /* Check the size limit on leaf pages. */
    if (rec_max_size >= page_rec_max) {
        ib::error_or_warn(strict)
            << "Cannot add field " << field->name
            << " in table " << table->name
            << " because after adding it, the row size is "
            << rec_max_size
            << " which is greater than maximum allowed"
            " size (" << page_rec_max
            << ") for a record on index leaf page.";
    
        return(TRUE);
    }

/ * Omit follow-up code, related auxiliary functions:*/

ulint
dict_col_get_fixed_size(
/*====================*/
    const dict_col_t*   col,    /*!< in: column */
    ulint           comp)   /*!< in: nonzero=ROW_FORMAT=COMPACT */
{
    return(dtype_get_fixed_size_low(col->mtype, col->prtype, col->len,
                    col->mbminmaxlen, comp));
    /*
     According to the implementation, it can be simply calculated as:
 1. Fixed length number of characters specified for display * character width
   - For example, char(10), in utf8mb4, the byte is 40
 2. Variable length direct return 0
    */
}

/***********************************************************************//**
Returns the size of a fixed size data type, 0 if not a fixed size type.
@return fixed size, or 0 */
UNIV_INLINE
ulint
dtype_get_fixed_size_low(
/*=====================*/
	ulint	mtype,		/*!< in: main type */
	ulint	prtype,		/*!< in: precise type */
	ulint	len,		/*!< in: length */
	ulint	mbminmaxlen,	/*!< in: minimum and maximum length of
				a multibyte character, in bytes */
	ulint	comp)		/*!< in: nonzero=ROW_FORMAT=COMPACT  */
{
	switch (mtype) {
	case DATA_SYS:
# ifdef UNIV_DEBUG
		switch (prtype & DATA_MYSQL_TYPE_MASK) {
		case DATA_ROW_ID:
			ut_ad(len == DATA_ROW_ID_LEN);
			break;
		case DATA_TRX_ID:
			ut_ad(len == DATA_TRX_ID_LEN);
			break;
		case DATA_ROLL_PTR:
			ut_ad(len == DATA_ROLL_PTR_LEN);
			break;
		default:
			ut_ad(0);
			return(0);
		}
# endif /* UNIV_DEBUG */
	// Fall through.
	case DATA_CHAR:
	case DATA_FIXBINARY:
	case DATA_INT:
	case DATA_FLOAT:
	case DATA_DOUBLE:
	case DATA_POINT:
		return(len);
	case DATA_MYSQL:
# ifndef UNIV_HOTBACKUP
		if (prtype & DATA_BINARY_TYPE) {
			return(len);
		} else if (!comp) {
			return(len);
		} else {
#ifdef UNIV_DEBUG
			ulint	i_mbminlen, i_mbmaxlen;

			innobase_get_cset_width(
				dtype_get_charset_coll(prtype),
				&i_mbminlen, &i_mbmaxlen);

			ut_ad(DATA_MBMINMAXLEN(i_mbminlen, i_mbmaxlen)
			      == mbminmaxlen);
# endif /* UNIV_DEBUG */
			if (DATA_MBMINLEN(mbminmaxlen)
			    == DATA_MBMAXLEN(mbminmaxlen)) {
				return(len);
			}
		}
# else /* !UNIV_HOTBACKUP */
		return(len);
# endif /* !UNIV_HOTBACKUP */
		/* fall through for variable-length charsets */
	case DATA_VARCHAR:
	case DATA_BINARY:
	case DATA_DECIMAL:
	case DATA_VARMYSQL:
	case DATA_VAR_POINT:
	case DATA_GEOMETRY:
	case DATA_BLOB:
		return(0);
	default:
		ut_error;
	}

	return(0);
}

ulint
dict_col_get_max_size(
/*==================*/
 const dict_col_t*   col)    /*!< in: column */
{
   return(dtype_get_max_size_low(col->mtype, col->len));
}

ulint
dtype_get_max_size_low(
/*===================*/
    ulint   mtype,      /*!< in: main type */
    ulint   len)        /*!< in: length */
{
    switch (mtype) {
        case DATA_SYS:
        case DATA_CHAR:
        case DATA_FIXBINARY:
        case DATA_INT:
        case DATA_FLOAT:
        case DATA_DOUBLE:
        case DATA_MYSQL:
        case DATA_VARCHAR:
        case DATA_BINARY:
        case DATA_DECIMAL:
        case DATA_VARMYSQL:
        case DATA_POINT:
            return(len);
        case DATA_VAR_POINT:
        case DATA_GEOMETRY:
        case DATA_BLOB:
            break;
        default:
            ut_error;
    }

    return(ULINT_MAX);
}


3.2 summary

According to the sorting of the above calculation methods, the formula for calculating the length in the InnoDB Server layer can be written as the following pseudo code:

rec_max_size = 5;
/*
  innodb header Two fixed values in
  DB_TRX_ID: 6 Byte
  DB_ROLL_PTR: 7 Byte
*/
rec_max_size += 6+7;

/* rem0rec.h:70
# define REC_NODE_PTR_SIZE	4 

dict0dict.cc:2536
rec_max_size + REC_NODE_PTR_SIZE >= page_ptr_max 
*/
rec_max_size += 4; 

rec_max_size += (number_null_field + 7) / 8; /*number_null_field Indicates the number of fields allowed to be null*/

for (int i = 0; i < n_fileds; i++) {
    /* Fixed length field */
    if (field.type is fixed) {
        rec_max_size += filed.length; 
		/*char Requires utf8mb4 type of x4 bytes*/
        continue;
    }
    
    /* Variable length field */
    field_ext_max_size = 1;
    if (field.type is variable) {
        field_max_size = field.length > 40 ? 40 : field.length;
    }
       
    rec_max_size += field_max_size + field_ext_max_size;
}


Based on the above calculation, it is necessary to meet the rec﹤ max﹤ size < 8126, which can be checked by InnoDB layer

Four, summary

1. The above conditions must be met at both MySQL Server layer and InnoDB layer in order to build the table successfully;
2. If the above error reporting situation occurs, most of it is because the setting of varchar is too large. It is recommended to gradually reduce some fields or replace them with text;

Topics: Database MySQL SQL Oracle