When running the model on the server, you often encounter two problems:
- The model is too large and needs a long running time, and the computer can not be kept in the boot state all the time. Even if it is kept in the boot state, it is not sure which one will turn off and cause re running
- During continuous training on the server, you find that your parameters are set incorrectly. You need to disconnect the previous training without knowing how to stop the current training
Solution to problem 1
The way to solve this problem is to let the model continue to run in the background through the nohup command. Your computer can be used normally. You only need to log on to the server to check the nohup Out output results. At the same time, you can also use the screen command to make the server continuously train the model
The use of nohup can refer to the blogger's
The screen command can refer to the blogger's
Continue training after the Linux server is disconnected – screen
I am not familiar with the screen command. I personally recommend using the nohup command. You can set parameters to save the model weight file when resuming training. When resuming training, just use the previous model weight to continue training
Solution to problem 2
linux can use the kill command to stop a process, but little is known about the kill command. It is easy to kill the wrong process and cause the server to crash. You can refer to the official document of the kill command to learn how to use the kill command kill official documents
You can also refer to other people's articles Kill Command in Linux
Here is the translation. There may be some errors in the translation. Welcome to point out
KILL(1P) POSIX programmer's manual KILL(1P)
PROLOG
This manual page is part of the POSIX programmer's manual. Of this interface
The Linux implementation of this interface may be different (see
For details on Linux behavior, refer to the appropriate Linux man pages).
Or the interface may not be implemented on Linux.
name
kill - terminates or signals a process
example
kill -s signal_name pid...
kill -l [exit_status] kill [-signal_name] pid... kill [-signal_number] pid...
describe
The kill tool sends a signal to one or more processes specified by each pid operand.
Sends a signal specified by each pid operand.
For each pid Operands, kill The tool should perform the following operations amount to POSIX.1-2017 Defined in the system interface volume kill()Function. Interfaces defined in the volume kill()Function. The parameters when the function is called are as follows Parameters. * pid The value of the operand should be used as pid Parameters. * sig Parameters are defined by-s Option. sig Parameters are defined by-s Options-signal_number Option or-signal_name Option, or by SIGTERM,If these options are not specified.
option
The kill tool shall comply with the basic definition volume in POSIX.1-2017.
POSIX.1-2017 basic definition volume, section 12.2, utility syntax guide, except
In the last two SYNOPSIS forms, - signal_number and
-signal_ The name option is usually more than one character.
The following options should be supported. -l (letter ell.) Write signal_name All values of If no operand is given give. If you give one exit_status Operand, and it is a Value of (see page 2.5.2 Section, special parameters). 2.5.2 Section, special parameters and wait), corresponding to a process terminated by a signal. If the process is terminated by a signal, the corresponding Corresponding to the signal that terminates the process signal_name Should be written to the process. If one exit_status Operand If it is an unsigned decimal integer value of a signal, write the value corresponding to the signal that terminates the process signal_name. Unsigned decimal integer value of signal_name(Not included SIG The symbolic constant name of the prefix, which is a signal number. Not included SIG Constant name of prefix, defined in POSIX.NET In the underlying definition volume. The defined in the basic definition volume does not contain SIG The symbolic constant name of the prefix) corresponds to the signal. The signal should be written. Otherwise, the result Is unspecified. -s signal_name Specify the signal to send, using<signal.h>A symbol defined in the header One of the symbol names defined in the header. Value of signal name signal_name The value of should be recognized as case independent The value of should be recognized in a case insensitive manner, excluding SIG Prefix. In addition, it shall also identify The symbol name 0 should be recognized to represent The signal value is zero. The corresponding signal shall be Send instead of SIGTERM. -signal_name amount to-s signal_name. -signal_number Specify a nonnegative decimal integer, signal_number. Representative to replace SIGTERM Signal. As a valid call kill()Timely sig Parameters. Integer value and sig The correspondence of values is displayed in the following list. Specify any other than those listed below signal_number The effects of are undefined. Any other than signal_number The effects of are undefined. 0 0 1 SIGHUP 2 SIGINT 3 SIGQUIT 6 SIGABRT 9 SIGKILL 14 SIGALRM 15 SIGTERM If the first argument is a negative integer, it will be interpreted as Will be interpreted as-signal_number Option instead of specifying the negative value of the process group pid Operand. negative pid Operand that specifies a process group.
Operand
The following operands should be supported.
pid One of the following. 1. A decimal integer that specifies a process or group of processes to signal. Process or process group pid The process selected by the positive, negative, and zero values of the operand. pid The positive and negative values of the operands should match the selected process kill()Description of the function. If specified, the process number is 0. All processes in the current process group will be deleted Signal. about pid Is a negative number. see POSIX.1-2017 Defined in the system interface volume kill()Function. The interface is defined in the volume. If the first one pid Operands are negative and should be preceded by"--" To prevent it from being interpreted as an option. 2. A job controlled job ID(see POSIX.1-2017 Base definition volume). see POSIX.1-2017 Basic definitions of, Volume 3.204 Section, work control task ID),It identifies a background process group To send a signal. Job control job ID Symbol of Only applicable in the current shell Execution environment kill Call of. current shell Executing in the environment kill Call of; See section 2.12, Shell Execution environment. exit_status A decimal integer that specifies a signal number or the number of processes terminated by a signal The exit state of a signaled process.
STDIN
Not used.
input file
No,
Top of environment variable
The following environment variables will affect the execution of the following programs.
LANG Provide a default value for internationalized variables The default value of the variable.(see POSIX.1-2017 Basis of See POSIX.1-2017 Basic definitions of, Volume 8.2 Festival. Priority of internationalization variables Priority of internationalization variables, see POSIX.1-2017 Basic definitions of, Volume 8.2 Section, internationalization variable, used to determine Priority). LC_ALL If set to a non empty string value, the values of all other internationalization variables will be overwritten. Override the values of all other internationalization variables. LC_CTYPE Determine the area division for interpreting the byte sequence of text data. Determines the locality of interpreting byte sequences of text data as characters (for example, single byte and For example, single byte rather than multi byte Bytes). LC_MESSAGES Decide which locale settings should be used to influence Format and content of diagnostic information Standard error. NLSPATH Determines the location of the message directory for handle LC_MESSAGES Message directory for.
Asynchronous events
By default.
STDOUT
When the - l option is not specified, standard output will not be used.
Will not be used.
When-l When the option is specified, the symbol name of each signal Should be written in the following format. "%s%c", <signal_name>, <separator>. among<signal_name>It's capitalized, No SIG Prefix. and<Separator>should be<Newline character>or<Space>. For the last written signal,<separator>should be<newline>. When-l Options and exit_status When all operands are specified. The symbol name of the corresponding signal shall be written in the following format Write in the following format. "%s/n", <signal_name>.
STDERR
Standard errors should only be used for diagnostic information.
OUTPUT FILES
None.
Extended description
None.
Exit status
The following exit values should be returned.
0 For each pid At least one matching process was found The operand finds at least one matching process, and the specified signal is successfully The specified signal was successfully processed for at least one matching process. >0 An error has occurred.
Consequences of mistakes
Default
The following sections are informative.
Scope of application
The process number can be found by using ps.
Work controlled work ID Symbols do not need to work as expected When kill When running in its own utility execution environment, it does not need to work as expected. When running in an environment, you don't need to work as expected. In any of the following examples. nohup kill %1 & system("kill %1"). kill Run in different environments and do not share shell Understanding of job number.
example
Any command.
kill -9 100 -165 kill -s kill 100 -165 kill -s KILL 100 -165 To process ID Send for 100 processes SIGKILL signal And all process groups ID Send for 165 processes SIGKILL Signal, if The process has permission to send the signal to the specified process. And they exist. POSIX.1-2017 System interface volume of and POSIX.1-2017 The system interface volume and this volume are not required to be any signal_names. even if it is-signal_number Options also provide a symbolic for the signal (Although it's a number)The name of the signal. If a process is terminated by a signal When, its exit status will show a signal to kill it. But the specific value is not specified. kill -l Options. However, it can be used kill -l Option maps the decimal signal number and exit status value to the signal name. The state value is mapped to the name of a signal. The following example The status of a terminated job is reported. work stat=$? If [ $stat -eq 0 ] be echo Work completed successfully. elif [ $stat -gt 128 ] be echo Working signal SIG$(kill -l $stat)Termination. otherwise echo Working error code $stat And terminate. fi To send a default signal to a process group (for example, 123), an application should use a command similar to the following The application should use a command similar to the following. kill -TERM -123 kill --123
the reason being that
-The l option originates from the shell of C language and is also implemented in Korn shell.
Implemented in Korn shell. The output of the C shell can include
Because on some terminal screens, the signal name is not always suitable for
Because on some terminal screens, the signal name is not always suitable to appear in one line. KornShell output
It also includes the signal number defined by the implementation, which is considered too difficult by the standard developer.
Standard developers think this is too difficult for scripts to parse easily.
Scripts are easy to parse. The output format specified is
The purpose of is not only to adapt to the historical C shell output.
It also allows the use of fully vertical or fully horizontal on suitable systems
List, this is appropriate.
The early suggestion was to SIGNULL This name is used as a signal name to indicate Signal 0 (by) POSIX.1-2017 System interface volume for Used to test the existence of a process without signaling it). because signal_name 0 It can be used unambiguously in this case. SIGNULL Has been deleted. Early recommendations also required symbolic signal_name Yes no SIG Prefix should be recognized in all cases. Whether or not SIG Prefixes can be recognized. Historical version The historical version of is not-l Option write SIG Prefix, also not for And don't recognize the signal name SIG Prefix. because The portability and ease of use of the application will not be improved by requiring this extension, so it is no longer needed. Portability and ease of use, so this extension is no longer needed. In order to avoid ambiguity of initial negative number parameters Specify a semaphore or a process group. POSIX.1-2008 Provisions, in support XSI In the implementation of option, it is always considered the former. support XSI The implementation of option is always considered to be the former. It also requires Qualified applications always use"--"option Concluding remarks, unless one is also specified Options are specified. add to-s The option is to respond to international concerns about the following issues Provide some form of killing to comply with the practical grammar guide. Guidelines. When kill Job controlled jobs when running on their own systems ID Symbols do not need to work as expected. When kill When running in its own utility execution environment, the job controls the job ID Symbols do not need to work as expected. When running in an environment, you do not need to use job control jobs ID Symbol. In any of the following examples. nohup kill %1 & system("kill %1"). In, kill Run in different environments and do not understand shell How to manage their job numbers.
Future direction
No,
See
Chapter 2, shell command language, ps(1p), wait(1p)
POSIX.1-2017 Basic definitions of, Volume 3.204 Festival, Job Control operation ID,Chapter 8, environmental variables, Chapter 12.2 Festival. Utility syntax Guide, signal.h(0p) POSIX.1-2017 System interface volume, kill(3p)
copyright
Part of this article is reproduced and reproduced in electronic form
Some contents of this paper are reproduced in electronic form from IEEE Std 1003.1-2017, Standard for Information
Technical standard – portable operating system interface (POSIX), The
Open Group Base Specifications Issue 7, 2018 Edition, Copyright
© Copyright 2018 by the American Society of electrical and electronics engineers and open group.
Copyright Institute of engineers and open group. If this version is different from
There are differences between this edition and the original IEEE and The Open Group standards.
If there is any difference between this edition and the original IEEE and open group standards, the original IEEE and open group standards are referee documents.
The standard is the referee document. The original standards are available at
Available online: http://www.opengroup.org/unix/online.html .
Any typographical or formatting errors on this page This is most likely during the process of converting the source file to manual format. Occurs when the source file is converted to manual format. To report these errors, see https://www.kernel.org/doc/man-pages/reporting_bugs.html .