Java interview questions - Nginx

Posted by vickie on Thu, 03 Mar 2022 19:45:20 +0100

1, Why can Nginx handle it asynchronously and non blocking

Look at the whole process of a request: when the request comes, establish a connection, and then receive the data. After receiving the data, send the data.

Specifically, the bottom layer of the system is the read-write event. When the read-write event is not ready, it must not be operable. If it is not called in a non blocking way, it will have to block the call. If the event is not ready, you can only wait. When the event is ready, you can continue. Blocking calls will enter the kernel and wait, and the cpu will let them out for others. It is obviously inappropriate for single threaded worker s. When there are more network events, everyone is waiting. When the cpu is idle and no one is using it, the cpu utilization will naturally not go up, let alone high concurrency. Well, you say add the number of processes. What's the difference between this and apache's thread model? Note that don't add unnecessary context switching. Therefore, in nginx, blocking system calls are the most taboo. Don't block, then it's not blocked. Non blocking means that if the event is not ready, return to EAGAIN immediately and tell you that the event is not ready. What are you panicking about? Come back later. Well, you can check the event later until the event is ready. During this period, you can do other things first, and then check whether the event is ready. Although it's not blocked, you have to check the status of events from time to time. You can do more things, but the cost is not small.

2, Advantages of Nginx

[1] Faster: This is reflected in two aspects: on the one hand, under normal circumstances, a single request will get a faster response; On the other hand, during peak periods (such as tens of thousands of concurrent requests), Nginx can respond to requests faster than other Web servers.
[2] High scalability, cross platform: the design of Nginx is highly scalable. It is completely composed of multiple modules with different functions, different levels, different types and very low coupling. Therefore, when repairing a Bug or upgrading a module, you can focus on the module itself without paying attention to others. Moreover, in the HTTP module, the HTTP filter module is also designed: after a normal HTTP module processes the request, a series of HTTP filter modules will reprocess the request results. In this way, when we develop a new HTTP module, we can not only use different levels or different types of modules such as HTTP core module, events module and log module, but also reuse a large number of existing HTTP filter modules intact. This excellent design with low coupling degree has created a huge third-party module of Nginx. Of course, the open third-party module is as easy to use as the officially released module.
Nginx modules are embedded into binary files for execution, both officially released modules and third-party modules. This makes the third-party modules have extremely excellent performance and make full use of the high concurrency characteristics of nginx. Therefore, many high traffic websites tend to develop customized modules in line with their own business characteristics.
[3] High reliability: for reverse proxy, the probability of downtime is very small. High reliability is the most basic condition for us to choose Nginx, because the reliability of Nginx is obvious to all. Many high traffic websites use Nginx on a large scale on the core server. The high reliability of Nginx comes from the excellent design of its core framework code and the simplicity of module design; In addition, the commonly used modules officially provided are very stable, and each worker process is relatively independent. When a worker process makes an error, the master process can quickly "pull up" a new worker sub process to provide services.
[4] Low memory consumption: generally, 10000 inactive HTTP keep alive connections consume only 2.5MB of memory in Nginx, which is the basis for Nginx to support high concurrency connections. It takes 150M memory to open 10 Nginx.
[5] Single machine supports more than 100000 concurrent connections: This is a very important feature! With the rapid development of the Internet and the doubling of the number of Internet users, major companies and websites need to deal with a large number of concurrent requests. A Server that can withstand more than 100000 concurrent requests in the peak period will undoubtedly be favored by everyone. Theoretically, the upper limit of concurrent connections supported by Nginx depends on memory, and 100000 is far from being capped. Of course, the ability to handle more concurrent requests in time is closely related to business characteristics.
[6] Hot deployment: the separation design of Master management process and Worker work process enables Nginx to provide heating deployment function, that is, it can be deployed in 7 × Upgrade the executable of Nginx on the premise of 24-hour uninterrupted service. Of course, it also supports the functions of updating configuration items and changing log files without stopping the service.
[7] The freest BSD license agreement: This is a powerful driving force for the rapid development of Nginx. BSD license agreement not only allows users to use Nginx for free, but also allows users to directly use or modify the source code of Nginx in their own projects and then publish it. This has attracted countless developers to continue to contribute their wisdom to Nginx.
Of course, the above seven features are not all of Nginx. With countless official function modules and third-party function modules, Nginx can meet most application scenarios. These function modules can be superimposed to achieve more powerful and complex functions. Some modules also support the integration of Nginx with Perl, Lua and other scripting languages, which greatly improves the development efficiency. These features urge users to think more about Nginx when looking for a Web server. The core reason for choosing Nginx is that it can support high concurrent requests while maintaining efficient services.
[8] High performance: handle 20000-30000 concurrent connections, and the official monitoring can support 50000 concurrent connections.

3, Tell me about the configuration parameters you used in Nginx

When Nginx is installed, there will be a corresponding installation directory, Nginx Conf is the main configuration file of Nginx. The main configuration file of Nginx is divided into four parts: Main (global configuration), server (host configuration), upstream (load balancing server setting) and location(URL matching specific location setting). The relationship between these four parts is: server inherits main, location inherits server, and upstream will neither inherit other settings nor be inherited.

   
  1. ########### Each instruction must end with a semicolon.#################
  2. #Configure users or groups. The default is nobody. Set different permissions for groups, and users will have corresponding permissions.
  3. user administrator administrators;
  4. worker_processes 2; #The number of processes allowed to be generated. The default is 1
  5. pid /nginx/pid/nginx.pid; #Specify the storage address of nginx process running files
  6. #Make log path and level. This setting can be put into global block, http block and server block. The level is as follows:
  7. # debug|info|notice|warn|error|crit|alert|emerg
  8. error_log log/error.log debug;
  9. events {
  10. accept_mutex on; #Set the serialization of network connection to prevent group panic. The default is on
  11. multi_accept on; #If multiple processes are connected at the same time, the default setting is off
  12. use epoll; #Event driven model, select|poll|kqueue|epoll|resig|/dev/poll|eventport
  13. worker_connections 1024; #The maximum number of connections is 512 by default
  14. }
  15. http {
  16. include mime.types; #File extension and file type mapping table
  17. default_type application/octet-stream; #The default file type is text/plain
  18. #access_log off; #Cancel service log
  19. log_format myFormat '$remote_addr–$remote_user [$time_local] $request $status $body_bytes_sent $http_referer $http_user_agent $http_x_forwarded_for'; #Custom format
  20. access_log log/access.log myFormat; #combined is the default value for log format
  21. sendfile on; #It is allowed to transfer files in sendfile mode, which is off by default. It can be in http block, server block and location block.
  22. sendfile_max_chunk 100k; #The number of transfers per call of each process cannot be greater than the set value. The default value is 0, that is, there is no upper limit.
  23. keepalive_timeout 65; #The connection timeout, which is 75s by default, can be set in http, server and location blocks.
  24. upstream mysvr {
  25. server 127.0.0.1:7878;
  26. server 192.168.10.121:3333 backup; #Hot standby
  27. }
  28. error_page 404 https://www.baidu.com; #Page error
  29. server {
  30. keepalive_requests 120; #Maximum number of single connection requests.
  31. listen 4545; #Listening port
  32. server_name 127.0.0.1; #Listening address
  33. location ~*^.+$ { #Request url filtering, regular matching, ~ is case sensitive, ~ * is case insensitive.
  34. #root path; #root directory
  35. #index vv.txt; #Set default page
  36. proxy_pass http://mysvr; #List of requests to mysvr server definition
  37. deny 127.0.0.1; #Rejected ip
  38. allow 172.18.5.54; #Allowed ip
  39. }
  40. }
  41. }

[1] Global block: configure instructions that affect Nginx global. Generally, there are user groups running the Nginx server, the pid storage path of the Nginx process, the log storage path, the introduction of configuration files, the number of worker process es allowed to be generated, etc.
[2] events block: the configuration affects the Nginx server or the network connection with the user. There is the maximum number of connections per process, which event driven model is selected to process connection requests, whether it is allowed to accept multiple network connections at the same time, and start the serialization of multiple network connections.
[3] http block: it can nest multiple server s, configure most functions such as proxy, cache, log definition and the configuration of third-party modules. Such as file import, MIME type definition, log customization, whether to use sendfile to transfer files, connection timeout, number of single connection requests, etc.
[4] server block: configure the relevant parameters of the virtual host. There can be multiple servers in one http.
[5] location block: configure the routing of requests and the processing of various pages.

4, Differences between Nginx and Apache

[1] Nginx is an event based Web server (select and epoll functions, etc.), and Apache is a process based server;
[2] Nginx means that all requests are processed by one thread, and Apache means that a single thread processes a single request;
[3] Nginx performs well in load balancing. Apache will reject new connections when the traffic reaches the limit of the process;
[4] Nginx's scalability and performance do not depend on hardware, while Apache relies on CPU, memory and other hardware components;
[5] Nginx is better in memory consumption and connection, while Apache has not improved in memory consumption and connection;
[6] Nginx pays attention to speed and Apache pays attention to power;
[7] Nginx avoids the probability of subprocesses, and Apache is based on subprocesses;
[8] Nginx handles requests asynchronously and non blocking, while Apache is blocking. Under high concurrency, nginx can maintain low resource consumption and high performance;

5, What are the Nginx load balancing algorithms

At present, the upstream of Nginx supports four allocation methods:
[1] Polling (default): each request is allocated to different back-end servers one by one in chronological order. If the back-end server goes down, it will be raised automatically;
[2] Weight: Specifies the polling probability. The weight is proportional to the access ratio. It is used in the case of uneven back-end servers;
[3]ip_hash: each request is allocated according to the hash result of the access IP, so that each visitor accesses a back-end server. It can solve the problem of session;
[4] fair (third party): the request is allocated according to the response time of the back-end server, and the response time block is allocated first;
[5]url_hash (third party): allocate according to the hash result of URL;

6, Common status codes

499: the processing time of the server is too long, and the client actively closes the connection.
[reference blog]: https://www.cnblogs.com/kevingrace/p/7205623.html

7, Static resource allocation

For the cache server of static resources, for example, in the project where the front-end and back-end are separated, in order to speed up the response speed of the front-end page, we can put the relevant resources of the front-end, such as html, js, css or pictures, into the directory specified by Nginx. When accessing, we only need to add a path through IP to achieve efficient and fast access, Let's talk about how to use Nginx as a static resource server under Windows.
[1] Modify nginx Config configuration file;
[2] The main configuration parameters are as follows. I have directly removed some irrelevant parameters. Note that multiple location s can be configured, so that relevant paths can be specified according to business needs to facilitate subsequent operation, maintenance and management:

   
  1. server {
  2. # Access static html under local absolute path
  3. location / {
  4. #root html;
  5. root D:/tools/nginx/2/html1;
  6. index index.html index.htm;
  7. }
  8. #Access path splicing upload accesses a picture under the local absolute path
  9. location /upload/ {
  10. alias D:/tools/nginx/2/image1/;
  11. autoindex on;
  12. }
  13. #Access path splicing / pages access static html under local absolute path
  14. location /pages/ {
  15. alias D:/tools/nginx/2/html1/;
  16. autoindex on;
  17. }
  18. # Fine configure relevant static resource parameters and optimize access to static resource files (commonly used in production)
  19. location ~ .*\.(gif|jpg|jpeg|png)$ {
  20. expires 24h;
  21. root D:/tools/nginx/2/image1/; #Specify the image storage path
  22. proxy_store on;
  23. proxy_temp_path D:/tools/nginx/2/image1/; #Picture access path
  24. proxy_redirect off;
  25. proxy_set_header Host 127.0.0.1;
  26. client_max_body_size 10m;
  27. client_body_buffer_size 1280k;
  28. proxy_connect_timeout 900;
  29. proxy_send_timeout 900;
  30. proxy_read_timeout 900;
  31. proxy_buffer_size 40k;
  32. proxy_buffers 40 320k;
  33. proxy_busy_buffers_size 640k;
  34. proxy_temp_file_write_size 640k;
  35. if ( !-e $request_filename)
  36. {
  37. proxy_pass http://127.0.0.1; #Default port 80
  38. }
  39. }
  40. error_page 500 502 503 504 /50x.html;
  41. location = /50x.html {
  42. root html;
  43. }
  44. }

8, Reverse proxy configuration

The reverse proxy server can hide the existence and characteristics of the source server. It acts as an intermediate layer between Internet cloud and web server. This is good for security, especially when you use web hosted services.

[reverse agent]: https://blog.csdn.net/zhengzhaoyang122/article/details/94303198

9, Nginx process model

By default, Nginx adopts multi process working mode. After Nginx is started, it will run one master process and multiple worker processes. Among them, the master acts as the interactive interface between the whole process group and users. At the same time, it monitors the process and manages the worker process to realize the functions of restarting service, smooth upgrade, replacing log files, real-time effectiveness of configuration files and so on. Workers are used to handle basic network events. Workers are equal. They compete together to handle requests from clients. The process model of Nginx is shown in the figure below:

The master uses signals to control the work process

10, One Master and multiple wokers are good

[1] Nginx – s reload can be used for hot deployment, and nginx can be used for hot deployment;
[2] Each woker is an independent process. If there is a problem with one woker, the other wokers are independent and continue to compete to realize the request process without causing service interruption;

12, How many woker s are appropriate

It is most suitable that the number of worker s is equal to the number of CPUs in the server;

13, Sending the request takes up several connections of woker

When accessing static resources, there are two: one request and one return of static resources;
When accessing non static resources, there are four: nginx needs to send the request to a node in the Tomcat cluster for processing once, and Tomcat returns the result once after processing;

14, Nginx has a Master and four wokers. Each Woker supports a maximum of 1024 connections. What is the maximum number of concurrent connections supported

The maximum concurrent number of common static access is: worker_connections * worker_processes /2. If HTTP is used as the reverse proxy, the maximum concurrent number should be worker_connections * worker_processes/4.

15, How Nginx works

Nginx consists of a kernel and modules. Nginx itself does little work. When it receives an HTTP request, it just maps the request to a location block by looking up the configuration file, and each instruction configured in the location will start different modules to complete the work. Therefore, the module can be regarded as the real labor worker of nginx. Usually, the instructions in a location will involve one handler module and multiple filter modules (of course, multiple locations can reuse the same module). The handler module is responsible for processing the request and generating the response content, while the filter module processes the response content. The modules developed by users according to their own needs belong to the third-party module. It is with the support of so many modules that the function of nginx will be so powerful. The modules of nginx are structurally divided into core modules, basic modules and third-party modules:
[1] Core modules: HTTP module, EVENT module and MAIL module;
[2] Basic modules: http Access module, HTTP FastCGI module, HTTP Proxy module and HTTP Rewrite Module;
[3] Third party modules: http # Upstream # Request # Hash module, Notice module and HTTP # Access # Key module.

The modules of Nginx are divided into the following three categories in terms of function:
[1] Handlers. This kind of module directly processes the request, and performs operations such as outputting content and modifying headers information. Generally, there can only be one handler module;
[2] Filters. The output of this kind of content is mainly modified by the processor module NX;
[3] Proxies (proxy class module). Such modules are the HTTP # Upstream modules of Nginx. These modules mainly interact with some back-end services, such as FastCGI, to realize the functions of service proxy and load balancing.

16, Load balancing configuration

[blog connection]: https://blog.csdn.net/zhengzhaoyang122/article/details/94287448

17, Virtual host configuration

[blog connection]: https://blog.csdn.net/zhengzhaoyang122/article/details/93793377

18, Nginx common commands

[1] Start nginx:

[root@LinuxServer sbin]# /usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf
   

[2] Stop nginx -s stop or nginx -s quit;
[3] Overload configuration/ SBIN / nginx - s reload or service nginx reload;
[4] Overload the specified configuration file, - c: use the specified configuration file instead of nginx in the conf directory conf ;

[root@LinuxServer sbin]# /usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf ;
   

[5] View nginx version nginx -v;
[6] Check whether the configuration file is correct nginx -t;
[7] Display help information nginx -h;

19, Please explain what the C10K problem is

C10K problem refers to the inability to process network sockets of a large number of clients (10000) at the same time.

20, How does Nginx handle Http requests

[1] First, when Nginx starts, it will parse the configuration file to get the port and IP address to be monitored, and then initialize the monitored Socket in the Master process of Nginx (create the Socket, set addr, reuse and other options, bind to the specified IP address port, and then listen).
[2] Then, fork (an existing process can call the fork function to create a new process. The new process created by fork is called a child process) to create multiple child processes.
[3] After that, the child process will compete to accept the new connection. At this point, the client can initiate a connection to nginx. When the client shakes hands with nginx three times and establishes a connection with nginx. At this time, a child process will accept successfully, get the Socket of the established connection, and then create the encapsulation of the connection by nginx, that is, ngx_connection_t structure.
[4] Then, set the read-write event processing function and add read-write events to exchange data with the client. Here, there is still some logic to continue in "how does Nginx achieve high concurrency?" From the point of view of the problem.
[5] Nginx or the client will take the initiative to close the connection. At this point, a connection will die.
[6] Finally, Nginx or the client will take the initiative to close the connection. At this point, a connection will die.

21, The difference between fast cgi and cgi

FastCGI is a scalable and high-speed interface for communication between HTTP server and dynamic scripting language. Most popular HTTP servers support FastCGI, including Apache, Nginx and lighttpd. At the same time, FastCGI is also supported by many scripting languages, including PHP.
FastCGI is developed and improved from CGI. The main disadvantage of the traditional CGI interface is poor performance, because every time the HTTP server encounters a dynamic program, it needs to restart the script parser to execute the parsing, and then return the result to the HTTP server. This is almost unavailable when dealing with high concurrent access. In addition, the traditional CGI interface has poor security and is rarely used now.
FastCGI interface adopts C/S structure, which can separate HTTP server and script parsing server, and start one or more script parsing daemons on the script parsing server at the same time. When the HTTP server encounters a dynamic program every time, it can directly deliver it to the FastCGI process for execution, and then return the obtained results to the browser. In this way, the HTTP server can specifically process static requests or return the results of the dynamic script server to the client, which greatly improves the performance of the whole application system.

[1] cgi: according to the requested content, the web server will fork a new process to run the external c program (or perl script...). This process will return the processed data to the web server. Finally, the web server will send the content to the user, and the fork process will also exit. If the next time the user requests to change the dynamic script, the web server forks a new process again and again.

[2] Fastcgi: when the web server receives a request, it will not fork a process again (because the process starts when the web server starts and will not exit). The web server directly passes the content to the process (inter process communication, but fastcgi uses other methods, tcp communication). After receiving the request, the process will process it, Return the result to the web server, and finally wait for the next request instead of exiting.

To sum up, the difference lies in whether the fork process is repeated to process requests.

22, How does Nginx achieve high concurrency

One main process and multiple working processes. Each working process can process multiple requests. For each incoming request, there will be a worker process to process it. But it is not the whole process. It is processed to the place where blocking may occur, such as forwarding the request to the upstream (back-end) server and waiting for the request to return. Then, the processing worker will continue to process other requests. Once the upstream server returns, this event will be triggered, and the worker will take over and the request will go down. Because of the nature of the work of web server, most of the life of each request is in network transmission. In fact, there is not much time spent on the server machine. This is the secret of solving high concurrency by several processes. That is @skoo to say, webserver is just a network IO intensive application, not a computing intensive application.

23, Why not use multithreading

Apache: create multiple processes or threads, and each process or thread will allocate cpu and memory for it (threads are much smaller than processes, so worker s support higher concurrency than perfork). Concurrency will drain server resources.
Nginx: single thread is used to process requests asynchronously and non blocking (the administrator can configure the number of working processes of the main process of nginx) (epoll). CPU and memory resources will not be allocated for each request, which saves a lot of resources and reduces a lot of CPU context switching. That's why nginx supports higher concurrency.

24, Why is Nginx performance so high

Thanks to its event processing mechanism: asynchronous non blocking event processing mechanism: using epoll model, it provides a queue to solve the problem;

25, Design of memory pool

In order to avoid memory fragmentation, reduce the number of memory applications to the operating system and reduce the development complexity of each module, Nginx adopts a simple memory pool (unified application and unified release). For example, allocate a memory pool for each http request and destroy the entire memory pool at the end of the request.

26, Platform independent code implementation

The core code is implemented with code independent of the operating system, and the system calls related to the operating system are implemented independently for each operating system, which finally creates the portability of Nginx.

27, Communication between Nginx processes

The number of the main processes that are not started will be determined by the Nginx function. After the main process is added to the main process that is not started, the number of the main process will be determined by the Nginx function, And establish a one-way pipeline and pass it to the working process. This pipeline is different from the ordinary pipeline. It is a single channel from the main process to the working process, including the instructions that the main process wants to send to the working process, the working process ID, the index of the working process in the working process table and the necessary file descriptor.
The main process communicates with the outside world through the signal mechanism. When it receives the signal to be processed, it sends the correct instructions to the relevant working process through the pipeline. Each working process has the ability to capture the readable events in the pipeline. When there are readable events in the pipeline, the working process will read and parse the instructions from the pipeline, and then take corresponding execution actions, This completes the interaction between the main process and the work process.

28, Event driven framework

Nginx event driven framework: the so-called event driven architecture, in short, is to generate events from some event sources, collect and distribute events by one or more event collectors (epold, etc.), and then many event processors will register the events they are interested in and "consume" these events at the same time. Nginx will not use processes or threads as event consumers. It can only be a module. The current process calls the module.

In traditional web servers (such as Apache), the so-called events are limited to the establishment and closing of TCP connections, and other reads and writes are no longer event driven. At this time, it will degenerate into the batch mode of executing each operation in sequence. In this way, each request will always occupy system resources after the connection is established, and the resources will not be released until the connection is closed. A great waste of memory, cpu and other resources. And treat a process or thread as an event consumer. There is an important difference between traditional web server and Nginx: the former monopolizes a process resource for each event consumer, while the latter is only called by the event distributor process for a short time.

29, Multistage asynchronous processing of requests

The multi-stage asynchronous processing of requests can only be realized based on the event driven framework, that is, the processing process of a request is divided into multiple stages according to the trigger mode of events, and each stage can be triggered by event collection and distributor (epoll, etc.). For example, an http request can be divided into seven stages.

30, In Nginx, please explain the difference between break and last in the Rewrite module

Official documents are defined as follows:
[1] last: stop executing the current round of ngx_http_rewrite_module instruction set, and then find the new location matching the changed URI;
[2] break: stop executing the current round of ngx_http_rewrite_module instruction set;
[for example]: as follows:

   
  1. location /test1.txt/ {
  2. rewrite /test1.txt/ /test2.txt break;
  3. }
  4. location ~ test2.txt {
  5. return 500;
  6. }

Using break will match the URL twice. If no item is met, it will stop matching the following location and directly initiate the request www.xxx com/test2.txt, because the file test2.txt does not exist Txt, 404 will be displayed directly.
If you use last, you will continue to search for the following locations that meet the conditions (meet the rewritten / test2.txt request) and match them ten times. If you don't get the results ten times, it will be the same as break. Return to the above example, / test2 Txt just corresponds to the condition of face location. Enter the code execution in curly brackets {}, and 500 will be returned here.

31, Please list the best uses of Nginx server

The best use of Nginx server is to deploy dynamic HTTP content on the network, using SCGI, WSGI application server and FastCGI handler for scripts. It can also be used as a load balancer.

32, Nginx in nginx Optimization of conf configuration file

   
  1. nginx The number of processes to start is generally equal to cpu In fact, the total number of auditors is usually 4 or 8.
  2. each nginx The process consumes 10 megabytes of memory
  3. worker_cpu_affinity
  4. Only for linux,Use this option to bind worker Process and CPU(2.4 Kernel machine (not working)
  5. If it's 8 cpu The distribution is as follows:
  6. worker_cpu_affinity 00000001 00000010 00000100 00001000 00010000
  7. 00100000 01000000 10000000
  8. nginx Multiple can be used worker Process, for the following reasons:
  9. to use SMP
  10. to decrease latency when workers blockend on disk I/O
  11. to limit number of connections per process when select()/poll() is
  12. used The worker_processes and worker_connections from the event sections
  13. allows you to calculate maxclients value: k max_clients = worker_processes * worker_connections
  14. worker_rlimit_nofile 102400;
  15. each nginx The configuration of the maximum number of process open file descriptors should be consistent with the number of single process open files of the system,linux 2.6 The number of open files under the kernel is 65535, worker_rlimit_nofile 65535 should be filled in accordingly nginx When scheduling, the allocation of requests to the process is not so balanced. If it exceeds, 502 error will be returned. I write a little bigger here
  16. use epoll
  17. Nginx Using the latest epoll(Linux 2.6 Kernel) and kqueue(freebsd)network I/O And, model Apache Is traditional select Model.
  18. Handle a large number of connected reads and writes, Apache Adopted select network I/O The model is very inefficient. In highly concurrent servers, polling I/O It is the most time-consuming operation at present Linux Can withstand high concurrency
  19. Visited Squid,Memcached All use epoll network I/O Model.
  20. worker_connections 65535;
  21. Maximum number of simultaneous connections allowed per worker process( Maxclient = work_processes * worker_connections)
  22. keepalive_timeout 75
  23. keepalive Timeout
  24. Here we need to pay attention to the official sentence:
  25. The parameters can differ from each other. Line Keep-Alive:
  26. timeout=time understands Mozilla and Konqueror. MSIE itself shuts
  27. keep-alive connection approximately after 60 seconds.
  28. client_header_buffer_size 16k
  29. large_client_header_buffers 4 32k
  30. Client request header buffer size
  31. nginx It will be used by default client_header_buffer_size this buffer To read header Value, if header Too large, it will use large_client_header_buffers To read
  32. If the setting is too small HTTP head/Cookie Report 400 error after the meeting nginx 400 bad request
  33. If the line exceeds buffer,Will report HTTP 414 error(URI Too Long) nginx Accept the longest HTTP The head size must be larger than one of them buffer Big, otherwise it will report 400 HTTP error(Bad Request).
  34. open_file_cache max 102400
  35. Use field:http, server, location This instruction specifies whether caching is enabled,If enabled,The following information will be recorded in the file: ·Open file descriptor,Size information and modification time. ·Existing directory information. ·Error messages during file search -- There is no such file,Cannot read correctly,reference resources open_file_cache_errors Instruction options:
  36. ·max - Specifies the maximum number of caches,If the cache overflows,Longest used file(LRU)Will be removed
  37. example: open_file_cache max=1000 inactive=20s; open_file_cache_valid 30s; open_file_cache_min_uses 2; open_file_cache_errors on;
  38. open_file_cache_errors
  39. grammar:open_file_cache_errors on | off Default value:open_file_cache_errors off Use field:http, server, location This directive specifies whether to search for a file record cache error.
  40. open_file_cache_min_uses
  41. grammar:open_file_cache_min_uses number Default value:open_file_cache_min_uses 1 Use field:http, server, location This instruction specifies the open_file_cache The minimum number of files that can be used within a certain time range in an invalid parameter of the instruction,If a larger value is used,File descriptor in cache Always open in.
  42. open_file_cache_valid
  43. grammar:open_file_cache_valid time Default value:open_file_cache_valid 60 Use field:http, server, location This directive specifies when to check open_file_cache Valid information of cached items in.
  44. open gzip
  45. gzip on;
  46. gzip_min_length 1k;
  47. gzip_buffers 4 16k;
  48. gzip_http_version 1.0;
  49. gzip_comp_level 2;
  50. gzip_types text/plain application/x-JavaScript text/css
  51. application/xml;
  52. gzip_vary on;
  53. Cache static files:
  54. location ~* ^.+\.(swf|gif|png|jpg|js|css)$ {
  55. root /usr/ local/ku6/ktv/show.ku6.com/;
  56. expires 1m;

33, Parameter optimization of Nginx kernel

   
  1. The optimization of kernel parameters is mainly in Linux In the system, for Nginx System kernel parameter optimization for application.
  2. An optimization example is given below for reference.
  3. 1. net.ipv4.tcp_max_tw_buckets = 6000
  4. 2. net.ipv4.ip_local_port_range = 1024 65000
  5. 3. net.ipv4.tcp_tw_recycle = 1
  6. 4. net.ipv4.tcp_tw_reuse = 1
  7. 5. net.ipv4.tcp_syncookies = 1
  8. 6. net.core.somaxconn = 262144
  9. 7. net.core.netdev_max_backlog = 262144
  10. 8. net.ipv4.tcp_max_orphans = 262144
  11. 9. net.ipv4.tcp_max_syn_backlog = 262144
  12. 10. net.ipv4.tcp_synack_retries = 1
  13. 11. net.ipv4.tcp_syn_retries = 1
  14. 12. net.ipv4.tcp_fin_timeout = 1
  15. 13. net.ipv4.tcp_keepalive_time = 30
  16. Add the above kernel parameter values/etc/sysctl.conf File, and then execute the following command to make it effective:
  17. 1. [root@ localhost home] #/sbin/sysctl -p
  18. The following describes the meaning of options in the example:
  19. net.ipv4.tcp_max_tw_buckets : Options are used to set timewait The default value is 180000, which is set to 6000 here.
  20. net.ipv4.ip_local_port_range:Option is used to set the port range that the system is allowed to open. In high and oestrus, otherwise the port number will not be enough.
  21. net.ipv4.tcp_tw_recycle:Option is used to set enable timewait Rapid recovery.
  22. net.ipv4.tcp_tw_reuse:Option is used to set reuse on, allowing TIME-WAIT sockets Reuse for new TCP connect.
  23. net.ipv4.tcp_syncookies:Option is used to set on SYN Cookies,When appear SYN Enable when waiting for queue overflow cookies Handle.
  24. net.core.somaxconn:The default value of option is 128. This parameter is used to adjust the system initiated at the same time tcp The number of connections. In highly concurrent requests, the default value may lead to link timeout or retransmission. Therefore, this value needs to be adjusted in combination with the number of concurrent requests.
  25. net.core.netdev_max_backlog:Option indicates the maximum number of packets allowed to be sent to the queue when each network interface receives packets faster than the kernel processes them.
  26. net.ipv4.tcp_max_orphans:This option is used to set the maximum number of in the system TCP The socket is not associated with any user file handle. If this number is exceeded, the isolated connection will be reset immediately and a warning message will be printed. This restriction is only to prevent simple DoS Attack. We should not rely too much on this limit or even artificially reduce this value. In more cases, we should increase this value.
  27. net.ipv4.tcp_max_syn_backlog:Option is used to record the maximum number of connection requests that have not received client confirmation. For 128 MB The default value of this parameter is 1024 for systems with large memory and 128 for systems with small memory.
  28. net.ipv4.tcp_synack_retries The value of the parameter determines what the kernel sends before abandoning the connection SYN+ACK Number of packages.
  29. net.ipv4.tcp_syn_retries Option means to send before the kernel abandons the connection SYN Number of packages.
  30. net.ipv4.tcp_fin_timeout Option determines whether the socket remains FIN-WAIT-2 Time of status. The default value is 60 seconds. It is very important to set this value correctly, sometimes even for a small load Web Server, there will also be a large number of dead sockets and the risk of memory overflow.
  31. net.ipv4.tcp_syn_retries Option means to send before the kernel abandons the connection SYN Number of packages.
  32. If the sender requests to close the socket, net.ipv4.tcp_fin_timeout Option determines whether the socket remains FIN-WAIT-2 Time of status. The receiver can make an error and never close the connection, or even unexpectedly shut down.
  33. net.ipv4.tcp_fin_timeout The default value for is 60 seconds. It should be noted that even a small load Web Server, there will also be a risk of memory overflow due to a large number of dead sockets. FIN-WAIT-2 Hazard ratio of FIN-WAIT-1 It should be small because it can only consume 1 at most.5KB Memory, but it has a longer lifetime.
  34. net.ipv4.tcp_keepalive_time Option indicates when keepalive When enabled, TCP send out keepalive Frequency of messages. The default value is 2 (in hours).

34, Optimize the performance of Nginx with TCMalloc

   
  1. TCMalloc The full name of is Thread-Caching Malloc,Is an open source tool developed by Google google-perftools A member of. With standard glibc Library Malloc comparison, TCMalloc The memory allocation efficiency and speed of the library are much higher, which greatly improves the performance of the server in the case of high concurrency, thus reducing the load of the system. Here is a brief introduction to how to Nginx add to TCMalloc Library support.
  2. To install TCMalloc Library, need to install libunwind(32 Bit operating system (not required to be installed) and google-perftools Two packages, libunwind The library is 64 bit based CPU And operating system programs provide basic function call chain and function call register functions. The following describes the use of TCMalloc optimization Nginx Specific operation process.
  3. 1).install libunwind library
  4. Can from http://download.savannah.gnu.org/releases/libunwind download the corresponding libunwind version. Here is libunwind-0.99-alpha tar. gz. The installation process is as follows:
  5. 1. [root@localhost home] #tar zxvf libunwind-0.99-alpha.tar.gz
  6. 2. [root@localhost home] # cd libunwind-0.99-alpha/
  7. 3. [root@localhost libunwind-0.99-alpha] #CFLAGS=-fPIC ./configure
  8. 4. [root@localhost libunwind-0.99-alpha] #make CFLAGS=-fPIC
  9. 5. [root@localhost libunwind-0.99-alpha] #make CFLAGS=-fPIC install
  10. 2).install google-perftools
  11. Can from http://google-perftools. googlecode. Download the corresponding version of Google perftools. Here is Google perftools-1.8 tar. gz. The installation process is as follows:
  12. 1. [root@localhost home] #tar zxvf google-perftools-1.8.tar.gz
  13. 2. [root@localhost home] #cd google-perftools-1.8/
  14. 3. [root@localhost google-perftools-1.8] # ./configure
  15. 4. [root@localhost google-perftools-1.8] #make && make install
  16. 5. [root@localhost google-perftools-1.8] #echo "/usr/
  17. local/lib " > /etc/ld.so.conf.d/usr_local_lib.conf
  18. 6. [root@localhost google-perftools-1.8]# ldconfig
  19. So far, google-perftools Installation is complete.
  20. 3).Recompile Nginx
  21. In order to make Nginx support google-perftools,Need to be added during installation“–with-google_perftools_module"Option recompile Nginx. The installation code is as follows:
  22. 1. [root@localhostnginx-0.7.65]#./configure \
  23. 2. >--with-google_perftools_module --with-http_stub_status_module --prefix=/opt/nginx
  24. 3. [root@localhost nginx-0.7.65]#make
  25. 4. [root@localhost nginx-0.7.65]#make install
  26. Come here Nginx Installation is complete.
  27. 4).by google-perftools Add thread directory
  28. Create a thread directory where the files are placed/tmp/tcmalloc Down. The operation is as follows:
  29. 1. [root@localhost home]#mkdir /tmp/tcmalloc
  30. 2. [root@localhost home]#chmod 0777 /tmp/tcmalloc
  31. 5).modify Nginx Master profile
  32. modify nginx.conf File, in pid Add the following code below this line:
  33. 1. #pid logs/nginx.pid;
  34. 2. google_perftools_profiles /tmp/tcmalloc;
  35. Next, restart Nginx Can be completed google-perftools Loading of.
  36. 6).Verify running status
  37. To verify google-perftools It has been loaded normally. You can view it through the following command:
  38. 1. [root@ localhost home]# lsof -n | grep tcmalloc
  39. 2. nginx 2395 nobody 9w REG 8,8 0 1599440 /tmp/tcmalloc.2395
  40. 3. nginx 2396 nobody 11w REG 8,8 0 1599443 /tmp/tcmalloc.2396
  41. 4. nginx 2397 nobody 13w REG 8,8 0 1599441 /tmp/tcmalloc.2397
  42. 5. nginx 2398 nobody 15w REG 8,8 0 1599442 /tmp/tcmalloc.2398
  43. Because in Nginx Settings in configuration file worker_processes The value of is 4, so 4 are turned on Nginx Threads, each thread will have a row of records. The numeric value after each thread file is started Nginx of pid Value.
  44. So far, use TCMalloc optimization Nginx Operation completed.

35, Compilation and installation process optimization

   
  1. 1).reduce Nginx Compiled file size
  2. Compiling Nginx The default is debug Mode, while in debug In mode, a lot of traces and are inserted ASSERT Such information, after compiling, a Nginx There should be several megabytes. And cancel before compiling Nginx of debug Mode, after compilation Nginx Only a few hundred kilobytes. Therefore, you can modify the relevant source code and cancel it before compiling debug pattern. The specific methods are as follows:
  3. stay Nginx After the source file is unzipped, find the source file in the source directory auto/cc/gcc File, find the following lines:
  4. 1. # debug
  5. 2. CFLAGS=" $CFLAGS -g"
  6. Comment out or delete these two lines to cancel debug pattern.
  7. 2.For specific CPU appoint CPU Type compilation optimization
  8. Compiling Nginx Default when GCC The compilation parameters are“-O",To optimize GCC To compile, you can use the following two parameters:
  9. 1. --with-cc-opt= '-O3'
  10. 2. --with-cpu-opt=CPU #Compile for a specific CPU. Valid values include:
  11. pentium, pentiumpro, pentium3, # pentium4, athlon, opteron, amd64, sparc32, sparc64, ppc64
  12. To be sure CPU Type, you can use the following command:
  13. 1. [root@localhost home] #cat /proc/cpuinfo | grep "model name"

36, Event model supported by Nginx

Nginx supports the following methods for handling connections (I/O multiplexing methods), which can be specified through the use instruction.
[1] Select – standard method. If there is no more effective method on the current platform, it is the default method at compile time. You can use the configuration parameter – with select_ Module and – without select_ Module to enable or disable this module.
[2] Poll – standard method. If there is no more effective method on the current platform, it is the default method at compile time. You can use the configuration parameter – with poll_ Module and – without poll_ Module to enable or disable this module.
[3] Kqueue – an efficient method for FreeBSD 4.1+, OpenBSD 2.9+, NetBSD 2.0 and MacOS X. using kqueue in a dual processor MacOS X system may cause kernel crash.
[4] Epoll – an efficient method for Linux kernel version 2.6 and later systems. In some distributions, such as SuSE 8.2, there are patches for the 2.4 kernel to support epoll.
[5] Rtsig - executable real-time signal, which is used in systems after Linux kernel version 2.2.19. By default, no more than 1024 POSIX real-time (queued) signals can appear in the whole system. This situation is inefficient for high load servers; Therefore, it is necessary to increase the size of the queue by adjusting the kernel parameter / proc / sys / kernel / rtsig max. However, since the Linux kernel version 2.6.6-mm2, this parameter is no longer used, and there is an independent signal queue for each process. The size of this queue can be limited by rlimit_ Sigbinding parameter adjustment. When the queue is too congested, nginx abandons it and starts using the poll method to process the connection until it returns to normal.
[6] / dev/poll – an efficient method for Solaris 7 11/99+, HP/UX 11.22+ (eventport), IRIX 6.5.15 + and Tru64 UNIX 5.1A +
[7] eventport – an efficient method for Solaris 10 In order to prevent kernel crash, it is necessary to install this security patch.

37, Operation principle of fastcinx

Nginx does not support direct calling or parsing of external programs. All external programs (including PHP) must be called through FastCGI interface. FastCGI interface is socket under Linux (this socket can be file socket or ip socket).
Wrapper: in order to call the CGI program, you also need a wrapper of FastCGI (the wrapper can be understood as the program used to start another program), which is bound to a fixed socket, such as a port or file socket. When Nginx sends a CGI request to this socket, the wrapper receives the request through the FastCGI interface, and then fork (derives) a new thread, which calls the interpreter or external program to process the script and read the returned data; Then, the wrapper passes the returned data to Nginx along the fixed socket through the FastCGI interface; Finally, Nginx sends the returned data (html page or picture) to the client. This is the whole operation process of Nginx+FastCGI, as shown in the figure below.

38, Nginx multi process event model: asynchronous non blocking

Although nginx uses multiple workers to process requests, there is only one main thread in each worker. The number of concurrency that can be processed is very limited. How many workers can handle how many concurrency? Why is high concurrency? No, that's what nginx is good at. Nginx uses an asynchronous and non blocking way to process requests, that is, nginx can process thousands of requests at the same time. The number of requests that a worker process can handle at the same time is only limited by the memory size. Moreover, in terms of architecture design, there is almost no limit of synchronization lock when processing concurrent requests between different worker processes, and the worker process usually does not enter the sleep state. Therefore, when the number of processes on nginx is equal to the number of CPU cores (it is better that each worker process is bound to a specific CPU core), The cost of inter process switching is minimal.

The common working mode of Apache (APACHE also has asynchronous non blocking version, but it is not commonly used because it conflicts with some built-in modules). Each process only processes one request at a time. Therefore, when the number of concurrency reaches thousands, thousands of processes are processing requests at the same time. This is not a small challenge for the operating system. The memory occupation brought by the process is very large, and the cpu overhead brought by the context switching of the process is very large, so the natural performance can not go up, and these overhead is completely meaningless.

Topics: Java Nginx