Analysis of Structured Exception Handling in Windows

Posted by FrancoPaddy on Sat, 11 May 2019 22:56:48 +0200

Recently, I have been puzzled by a problem, that is, the program always crashes without reason. Some places know that there may be problems, but some places have no way to know what the problems are. Even more painful thing is that our program needs 7x24 service customers, although it does not need real-time precision zero error, but can not always appear disconnected loss of data state. So just by dealing with this problem, we found some solutions to capture and access illegal memory addresses or divide 0 by a number. So we have encountered this structured exception handling. Now let's make a brief introduction to the problem, so that we can know the reason of the problem first, and then how to solve it. No more nonsense, let's get to the point.

What is structured exception handling

structured exception handling (SEH) is introduced into the operating system as a system mechanism, which is independent of language. Using SEH in our own programs allows us to concentrate on the development of key functions, and to deal with possible exceptions in the program in a unified way, making the program more concise and readable.

Using SHE does not mean that errors in code can be totally ignored, but we can separate software workflow from exception handling by focusing on important and urgent tasks before dealing with this important and non-urgent problem that may encounter various errors (not urgent, but absolutely important).

When SEH is used in programs, it becomes compiler-related. The burden is mainly borne by the compiler. For example, the compiler generates tables to support the data structure of SEH and provides callback functions.

Note:
Don't confuse SHE with C++ exception handling. C++ exception handling is formally expressed as using keywords catch and throw. This SHE is different in form. In windows Visual C++, it is implemented by compiler and operating system SHE.

Of all the mechanisms provided by Win32 operating system, SHE is probably the most widely used unpublished mechanism. When it comes to SHE, it may be reminiscent of words like * try, finally * and * except *. SHE actually has two functions: termination handing and exception handing.

Termination processing

The termination handler ensures that no matter how one block of code (protected code) exits, another block of code (termination handler) can always be invoked and executed with the following syntax:

__try
{
    //Guarded body
    //...
}
__finally
{
    //Terimnation handler
    //...
}

The ** try and finally ** keywords mark the two parts of the termination handler. Cooperative work between the operating system and the compiler ensures that the termination program can be called regardless of how the protected code part exits (whether it exits normally or abnormally), i.e., the ** finally ** block of code can be executed.

Normal Exit and Abnormal Exit of try Block

The try block may exit unnaturally because of return, goto, exception, etc., or it may exit naturally because of successful execution. However, regardless of how the try block exits, the final block content is executed.

int Func1()
{
    cout << __FUNCTION__ << endl;
    int nTemp = 0;
    __try{
        //Normal execution
        nTemp = 22;
        cout << "nTemp = " << nTemp << endl;
    }
    __finally{
        //End processing
        cout << "finally nTemp = " << nTemp << endl;
    }
    return nTemp;
}

int Func2()
{
    cout << __FUNCTION__ << endl;
    int nTemp = 0;
    __try{
        //Abnormal execution
        return 0;
        nTemp = 22;
        cout << "nTemp = " << nTemp << endl;
    }
    __finally{
        //End processing
        cout << "finally nTemp = " << nTemp << endl;
    }
    return nTemp;
}

The results are as follows:

Func1
nTemp = 22  //Normal Execution Assignment
finally nTemp = 22  //End processing block execution

Func2
finally nTemp = 0   //End processing block execution

As can be seen from the examples above, the use of termination handlers prevents premature execution of return statements. When the return statement view exits the try block, the compiler causes the final block to execute before it. When accessing variables through semaphores in multi-threaded programming, if there is an exception, whether the semaphores can be smoothly or not, the thread will not occupy a semaphore all the time. When the final block is executed, the function returns.

In order to make the whole mechanism work, the compiler must generate some extra code, and the system must perform some extra work. So we should avoid using return statement in the try code block when writing the code, because it has an impact on the performance of the application, it is not a big problem for simple demo, and it is better for the program to run uninterruptedly for a long time, as will be mentioned below. A keyword ** leave ** can help us find code with local expansion overhead.

A good rule of thumb: Don't terminate handlers with statements that let try blocks exit early. This means removing return, continue, break, goto statements from try and final blocks and placing them outside the termination handler. The advantage of this method is that it does not need to capture which blocks of try exit early, so that the compiler generates the smallest amount of code and improves the efficiency and readability of the program.

#### Cleaning function of final block and its effect on program structure

In the coding process, we need to add the need to detect whether the function is successfully executed. If it is successful, we need to do some additional cleaning work, such as releasing memory, closing handles and so on. If there are not many detections, it will have no effect; but if there are many detections and the logic relationship in the software is complex, it often requires a lot of effort to achieve tedious detection and judgment. As a result, the program will appear to be more complex in structure, greatly reducing the readability of the program, and the volume of the program will continue to increase.

In the past, when writing VBA that calls Word through C OM, we need to get objects layer by layer, judge whether the objects are successful, perform related operations, and release the objects again. One or two lines of VBA code should be written in dozens of lines (depending on what objects the operations are).

Here's a way to show you why some people prefer scripting languages to C++.

In order to operate Office more logically and hierarchically, Microsoft divides applications into the following tree structures according to their logical functions

Application(WORD For example, only a part is listed)
　　Documents(All documents)
        Document(A document)
            ......
　　Templates(All templates)
        Template(A template)
            ......
　　Windows(All windows)
        Window
        Selection
        View
        .....
　　Selection(Editing objects)
        Font
        Style
        Range
        ......
　　......

Only when we understand the logical hierarchy can we operate the Office correctly. For example, if a VBA statement is given, it is:

Application.ActiveDocument.SaveAs "c:\abc.doc"

So, we know that the process of this operation is:

The first step is to get Application
Step 2: Get Active Document from Application
The third step is to call the function SaveAs of Document, whose parameter is a string file name.

This is just the simplest VBA code. For a slightly more complicated example, insert a bookmark at the selection:

 ActiveDocument.Bookmarks.Add Range:=Selection.Range, Name:="iceman"

Here the process is as follows:

Get Application
Get ActiveDocument
Get Selection
Get Range
Get Bookmarks
Call method Add

When acquiring each object, we need to judge, and give error handling, object release and so on. Here's the pseudo code. It's a bit long.

#define RELEASE_OBJ(obj) if(obj != NULL) \
                        obj->Realse();

BOOL InsertBookmarInWord(const string& bookname)
{
    BOOL ret = FALSE;
    IDispatch* pDispApplication = NULL;
    IDispatch* pDispDocument = NULL;
    IDispatch* pDispSelection = NULL;
    IDispatch* pDispRange = NULL;
    IDispatch* pDispBookmarks = NULL;
    HRESULT hr = S_FALSE;

    hr = GetApplcaiton(..., &pDispApplication);
    if (!(SUCCEEDED(hr) || pDispApplication == NULL))
        return FALSE;

    hr = GetActiveDocument(..., &pDispDocument);
    if (!(SUCCEEDED(hr) || pDispDocument == NULL)){
        RELEASE_OBJ(pDispApplication);
        return FALSE;
    }

    hr = GetActiveDocument(..., &pDispDocument);
    if (!(SUCCEEDED(hr) || pDispDocument == NULL)){
        RELEASE_OBJ(pDispApplication);
        return FALSE;
    }

    hr = GetSelection(..., &pDispSelection);
    if (!(SUCCEEDED(hr) || pDispSelection == NULL)){
        RELEASE_OBJ(pDispApplication);
        RELEASE_OBJ(pDispDocument);
        return FALSE;
    }

    hr = GetRange(..., &pDispRange);
    if (!(SUCCEEDED(hr) || pDispRange == NULL)){
        RELEASE_OBJ(pDispApplication);
        RELEASE_OBJ(pDispDocument);
        RELEASE_OBJ(pDispSelection);
        return FALSE;
    }

    hr = GetBookmarks(..., &pDispBookmarks);
    if (!(SUCCEEDED(hr) || pDispBookmarks == NULL)){
        RELEASE_OBJ(pDispApplication);
        RELEASE_OBJ(pDispDocument);
        RELEASE_OBJ(pDispSelection);
        RELEASE_OBJ(pDispRange);
        return FALSE;
    }

    hr = AddBookmark(...., bookname);
    if (!SUCCEEDED(hr)){
        RELEASE_OBJ(pDispApplication);
        RELEASE_OBJ(pDispDocument);
        RELEASE_OBJ(pDispSelection);
        RELEASE_OBJ(pDispRange);
        RELEASE_OBJ(pDispBookmarks);
        return FALSE;
    }
    ret = TRUE;
    return ret;

This is just pseudo code, although you can also reduce the line of code through goto, but goto is not used well will make a mistake, the following procedure is a little careless goto should not get the place.

BOOL InsertBookmarInWord2(const string& bookname)
{
    BOOL ret = FALSE;
    IDispatch* pDispApplication = NULL;
    IDispatch* pDispDocument = NULL;
    IDispatch* pDispSelection = NULL;
    IDispatch* pDispRange = NULL;
    IDispatch* pDispBookmarks = NULL;
    HRESULT hr = S_FALSE;

    hr = GetApplcaiton(..., &pDispApplication);
    if (!(SUCCEEDED(hr) || pDispApplication == NULL))
        goto exit6;

    hr = GetActiveDocument(..., &pDispDocument);
    if (!(SUCCEEDED(hr) || pDispDocument == NULL)){
        goto exit5;
    }

    hr = GetActiveDocument(..., &pDispDocument);
    if (!(SUCCEEDED(hr) || pDispDocument == NULL)){
        goto exit4;
    }

    hr = GetSelection(..., &pDispSelection);
    if (!(SUCCEEDED(hr) || pDispSelection == NULL)){
        goto exit4;
    }

    hr = GetRange(..., &pDispRange);
    if (!(SUCCEEDED(hr) || pDispRange == NULL)){
        goto exit3;
    }

    hr = GetBookmarks(..., &pDispBookmarks);
    if (!(SUCCEEDED(hr) || pDispBookmarks == NULL)){
        got exit2;
    }

    hr = AddBookmark(...., bookname);
    if (!SUCCEEDED(hr)){
        goto exit1;
    }

    ret = TRUE;
exit1:
    RELEASE_OBJ(pDispApplication);
exit2:
    RELEASE_OBJ(pDispDocument);
exit3:
    RELEASE_OBJ(pDispSelection);
exit4:
    RELEASE_OBJ(pDispRange);
exit5:
    RELEASE_OBJ(pDispBookmarks);
exit6:
    return ret;

Here or through the SEH termination handler to re-method, so is it clearer?

BOOL InsertBookmarInWord3(const string& bookname)
{
    BOOL ret = FALSE;
    IDispatch* pDispApplication = NULL;
    IDispatch* pDispDocument = NULL;
    IDispatch* pDispSelection = NULL;
    IDispatch* pDispRange = NULL;
    IDispatch* pDispBookmarks = NULL;
    HRESULT hr = S_FALSE;

    __try{
        hr = GetApplcaiton(..., &pDispApplication);
        if (!(SUCCEEDED(hr) || pDispApplication == NULL))
            return FALSE;

        hr = GetActiveDocument(..., &pDispDocument);
        if (!(SUCCEEDED(hr) || pDispDocument == NULL)){
            return FALSE;
        }

        hr = GetActiveDocument(..., &pDispDocument);
        if (!(SUCCEEDED(hr) || pDispDocument == NULL)){
            return FALSE;
        }

        hr = GetSelection(..., &pDispSelection);
        if (!(SUCCEEDED(hr) || pDispSelection == NULL)){
            return FALSE;
        }

        hr = GetRange(..., &pDispRange);
        if (!(SUCCEEDED(hr) || pDispRange == NULL)){
            return FALSE;
        }

        hr = GetBookmarks(..., &pDispBookmarks);
        if (!(SUCCEEDED(hr) || pDispBookmarks == NULL)){
            return FALSE;
        }

        hr = AddBookmark(...., bookname);
        if (!SUCCEEDED(hr)){
            return FALSE;
        }

        ret = TRUE;
    }
    __finally{
        RELEASE_OBJ(pDispApplication);
        RELEASE_OBJ(pDispDocument);
        RELEASE_OBJ(pDispSelection);
        RELEASE_OBJ(pDispRange);
        RELEASE_OBJ(pDispBookmarks);
    }
    return ret;

The functions of these functions are the same. You can see that the RELEASE_OBJ in InsertBookmarInWord is everywhere, while the cleaning functions in InsertBookmarInWord 3 are all concentrated in the final block. If you read the code, you only need to look at the content of the try block to understand the program flow. These two functions are very small, so we can appreciate the difference between the two functions in detail.

Keyword _leave

Using the ** leave keyword in the try block will cause the program to jump to the end of the try block and naturally enter the final block.
For InsertBookmarInWord3 in the previous example, the return in the try block can be completely replaced by _leave**. The difference between the two is that return causes the early exit of try from the system to expand locally, which increases the system overhead. If ** leave ** is used, it will naturally exit the try block, and the overhead is much smaller.

BOOL InsertBookmarInWord4(const string& bookname)
{
    BOOL ret = FALSE;
    IDispatch* pDispApplication = NULL;
    IDispatch* pDispDocument = NULL;
    IDispatch* pDispSelection = NULL;
    IDispatch* pDispRange = NULL;
    IDispatch* pDispBookmarks = NULL;
    HRESULT hr = S_FALSE;

    __try{
        hr = GetApplcaiton(..., &pDispApplication);
        if (!(SUCCEEDED(hr) || pDispApplication == NULL))
            __leave;

        hr = GetActiveDocument(..., &pDispDocument);
        if (!(SUCCEEDED(hr) || pDispDocument == NULL))
            __leave;

        hr = GetActiveDocument(..., &pDispDocument);
        if (!(SUCCEEDED(hr) || pDispDocument == NULL))
            __leave;

        hr = GetSelection(..., &pDispSelection);
        if (!(SUCCEEDED(hr) || pDispSelection == NULL))
            __leave;

        hr = GetRange(..., &pDispRange);
        if (!(SUCCEEDED(hr) || pDispRange == NULL))
            __leave;

        hr = GetBookmarks(..., &pDispBookmarks);
        if (!(SUCCEEDED(hr) || pDispBookmarks == NULL))
            __leave;

        hr = AddBookmark(...., bookname);
        if (!SUCCEEDED(hr))
            __leave;

        ret = TRUE;
    }
    __finally{
        RELEASE_OBJ(pDispApplication);
        RELEASE_OBJ(pDispDocument);
        RELEASE_OBJ(pDispSelection);
        RELEASE_OBJ(pDispRange);
        RELEASE_OBJ(pDispBookmarks);
    }
    return ret;
}

Exception handler

Software exceptions are something we don't want to see, but errors often occur, such as CPU capturing problems like illegal memory access and dividing by 0. Once such errors are detected, relevant exceptions are thrown. Operating system will give our application an opportunity to see the type of exceptions and run the program itself to handle the exceptions. The exception handler structure code is as follows

  __try {
      // Guarded body
    }
    __except ( exception filter ) {
      // exception handler
    }

Note that the keyword ** except **, any try block, must be followed by a final block or except block, but after try, there can not be both final and except blocks, nor can there be multiple find or except blocks at the same time, but can be nested with each other.

Basic flow of exception handling

int Func3()
{
    cout << __FUNCTION__ << endl;
    int nTemp = 0;
    __try{
        nTemp = 22;
        cout << "nTemp = " << nTemp << endl;
    }
    __except (EXCEPTION_EXECUTE_HANDLER){
        cout << "except nTemp = " << nTemp << endl;
    }
    return nTemp;
}

int Func4()
{
    cout << __FUNCTION__ << endl;
    int nTemp = 0;
    __try{
        nTemp = 22/nTemp;
        cout << "nTemp = " << nTemp << endl;
    }
    __except (EXCEPTION_EXECUTE_HANDLER){
        cout << "except nTemp = " << nTemp << endl;
    }
    return nTemp;
}

The results are as follows:

Func3
nTemp = 22  //Normal execution

Func4
except nTemp = 0 //Catching exceptions,

The try block in Func3 is only a simple operation, so it will not cause an exception, so the code in the except block will not be executed. The view of the try block in Func4 divides 22 by 0, which causes the CPU to capture the event and throw it out. The system locates the except block and processes the exception. There is an exception filtering expression. There are three definitions in the system (defined in Excpt.h of Windows):

1. EXCEPTION_EXECUTE_HANDLER:
    I know the exception. I've written code to handle it. Let the code execute. The program jumps to the except block to execute and exits.
2. EXCEPTION_CONTINUE_SERCH
    Continue the upper search to process except code block and call the corresponding exception filter program
3. EXCEPTION_CONTINUE_EXECUTION
    Return to the place where the exception occurred and re-execute the CPU instruction itself

Face is two basic methods of use:

Way 1: Use one of the three return values of the filter directly

__try {
   ......
}
__except ( EXCEPTION_EXECUTE_HANDLER ) {
   ......
}

Mode 2: Custom filter
```
__try {
......
}
__except ( MyFilter( GetExceptionCode() ) )
{
......
}

LONG MyFilter ( DWORD dwExceptionCode )
{
if ( dwExceptionCode == EXCEPTION_ACCESS_VIOLATION )
return EXCEPTION_EXECUTE_HANDLER ;
else
return EXCEPTION_CONTINUE_SEARCH ;
}

<br>

## Capture SEH exceptions in. NET 4.0

After. NET 4.0, CLR will distinguish some exceptions (all SEH exceptions) and identify them as Corrupted State Exception. For these exceptions, CLR's catch block does not catch these exceptions, nor can the code catch them.

try{
//....
}
catch(Exception ex)
{
Console.WriteLine(ex.ToString());
}

Because not everyone needs to catch this exception, if your program is compiled and run under 4.0 and you want to catch the SEH exception in. NET program, there are two ways to try:

 - In the.config file of the managed program, the attribute legacy Corrupted State Exceptions Policy is enabled, that is, the simplified.config file is similar to the following file:

App.Config

This setting tells CLR 4.0 that the entire. NET program uses the old exception capture mechanism.

- A HandleProcessCorruptedStateExceptions property is added to a function that needs to catch destructive exceptions, which controls only one function and has no effect on other functions of the managed program, such as:

[HandleProcessCorruptedStateExceptions]
try{
//....
}
catch(Exception ex)
{
Console.WriteLine(ex.ToString());
}
```

Topics: Windows Programming Attribute

Programmer Think