# Chapter IV study notes

Posted by janey on Sat, 30 Oct 2021 15:16:52 +0200

# Chapter IV study notes

## 1. Introduction to parallel computing

• In the early days, most computers had only one processing component, called a processor or central processing unit (CPU). Limited by this hardware condition, computer programs are usually written for serial computing.
To solve a problem, we must first design an algorithm to describe how to solve the problem step by step, and then use the computer program to realize the algorithm in the form of serial instruction stream. When there is only one CPU, only one instruction and step of an algorithm can be executed in sequence at a time. However, algorithms based on the divide and conquer principle (such as binary tree search and quick sorting) often show a high degree of parallelism, which can improve the computing speed by using parallel or concurrent execution. Parallel computing is a computing solution that attempts to solve problems more quickly using multiple processors executing parallel algorithms.
1. Sequential algorithm and parallel algorithm
• Sequential algorithm
```  begin
step_1
step_2
···
step_n
end
```
• parallel algorithm
``` cobegin
···
coend
```
• Sequential algorithm a sequential algorithm in a begin end code block may contain multiple steps. All steps are performed sequentially through a single task, one step at a time. When all steps are completed, the algorithm ends. In contrast, parallel algorithms use cobegin coend code blocks to specify independent tasks of parallel algorithms. In the cobegin coend block, all tasks are executed in parallel. The next step in the cobegin coend code block will only be performed after all these tasks are completed.
1. Parallelism and concurrency
• Generally, parallel algorithms only identify tasks that can be executed in parallel, but it does not specify how to map tasks to processing components. Ideally, all tasks in parallel algorithms should be executed in real time at the same time. However, real parallel execution can only be implemented in systems with multiple processing components, such as multiprocessor or multi-core systems. In a single CPU system, only one task can be executed at a time. In this case, different tasks can only be executed concurrently, that is, logically in parallel. In a single CPU system, concurrency is achieved by multitasking.

• In the thread model, if one thread is suspended, other threads can continue to execute. In addition to sharing a common address space, threads also share many other resources of the process, such as user IDs, open file descriptors, signals, and so on.
• (1) Faster thread creation and switching
• (3) Threads are more suitable for parallel computing
• (1) Due to address space sharing, threads need explicit synchronization from users.
• (2) Many library functions may not be thread safe. For example, the traditional strtok() function divides a string into a series of tokens. Generally, any function that uses global variables or depends on static memory content is not thread safe. In order to adapt the library functions to the threaded environment, a lot of work needs to be done.
• (3) On a single CPU system, using threads to solve problems is actually slower than using sequential programs, which is caused by the overhead of creating threads and switching contexts at run time.

• The execution trajectory of a thread is similar to that of a process. Threads can execute in kernel mode or user mode.
In user mode, threads execute in the same address space of the process, but each thread has its own execution stack. Thread is an independent execution unit. It can make system call to the kernel according to the scheduling strategy of the operating system kernel, change it to suspend and activate to continue execution, etc.

## 4. Process management function

```int pthread_create (pthread_t *pthread_id, pthread_attr_t *attr,
void *(*func)(void *), void *arg);
```
• pthread_id refers to pthread_ Pointer to a variable of type T. It is populated with a unique thread ID assigned by the operating system kernel. In POSIX, pthread_t is an opaque type. Programmers should not know the content of opaque objects because it may depend on the implementation. Threads can be accessed via pthread_ The self () function gets its own ID. In Linux, pthread_ The T type is defined as an unsigned long integer, so the thread ID can be printed as% lu.
• attr is a pointer to another opaque data type that specifies thread properties, which will be described in more detail below.
• Func is the entry address of the new thread function to execute. arg is the pointer to the thread function parameter, which can be written as void *func(void *arg)
• attr parameter
• (2) With pthread_ attr_ Init (& attr) initializes the attribute variable.
• (3) Set the attribute variable and in pthread_ Used in the create() call.
• (4) If necessary, use pthread_ attr_ Destroy (& attr) free attr resources.
• Thread ID is an opaque data type, depending on the implementation. Therefore, you should not directly compare thread IDs. If necessary, you can use pthread_ The equal () function compares them.
```int pthread_equal (pthread_t t1, pthread_t t2);
```

If it is a different thread, it returns 0; otherwise, it returns non-0.

```int pthread_exit (void *status);
```

Performs an explicit termination, where the state is the exit state of the thread. Generally, 0 exit value indicates normal termination, and non-0 value indicates abnormal termination.

```int pthread_join (pthread_t thread, void **status ptr);
```

Terminate the exit status of the thread to status_ptr returns.

## 5. Practice

```#include <stdio.h>
#include <stdlib.h>
typedef struct{
int upperbound;
int lowerbound;
}PARM;
#define N 10
int a[N]={5,1,6,4,7,2,9,8,0,3};
int print(){//print current a[] contents
int i;
printf("[");
for(i=0;i<N;i++)
printf("%d ",a[i]);
printf("]\n");
}
void *Qsort(void *aptr){
PARM *ap, aleft, aright;
int pivot, pivotIndex,left, right,temp;
int upperbound,lowerbound;
ap =(PARM *)aptr;
upperbound = ap->upperbound;
lowerbound = ap->lowerbound;
pivot = a[upperbound];//pick low pivot value
left = lowerbound - 1;//scan index from left side
right = upperbound;//scan index from right side
if(lowerbound >= upperbound)
while(left < right){//partition loop
do{left++;} while (a[left] < pivot);
do{right--;}while(a[right]>pivot);
if (left < right ) {
temp = a[left];a[left]=a[right];a[right] = temp;
}
}
print();
pivotIndex = left;//put pivot back
temp = a[pivotIndex] ;
a[pivotIndex] = pivot;
a[upperbound] = temp;
aleft.upperbound = pivotIndex - 1;
aleft.lowerbound = lowerbound;
aright.upperbound = upperbound;
aright.lowerbound = pivotIndex + 1;
printf("%lu: create left and right threadsln", me) ;
//wait for left and right threads to finish
printf("%lu: joined with left & right threads\n",me);
}
int main(int argc, char *argv[]){
PARM arg;
int i, *array; 