2, Probabilistic algorithm
Probabilistic algorithm classification:
- Digital algorithm: approximate solution
- Monte Carlo algorithm: solve the decision problem or the problem with only one correct solution. The answer may not be correct.
- Las Vegas algorithm: never return wrong answers.
- Sherwood algorithm: always give the right answer.
2.1 digital algorithm
Find the approximate solution of the numerical problem
\(\ Pi \) value calculation: dart throwing simulation, unit circle and circumscribed square.
Darts(n){ k=0; for i=1 to n do { x = uniform(0,1); y = uniform(0,1); if(x*x+y*y<=1) then k++; } return 4k/n; }
Digital integration: Monte Carlo integration, but not the MC algorithm in this lesson.
Calculate integral \ (S=\int_0^!f(x)dx \)
HitorMiss algorithm: darts are thrown into a square per unit area. k falls below the integral function, \ (\ frac{k}{n}=\frac{S}{1} \)
HitorMiss(f, n){ k=0; for i=1 to n do { x = uniform(0,1); y = uniform(0,1); if(y<=f(x)) then k++; } return k/n; }
Crude algorithm: randomly take points on the integral interval and multiply the arithmetic mean by the interval width\ (\int_a^bf(x)dx=(b-1)\frac{1}{n}\sum_{i=1}^{n}f(x_i)\)
Crude(f,n,a,b){ sum = 0; for i=1 to n do{ x = uniform(a,b); sum = sum + f(x); } return (b-a)*sum/n; }
Relatively speaking, the variance of Crude algorithm is smaller, but the number of iterations of HitorMiss algorithm is more at the same time.
Solution ideas: \ (\ pi=4\int_0^1\sqrt{1-x^2}dx \)
Trapezoid deterministic algorithm: Trapezoidal algorithm, which divides the interval into n-1 sub intervals
\(S=\delta*(f(a+\delta)+f(a+2\delta)+\cdots+\frac{f(a)+f(b)}{2})\)
Trapezoid(f,n,a,b){ // Suppose n > = 2 delta = (b-a)/(n-1); sum = (f(a)+f(b))/2; for(x=a+delta; x<=b-delta; x+=delta){ sum = sum + f(x); } return sum * delta; }
The deterministic algorithm sometimes can not find the solution, and the probabilistic algorithm is suitable for high-dimensional definite integral.
Probability count: an estimate of an integer value, such as the size of a set or an estimate of the number of different objects in a set.
Find the size of the set: there are randomly selected elements put back, and k is the number selected before the first occurrence of repeated elements.
\(when n is large enough, k=\beta\sqrt{n}=\sqrt{\frac{\pi}{2}n}, that is, n=\frac{2k^2}{\pi} \)
SetCount(X){ k = 0; S = {}; a = uniform(X); do{ k++; S = S U {a}; a = uniform(X); }while(a Not belong to S) return 2k**2/pi }
Estimation of the number of different objects in multiple sets: map the word to the bit string with length m, \ (\ PI (y, b) = \ begin {cases}i, y [i] = the minimum I of B \ \ K + 1, y [i]= b \end{cases}\)
WordCount(){ y[1...m+1] = 0; for Every word on the tape x do { i = pi(h(x), 1); // The minimum position equal to 1 in the hash value of X, indicating that x is expressed in 00 01.... initial y[i] = 1; } return pi(y, 0); // Returns the smallest position in y equal to 0 } 2^(k-2)<=n<=2^k,More accurate n=2^k/1.54703
2.2 Sherwood algorithm
Always give the correct answer and smooth the execution time of different input instances.
Purpose: the average performance is better, but the worst performance is poor. For example, selection and sorting are sensitive to the relative order of elements.
Basic process:
- Transform the solved instance to a random instance. (pretreatment)
- Using the deterministic algorithm to solve the random example, a solution is obtained.
- Transform this solution to the solution of the original instance. (post treatment)
RH(x){ // Calculating f(x) with Sherwood algorithm n = length(x); // The size of x is n r = uniform(A[n]); // Select an element at random y = u(x,r); // Convert original instance x to random instance y s = f(y); // Finding the solution s of y with deterministic algorithm return v(r,s); // Restore the solution of s to the solution of x }
Sherwood algorithm for selection and sorting: scramble the input instances without post-processing. Shuffle the cards before calling the deterministic algorithm
Shuffle(T){ n = length(T); for i=1 to n-1 do { // Randomly select an element in T[1...n] and put it on T[i] j = uniform(1...n); sqap(T[i],T[j]); } }
Discrete logarithm calculation: \ (a=g^x mod \ p, calculate x = log {g, P} a \)
Simple algorithm:
- Calculate \ (g^x mod \ p,0 \le x\le p-1 \), X has limited values, because the discrete logarithm \ (< g > \) is a cyclic group
- Verify that \ (a=g^x mod \ p \) is valid
dlog(g,a,p){ // When such an algorithm does not exist, the algorithm returns p x = 0; y = 1; do{ x++; y = y*g; // Calculate y=g**x }while(a!=y%p && x!=p); return x; }
Theorem:
- \(log_{g,p}(st \% p)=(log_{g,p}s+log_{g,p}t)\%(p-1)\)
- \(log_{g,p}(g^r \% p)=r,0\le r \le p-2\)
Sherwood transformation algorithm:
dlogRH(g, a, p){ r = uniform(0..p-2); b = g^r mod p; // Power module b=g**r%p c = ba mod p; // Theorem 1 y = dlog(g,c,p); // Log using deterministic algorithm_ {g,p}c, y=r+x return (y-r) mod (p-1); // Find x }
Search ordered list: numeric array and pointer array form an ordered static linked list.
Determination algorithm with time \ (O(n) \): direct sequential search, average time \ (O(\frac{n}{2}) \)
Search(x, i){ while x > val[i] do i = ptr[i]; return i; } A(x){ return Search(x, head); }
Probabilistic algorithm with time of \ (O(n) \): start sequential search from about a random point, and the average time is \ (O(\frac{n}{3}) \)
D(x){ i = uniform(1..n); y = val[i]; case{ x<y: return Search(x, head); // Search from scratch x>y: return Search(x, ptr[i]); // Search from next location otherwise: return i; // case3 x==y } }
The determination algorithm with an average time of \ (O(\sqrt{n}) \): find the subscript of the largest integer that is not greater than x in the first \ (\ left \lfloor \sqrt{n} \right \rfloor \) and look back from this number.
B(x){ // Let x be in val[1..n] i = head; max = val[i]; // The initial value of max is the minimum value in table val for j=1 to [sqrt(n)] do { // The subscript of the largest integer y greater than x cannot be found in the number n of the first radical y = val[j]; if max < y <= x then { i = y; max = y; } } return Search(x, i); }
2.3 Las Vegas algorithm
characteristic:
- More efficient algorithm
- The upper bound of time may not exist
- The solution may not be found, but if it is found, it must be correct
- The probability of success increases with the increase of execution time
General form of algorithm: \ (LV(x,y,success),x is input, y is output \)
Convention mark:
- \(p(x), probability of algorithm success \)
- \(s(x), expected time when the algorithm succeeds \)
- \(e(x), expected time when the algorithm fails \)
Generally, the LV algorithm will be run many times until it is successful
Obstinate(x){ repeat LV(x,y,success); until success; return y; }
Expected time to find the correct solution:
2.3.1 eight queens problem
requirement:
- Row no conflict: equal row numbers
- Columns do not conflict: column numbers are equal
- The main diagonals do not conflict: the difference between row and column numbers is equal
- The secondary diagonals do not conflict: the sum of row and column numbers is equal
Traditional backtracking: decide where to put each line separately
i = j = 1; while i ≤ 8 do { // Current line number i ≤ 8 Check current row i: From current column j Try to find the safe train number one by one from the beginning to the back; if Secure column number found then { Place the queen, record the column number on the stack, and set the next row as the current row(i++); j=1; //The current column is set to 1 } else { Backtracking to the previous line, i.e i--; Remove the placed queen in the row and take the next column of the Queen's column as the current column; } }
Las Vegas algorithm:
appointment:
- \(try[1...8] indicates that the ith queen is placed on (i,try[i]) \)
- \(try[1...k] is called k-promising \), if and only if the requirements of K are met
- \(\forall i!=j \in [1,k],try[i]-try[j]\notin \{i-j,0,j-i\}\)
Basic steps:
- Whether it is feasible to traverse the positions of all the columns to change the row, \ (\ frac{1}{nb} \) this column is selected under the probability
- If there is a solution, put it. If there is no solution, exit
QueensLV(success){ // Greedy LV algorithm, all queens are placed randomly col, diag45, diag135 = {}; // Columns and two diagonals are initialized to an empty set k = 0; // Line number repeat nb = 0; // Counter, nb value is the total number of open positions of (k+1)th queen // Traverse the column number to test whether (k+1,i) is safe for i=1 to 8 do { if (i!∈col) and (i-k-1!∈diag45) and (i+k+1!∈diag135) then { // Column i is available for (i+1)th queen, but it is not placed immediately nb = nb + 1; if uniform(1..nb)=1 then // Maybe in column i j = i; } } if(nb>0) then { // No safe position when nb=0 // In all nb secure locations, the probability of (k+1)th queen selecting position j column is 1/nb k = k+1; try[k] = j; col = col ∪ {j}; diag45 = diag45 ∪ {j-k}; diag135 = diag135 ∪ {j+k}; } until nb=0 or k=8; // Exit when the appropriate location cannot be found or try is 8-promising success = (nb>0); }
In line 12 of the algorithm, at all possible column positions, the probability of selecting a column is \ (\ frac{1}{nb} \), which is pushed backward from the last item.
Improved algorithm: Joint backtracking method and LV algorithm, specifying that LV algorithm is used to determine only the position of the first few lines.
...Above algorithm... until nb=0 or k=stepVegas; // stepVegas queens have been placed randomly if nb>0 then backtrace(k, col, diag45, diag135, success); else success = false;
2.3.2 modular p square root
\(x \equiv y^2 (mod \ p),x\in[1,p-1],p is an odd prime number \), then
- x is the quadratic residue of module p
- y is the square root of x module p
Theorem:
- The quadratic residue of any module p has two different square roots \ (b,(p-b) \)
- Quadratic residue of module p when the integer between 1 and p-1 is exactly half
- \(\ forall x \in [1,p-1],p is an odd prime number, with x^{(p-1)/2} \equiv \pm1(mod \ p) \)
- \(x is the quadratic residue of module p if and only if x^{(p-1)/2} \equiv 1(mod \ p) \)
How to determine whether x is the quadratic residue of module p? Calculate whether \ (x ^ {\ frac {P-1} {2} (MOD \ P) \) is 1.
How to calculate the two square roots of x module p?
-
\(if \ p\equiv 3 (mod \ 4) \), then the square root is \ (\ pm x^{(p+1)/4} \)
-
\(if \ p\equiv 1 (mod \ 4) \), only LV algorithm can be used
Example: \ (p=53\equiv1(mod \ 4),x=7 \), find the square root of module 53 of 7.
Basic steps:
- Random \ (a\in[1,p-1] \)
- Calculate c and d, \ ((a+\sqrt{x})^{(p-1)/2} \equiv c+d\sqrt{x} \ (mod \ p) \)
- If \ (d=0 \), false is returned
- If \ (c=0 \), find y, \ (dy \equiv 1 \ (mod \ p) \)
rootLV(x, p, y, success){ // Calculate y a = uniform(1..p-1); // I don't know how much a is if a**2 % p == x % p then { // Very unlikely success = true; y =a; }else{ calculation c and d,Make 0<=c,d<=p-1,(a+sqrt(x))**((p-1)/2) % p == c+d*sqrt(x) % p; if d==0 then success = false; else{ // c=0 success = true; calculation y So, d*y % p == 1, 1<=y<=p-1 // Modified Euclid algorithm } } }
The probability of success of the algorithm is greater than 0.5, so the square root of x can be obtained by calling twice on average.
2.3.3 factorization of integers
Prime factorization of integers, \ (n=p_1^{m_1}p_2^{m_2}\cdots p_k^{m_k} \)
Naive split algorithm: find a nontrivial factor of n
split(n){ // N is a prime number, return 1, otherwise return the minimum nontrivial factor of found n for i=2 to [sqrt(n)] do if n % i == 0 then return i; return 1; }
Quadratic residue of composite number: the quadratic residue in the previous section requires that p is an odd prime number, then there are exactly two different square roots. If \ (n=pq,p and q are different prime numbers \), the quadratic residue has exactly 4 square roots.
Dixon factorization algorithm:
Basic steps:
- Find the integers a and B coprime with n so that \ (a^2 \equiv b^2 \ (mod \ n) and a! \equiv \pm b \ (mod \ n)\)
- Then the greatest common factor of N and a+b is a nontrivial factor, \ (x=gcd(n, a+b) \).
Dixon(n,x,success){ // Find a nontrivial factor of composite number n if n It's an even number then { x = 2; success = true; }else{ for i=2 to [log_{2}n] do { if n**(1/i) Is an integer then { x = n**(1/i); success = true; return; } } find a, b,bring a**2 % n == b**2 % n; if a % n == +-b % n then success = false; else { x = gcd(a+b, n); // Maximum common factor success = true; } } }
K-smoothing: all prime factors of an integer x are in the first k prime numbers.
Determine the values of a and b: select n and k in advance.
-
Randomly select \ (x\in[1,n-1] \).
- If x is not coprime with n, a nontrivial factor is found
- Otherwise, \ (y=x^2 \ mod \ n \), if y is k smooth, the factorization of X and Y is saved in the table.
- Repeat until k+1 different integers x are selected.
-
Find a nonempty subset \ (\ {y_1y_2\cdots y_k \} \) in the k+1 factorization, so that the exponents of the first k primes in the product of the corresponding factorization are even (including 0), such as \ (y_1y_2y_4y_8=2^63^45^47^011^213^217^2 \)
-
If \ (s = \ {y_{s_1}, y_{s_2}, \ cdots, y_{s_k} \}, a = \ prod {y_i \ in s} x_i, B = \ sqrt {\ prod {y_i \ in s} y_i} \), and \ (if \ a! \ equiv \ PM B \ (MOD \ n) \), you only need to find the maximum common factor of a+b and n.
Time analysis: the value of K will affect the possibility that \ (x^2 \ mod \ n \) is k smoothing and the time of factor decomposition y. Usually take the following time
2.4 Monte Carlo algorithm
There are some problems that can not find an effective algorithm to obtain the correct solution. MC finds the correct solution with high probability.
appointment:
- p-correct: an MC algorithm returns a correct solution with a probability not less than p
- Consistent: an MC algorithm returns the same correct solution for the same instance.
Call the same algorithm several times and select the solution with the most occurrences to increase the probability of success of the consistent and p- correct algorithm.
Biased algorithm:
- Partial truth algorithm: the solution of MC algorithm when it returns true must be correct. Only when it returns false can it produce a wrong solution.
- Partial \ (y_0 \) algorithm: it is always correct when the algorithm returns \ (y_0 \), and the p probability is correct when it returns non \ (y_0 \).
Repeat calling a consistent, p-correct, partial \ (y_0 \) MC algorithm k times to get a \ (1-(1-p)^k \) - correct algorithm.
2.4.1 main element problems
An array with n elements. If the number of elements is greater than \ (\ frac{n}{2} \), it is the main element.
Determine whether T has a main element: set an x randomly, and then see how many there are
maj(T){ i = uniform(1..n); x = T[i]; k = 0; for j=1 to n do if T[j] == x then k = k+1; return k>n/2; } Partial truth, 0.5-Correct algorithm
Repeated calls reduce the error rate:
maj2(T){ if maj(T) then return true; // 1 success else return maj(T); // Call twice } Partial truth, 0.75-Correct algorithm
Algorithm improvement: the error probability of the control algorithm is less than \ (\ varepsilon > 0 \), and the number of calls should be \ (k=\left \lceil lg\frac{1}{\varepsilon} \right \rceil \)
// e is the error probability majMC(T, e){ k = [lg(1/e)]; // Rounding up for i=1 to k do if maj(T) then return true; return false; } // O(nlg(1/e))
2.4.2 prime number determination
To determine whether an integer is a prime number, no effective deterministic algorithm or LV algorithm has been found
Simple probabilistic algorithm:
prime(n){ d = uniform(2..[sqrt(n)]); // Rounding down return (n mod d)!=0; }
Fermat's small theorem: \ (if \ n is a prime number, then \ forall a \in [1,n-1], with a^{n-1} mod \ n = 1 \)
Transform proposition: \ (if \ \exist a \in [1,n-1], make a ^ {n-1}mod \ n! = 1, then n is not a prime \)
Primality determination algorithm: use Fermat's small theorem, partial false. That is, the algorithm returns false, which must be false.
Fermat(n){ a = uniform(1..n-1); if a**(n-1) mod n == 1 then return true; // Not necessarily correct else return false; // It must be right }
Pseudo prime number and prime pseudo evidence: the composite number conforming to Fermat's small theorem is called the pseudo prime number with a as the base, and a is called the prime pseudo evidence of n.
Repeating the Fermat test several times does not reduce the error.
Strong pseudo Prime: n is an odd number greater than 4, s and T are integers, so that \ (n-1=2^st,t is an odd number \), and the set \ (B(n) \) satisfies the following conditions:
- N is a prime number, then \ (\ forall a \in [2,n-2], a \in B(n) \)
- N is a composite number, \ (if \ a \in B(n) \), then n is a strong pseudo prime number with a as the base, and a is the strong pseudo evidence of N prime.
Btest(a, n){ // N is an odd number, 2 < = a < = n-2. If true is returned, it means that a belongs to B(n). Then n is a strong pseudo prime or prime s=0; t=n-1; // t starts with an even number repeat s++;t/=2; until t mod 2 = 1; //n-1=(2**s)t, t is an odd number x = a**t mod n; if x=1 or x=n-1 then return true; // Meet condition 1 or 2 for i=1 to s-1 do{ x = x**2 mod n; if x=n-1 then return true; // Meet condition 2 } return false; } 0.75-correct
The number of strong false evidence is much less than that of false evidence.
\(strong false evidence \ Rightarrow false evidence \)
Miller Rabin test: 0.75 - correct, false
MillRab(n){ // Odd n > 4. When it returns true, it represents prime number, and when it returns false, it represents composite number a = uniform(2..n-2); return Btest(a,n); // Test whether n is a strong pseudo prime }
RepeatMillRob: repeat the experiment k times, \ (1-4^{-k} \) - the correct MC algorithm
RepeatMillRob(n,k){ for i=1 to k do if MillRob(n) == false then return false; // It must be a composite number return true; }
If the error probability does not exceed \ (\ varepsilon \), the number of repetitions is \ (k=\left \lceil \frac{1}{2}lg\frac{1}{\varepsilon} \right \rceil \)
Determine n prime time as: \ (O(lg^3n \ lg\frac{1}{\varepsilon}) \)
2.4.3 matrix multiplication verification
Verification \ (matrix AB=C \)
Set \ (X_{1 \times n}=(a_1,a_2,\cdots,a_n),a_i\in\{0,1 \} \), then the problem is converted to \ (XAB=XC \)
Basic steps:
- Calculate \ (XA \)
- Calculate the product of \ (XA \) and \ (B \)
- Calculation \ (XC \)
GoodProduct(A,B,C,n){ for i=1 to n do x[i] = uniform(0..1); if XAB=XC then return true; else return false; }
False, 0.5-correct
Return false must be false
Improved algorithm: repeat k times
RepeatGoodProduct(A,B,C,n,k){ for i=1 to k do // Repeat k times if GoodProduct(A,B,C,n) == false then return false; return true; }
False, \ (1-2^{-k} \) - correct
Give the error probability: $2^{-k}=\varepsilon$
GP(A,B,C,n,e){ k = [lg(1/e)]; // Rounding up return RepeatGoodProduct(A,B,C,n,k); } // O(n**2log(1/e))