Data structure in project dependency analysis

Posted by TechGnome on Thu, 06 Jan 2022 03:43:19 +0100

Common problems in the project

  1. Source code dependency may lead to mutual direct or indirect dependency to form a ring. How should we quickly detect this?
  2. Source code dependency for example, for the large version upgrade of commonLib, we need to release the verification of other aar source codes that depend on the commonLib. How can we quickly obtain them?
thinking
  • In view of the above problems, for the same city and town, through config maintenance, we can put forward solutions from this file; Sensitive information related to the project shall be replaced by "XX"! The version information is always "1.0.0"
  1. Insert the code slice here. The relationship between aar and source code switching is set by the switches set: lib totals 41
switchs = [
        "58AppDependenciesLib"    : dependencyMode == "aar",
        "58ClientHybridLib"       : dependencyMode == "aar",
        "58HouseLib"              : dependencyMode == "aar",
        "58HuangyeLib"            : dependencyMode == "aar",
        "58PincheLib"             : dependencyMode == "aar",
        "58PartTimeLib"           : dependencyMode == "",
        "WubaCommonsLib"          : dependencyMode == "aar",
        "WubaHybridBusinessLib"   : dependencyMode == "aar"
        ...ellipsis...
]
  1. If it is set as the source code, the storage location is set in the setting configuration: the address is under the current project path
include switchs['58ClientHybridLib']?'':'58ClientHybridLib'
include switchs['58HouseLib']?'':'58HouseLib'
include switchs['58AnjukeLib'] ? '' : '58AnjukeLib'
include switchs['58CarLib']?'':'58CarLib'
 ...ellipsis...
  1. The dependencies of each module, namely lib, are also configured in config
 WubaPartTimeLib = [
            switchs['58PartTimeLib'] ? "com.wuba.XXX:" + "$WubaPartTimeLibVersion" : findProject(':58PartTimeLib'),
            rootProject.ext.WubaAppDependenciesLib,
            rootProject.ext.WubaTradelineLib,
            rootProject.ext.WubaWebBusinessLib,
            "com.wuba.XXX:$TtSdkVersion",
            'com.wuba.XXX:1.0.0',
            'com.wuba.XXX:1.0.0',
            rootProject.ext.ZCMPublish,
            rootProject.ext.WubaCommonsLib,
            rootProject.ext.WubaPermissionSDK
    ]
  • By observing the part-time side dependencies, such as ttsdk, and modules that can configure aar like part-time, such as WubaCommonsLib. WubaCommonsLib may also rely on other modules that can configure aar. It is easy to think of a solution to such nesting: recursive call
Implementation plan
  1. First, you need to obtain the name of the aar and source code switching project, that is, the switches collection data, which is stored in map
  2. Recursively call to get the case that each lib depends on other LIBS. The relationship is as follows
key : WubaClientHybridLib  -- > [
com.wubaXXX:100.2.36, 
com.wubaXXXLib:100.22.41, 
com.wubanessLib:101.22.82, 
com.wubaILib:100.20.18, 
com.wubaALib:10.19.2, 
]   
key : WubaCommonsLib  -- > 
[com.wubaCommonsLib:10.26.1, 
com.wuba.SourcesLib:1.0.0, 
com.wuba.LogLib:1.0.1
]
  • Compared with the above data, because it is a recursive call, when WubaClientHybridLib references WubaCommonsLib, the former will obtain all the dependent libraries of the latter, so what data format is the most convenient and direct for us to save their direct relationship?
  • Requirements: it is convenient to quickly view the dependencies of various LIBS, that is, for example, WubaCommonsLib can quickly view the underlying libraries it depends on and the LIBS that use it. After comparison, we choose to use the diagram! Let's briefly introduce why this method is used!
chart
  • Definition: graph is a complex nonlinear structure. In the graph structure, the relationship between nodes is arbitrary, and any two data elements in the graph may be related. Graph G is composed of two sets V (Vertex) and E (Edge), which is defined as G=(V, E)
  1. Use the above data as an example, such as the directed graph between WubaClientHybridLib, WubaCommonsLib and WubaLogLib
//Define the vertex set of the three. Here, the same is true for the array. The subscript of the array represents the corresponding lib name
  String[] strArr = new String[]{
                "WubaClientHybridLib", //0 ->V0
                "WubaCommonsLib", //1->V1
                "WubaLogLib" //2 ->V2
        };
  • Undirected graph: the degree of a vertex represents the number of edges with the vertex as an endpoint;
  • Directed graph: the degree of vertex is divided into in degree and out degree. In degree represents the number of in edges ending at the vertex, out degree is the number of out edges starting from the vertex, and the degree of the vertex is equal to the sum of its in degree and out degree;
  1. According to the above analysis, if we need to obtain the dependency Library of WubaClientHybridLib, we only need to find another vertex of the edge with vertex WubaClientHybridLib, that is, vertex V0. Similarly, if you want to obtain WubaLogLib, which is dependent on those LIBS, you only need to obtain another vertex of the edge with V2 degrees.
  • If the aar of WubaLogLib is deleted, you can set whether only this lib is source dependent or all LIBS it depends on are source dependent. You only need to read the WubaLogLib entry!
Representation of edges
  • The actual information of the graph is stored on the edges. They describe the structure of the graph. Anyone who understands the data structure knows that there are only two forms: array and linked list. The graph is no exception: the method of representing the edges of the graph is called adjacency matrix or adjacency table
Adjacency table
  • It is a storage structure combining sequential allocation and chain allocation. If the vertex corresponding to the header node has adjacent vertices, the adjacent vertices are stored in the one-way linked list pointed to by the header node in turn;
adjacency matrix
  1. The principle is to use two arrays, one to save the vertex set and the other to save the edge set;
-V0V1V2
V0011
V1001
V2000
  • For each line, it indicates other library relationships that the current lib depends on. If it is 1, it indicates that there are dependencies
  • For each column, it indicates which libraries the current lib is dependent on. If 1, it indicates that there is also a dependency
  1. The adjacency matrix obtained by running config is as follows. Interested students can choose one to check whether it is consistent with the summary;
current key : HouseLib  -- > position :   4
 current key : PincheLib  -- > position :   12
 current key : RxDataSourcesLib  -- > position :   36
 current key : BaseUILib  -- > position :   35
 current key : VideoLib  -- > position :   31
 current key : HybridBusinessLib  -- > position :   40
 current key : FinanceLib  -- > position :   14
 current key : TribeLib  -- > position :   16
 current key : JiaoyouLib  -- > position :   25
 current key : AnjukeLib  -- > position :   20
 current key : JobCommonLib  -- > position :   8
 current key : PartTimeLib  -- > position :   13
 current key : TradelineLib  -- > position :   28
 current key : TribeABLib  -- > position :   17
 current key : TownLib  -- > position :   23
 current key : AOPLib  -- > position :   37
 current key : LogLib  -- > position :   38
 current key : WalleExtLib  -- > position :   32
 current key : IMLib  -- > position :   15
 current key : WalleLib  -- > position :   33
 current key : JobLib  -- > position :   6
 current key : HuangyeLib  -- > position :   11
 current key : RNBusinessLib  -- > position :   18
 current key : SaleLib  -- > position :   10
 current key : HouseLib  -- > position :   2
 current key : BasicBusinessLib  -- > position :   29
 current key : CarLib  -- > position :   5
 current key : LoginLib  -- > position :   21
 current key : ClientHybridABLib  -- > position :   19
 current key : TownPowerLib  -- > position :   24
 current key : LocationLib  -- > position :   22
 current key : QigsawLib  -- > position :   39
 current key : JiaoyouIMLib  -- > position :   26
 current key : JobABLib  -- > position :   7
 current key : ClientHybridLib  -- > position :   1
 current key : AppDependenciesLib  -- > position :   0
 current key : DatabaseLib  -- > position :   30
 current key : NewCarLib  -- > position :   9
 current key : WebBusinessLib  -- > position :   27
 current key : HouseAJKMixLib  -- > position :   3
 current key : CommonsLib  -- > position :   34
 View the directed graph as: int[][] map = new int[][]{            
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,0,1,0,0},
{1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,1,0,0,0,0,1,0,1,1,1,1,1,1,1,1,0,1,1,1},
{0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,1,0,0},
{0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,1,1,1,1,1,1,0,1,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,1,0,0},
{1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,1,1,0},
{1,0,0,0,0,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,1,1,0},
{1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,1,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,1,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,1,1,1,1,1,0,1,0,0},
{1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,1,0,0},
{1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,0,1,0,0},
{1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,1,0,0},
{1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,0,1,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,0,1,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,1,1,1,1,1,1,0,1,0,0},
{1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,1,0,0,0,0,1,0,1,1,1,1,1,1,1,1,0,1,1,1},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,1,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,1,0,1,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,1,0,1,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,0,1},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,1,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,1,1,1,1,1,1,0,1,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,0,1,1,1,1,1,1,1,1,0,1,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,0,1,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,1,1,1,1,1,0,1,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,1,1,1,0,1,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,1,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,0,1,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,1,0,1,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,1,1,1,1,1,0,1,0,0},
{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,1,1,1,1,1,1,0,1,0,0}
};
  • For example, for WubaClientHybridLib, the subscript is 1, that is, only one data in the second column is 1. In line 19, that is, WubaClientHybridABLib depends on. Check the data in config, and it is true
 ClientHybridABLib = [
      switchs['ClientHybridABLib'] ? "com.wuba.XXX.ClientHybridABLib:$ClientHybridABLibVersion" : findProject('ClientHybridABLib'),
      rootProject.ext.ClientHybridLib
    ]
  • Optimization point: for the upper adjacency matrix, since we aar set the set size to 41, the memory ratio int[41][41] is approximately 1681 * 4B = 6.56KB, which seems acceptable. However, with the increase of set data, the memory ratio increases exponentially. For example, the amount of analysis data is 200, and the memory ratio is 200 * 200 * 4B = 156.25KB; How to reduce it?
  • Instead of using int to replace byte array, byte still occupies 1 byte. By observing the data above: 0 means irrelevant, and 1 means there is a dependency. So why not use the bit bit representation? For the size of the upper 41, it is more than enough to use the long type, accounting for 8 bytes = 64bit bits. It can be expressed by using the long[41] array, and the memory proportion is 64 * 8B = 0.32KB. Obtain the direct relationship, that is, judge whether a bit in the long is 1, that is, Vi & (1 < < (J - 1))= 0, judge the dependency of Vi, that is, query the position set with binary 1 in long[i], judge that Vi is dependent, that is, traverse the long [] array to obtain the set with binary I position 1;
int size = 41 ;
long[] nums = new long[size];
//Modifying arr[i][j] in long array has dependency
nums[i] |= 1L << (size - 1 - j);
//Gets whether the data at I and j in the long array is 1
nums[i] &= 1L << (size - 1 - j);
assert(nums[i] != 0)

solve

Question one
  • Through the above, we can obtain the graph of the adjacency matrix of the dependency relationship. How to quickly detect whether there is a ring? Students who often brush leetCode recall whether they have seen this problem: 207 Curriculum , two traversal methods are used: breadth traversal BFS and depth traversal DFS
  1. To analyze how to detect the ring, first obtain the in degree set degrees[n] of each node. For the join stack (DFS) or queue (BFS) with an in degree of 0, take BFS as an example;
  2. Traverse the set degrees. After all nodes with degrees[i] of 0 are queued, traverse the queue and take out the queue head element, that is, disconnect the edge with the degree of this node. At this time, it is necessary to modify the degree of another node at the degree of node exit edge - 1 and judge whether the degree of entry is 0. If it is 0, join the queue and cycle;
//Store the penetration of nodes and count the number of all 1s in column j
int[] degrees = new int[map.length];
   for (int i = 0; i < map.length; i++) {
      for (int j = 0; j < map[0].length; j++) {
         degrees[i] += map[j][i];
      }
}
BFSSearch(map , degrees);

/**
 * BFS Breadth traversal: using queues; DFS depth traversal: using stack
 * LinkedList Stack and queue data can be realized:
 * Stack: push means to enter the stack, add elements in the header, pop means to display the stack, return the header elements, and peek to view the stack header elements
 * Queue: add an element at the end of the offer, poll returns the header element and deletes it from the queue, peek returns the header element
 */
public void BFSSearch(int[][] map , int[] indegree){
    int count = 0; //Judge whether there is a loop (whether it is looped)
    LinkedList stack = new LinkedList<Integer>();
    int n = indegree.length ;
    for (int i = 0; i < n; i++) {
      if (indegree[i] == 0) {
      stack.offer(i);     // --------Mark point 1------------
      indegree[i] = -1;
      }
    }

    while (!stack.isEmpty()) {
      Integer p = (Integer) stack.poll();  //--------Mark point 2------------
      System.out.print(p + " ");
      count++;
      for (int j = 0; j < n; j++) { //The data corresponding to the p column in degree with the p out degree is the p row data
        if (map[p][j] == 1) {
            map[p][j] = 0;
            indegree[j]--;
            if (indegree[j] == 0) {
              stack.offer(j);    // --------Mark point 3------------
              indegree[j] = -1;
            }
          }
        }
    }
    if(count < n)
    System.out.println("Circuit information currently exists!");
    else
    System.out.println("There is currently no circuit information!");
}
  1. DFS can be implemented by simply modifying a few lines of code
Mark point 1 and mark point 3
stack.push Push 
Mark point 2
stack.pop Out of stack
Question two
  • This problem can be extracted to find the set of edges with commonLib penetration, that is, the set with the column value of 1 in the statistical adjacency matrix commonLib. Add task to modify the lib in the set to source code dependency!

reflection

  • Adjacency matrix and critical table can represent graphs. We will use how to select them later. What are the differences between them?
  1. adjacency matrix
    • Advantages: it can quickly judge whether there is an edge between two vertices (fast query); Quickly add or delete edges;
    • Disadvantages: if there are too many unrelated data (sparse graph), that is, too many zeros in the array will lead to a waste of space; At the same time, adding vertices is troublesome and needs to be modified to move the array;
  2. Critical table
    • Advantages: save space and store only the actual edges. Adding new vertices is relatively simple. You can add data to the in degree linked list;
    • Disadvantages: when paying attention to the degree of the vertex, you may need to traverse the whole linked list. There are solutions: 1) add an anti critical table to store the in degree list at the top and bottom; 2) Use Orthogonal list Indicates the degree and in degree information

Topics: data structure