Analysis of Dex String Encryption Principle by StringFog Plug-in

Posted by dgny06 on Thu, 18 Jul 2019 00:06:54 +0200

Android application reinforcement and reverse has always been one of the hot topics of research, and the attack and defense between encryption and cracking is in full swing. Although Dex encapsulation and res confusion technologies were born, they are not widely used in practice. First, most anti-reverse services are charged, second, the performance has a great impact, and third, the packaging process is complex. Most App s on the market do not do any reverse defense, in front of Jadx, ApkTool and other reverse tools, almost as no privacy as women without clothes. Of course, the specific reverse technology we no longer in-depth discussion, or cut into the topic of this blog: string encryption in Dex.

In the vast majority of Android applications, many privacy information exists in the form of strings, such as AppId, AppSecret of third-party platform accessed, and interface address fields, which generally exist in plaintext. If we can encrypt and replace the strings in Dex at packaging time and call decryption at runtime, we can avoid the existence of plaintext in Dex. Although it can not be completely avoided to be cracked, it increases the difficulty of retrieving information, and the security is undoubtedly improved a lot.

In fact, this similar technology has been implemented and applied by large factories, such as Netease Cloud Music. When we use Jadx to view the application content, we find that almost all strings have been encrypted as follows:

Generally speaking, there are two ways to deal with string encryption.

1. In the development phase, developers use encrypted strings and then call decryption manually. This is undoubtedly the simplest way, but it has poor maintainability, heavy workload, and for thousands of strings in the application if all encrypted manual time-consuming.

2. Modify bytecode after compiling, dynamically implant encrypted string and automatically call decryption. This is the smartest way, and does not affect normal development, but it is slightly difficult to achieve.

For the first way, you may have used it more or less. I will not talk more about it here. The focus of this paper is to study the second way, String Fog for short. The source code has been source to Github for your reference. https://github.com/MegatronKing/StringFog

I. Encryption

There are many ways to encrypt and decrypt data. Considering the performance and implementation problems, symmetric encryption is used here, and Base64 + XOR algorithm is used by StringFog.

First, let's look at the classical XOR algorithm. Here, the code is as follows, by dealing with the addition (decryption) data and a string loop XOR to achieve simple addition (decryption):

private static byte[] xor(byte[] data, String key) {
    int len = data.length;
    int lenKey = key.length();
    int i = 0;
    int j = 0;
    while (i < len) {
        if (j >= lenKey) {
            j = 0;
        }
        data[i] = (byte) (data[i] ^ key.charAt(j));
        i++;
        j++;
    }
    return data;
}

When encrypting, the encrypted data is obtained by XOR, and when decrypting, the decrypted data is obtained by XOR again. Considering the character encoding characteristics, we need to use Base64 for encoding (decoding):

public static String encode(String data, String key) {
    return new String(Base64.encode(xor(data.getBytes(), key), Base64.NO_WRAP));
}

public static String decode(String data, String key) {
    return new String(xor(Base64.decode(data, Base64.NO_WRAP), key));
}

In this way, it not only solves the problem of character encoding, but also solves the problem of encrypting and decrypting (note that Base64 is not strictly an encryption algorithm), and it also gets reliable guarantee in performance.

2. Bytecode Implantation

It's not difficult to find and replace strings in Dex, but it's not easy to implant decryption calls at the same time. However, if the bytecode file before compiled Dex is relatively easy to operate, and there are powerful ASM packages to use, the famous hot fix framework Nuwa in solving ISPREVERIFIED tags is also handled in this way, let's see the implementation below.

1. transform mechanism of Gradle Android

Gradle Android plug-in provides a powerful transform ation mechanism to customize bytecode files and resource files in order to provide better customized task operations when compiling and packaging Android projects with Gradle. For example, Jar package merging, MultiDex splitting, code obfuscation and so on are all implemented through this mechanism. More careful children's shoes will find that when compiling or packaging, you can see the following task flow:

:app:transformClassesWithJarMergingForDebug
:app:transformClassesWithMultidexlistForDebug
:app:transformClassesWithDexForDebug

To perform these tasks, you will see the corresponding transformation folder in the build/intermediates/transforms directory. The specific principles are not elaborated, and interested in self-study.
Therefore, we can use ASM library to rewrite bytecode files by customizing the transform operation. The Gradle Android plug-in also provides us with the corresponding API for such extensions.

def android = project.extensions.android
android.registerTransform(new StringFogTransform(project))

These two lines of code are Groovy language, and the custom Gradle plug-in will be used, which is simpler and easier to operate than Java language.
The first line of code is to get the Extension of the Android plug-in, which corresponds to this in our common build.gradle script:

android {
    ...
}

The corresponding class is com.android.build.gradle.AppExtension, which inherits the registerTransform method of the parent class, meaning to register a transform ation processing class, where we register StringFogTransform.

class StringFogTransform extends Transform {

      private static final String TRANSFORM_NAME = 'stringFog'

      @Override
      String getName() {
        return TRANSFORM_NAME
      }

      @Override
      Set<QualifiedContent.ContentType> getInputTypes() {
        return ImmutableSet.of(QualifiedContent.DefaultContentType.CLASSES)
      }

}

All custom processing classes must inherit the Transform class, and several methods need to be overridden.
First, define the name of the Transform. We use the project name string Fog.
Secondly, define the input type, there are two kinds, CLASSES and RESOURCES. We want to operate on bytecode, so we use CLASSES.
This automatically creates and adds a task called transformClassesWithStringFogForvariant, where {variant} refers to buildTypes, usually Debug or Release.
Transform has several methods to be implemented, mainly defining scopes and patterns. Let's skip the details here and focus on the implementation of the transform method.

void transform(TransformInvocation transformInvocation) throws TransformException, InterruptedException, IOException {
  def dirInputs = new HashSet<>()
  def jarInputs = new HashSet<>()

  // Collecting inputs.
  transformInvocation.inputs.each { input ->
      input.directoryInputs.each { dirInput ->
          dirInputs.add(dirInput)
      }
      input.jarInputs.each { jarInput ->
          jarInputs.add(jarInput)
      }
  }

  // transform classes and jars
  ...
}

There are two types of files that need to be transform ed. One is the class bytecode file compiled from the current project Java file, where the path is stored in the directoryInputs attribute, and the other is the jar(aar) package that relies on the reference, where the path is stored in the jarInputs attribute. We traverse it and put it into the Set set Set we defined to facilitate subsequent operations.
After getting the file paths of classes and jars, we can modify the bytecode files through ASM libraries, calling the following two methods respectively:

StringFogClassInjector.doFog2Class(fileInput, fileOutput, mKey)
StringFogClassInjector.doFog2Jar(jarInputFile, jarOutputFile, mKey)

mKey is the encryption key we specified.

2. Bytecode Modification and Implantation

The two methods provided by the StringFogClassInjector class, doFog2Class and doFog2Jar, are ultimately processClass methods invoked:

private static void processClass(InputStream classIn, OutputStream classOut, String key) throws IOException {
    ClassReader cr = new ClassReader(classIn);
    ClassWriter cw = new ClassWriter(0);
    ClassVisitor cv = ClassVisitorFactory.create(cr.getClassName(), key, cw);
    cr.accept(cv, 0);
    classOut.write(cw.toByteArray());
    classOut.flush();
}

This is about ASM library related processing. We use ClassVisitor to manipulate bytecode files and rewrite them. Because different processing logic is needed for different classes, the ClassVisitorFactory static factory is used to create different ClassVisitor objects.

public final class ClassVisitorFactory {
    public static ClassVisitor create(String className, String key, ClassWriter cw) {
        if (Base64Fog.class.getName().replace('.', '/').equals(className)) {
            return new Base64FogClassVisitor(key, cw);
        }
        if (WhiteLists.inWhiteList(className, WhiteLists.FLAG_PACKAGE) || WhiteLists.inWhiteList(className, WhiteLists.FLAG_CLASS)) {
            return createEmpty(cw);
        }
        return new StringFogClassVisitor(key, cw);
    }

    public static ClassVisitor createEmpty(ClassWriter cw) {
        return new ClassVisitor(Opcodes.ASM5, cw) {
        };
    }
}

The factory creates three types of ClassVisitor. One is Base64FogClassVisitor, which is used to modify the bytecode of Base64Fog class. The main purpose is to implant our custom encryption and decryption key. One is the empty Class Visitor for the whitelist mechanism, such as many public and well-known libraries such as android.support and so on, which do not need to be encrypted by strings, and the other is the BuildConfig class which does not need to be processed. It will be filtered out here. The third is the class we want to modify, which is handled by using the StringFogClassVisitor class.

public class Base64FogClassVisitor extends ClassVisitor {
    private static final String CLASS_FIELD_KEY_NAME = "DEFAULT_KEY";
    private String mKey;

    public Base64FogClassVisitor(String key, ClassWriter cw) {
        super(Opcodes.ASM5, cw);
        this.mKey = key;
    }

    @Override
    public FieldVisitor visitField(int access, String name, String desc, String signature, Object value) {
        if (CLASS_FIELD_KEY_NAME.equals(name)) {
            value = mKey;
        }
        return super.visitField(access, name, desc, signature, value);
    }
}

In Base64Fog encryption and decryption class, encryption and decryption key is defined in a static constant named DEFAULT_KEY. The purpose of modification is achieved by rewriting visitField method and rewriting assignment value. This step is very simple.
Let's look at some complex StringFogClassVisitor classes. Before we talk about this class, let's first analyze what forms strings exist in Java classes.
- A. Static member variables
- B. Common member variables
- C. Local variables
Broadly speaking, it can be divided into three categories. Form A exists in clinit method, Form B in init method and Form C in general method. Correspondingly, we can access it by rewriting visitMethod.

public MethodVisitor visitMethod(int access, String name, String desc, String signature, String[] exceptions) {
    if ("<clinit>".equals(name)) {
        ... // Handling static member variables
    } else if ("<init>".equals(name)) {
        ... // Processing member variables
    } else {
        ... // Handling local variables
    }
}

For member variables of A and B, we can get them by visitField method first.

@Override
public FieldVisitor visitField(int access, String name, String desc, String signature, Object value) {
    if (ClassStringField.STRING_DESC.equals(desc) && name != null && !mIgnoreClass) {
            // static final, in this condition, the value is null or not null.
            if ((access & Opcodes.ACC_STATIC) != 0 && (access & Opcodes.ACC_FINAL) != 0) {
                mStaticFinalFields.add(new ClassStringField(name, (String) value));
                value = null;
            }
            // static, in this condition, the value is null.
            if ((access & Opcodes.ACC_STATIC) != 0 && (access & Opcodes.ACC_FINAL) == 0) {
                mStaticFields.add(new ClassStringField(name, (String) value));
                value = null;
            }

            // final, in this condition, the value is null or not null.
            if ((access & Opcodes.ACC_STATIC) == 0 && (access & Opcodes.ACC_FINAL) != 0) {
                mFinalFields.add(new ClassStringField(name, (String) value));
                value = null;
            }

            // normal, in this condition, the value is null.
            if ((access & Opcodes.ACC_STATIC) != 0 && (access & Opcodes.ACC_FINAL) != 0) {
                mFields.add(new ClassStringField(name, (String) value));
                value = null;
            }
        }
}

Since all string member variables are eventually modified to a static decryption call called StringFog.decode("xxxx"), value needs to be null-set and rewritten in the visitLdcInsn method of clinit and init accessors:

@Override
public void visitLdcInsn(Object cst) {
    if (cst != null && cst instanceof String && !TextUtils.isEmptyAfterTrim((String) cst)) {
        super.visitLdcInsn(Base64Fog.encode((String) cst, mKey));
        super.visitMethodInsn(Opcodes.INVOKESTATIC, BASE64_FOG_CLASS_NAME, "decode", "(Ljava/lang/String;)Ljava/lang/String;", false);
    }
}

One thing to note is that if there is no clinit method in the bytecode, we need to manually insert a modification to the visitEnd method and add string constants:

@Override
public void visitEnd() {
    if (!mIgnoreClass && !isClInitExists && !mStaticFinalFields.isEmpty()) {
        MethodVisitor mv = super.visitMethod(Opcodes.ACC_STATIC, "<clinit>", "()V", null, null);
        mv.visitCode();
        // Here init static final fields.
        for (ClassStringField field : mStaticFinalFields) {
            if (field.value == null) {
               continue; // It could not be happened
            }
            mv.visitLdcInsn(Base64Fog.encode(field.value, mKey));
            mv.visitMethodInsn(Opcodes.INVOKESTATIC, BASE64_FOG_CLASS_NAME, "decode", "(Ljava/lang/String;)Ljava/lang/String;", false);
            mv.visitFieldInsn(Opcodes.PUTSTATIC, mClassName, field.name, ClassStringField.STRING_DESC);
        }
        mv.visitInsn(Opcodes.RETURN);
        mv.visitMaxs(1, 0);
        mv.visitEnd();
    }
    super.visitEnd();
}

At this point, the whole bytecode modification is almost complete, of course, there are some details to deal with.

This blog is constantly updated from time to time. Welcome to pay attention to and exchange:

http://blog.csdn.net/megatronkings

Topics: Android Gradle encoding Java