How to convert UTF-8 bytes [] to strings?

Posted by elraj on Mon, 16 Dec 2019 05:52:31 +0100

I have a byte [] array, which is composed of UTF-8 Of the. In some debugging code, I need to convert it to a string. Is there a liner that can do this?

Behind the scenes, it should be just an allocation and a memory copy, so it should be possible even if it is not implemented.

#1 building

string result = System.Text.Encoding.UTF8.GetString(byteArray);

#2 building

There are at least four different ways to accomplish this transformation.

  1. Encoded GetString
    , but if the original byte has non ASCII characters, the original byte cannot be retrieved.

  2. BitConverter.ToString
    The output is a string separated by "-", but there is no. NET built-in method to convert the string back to a byte array.

  3. Convert.ToBase64String
    You can easily convert the output string back to a byte array using Convert.FromBase64String.
    Note: the output string can contain '+', '/' and '='. If you want to use strings in URL s, you need to explicitly code them.

  4. HttpServerUtility.UrlTokenEncode
    You can easily convert the output string back to a byte array using HttpServerUtility.UrlTokenDecode. The output string is already URL friendly! The downside is that if your project is not a web project, you need a System.Web assembly.

A complete example:

byte[] bytes = { 130, 200, 234, 23 }; // A byte array contains non-ASCII (or non-readable) characters

string s1 = Encoding.UTF8.GetString(bytes); // ���
byte[] decBytes1 = Encoding.UTF8.GetBytes(s1);  // decBytes1.Length == 10 !!
// decBytes1 not same as bytes
// Using UTF-8 or other Encoding object will get similar results

string s2 = BitConverter.ToString(bytes);   // 82-C8-EA-17
String[] tempAry = s2.Split('-');
byte[] decBytes2 = new byte[tempAry.Length];
for (int i = 0; i < tempAry.Length; i++)
    decBytes2[i] = Convert.ToByte(tempAry[i], 16);
// decBytes2 same as bytes

string s3 = Convert.ToBase64String(bytes);  // gsjqFw==
byte[] decByte3 = Convert.FromBase64String(s3);
// decByte3 same as bytes

string s4 = HttpServerUtility.UrlTokenEncode(bytes);    // gsjqFw2
byte[] decBytes4 = HttpServerUtility.UrlTokenDecode(s4);
// decBytes4 same as bytes

#3 building

Definition:

public static string ConvertByteToString(this byte[] source)
{
    return source != null ? System.Text.Encoding.UTF8.GetString(source) : null;
}

usage method:

string result = input.ConvertByteToString();

#4 building

Using (byte)b.ToString("x2"), output b4b5dfe475e58b67

public static class Ext {

    public static string ToHexString(this byte[] hex)
    {
        if (hex == null) return null;
        if (hex.Length == 0) return string.Empty;

        var s = new StringBuilder();
        foreach (byte b in hex) {
            s.Append(b.ToString("x2"));
        }
        return s.ToString();
    }

    public static byte[] ToHexBytes(this string hex)
    {
        if (hex == null) return null;
        if (hex.Length == 0) return new byte[0];

        int l = hex.Length / 2;
        var b = new byte[l];
        for (int i = 0; i < l; ++i) {
            b[i] = Convert.ToByte(hex.Substring(i * 2, 2), 16);
        }
        return b;
    }

    public static bool EqualsTo(this byte[] bytes, byte[] bytesToCompare)
    {
        if (bytes == null && bytesToCompare == null) return true; // ?
        if (bytes == null || bytesToCompare == null) return false;
        if (object.ReferenceEquals(bytes, bytesToCompare)) return true;

        if (bytes.Length != bytesToCompare.Length) return false;

        for (int i = 0; i < bytes.Length; ++i) {
            if (bytes[i] != bytesToCompare[i]) return false;
        }
        return true;
    }

}

#5 building

Converting byte [] to string seems simple, but any type of encoding can clutter the output string. This small function can work without any unexpected results:

private string ToString(byte[] bytes)
{
    string response = string.Empty;

    foreach (byte b in bytes)
        response += (Char)b;

    return response;
}

Topics: encoding ascii