Use xxd command to save 0.5 hours

Posted by magi on Sat, 25 Dec 2021 13:56:15 +0100

I Fate comes and starts

Recently, my colleagues encountered a very strange problem. They spent 0.5 hours without a clue. Moreover, the more they think, the more strange they become.

I happened to pass by, asked with interest, and then directly used an xxd command to solve the problem, so everyone was happy.

II Strange question

The original problem is relatively complex. In order to facilitate narration, let me simplify the problem.

a.txt and b.txt files are as follows:

Eh? These two documents are as like as two peas. However, the problem is that the same program reads them with different results.

Read the first line of a.txt and find that the length is 3: #include < fstream >

#include <string>#include <iostream>using namespace std;
int main(){  ifstream in("a.txt");  string filename;  string line;
  if(in) // There is the file {while (getline (in, line)) // line does not include the newline character {cout < < line. Size() < < endl; / / the result is 3}} else / / there is no file {cout < < no such file "< < endl;}
  return 0;}

Read the first line of b.txt and find that the length is 6:

#include <fstream>#include <string>#include <iostream>using namespace std;
int main(){  ifstream in("b.txt");  string filename;  string line;
  if(in) // There is the file {while (getline (in, line)) // line does not include the newline character {cout < < line. Size() < < endl; / / the result is 6}} else / / there is no file {cout < < no such file "< < endl;}
  return 0;}

It's really strange! The same file, the same reading program, the final results are actually different. What is the reason?

Many times, during development, we will encounter similar problems, and various phenomena are contradictory. Therefore, it is necessary to find out the true and false.

III Bold assumptions

According to my experience, I guess the reason may be that a.txt and b.txt are not really the same.

The consistent phenomenon seen above is only an illusion, and seeing is not necessarily true. After verification, it is true.

IV Careful verification

Look directly at their binary codes. There are many ways. This article introduces a practical linux command, xxd. Let's man xxd:

ubuntu@VM-0-15-ubuntu:~$ man xxdXXD(1)                                                                                                  XXD(1)
NAME       xxd - make a hexdump or do the reverse.
SYNOPSIS       xxd -h[elp]       xxd [options] [infile [outfile]]       xxd -r[evert] [options] [infile [outfile]]
DESCRIPTION       xxd  creates  a hex dump of a given file or standard input.  It can also convert a hex dump back to its       original binary form.  Like uuencode(1) and uudecode(1) it allows the transmission of binary data in  a       `mail-safe'  ASCII  representation, but has the advantage of decoding to standard output.  Moreover, it       can be used to perform binary file patching.

As you can see, the xxd command can output hexadecimal codes or do the opposite.

Let's look directly at the result of the xxd command:

ubuntu@VM-0-15-ubuntu:~$ xxd a.txt00000000: 6162 63                                  abcubuntu@VM-0-15-ubuntu:~$ xxd b.txt00000000: efbb bf61 6263                           ...abcubuntu@VM-0-15-ubuntu:~$

Sure enough, their binary levels are inconsistent, that is, a.txt and b.txt just look the same, like twins, but their essence is different.

Why? Because of different file coding formats, interested friends can learn about BOM coding. So, how to construct a BOM encoded file?

The editor has the "save as" function, which directly saves the file as BOM coding format, and then the file header will be ef bb bf. google to find more useful information:

Once the problem is found, it is much easier to solve. It can be directly unified into the format without BOM code.

Incidentally, another linux command, hexdump, can achieve similar effects, as follows:

ubuntu@VM-0-15-ubuntu:~$ hexdump a.txt0000000 61 62 63                                       0000003ubuntu@VM-0-15-ubuntu:~$ hexdump b.txt 0000000 ef bb bf 61 62 63                              0000006ubuntu@VM-0-15-ubuntu:~$

V Last words

In practical work, it is particularly important to master various debugging tools. Make bold assumptions and prove them carefully. The xxd command of linux is introduced here first.

In the follow-up, we will share more practical debugging experience and kill bugs. I hope you can make progress together. One command, save 0.5 hours, so as to reduce overtime.

This article is very simple. The important thing is not what the conclusion is, but the process and method of solving the problem.