HDFS
1. Shell operation
upload
- -moveFromLocal: cut and paste from local to HDFS
- hadoop fs -moveFromLocal local file HDFS directory
- -Copy from local: copy files from the local file system to the HDFS path
- hadoop fs -copyFromLocal local file HDFS directory
- -Put: equivalent to copyFromLocal, the production environment is more used to put
- hadoop fs -put local file HDFS directory
- -appendToFile: appends a file to the end of an existing file
- hadoop fs -appendToFile local file HDFS directory
download
- -copyToLocal: copy from HDFS to local
- hadoop fs -copyToLocal HDFS directory file local directory
- -Get: equivalent to copyToLocal. The production environment is more used to get
- hadoop fs -get HDFS directory file local directory
Direct operation (same as Linux command function)
- -ls: display directory information
- -cat: display file contents
- -chmod, - chown: the same as in Linux file system, modify the permissions of the file
- -mkdir: create path
- -cp: copy from one path of HDFS to another path of HDFS
- -mv: move files in HDFS directory
- -tail: displays data at the end 1kb of a file
- -rm: delete a file or folder
- -rm -r: recursively delete the directory and its contents
- -du: Statistics of folder size information
- hadoop fs -du -s -h HDFS directory (list the size information of the directory)
- hadoop fs -du -h HDFS directory (list the file size information in the directory)
- -setrep: set the number of copies of the file in HDFS (set the number of copies of the file to 10)
- hadoop fs -setrep 10 HDFS directory file
2. API operation
preparation
-
Add dependency: create Maven project and add dependency (pom.xml)
<dependency> <!-- The version number is the same as your own Hadoop Version correspondence --> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-client</artifactId> <version>3.1.4</version> </dependency> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.12</version> </dependency> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-log4j12</artifactId> <version>1.7.30</version> </dependency>
-
Add log: log4j properties
log4j.rootLogger=INFO, stdout log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.layout=org.apache.log4j.PatternLayout log4j.appender.stdout.layout.ConversionPattern=%d %p [%c] - %m%n log4j.appender.logfile=org.apache.log4j.FileAppender log4j.appender.logfile.File=target/spring.log log4j.appender.logfile.layout=org.apache.log4j.PatternLayout log4j.appender.logfile.layout.ConversionPattern=%d %p [%c] - %m%n
File upload
-
FileSystem.copyFromLocalFile(...)
@Test public void testCopyFromLocalFile() throws IOException, InterruptedException, URISyntaxException { // 1 get file system Configuration configuration = new Configuration(); //(URI) and (String user) parameters can be modified according to their own configuration FileSystem fs = FileSystem.get(new URI("hdfs://192.168.37.151:8020"), configuration, "root"); // 2 upload files fs.copyFromLocalFile(new Path("D:\\fzk.txt"), new Path("/fzk")); // 3 close resources fs.close(); }
File download
-
FileSystem.copyToLocalFile(...)
@Test public void testCopyToLocalFile() throws IOException, InterruptedException, URISyntaxException { // 1 get file system Configuration configuration = new Configuration(); //(URI) and (String user) parameters can be modified according to their own configuration FileSystem fs = FileSystem.get(new URI("hdfs://192.168.37.151:8020"), configuration, "root"); // 2. Perform the download operation // boolean delSrc indicates whether to delete the original file // Path src refers to the path of the file to download // Path dst refers to the path to which the file is downloaded // boolean useRawLocalFileSystem whether to enable file verification fs.copyToLocalFile(false, new Path("/xiyou/huaguoshan/sunwukong.txt"), new Path("d:/sunwukong2.txt"), true); // 3 close resources fs.close(); }
Modify file name
-
FileSystem.rename(...)
@Test public void testRename() throws IOException, InterruptedException, URISyntaxException{ // 1 get file system Configuration configuration = new Configuration(); //(URI) and (String user) parameters can be modified according to their own configuration FileSystem fs = FileSystem.get(new URI("hdfs://192.168.37.151:8020"), configuration, "root"); // 2. Modify the file name fs.rename(new Path("/xiyou/sunwukong.txt"), new Path("/xiyou/meihouwang.txt")); // 3 close resources fs.close(); }
Delete files and directories
-
FileSystem.delete(...)
@Test public void testDelete() throws IOException, InterruptedException, URISyntaxException{ // 1 get file system Configuration configuration = new Configuration(); //(URI) and (String user) parameters can be modified according to their own configuration FileSystem fs = FileSystem.get(new URI("hdfs://192.168.37.151:8020"), configuration, "root"); // 2 execute deletion fs.delete(new Path("/xiyou"), true); // 3 close resources fs.close(); }
Document details view
-
View file name, permission, length and block information
@Test public void testListFiles() throws IOException, InterruptedException, URISyntaxException { // 1 get file system Configuration configuration = new Configuration(); //(URI) and (String user) parameters can be modified according to their own configuration FileSystem fs = FileSystem.get(new URI("hdfs://192.168.37.151:8020"), configuration, "root"); // 2 obtain document details RemoteIterator<LocatedFileStatus> listFiles = fs.listFiles(new Path("/"), true); while (listFiles.hasNext()) { LocatedFileStatus fileStatus = listFiles.next(); System.out.println("========" + fileStatus.getPath() + "========="); System.out.println(fileStatus.getPermission()); //jurisdiction System.out.println(fileStatus.getOwner()); //Owner System.out.println(fileStatus.getGroup()); //group System.out.println(fileStatus.getLen()); //length System.out.println(fileStatus.getModificationTime()); //Modification time System.out.println(fileStatus.getReplication()); //Number of copies stored in the file System.out.println(fileStatus.getBlockSize()); //Block size System.out.println(fileStatus.getPath().getName()); //name // Get block information BlockLocation[] blockLocations = fileStatus.getBlockLocations(); System.out.println(Arrays.toString(blockLocations)); //Block where the file is located } // 3 close resources fs.close(); }
File and folder judgment
-
FileSystem.isFile()
@Test public void testListStatus() throws IOException, InterruptedException, URISyntaxException { // 1 get file configuration information Configuration configuration = new Configuration(); //(URI) and (String user) parameters can be modified according to their own configuration FileSystem fs = FileSystem.get(new URI("hdfs://192.168.37.151:8020"), configuration, "root"); // 2 determine whether it is a file or a folder FileStatus[] listStatus = fs.listStatus(new Path("/")); for (FileStatus fileStatus : listStatus) { // If it is a file if (fileStatus.isFile()) { System.out.println("file:" + fileStatus.getPath().getName()); } else { System.out.println("catalogue:" + fileStatus.getPath().getName()); } } // 3 close resources fs.close(); }
Modify parameter method
-
1. Value set in client code: configuration set(key, value)
// 1 get file system Configuration configuration = new Configuration(); //Set DFS The number of replications is 2 configuration.set("dfs.replication", "2"); FileSystem fs = FileSystem.get(new URI("hdfs://192.168.37.151:8020"), configuration, "root");
-
2. User defined profile under ClassPath
-
HDFS site Copy the XML to the resources directory of the project
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <!-- set up dfs.replication The number of is 1 --> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
-
-
3. Then there is the custom configuration of the server (xxx site. XML)
-
4. Default configuration of the server (XXX default. XML)
Parameter priority
- Priority from high to low
- The value set in the client code
- User defined profile under ClassPath
- Then there is the custom configuration of the server (xxx site. XML)
- Default configuration of the server (XXX default. XML)