The process of recursively searching directories in Java can be easily implemented by using the Files.walkFileTree
method introduced in Java 1.7.
Also, use a class that implements the FileVisitor
interface to perform arbitrary processing on each searched file and directory.
walkFileTree
Pass the path that is the starting point of the search in the start of the first argument, and the instance of FileVisitor in the second argument.
walkFileTree
public static Path walkFileTree(Path start, FileVisitor<? super Path> visitor) throws IOException {
// ...
}
environment
reference
In this demo code, the relative paths (from the starting path) of the files and directories under the starting path to be searched are acquired and put together in a List. For files, MD5 calculates the checksum.
ComputeFileChecksumVisitor
The SimpleFileVisitor
class is a basic Visitor class that implements the FileVisitor
interface. Inherit this class to implement processing for files and directories.
The overridden visitFile
is a method that is called back for each searched file, and implements the process of getting the relative path of the file and calculating the checksum here.
The preVisitDirectory
method is a method that is called back for each searched directory. Since the checksum is not calculated in the directory, only the relative path is acquired.
In addition, there is also a method called postVisitDirectory
for operations on directories, but it is different at what timing it is called back as the method name has pre
, post
. ..
ComputeFileChecksumVisitor
import java.io.IOException;
import java.io.InputStream;
import java.nio.file.FileVisitResult;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.SimpleFileVisitor;
import java.nio.file.attribute.BasicFileAttributes;
import java.security.DigestInputStream;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.util.ArrayList;
import java.util.List;
public class ComputeFileChecksumVisitor extends SimpleFileVisitor<Path> {
private final Path start;
private final String hashAlg;
private final List<FileItem> items = new ArrayList<>();
public ComputeFileChecksumVisitor(Path start, String hashAlg) {
this.start = start;
this.hashAlg = hashAlg;
}
public List<FileItem> getResult() {
return items;
}
@Override
public FileVisitResult preVisitDirectory(Path dir, BasicFileAttributes attrs) throws IOException {
if (!dir.equals(start)) {
FileItem item = new FileItem(relativePath(dir), "");
items.add(item);
}
return FileVisitResult.CONTINUE;
}
@Override
public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
FileItem item = new FileItem(relativePath(file), checksum(file));
items.add(item);
return FileVisitResult.CONTINUE;
}
private Path relativePath(Path path) {
if (path.startsWith(start)) {
return path.subpath(start.getNameCount(), path.getNameCount());
}
throw new RuntimeException();
}
private String checksum(Path path) throws IOException {
MessageDigest digest = null;
try {
digest = MessageDigest.getInstance(hashAlg);
} catch (NoSuchAlgorithmException e) {
throw new RuntimeException(e);
}
try (InputStream input = Files.newInputStream(path);
DigestInputStream dInput = new DigestInputStream(input, digest)) {
while (dInput.read() != -1) {}
}
return toHex(digest.digest());
}
private String toHex(byte[] bytes) {
StringBuilder builder = new StringBuilder(bytes.length * 2);
for (byte b : bytes) {
builder.append(String.format("%02x", b));
//builder.append(Integer.toString((b & 0xff) + 0x100, 16).substring(1));
}
return builder.toString();
}
}
FileItem
ComputeFileChecksum This class stores information about files and directories searched by Visitor.
The path
field stores the relative path (from the origin path) of the searched file or directory.
The checksum
field stores the checksum of the searched file, or an empty string if it is a directory.
FileItem
import java.nio.file.Path;
public class FileItem implements Comparable<FileItem> {
private Path path;
private String checksum;
public FileItem(Path path, String checksum) {
this.path = path;
this.checksum = checksum;
}
public Path getPath() {
return path;
}
public String getChecksum() {
return checksum;
}
@Override
public int compareTo(FileItem o) {
return this.compareTo(o);
}
@Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((checksum == null) ? 0 : checksum.hashCode());
result = prime * result + ((path == null) ? 0 : path.hashCode());
return result;
}
@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
FileItem other = (FileItem) obj;
if (checksum == null) {
if (other.checksum != null)
return false;
} else if (!checksum.equals(other.checksum))
return false;
if (path == null) {
if (other.path != null)
return false;
} else if (!path.equals(other.path))
return false;
return true;
}
@Override
public String toString() {
return "FileItem [path=" + path + ", checksum=" + checksum + "]";
}
}
I created directories called dir1
and dir2
with the same directory structure as shown below. The file name and the contents of the file are the same, but the time stamps of the creation date and time are different.
Search for these dir1
and dir2
in the demo code respectively to get a list of files, directories and a checksum of the file to check if they have the same structure.
D:var
├─dir1
│ │ test1.txt
│ │ test2.txt
│ │
│ ├─aaa
│ │ └─ddd
│ ├─bbb
│ │ test3.txt
│ │
│ └─ccc
│ test4.txt
│
└─dir2
│ test1.txt
│ test2.txt
│
├─aaa
│ └─ddd
├─bbb
│ test3.txt
│
└─ccc
test4.txt
Execute
Demo
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
public class Demo {
public static void main(String[] args) throws Exception {
Path dir1 = Paths.get("D:", "var", "dir1");
ComputeFileChecksumVisitor dir1Visit = new ComputeFileChecksumVisitor(dir1, "MD5");
Files.walkFileTree(dir1, dir1Visit);
Path dir2 = Paths.get("D:", "var", "dir2");
ComputeFileChecksumVisitor dir2Visit = new ComputeFileChecksumVisitor(dir2, "MD5");
Files.walkFileTree(dir2, dir2Visit);
List<FileItem> dir1Files = dir1Visit.getResult();
System.out.println("Root : " + dir1.toString());
dir1Files.forEach(System.out::println);
List<FileItem> dir2Files = dir2Visit.getResult();
System.out.println("Root : " + dir2.toString());
dir2Files.forEach(System.out::println);
if (dir1Files.equals(dir2Files)) {
System.out.println("equal");
} else {
System.out.println("not equal");
}
}
}
Result of execution
Root : D:\var\dir1
FileItem [path=aaa, hash=]
FileItem [path=aaa\ddd, hash=]
FileItem [path=bbb, hash=]
FileItem [path=bbb\test3.txt, hash=ed6e956a3d549303751e3238ab04bb46]
FileItem [path=ccc, hash=]
FileItem [path=ccc\test4.txt, hash=2c97af7af48689fc67a2700d9f051af6]
FileItem [path=test1.txt, hash=ac6a2aaa9317ef1f007c092c6a5fd75e]
FileItem [path=test2.txt, hash=811ad90a8dafc585bb64b23b6200969e]
Root : D:\var\dir2
FileItem [path=aaa, hash=]
FileItem [path=aaa\ddd, hash=]
FileItem [path=bbb, hash=]
FileItem [path=bbb\test3.txt, hash=ed6e956a3d549303751e3238ab04bb46]
FileItem [path=ccc, hash=]
FileItem [path=ccc\test4.txt, hash=2c97af7af48689fc67a2700d9f051af6]
FileItem [path=test1.txt, hash=ac6a2aaa9317ef1f007c092c6a5fd75e]
FileItem [path=test2.txt, hash=811ad90a8dafc585bb64b23b6200969e]
equal
Recommended Posts