The condition is that the mask target can be extracted with a regular expression, but the character string can be replaced with ** % replace
** in the Logback settings.
Use the function of Java regular expression and the reference of the captured substring. Then, you can mask under more difficult conditions than the above example. Here is an example and the actual code.
In the following log, user ID, token, and resource ID are output in * all the same format * (hexadecimal 32 digits). Of these, I want to mask ** tokens only **.
Log example
2020-11-14T09:30:52.774+09:00 [main] INFO com.example.Main - UserID: 35f44b06a3cf8dab8355eb8ba5844c73, Token: b9656056c799ab9ba19cebe12b49992b, ResourceID: 945c4f63c61f1bc7ba632fe0ce25aa0d
Logback settings
<configuration debug="true">
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>%date{yyyy-MM-dd'T'HH:mm:ss.SSSXXX} [%thread] %level %logger - %message%n</pattern>
</encoder>
</appender>
<root level="INFO">
<appender-ref ref="STDOUT" />
</root>
</configuration>
If you want to extract the token part with a regular expression, you can use "what is written just before Token
"in this example. (Regular expressions alone are not enough to make a strict judgment in every situation)
Rewrite % message
as follows.
%replace(%message){'((?i:token).{0,10}?)\b\p{XDigit}{32}\b','$1****'}
Then the log will be as follows. Only the token part is ****
, and the others have not changed.
2020-11-14T09:43:31.724+09:00 [main] INFO com.example.Main - UserID: 5457645aaa75b97eb9e2c7b0aec79ca6, Token: ****, ResourceID: c194b0155ac7ece290092c1ee2a73948
% replace
takes parentheses and two arguments [String # replaceAll ()
](https://docs.oracle.com/javase/jp/8/docs/api/java/lang/ You can think of it as the same as the receiver and 2 arguments of String.html # replaceAll-java.lang.String-java.lang.String-).
If you do your best with regular expressions, you can say, "Leave 4 digits before and after the token."
\ p {XDigit}
(same as [0-9A-Fa-f]
){32}
after the pattern to indicate that it will be repeated 32 times.\ b
(word boundaries) before and after to eliminate cases longer than 32 digits_
is treated as a part of the word, it cannot be used if it is separated by this.\ b \ p {XDigit} {32} \ b
** Token
is written immediately before ""()
$ n
(* n is the capture group number)(? I :)
to accommodate variations such as Token`` token
TOKEN
Token
. *
Is the longest match, so it cannot be masked correctly in this example.. *?
Is also not good, for example token validation for user 0123 ... is failed
((? I: token). {0,10}?)
**(It seems that there is a mask processing setting in Logstash, but that has not been investigated yet)
Since the settings in Logback are written in JSON, it is necessary to ** escape the backslash **. (Otherwise, \ b
is recognized as a backspace and \ p
is recognized as an illegal escape)
Settings (partial)
{
"timestamp": "%date{yyyy-MM-dd'T'HH:mm:ss.SSSXXX}",
"thread": "%thread",
"level": "%level",
"logger": "%logger",
"message": "%replace(%message){'((?i:token).{0,10}?)\\b\\p{XDigit}{32}\\b','$1****'}"
}
log
{"timestamp":"2020-11-14T11:31:38.259+09:00","thread":"main","level":"INFO","logger":"com.example.Main","message":"UserID: c610e22e634ed2ff9f1bb27afc81e638, Token: ****, ResourceID: de343ea6405a8c559043c3e3e84f9bcd"}
The code used for this experiment is as follows.
pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.example</groupId>
<artifactId>logback-sample</artifactId>
<version>1.0-SNAPSHOT</version>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>8</source>
<target>8</target>
</configuration>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.2.3</version>
</dependency>
<dependency>
<groupId>net.logstash.logback</groupId>
<artifactId>logstash-logback-encoder</artifactId>
<version>6.4</version>
</dependency>
</dependencies>
</project>
src/main/resources/logback.xml
<configuration debug="true">
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<pattern>%date{yyyy-MM-dd'T'HH:mm:ss.SSSXXX} [%thread] %level %logger - %replace(%message){'((?i:token).{0,10}?)\b\p{XDigit}{32}\b','$1****'}%n</pattern>
</encoder>
</appender>
<appender name="STDOUT_JSON" class="ch.qos.logback.core.ConsoleAppender">
<!-- https://github.com/logstash/logstash-logback-encoder -->
<encoder class="net.logstash.logback.encoder.LoggingEventCompositeJsonEncoder">
<providers>
<pattern>
<pattern>
{
"timestamp": "%date{yyyy-MM-dd'T'HH:mm:ss.SSSXXX}",
"thread": "%thread",
"level": "%level",
"logger": "%logger",
"message": "%replace(%message){'((?i:token).{0,10}?)\\b\\p{XDigit}{32}\\b','$1****'}"
}
</pattern>
</pattern>
</providers>
</encoder>
</appender>
<root level="INFO">
<appender-ref ref="STDOUT" />
<appender-ref ref="STDOUT_JSON" />
</root>
</configuration>
src/main/java/com/example/Main.java
package com.example;
public class Main {
private static final org.slf4j.Logger log =
org.slf4j.LoggerFactory.getLogger(Main.class);
public static void main(String[] args) {
log.info("UserID: {}, Token: {}, ResourceID: {}", hex(), hex(), hex());
}
private static String hex() {
return new java.util.Random().ints(16, 0, 256)
.mapToObj(x -> String.format("%02x", x))
.reduce("", (a, b) -> a + b);
}
}