Masking sensitive data in log playback is achieved by replacing customer sensitive data or NPI (non-public personal information) ** with some arbitrary encoded text in part or in whole **. For example, the SSN information can be replaced with all star characters, or we can remove the entire SSN information from the log.

1. Mask the NPI in logs

In general, we can block sensitive data in two ways.

The first approach (not recommended) is to create utility functions that create masked string representations of domain objects with sensitive information.

Logger.info("Transaction completed with details : " + CommonUtils.mask(trasaction));
Copy the code

This approach is problematic because the masking calls are scattered across all the application code. In the future, when we are required to mask data only in production and pre-production environments, we may change the code in multiple places.

Similarly, if we find that a domain object has been missed during masking, we may need to change the code in many places and in many logging statements.

The second approach is to separate the masking logic from the application code and place it in the Logback configuration. Changes in masking logic will now be at the heart of configuration files and layout handlers. Application classes will not participate in any form of masking logic.

Any changes in masking logic or scope must be handled by Logback through layout handler classes and configuration files. This option is easy to manage and should be the preferred way to mask data in logs.

2. How do I use Logback to mask data

Data masking in Logback is done in two steps.

  1. inlogback.xmlIn the configuration fileRegular expressionDefine masking mode.
  2. Define a customLayoutClass that reads masked patterns and applies regular expressions for those patterns to log information.

2.1. Configure the masking mode in the file

This is the slightly harder part, where you write regex patterns for the information you want to mask. Writing regular expressions to override output in various formats may not be easy, but once it’s done, you’ll thank yourself later.

Here is a configuration that uses a console appender (for demonstration purposes) to record masked data, * it only masks the email and SSN* fields.

<appender name="DATA_MASK" class="ch.qos.logback.core.ConsoleAppender">
    <encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
       <layout class="com.howtodoinjava.demo.logback.DataMaskingPatternLayout">
       <maskPattern>((? ! 000 | 666) [0 to 8] [0-9] {2} - (? ! 00) [0-9] {2} - (? ! 0000) [0-9] {4})</maskPattern> <! -- SSN -->
       <maskPattern>(\w+@\w+\.\w+)</maskPattern> <! -- Email -->
       <pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern>
       </layout>
    </encoder>
</appender>
Copy the code

Note that we can easily enable or disable masking in a particular environment by using the If-else condition of the Janino library.

<dependency>
    <groupId>org.codehaus.janino</groupId>
    <artifactId>janino</artifactId>
    <version>3.1.6</version>
</dependency>
Copy the code

In the given example, we have enabled data masking in production and disabled it in all other environments. ENV is a system property that returns the name of the environment in which the application is running.

<if condition='property("ENV").equals("prod")'>
	<then>
	<appender name="DATA_MASK" class="ch.qos.logback.core.ConsoleAppender">
        <encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
           <layout class="com.howtodoinjava.demo.logback.DataMaskingPatternLayout">
		       <maskPattern>((? ! 000 | 666) [0 to 8] [0-9] {2} - (? ! 00) [0-9] {2} - (? ! 0000) [0-9] {4})</maskPattern> <! -- SSN -->
		       <maskPattern>(\w+@\w+\.\w+)</maskPattern> <! -- Email -->
		       <pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern>
           </layout>
        </encoder>
    </appender>
  </then>
  <else>
  	<appender name="DATA_MASK" class="ch.qos.logback.core.ConsoleAppender">
        <encoder>
			<pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern>
		</encoder>
    </appender>
  </else>
</if>
Copy the code

2.2. Customize PatternLayout

The second part of the solution is to read masking patterns from the configuration and apply them to the log information. This is a fairly simple approach that can be implemented through a custom schema handler.

The given pattern handler creates a single regular expression by combining all the patterns in the configuration and using the OR operator. This pattern is applied to all log messages that need to be processed by this pattern handler.

We can customize the logic implemented in this handler to suit our own requirements.

import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.stream.Collectors;
import java.util.stream.IntStream;

import ch.qos.logback.classic.PatternLayout;
import ch.qos.logback.classic.spi.ILoggingEvent;

public class DataMaskingPatternLayout extends PatternLayout 
{
	private Pattern aplpliedPattern;
	private List<String> maskPatterns = new ArrayList<>();

	public void addMaskPattern(String maskPattern) {
		maskPatterns.add(maskPattern);
		aplpliedPattern = Pattern.compile( maskPatterns.stream()
					.collect(Collectors.joining("|")), Pattern.MULTILINE);
	}

	@Override
	public String doLayout(ILoggingEvent event) {
		return maskMessage(super.doLayout(event));
	}

	private String maskMessage(String message) {
		//When masking is disabled in a environment
		if (aplpliedPattern == null) {
			return message;
		}
		StringBuilder sb = new StringBuilder(message);
		Matcher matcher = aplpliedPattern.matcher(sb);
		while (matcher.find()) {
			IntStream.rangeClosed(1, matcher.groupCount()).forEach(group -> {
				if(matcher.group(group) ! =null) {
					IntStream.range(matcher.start(group), 
								matcher.end(group)).forEach(i -> sb.setCharAt(i, The '*')); }}); }returnsb.toString(); }}Copy the code

3. The demo

Let’s look at data masking in action. I will execute the demo code in both production and non-production mode.

In non-production mode, we have not set the system property ENV, so data masking does not occur.

Logger logger = LoggerFactory.getLogger(Main.class);

Map<String, String> customer = new HashMap<String, String>();
customer.put("id"."12345");
customer.put("ssn"."856-45-6789");
customer.put("email"."[email protected]");

logger.info("Customer found : {}".new JSONObject(customer));
Copy the code
21:02:18.683 [main] INFO  com.howtodoinjava.demo.slf4j.Main - Customer found : {"id":"12345"."email":"[email protected]"."ssn":"856-45-6789"}
Copy the code

When we run the application in production mode, we can see the masked output.

//Production mode ON
System.setProperty("ENV"."prod");

Logger logger = LoggerFactory.getLogger(Main.class);

Map<String, String> customer = new HashMap<String, String>();
customer.put("id"."12345");
customer.put("ssn"."856-45-6789");
customer.put("email"."[email protected]");

logger.info("Customer found : {}".new JSONObject(customer));
Copy the code
21:03:07.960 [main] INFO  com.howtodoinjava.demo.slf4j.Main - Customer found : {"id":"12345"."email":"* * * * * * * * * * * * * * *"."ssn":"* * * * * * * * * * *"}
Copy the code

4. Conclusion

In this Logback tutorial, we learned to create a custom PatternLayout to shield sensitive data from the application’s logs. The data masking mode is centrally controlled by configuration files, which makes this technique very useful.

We can extend this functionality by using conditional tags in the Janino library that Logback implicitly supports to achieve context-specific masking.

Happy study! !

Download the source code