The problem background

My log path is /opt/logs/xxxx.log. There are two servers in the cluster.

  • A server’s log files were last updated today,
  • The log files on the other server were last updated yesterday or more ago.

In other words, at some point in the past, a server has stopped printing logs. By the way, I did not build this project from scratch. I took over the maintenance of the project. The logging framework used is SLF4J + Logback.

At the beginning, I found the following solution on Baidu:


      
<jboss-deployment-structure>
    <deployment>
        <! -- Exclusions allow you to prevent the server from automatically adding some dependencies -->
        <exclusions>
            <module name="org.slf4j" />
            <module name="org.slf4j.impl" />
            <! -- If the project uses jCL-over-slf4J, you can kill the following module as well -->
            <module name="org.slf4j.jcl-over-slf4j" />
        </exclusions>
    </deployment>
</jboss-deployment-structure>
Copy the code

I took a look at my project:

Jcl-over-slf4j Jul-to-slf4j log4j-over-slf4j slf4J-API has both, so I tried it

Unfortunately, the problem was not solved, so I had to continue to analyze, I can not guess the reason, and can not see the error, so I wondered if I could find some exceptions in the source code to help solve the problem.

Source code analysis

For example, we have a simple UserService example that calls the log.info method:

package service; import org.slf4j.Logger; import org.slf4j.LoggerFactory; public class UserService { private static final Logger log = LoggerFactory.getLogger("service.UserService"); public String getPhoneByUsername(String username) { String result = "12345678901"; log.info("{}'s phone number is {}", username, result); return result; }}Copy the code

Implement org logback. Slf4j. Logger interface instance is kind of ch. Qos. Logback. Classic. The Logger. We found the trace/debug/info/warn/error method, will be called a private method buildLoggingEventAndAppend:

private void buildLoggingEventAndAppend(final String localFQCN, final Marker marker, final Level level, final String msg, final Object[] params,
                final Throwable t) {
    Create a LoggingEvent object
    LoggingEvent le = new LoggingEvent(localFQCN, this, level, msg, t, params);
    le.setMarker(marker);
    Call Appenders to handle LoggingEvent
    callAppenders(le);
}
Copy the code

Next, look at the source code for callAppenders:

/**
 * Invoke all the appenders of this logger.
 * 
 * @param event
 *          The event to log
 */
public void callAppenders(ILoggingEvent event) {
    int writes = 0;
    for (Logger l = this; l ! =null; l = l.parent) {
        writes += l.appendLoopOnAppenders(event);
        if(! l.additive) {break; }}// No appenders in hierarchy
    if (writes == 0) {
        loggerContext.noAppenderDefinedWarning(this); }}Copy the code

Let’s look at this loop for (Logger L = this; l ! = null; L = l.parent) {}, find the parent Logger from the current Logger, take service.userService as an example:

This tree Logger structure is created during loggerFactory.getLogger (” service.userService “). Class in logback ch. Qos. Logback. Classic. LoggerContext# getLogger assume the task.

Ch. Qos. Logback. Classic. Continue to dig down, Logger code by appendLoopOnAppenders method, this method can continue to perform, the key is whether aai is null?

transient private AppenderAttachableImpl<ILoggingEvent> aai;

private int appendLoopOnAppenders(ILoggingEvent event) {
    if(aai ! =null) {
        return aai.appendLoopOnAppenders(event);
    } else {
        return 0; }}Copy the code

Is still a ch. Qos. Logback. Classic. Logger source, aai is created when addAppender, if there is no a Logger appender, aai is null:

public synchronized void addAppender(Appender<ILoggingEvent> newAppender) {
    if (aai == null) {
        aai = new AppenderAttachableImpl<ILoggingEvent>();
    }
    aai.addAppender(newAppender);
}
Copy the code

Take a look at my configuration, logback.xml in the Resources folder:


      
<configuration>
	<appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
                <! -- here I paste the official configuration, not paste the configuration I check the problem -->
		<file>logFile.log</file>
		<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
			<! -- daily rollover -->
			<fileNamePattern>logFile.%d{yyyy-MM-dd}.log</fileNamePattern>
			<! -- keep 30 days' worth of history capped at 3GB total size -->
			<maxHistory>30</maxHistory>
			<totalSizeCap>3GB</totalSizeCap>
		</rollingPolicy>
		<encoder>
			<pattern>%-4relative [%thread] %-5level %logger{35} - %msg%n</pattern>
		</encoder>
	</appender>
	<appender name="ASYNC_FILE" class="ch.qos.logback.classic.AsyncAppender">
		<appender-ref ref="FILE"/>
	</appender>
	<root level="INFO">
		<appender-ref ref="ASYNC_FILE"/>
	</root>
</configuration>
Copy the code

Based on the XML configuration above, the resulting root Logger is parsed as shown below:

As I understand it, the < Logger > tag supports the < Append-ref > child tag, and if it contains at least one < Append-ref >, then the logger has an AAI attribute.

In addition, ch. Qos. Logback. Classic. AsyncAppender bring aai, because it is the parent of the ch. Qos. Logback. Core. Aai AsyncAppenderBase:

public class AsyncAppenderBase<E> extends UnsynchronizedAppenderBase<E> implements AppenderAttachable<E> {
    AppenderAttachableImpl<E> aai = new AppenderAttachableImpl<E>();
}
Copy the code

The succession system:

ConsoleAppender, FileAppender, and RollingFileAppender have no AAI property!

Now, we go back to ch. Qos. Logback. Core. Spi. AppenderAttachableImpl appendLoopOnAppenders method of source code, it calls the doAppend

/** * Call the doAppend method on all attached appenders. */
public int appendLoopOnAppenders(E e) {
    int size = 0;
    for (Appender<E> appender : appenderList) {
        appender.doAppend(e);
        size++;
    }
    return size;
}
Copy the code

Debug key points

AsyncAppender

If you want to figure out the working process of the asynchronous AsyncAppender suggested breakpoint play in two: 1, ch. Qos. Logback. Core. AsyncAppenderBase line 156, put method:

The call stack is shown below:

2, ch. Qos. Logback. Core. AsyncAppenderBase line 265, the inner class Worker within the run method of the cycle of:

The inner Worker class works on threads prefixed with AsyncAppender-Worker-.

RollingFileAppender

Now, we put a breakpoint in ch. Qos. Logback. Core. Rolling. The RollingFileAppender 222 lines, see subAppend executed:

/** * This method differentiates RollingFileAppender from its super class. */ @Override protected void subAppend(E event) { // The roll-over check must precede actual writing. This is the // only correct behavior for time driven triggers. // We need to synchronize on triggeringPolicy so that only one rollover // occurs at a time synchronized (triggeringPolicy) { if (triggeringPolicy.isTriggeringEvent(currentlyActiveFile, event)) { rollover(); }} // Call the subAppend method of OutputStreamAppender, which is the method that actually writes to the log. super.subAppend(event); }Copy the code

Here is the call stack information when I run to subAppend:

The problem I have is this rollover

Let me first post that I have a problem with logback.xml:


      
<configuration>
    <appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
        <file>logFile.log</file>
        <rollingPolicy class="ch.qos.logback.core.rolling.FixedWindowRollingPolicy">
            <FileNamePattern>biz-%d{yyyy-MM-dd_HH}.%i.txt</FileNamePattern>
            <MinIndex>1</MinIndex>
            <! Keep a maximum of 20 historical logs
            <MaxIndex>20</MaxIndex>
        </rollingPolicy>
        <triggeringPolicy class="ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy">
            <! -- Each file has a maximum size of 1KB.
            <MaxFileSize>1KB</MaxFileSize>
        </triggeringPolicy>
        <encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
            <pattern>%date{yyyy-MM-dd HH:mm:ss.SSS} %level [%thread] %logger{36} %msg%n</pattern>
        </encoder>
    </appender>
    <appender name="ASYNC_FILE" class="ch.qos.logback.classic.AsyncAppender">
        <appender-ref ref="FILE"/>
    </appender>
    <root level="INFO">
        <appender-ref ref="ASYNC_FILE"/>
    </root>
</configuration>
Copy the code

I then looped the method UserService#getPhoneByUsername to print the log 200 times:

for (int i = 0; i < 100; i++) {
  String result = getPhoneByUsername("Mi");
}
Copy the code

Ch. Qos. Logback. Core. Rolling. SizeBasedTriggeringPolicy isTriggeringEvent method returns true, it’s that one and I don’t print log server is the same:

public boolean isTriggeringEvent(final File activeFile, final E event) {
    long now = System.currentTimeMillis();
    if (invocationGate.isTooSoon(now))
        return false;
    // Returns true when the log file size exceeds the set size (default is 10MB, I changed it to 1KB)
    return (activeFile.length() >= maxFileSize.getSize());
}
Copy the code

Breakpoints can be set atrolloverThis method is in the line. Then, when the file size exceeds the specified size, an exception occurs when calling ROLLBACKch.qos.logback.core.UnsynchronizedAppenderBaseThe doAppend method captures:

Then, the real log method is not implemented, namely ch. Qos. Logback. Core. OutputStreamAppender subAppend method has not been executed:

/**
 * Actual writing occurs here.
 * <p>
 * Most subclasses of <code>WriterAppender</code> will need to override this
 * method.
 * 
 * @since0.9.0 * /
protected void subAppend(E event) {
    if(! isStarted()) {return;
    }
    try {
        // this step avoids LBCLASSIC-139
        if (event instanceof DeferredProcessingAware) {
            ((DeferredProcessingAware) event).prepareForDeferredProcessing();
        }
        // the synchronization prevents the OutputStream from being closed while we
        // are writing. It also prevents multiple threads from entering the same
        // converter. Converters assume that they are in a synchronized block.
        lock.lock();
        try {
            writeOut(event);
        } finally{ lock.unlock(); }}catch (IOException ioe) {
        // as soon as an exception occurs, move to non-started state
        // and add a single ErrorStatus to the SM.
        this.started = false;
        addStatus(new ErrorStatus("IO failure in appender".this, ioe)); }}Copy the code

Then, we can look at the stack information where the exception occurred:

I choose from ch. Qos. Logback. Core. Rolling. The FixedWindowRollingPolicy continue refining analysis problem:

This line also has the variable name fileNamePattern, which reminds me of the < fileNamePattern > tag in XML.

Where the simulation went wrong, FileNamePattern

I then wrote code to simulate the behavior of FileNamePattern:

FileNamePattern pattern = new FileNamePattern("biz-%d{yyyy-MM-dd_HH}.%i.txt".new LoggerContext());
String s = pattern.convertInt(1);
Copy the code

Debug reference point

Breakpoint play inch.qos.logback.core.rolling.helper.FileNamePatternAt line 65, look at line 64 of the parse method:

As shown in the figure, the home-made word splitter in logback parses biz-%d{YYYY-MM-DD_HH}.%i.txt to tokenList.

Next line 67, look at the Node result:

%d{YYYY-MM-DD_hh} and % I are parsed as keywords, while the others retain literals.

Keywords D and I correspond to different converters:

static final Map<String, String> CONVERTER_MAP = new HashMap<String, String>(); The static {/ / I corresponding IntegerTokenConverter CONVERTER_MAP. Put (IntegerTokenConverter CONVERTER_KEY, IntegerTokenConverter.class.getName()); / / d corresponding DateTokenConverter CONVERTER_MAP. Put (DateTokenConverter. CONVERTER_KEY, DateTokenConverter. Class. GetName ()); }Copy the code

%d{YYYY-MM-DD_hh} % D {YYYY-MM-DD_HH} % D {YYYY-MM-DD_HH} % D {YYYY-MM-DD_HH} % D {YYYY-MM-DD_HH}

Case closed!

The solution

FileNamePattern and rollingPolicy do not match.

FixedWindowRollingPolicy

<? The XML version = "1.0" encoding = "utf-8"? > <configuration> <appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender"> <file>logFile.log</file> <rollingPolicy class="ch.qos.logback.core.rolling.FixedWindowRollingPolicy"> <! Begin--> <FileNamePattern>biz.%i.t </FileNamePattern> <! END --> <MinIndex>1</MinIndex> <! Keep a maximum of 20 historical logs --> <MaxIndex>20</MaxIndex> </rollingPolicy> <triggeringPolicy class="ch.qos.logback.core.rolling.SizeBasedTriggeringPolicy"> <! -- Maximum 1KB for each file, <MaxFileSize>1KB</MaxFileSize> </triggeringPolicy> <encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder"> <pattern>%date{yyyy-MM-dd HH:mm:ss.SSS} %level [%thread] %logger{36} %msg%n</pattern> </encoder> </appender> <appender name="ASYNC_FILE" class="ch.qos.logback.classic.AsyncAppender"> <appender-ref ref="FILE"/> </appender> <root level="INFO"> <appender-ref ref="ASYNC_FILE"/> </root> </configuration>Copy the code

SizeAndTimeBasedRollingPolicy

<? The XML version = "1.0" encoding = "utf-8"? > <configuration> <appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender"> <file>logFile.log</file> <! - modify part Begin -- > < rollingPolicy class = "ch. Qos. Logback. Core. Rolling. SizeAndTimeBasedRollingPolicy" > <! -- rollover daily --> <fileNamePattern>biz-%d{yyyy-MM-dd_HH}.%i.txt</fileNamePattern> <! -- each file should be at most 100MB, keep 60 days worth of history, but at most 20GB --> <maxFileSize>100MB</maxFileSize> <maxHistory>60</maxHistory> <totalSizeCap>20GB</totalSizeCap> </rollingPolicy> <! - the modified part END -- > < encoder class = "ch. Qos. Logback. Classic. Encoder. PatternLayoutEncoder" > < pattern > % date {yyyy - MM - dd HH:mm:ss.SSS} %level [%thread] %logger{36} %msg%n</pattern> </encoder> </appender> <appender name="ASYNC_FILE" class="ch.qos.logback.classic.AsyncAppender"> <appender-ref ref="FILE"/> </appender> <root level="INFO"> <appender-ref ref="ASYNC_FILE"/> </root> </configuration>Copy the code

conclusion

On one of my servers, the log file logfile.log is smaller than the MaxFileSize (50MB) set, so it works fine; The log file size of the other server is larger than MaxFileSize. An error occurred during the rollback method. As a result, log writing cannot continue.

The reason why ROLLBACK does not work is that FileNamePattern and rollingPolicy do not match.

Reference documentation

  • Logback Manual — Chapter 4: Appenders Reading
  • Jboss Logback SLf4J log files do not generate problem reading
  • Logback performance optimization and source code analysis reading