Skip to content

SOLR-18124: Add tracing spans for UpdateLog replay#4216

Open
openworld-maker wants to merge 4 commits intoapache:mainfrom
openworld-maker:SOLR-18124-updatelog-replay-tracing
Open

SOLR-18124: Add tracing spans for UpdateLog replay#4216
openworld-maker wants to merge 4 commits intoapache:mainfrom
openworld-maker:SOLR-18124-updatelog-replay-tracing

Conversation

@openworld-maker
Copy link
Contributor

@openworld-maker openworld-maker commented Mar 15, 2026

Description

This PR addresses SOLR-18124 by adding tracing spans around UpdateLog replay/recovery.

What changed

  • Added an overall replay span: updatelog.replay.
  • Added a per-log child span: updatelog.replay.log.
  • Added lightweight replay metadata on spans where already available:
    • replay context (state, active_log, in_sorted_order)
    • core name and db instance
    • total/replayed log counts and replayed op counts
    • per-log file name, size, replayed op count, error count, success flag
    • URP chain summary
  • Added error reporting to spans for replay exceptions.
  • Added UpdateLogReplayTracingTest to verify:
    • parent replay span exists
    • per-log child span exists
    • parent/child relationship is correct
    • key replay metadata attributes are present

Notes

  • No intended functional change to UpdateLog replay behavior.
  • Instrumentation follows existing Solr tracing conventions via TraceUtils.

Testing

  • Attempted: ./gradlew :solr:core:test --tests org.apache.solr.update.UpdateLogReplayTracingTest

@openworld-maker
Copy link
Contributor Author

Ran local checks after wiring Java 21 in my env:

  • ./gradlew :solr:core:test --tests org.apache.solr.update.UpdateLogReplayTracingTest
  • ./gradlew :solr:core:test --tests org.apache.solr.update.PeerSyncWithBufferUpdatesTest

Both passed. Happy to adjust naming/attributes if there’s a preferred tracing convention for replay spans.

Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for this!

public static String LOG_FILENAME_PATTERN = "%s.%019d";
public static String TLOG_NAME = "tlog";
public static String BUFFER_TLOG_NAME = "buffer.tlog";
private static final String UPDATELOG_REPLAY_SPAN_NAME = "updatelog.replay";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

honestly just inline these. The constants serve no purpose. I know this is a matter of taste. The practice of constants spreads readability around thus reducing readability.

span.setAttribute("updatelog.replay.active_log", activeLog);
span.setAttribute("updatelog.replay.in_sorted_order", inSortedOrder);
span.setAttribute("updatelog.replay.logs_total", initialLogCount);
span.setAttribute("updatelog.replay.core", req.getCore().getName());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably remove, assuming redundant with the line below

TraceUtils.setDbInstance(span, req.getCore().getName());
});

try (Scope scope = replaySpan.makeCurrent()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we avoid adding a try-finally when I see one here already?

return replayedOps;
}

private String summarizeProcessorChain(UpdateRequestProcessorChain processorChain) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets put this on UpdateRequestProcessorChain.toString().
BTW processorChain won't be null

import org.junit.BeforeClass;
import org.junit.Test;

public class UpdateLogReplayTracingTest extends SolrTestCaseJ4 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.... I would've been sufficiently happy to see a pic of it working in your tracing viewer of choice. We mostly don't test logs; traces are a glorified log in the end, and thus I think testing traces is questionable value trade-off.

Copy link
Contributor

@dsmiley dsmiley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you added changes here totally unrelated to the issue/description. Looks related to another PR you are working on. Please don't contribute changes to AGENTS.md nor add new md files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants