SOLR-18147 Make a new Grafana dashboard for Solr 10.x#4210
SOLR-18147 Make a new Grafana dashboard for Solr 10.x#4210janhoy wants to merge 8 commits intoapache:mainfrom
Conversation
Add traffic generator Run two solr's in example cluster
Fix several panels
support running your own solr
|
So the foundation is laid I believe. Technically it is working and I generally like the "rows" and panels chosen by AI. But there are probably useful changes to do. Here are some I can think of
|
|
Latency graphs should always show the max, p50 is basically useless... https://www.youtube.com/watch?v=lJ8ydIuPFeU Also update latency is only rarely interesting... throughput is what most folks care about for indexing, that and stuck/failed documents. |
|
Thanks Jan this looks like a great start. I'll find some time to take a look. I really love the docker compose setup making it easy to test. Something we should add is also a way to turn on tracing module with this so we can also see exemplars that Solr supports now as well with these dashboards. Maybe a second iteration since that is definitely way out of scope. |
Good feedback, adding in a max graph in the search latency panel. Let's do that.
Yea, cause /update is non-blocking, right, so it won't tell much other than how large the payload was and perhaps how busy the server was. Let's use that real estate for something better. |
Thought of it but wanted to keep scope somewhat low, so I think this PR should focus on a GA dashboard. Then follow up work could add OTEL collector and Jaeger to the |
|
Looking good! +1 on lacing up a OTEL collector next 👀
|
|
Are you ok with the location in the monorepo |
|
I like |
| # ./stack.sh --help # All options | ||
| # | ||
| # Services (full stack): | ||
| # solr1 http://localhost:8983 (SolrCloud node 1, embedded ZooKeeper) |
epugh
left a comment
There was a problem hiding this comment.
Good progress.. There is a lot here that I don't quite grok... Is trafficgen coming out of other perf related effrots, or just "hey, we need some load" ;-)
Trafficgen is just something I wrote earlier, not written for perf at all, just to have something happening in a cluster, as it is boring to view a dashboard or traces with nothing going on. This Do you feel it is too much to add? Should the entire |
https://issues.apache.org/jira/browse/SOLR-18147
monitoring-with-prometheus-and-grafanarefguide page, but written from scratch, with a new diagram scraping each solr node.solr/monitoring/devfolder with a docker-compose file that starts two solr, prometheus, grafana, alertmanager and a tarffic ingester container, to easily test metric/grafana changes locallyWant to review?
This is a first draft, the things most ready for review are the mixin build logic and the dev/ compose setup for local testing.
I'd not recommend starting a details-focused review of each dashboard panel, presentation etc. The dashboard and panels themselves I'd categorize as first LLM draft. I have not done more than fixing them so they display data and react to variable dropdowns. Thus, everything related to choice of dashboard ROWs, selection and presentation of what metrics to make panels for, and the design of those panels are up for discussion, so the most useful review feedback on the dashboard at this stage is high-level on what rows and panels we need, and what style.
I give every committer permission to commit fixes and improvements to this branch, after first announcing what you intend to do in a review comment or ordinary comment. I am not strongly attached to the current row+panel selection.
Current dashboard layout (Draft)
The rows are:
Here are some screenshots:





Disclaimer: All of this is built by Claude Code.