Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SearchTagValuesV2 perf improvements #3650

Merged
merged 8 commits into from
May 7, 2024

Conversation

joe-elliott
Copy link
Member

@joe-elliott joe-elliott commented May 7, 2024

What this PR does:
Improves the performance of SearchTagsV2 and fixes the benchmark to actually test SearchTagsV2.

Changes:

  • Don't unnecessarily pass null values up through the iterator OtherEntries
  • Stop repeatedly building attribute name strings to double check the iterators
  • Don't send the same value twice (Skips the cost of repeated allocations to and from interface{})
  • Consolidate the 3 samey distinct value collectors into one

Benches:

goos: darwin
goarch: arm64
pkg: github.com/grafana/tempo/tempodb/encoding/vparquet3
                                                                                                           │ before.txt  │           dontsend2x.txt            │
                                                                                                           │   sec/op    │   sec/op     vs base                │
FetchTagValues/tag:_span.http.url,_query:_{resource.namespace="tempo-ops"}-11                                252.7m ± 1%   251.9m ± 1%        ~ (p=0.218 n=10)
FetchTagValues/tag:_span.component,_query:_{resource.namespace="tempo-ops"}-11                               233.6m ± 1%   232.4m ± 1%   -0.51% (p=0.029 n=10)
FetchTagValues/tag:_span.http.url,_query:_{resource.namespace="tempo-ops"_&&_span.http.status_code=200}-11   384.8m ± 1%   388.7m ± 1%   +1.02% (p=0.000 n=10)
FetchTagValues/tag:_resource.namespace,_query:_{span.http.status_code=200}-11                                 2.548 ± 1%    2.141 ± 0%  -15.97% (p=0.000 n=10)
geomean                                                                                                      490.5m        469.8m        -4.21%

                                                                                                           │  before.txt   │            dontsend2x.txt             │
                                                                                                           │     B/op      │     B/op       vs base                │
FetchTagValues/tag:_span.http.url,_query:_{resource.namespace="tempo-ops"}-11                                 99.49Mi ± 2%   102.23Mi ± 2%   +2.75% (p=0.035 n=10)
FetchTagValues/tag:_span.component,_query:_{resource.namespace="tempo-ops"}-11                               11.013Mi ± 4%    9.212Mi ± 5%  -16.35% (p=0.000 n=10)
FetchTagValues/tag:_span.http.url,_query:_{resource.namespace="tempo-ops"_&&_span.http.status_code=200}-11    105.9Mi ± 4%    102.4Mi ± 3%   -3.31% (p=0.004 n=10)
FetchTagValues/tag:_resource.namespace,_query:_{span.http.status_code=200}-11                                417.81Mi ± 0%    48.90Mi ± 1%  -88.30% (p=0.000 n=10)
geomean                                                                                                       83.44Mi         46.60Mi       -44.15%

                                                                                                           │  before.txt  │           dontsend2x.txt            │
                                                                                                           │  allocs/op   │  allocs/op   vs base                │
FetchTagValues/tag:_span.http.url,_query:_{resource.namespace="tempo-ops"}-11                                 194.9k ± 0%   142.9k ± 0%  -26.66% (p=0.000 n=10)
FetchTagValues/tag:_span.component,_query:_{resource.namespace="tempo-ops"}-11                                204.1k ± 0%   139.6k ± 0%  -31.59% (p=0.000 n=10)
FetchTagValues/tag:_span.http.url,_query:_{resource.namespace="tempo-ops"_&&_span.http.status_code=200}-11    220.7k ± 0%   161.6k ± 0%  -26.75% (p=0.000 n=10)
FetchTagValues/tag:_resource.namespace,_query:_{span.http.status_code=200}-11                                14.220M ± 0%   2.474M ± 0%  -82.60% (p=0.000 n=10)
geomean                                                                                                       594.4k        298.9k       -49.71%

Other Thoughts:

For our worst case scenario in the benchmarks there is a massive, but harder to realize perf improvement.

tag: resource.namespace
conditions: span.http.status_code = 200

Currently the span level iterator will find all matching spans where http.status_code = 200 and pass them up to the resource level iterator which will then add resource.namespace to the returned values. If we could short circuit the iterator as soon as a single match were found it would be a massive improvement.

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: Joe Elliott <number101010@gmail.com>
Copy link
Member

@mapno mapno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvements 👍

@joe-elliott joe-elliott merged commit 884a344 into grafana:main May 7, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants