fix(search): preserve value case for non-lowercased bleve fields by dschmidt · Pull Request #2633 · opencloud-eu/opencloud

dschmidt · 2026-04-20T14:35:04Z

Summary

Replace the unconditional strings.ToLower(v) in the bleve compiler with a small allowlist (Name, Tags, Favorites, Content) that matches the fields whose index mapping uses a lowercasing analyzer.
Update the existing author:("John Smith" Jane) compiler tests; those names are not in the allowlist, so the compiler now preserves case.
Add an end-to-end regression in bleve/backend_test.go: Title:"Some Title" returns 1 hit, Title:"some title" returns 0 hits.

Why

The previous compiler lowercased every query value before emitting a bleve QueryStringQuery. That was a workaround for the four fields whose analyzer lowercases at index time — without the compiler-side fold, a query like Name:Report.pdf wouldn't match the indexed report.pdf. But every other field uses the default keyword analyzer, which preserves case. For those, the unconditional lowercasing silently turned audio.artist:Motörhead, Title:"Some Title", Path:"./Some Dir", etc. into lookups against tokens that didn't exist in the index.

The allowlist approach puts the case-folding decision next to the field mapping it mirrors. If a future field opts into a lowercasing analyzer, it gets added to the allowlist; otherwise it preserves case — which is also what value identity and aggregation display need. Names like deadmau5 or Motörhead stay the way the tag writer wrote them.

Out of scope

OpenSearch backend. Its default dynamic mapping uses text (standard analyzer, tokenised + lowercased) with a .keyword multi-field. The equivalent change there requires switching the TermQuery emission to MatchQuery for text fields (analyser applies) and using .keyword for aggregations. Separate PR, once aggregations are actually implemented.
The dotted-key KQL grammar (audio.artist:…) lives in feat(kql): support dotted keys in property restrictions #2632. The regression test in this PR uses Title as a stand-in and has a FIXME pointing at that PR — once it lands, Title can be swapped for audio.artist.

The bleve compiler lowercased every query value (except Hidden) before handing it to the engine. This matched the index tokens for fields whose analyzer folds case — Name, Tags, Favorites, Content — but silently broke matching for every other field, whose default keyword analyzer preserves case. A query like Title:"Some Title" parsed fine, lowercased to "some title", and missed the indexed token "Some Title". Replace the blanket lowercasing with an allowlist of the four fields whose index mapping actually uses a lowercasing analyzer. Every other field now passes through unchanged, which keeps values like "deadmau5" or "Motörhead" intact instead of normalising them to a case the tag writer didn't choose.

Copilot

Pull request overview

Adjust Bleve query compilation to avoid case-folding values for fields whose index mapping preserves case, fixing mismatches for keyword-analyzed fields.

Changes:

Replace unconditional query-value lowercasing with an allowlist matching fields that lowercase at index time.
Update compiler unit tests to reflect preserved case for non-allowlisted fields (e.g., author).
Add an end-to-end regression test asserting case-sensitive behavior for Title.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
services/search/pkg/query/bleve/compiler.go	Introduces `lowercaseFields` allowlist and applies lowercasing only for those mapped fields.
services/search/pkg/query/bleve/compiler_test.go	Updates expected compiled query strings to preserve case for `author` values.
services/search/pkg/bleve/backend_test.go	Adds regression test ensuring case is preserved for fields not marked lowercase (using `Title`).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot review pointed out that the comment claimed pre-lowercasing makes non-analyzed query types (wildcard, fuzzy) match for every allowlisted field. That is true for Name/Tags/Favorites, whose lowercaseKeyword analyzer emits a single lowercased token, but the Content analyzer also stems terms — so the guarantee doesn't hold there. Drop the specific claim and keep the comment to the intent: stay consistent with the field's analyzer.

sonarqubecloud · 2026-04-20T14:43:07Z

Quality Gate failed

Failed conditions
C Maintainability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-20T14:49:09Z

+var lowercaseFields = map[string]bool{
+	"Name":      true,
+	"Tags":      true,
+	"Favorites": true,
+	"Content":   true,


lowercaseFields is being used as a set; in this repo the common pattern is map[string]struct{} (e.g. services/thumbnails/pkg/thumbnail/mimetypes.go:7) to avoid boolean values and make membership checks explicit. Consider switching to map[string]struct{} and checking membership via _, ok := lowercaseFields[k].

Suggested change

var lowercaseFields = map[string]bool{

"Name": true,

"Tags": true,

"Favorites": true,

"Content": true,

var lowercaseFields = map[string]struct{}{

"Name": {},

"Tags": {},

"Favorites": {},

"Content": {},

dschmidt requested review from Copilot and fschade April 20, 2026 14:35

dschmidt added the Type:Bug label Apr 20, 2026

Copilot started reviewing on behalf of dschmidt April 20, 2026 14:36 View session

Copilot AI reviewed Apr 20, 2026

View reviewed changes

Comment thread services/search/pkg/query/bleve/compiler.go Outdated

dschmidt requested a review from Copilot April 20, 2026 14:43

Copilot started reviewing on behalf of dschmidt April 20, 2026 14:43 View session

Copilot AI reviewed Apr 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(search): preserve value case for non-lowercased bleve fields#2633

fix(search): preserve value case for non-lowercased bleve fields#2633
dschmidt wants to merge 2 commits intomainfrom
fix/search-preserve-value-case

dschmidt commented Apr 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

sonarqubecloud bot commented Apr 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dschmidt commented Apr 20, 2026

Summary

Why

Out of scope

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

sonarqubecloud bot commented Apr 20, 2026

Quality Gate failed

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants