Last modified: 2014-04-30 18:27:01 UTC
Lucenesearch: https://de.wikipedia.org/w/index.php?title=Spezial:Suche&search=moveInterwikisToTop&fulltext=Suche&profile=all&redirs=1 Cirrus: https://de.wikipedia.org/w/index.php?title=Spezial:Suche&search=moveInterwikisToTop&fulltext=Suche&profile=all&redirs=1&srbackend=CirrusSearch Where did all the JS pages go?
Wonder if something went wrong with Gerrit change #115214.
I believe this is caused by us not word breaking foo.bar into foo and bar. The solution to this, as I see it, is to use the word_break token filter _but_ to do that I have to rebuild each analyzer with that filter. That isn't easy because now what I want the German analyzer I can ask for {"analyzer":{"text":{"type":"german"}}} but to rebuild it I have to do this: {"analyzer":{"text":{ "filter": [ "standard", "lowercase", "german_stop", "german_normalization", "light_german_stemmer" ], "tokenizer": "standard", "type": "custom" }},"filter":{ "german_stop": { "stopwords": [ "denn", ... "eures", "dies", "bist", "kein" ], "type": "stop" } }} Except even that doesn't work because german_normalization isn't properly exposed! The pull request I've opened upstream exposes all the stuff I'd need and it creates an endpoint on Elasticsearch designed to spit this back out for easy customization.
Interesting. Wonder if we're running into bug 40612 in a different form then.
I have little doubt.