Last modified: 2013-08-29 00:38:46 UTC
same as many other language wikis (Bug 2981 for example), Gujarati wiki has the same problem. can entire set of Gujarati alphabets (http://www.unicode.org/charts/PDF/U0A80.pdf) added to linktrail?
It's not exactly broken as it was never implemented
(In reply to comment #0) > same as many other language wikis (Bug 2981 for example), Gujarati wiki has > the > same problem. can entire set of Gujarati alphabets > (http://www.unicode.org/charts/PDF/U0A80.pdf) added to linktrail? Entire set? Even digits? Surely not punctuation, except perhaps hyphen. If you attach a txt list it's easy; that PDF contains some characters I'm unsure about. Do you want the same for linkprefix?
(In reply to comment #2) > (In reply to comment #0) > > same as many other language wikis (Bug 2981 for example), Gujarati wiki has > > the > > same problem. can entire set of Gujarati alphabets > > (http://www.unicode.org/charts/PDF/U0A80.pdf) added to linktrail? > > Entire set? Even digits? Surely not punctuation, except perhaps hyphen. If > you > attach a txt list it's easy; that PDF contains some characters I'm unsure > about. > > Do you want the same for linkprefix? My bad, you are right, we don't need punctuations, digits, etc. In nutshell, at least below characters: ક્ ખ્ ગ્ ઘ્ ચ્ છ્ જ્ ઝ્ ટ્ ઠ્ ડ્ ઢ્ ણ્ ત્ થ્ દ્ ધ્ ન્ પ્ ફ્ બ્ ભ્ મ્ ય્ ર્ લ્ વ્ સ્ શ્ ષ્ હ્ ળ્ ક્ષ્ જ્ઞ્ and additional below 2 sections: * [[gu:વિકિપીડિયા:ગુજરાતીમાં કેવી રીતે ટાઇપ કરવું#સ્વર]] * [[gu:વિકિપીડિયા:ગુજરાતીમાં કેવી રીતે ટાઇપ કરવું#વિશેષ_ચિહ્નો]] And good suggestion, I never thought about linkprefix, Yes please, add the same set of charqacters for linkprefix as well.
So I'm adding these, please check: ક્ ખ્ ગ્ ઘ્ ચ્ છ્ જ્ ઝ્ ટ્ ઠ્ ડ્ ઢ્ ણ્ ત્ થ્ દ્ ધ્ ન્ પ્ ફ્ બ્ ભ્ મ્ ય્ ર્ લ્ વ્ સ્ શ્ ષ્ હ્ ળ્ ક્ષ્ જ્ઞ્ અ આ ઇ ઈ ઉ ઊ એ ઐ ઓ ઔ અં અઃ અઁ ઍ ઑ ઋ ઁ ઼ । Also on https://translatewiki.net/w/i.php?title=MediaWiki%3ALinkprefix%2Fgu&diff=4741186&oldid=2063880 but someone should check what characters are covered by the code range \x80-\xff .
Related URL: https://gerrit.wikimedia.org/r/65449 (Gerrit Change I872a9f141f64a664bc3743fcff5f036634445ba0)
(In reply to comment #4) > So I'm adding these, please check: > > ક્ ખ્ ગ્ ઘ્ ચ્ છ્ જ્ ઝ્ ટ્ ઠ્ ડ્ ઢ્ ણ્ ત્ થ્ દ્ ધ્ ન્ પ્ ફ્ બ્ ભ્ મ્ ય્ ર્ લ્ > વ્ સ્ શ્ ષ્ હ્ ળ્ ક્ષ્ જ્ઞ્ અ આ ઇ ઈ ઉ ઊ એ ઐ ઓ ઔ અં અઃ અઁ ઍ ઑ ઋ ઁ ઼ । Thank you and also the below, as only a mixture of above and below makes meaningful characters/alphabets... ્ ા િ ી ુ ૂ ે ૈ ો ૌ ં ઃ ઁ ૅ ૉ ૃ \x80-\xff looks like wildcards (\x) which, don't seem to work as, but I will love to be wrong here..
Dhaval, we need anything in Gujarati script as a trail right? We need not write all alphabets with virama, but we can just use the gu unicode range like this: $linkTrail = "/^([\x{0A80}-\x{0AFF}]+)(.*)$/sDu"; with $wgLanguageCode = 'gu'; it works. Please confirm that this is what you need.
(In reply to comment #7) > Dhaval, we need anything in Gujarati script as a trail right? We need not > write > all alphabets with virama, but we can just use the gu unicode range like > this: > > $linkTrail = "/^([\x{0A80}-\x{0AFF}]+)(.*)$/sDu"; Yes Santhosh, that is true. I had originally provided the table for the entire gu unicode range, but as there might be puncuation marks in it, Nemo came up with an idea of character set to be defined. I provided characters with virama because, if there are joint characters, it should work. > > with $wgLanguageCode = 'gu'; it works. > > Please confirm that this is what you need. When you say it works, does it mean it is working somewhere in test enviroment? Can I test it?
(In reply to comment #8) > Yes Santhosh, that is true. I had originally provided the table for the > entire > gu unicode range, but as there might be puncuation marks in it, Nemo came up > with an idea of character set to be defined. I provided characters with > virama > because, if there are joint characters, it should work. Conjuncts will still work with my regex too > When you say it works, does it mean it is working somewhere in test > enviroment? No, it was my local wiki. :)
(In reply to comment #9) > (In reply to comment #8) > Conjuncts will still work with my regex too > Perfect. Lets go ahead and deploy then.
Is there any update to this? Its been 2 months since everything was sorted and Change was successfully merged into the git repository...
(In reply to comment #11) > Is there any update to this? Its been 2 months since everything was sorted > and > Change was successfully merged into the git repository... So it's on your wiki already, didn't it work?
(In reply to comment #12) > So it's on your wiki already, didn't it work? Exactly, it never worked on gu.wiki. Can you please check why so?
reedy@tin:/a/common$ grep -i linktrail php-1.22wmf12/languages/messages/MessagesGu.php $linkTrail = '/^((?:[a-z]|ક્|ખ્|ગ્|ઘ્|ચ્|છ્|જ્|ઝ્|ટ્|ઠ્|ડ્|ઢ્|ણ્|ત્|થ્|દ્|ધ્|ન્|પ્|ફ્|બ્|ભ્|મ્|ય્|ર્|લ્|વ્|સ્|શ્|ષ્|હ્|ળ્|ક્ષ્|જ્ઞ્|અ|આ|ઇ|ઈ|ઉ|ઊ|એ|ઐ|ઓ|ઔ|અં|અઃ|અઁ|ઍ|ઑ|ઋ|ઁ|઼|।|્|ા|િ|ી|ુ|ૂ|ે|ૈ|ો|ૌ|ં|ઃ|ઁ|ૅ|ૉ|ૃ)+)(.*)$/sDu'; reedy@tin:/a/common$
Weird, I thought the last patchset by santhosh had converted it to a range as per comment 7, let's make it now.
(In reply to comment #15) > Weird, I thought the last patchset by santhosh had converted it to a range as > per comment 7, let's make it now. Thanks Nemo, please let me know once it you know it is deployed, so that I can test and confirm.
Change 77509 had a related patch set uploaded by Nemo bis: Customise linktrail for Gujarati (gu) https://gerrit.wikimedia.org/r/77509
Change 77509 merged by jenkins-bot: Customise linktrail for Gujarati (gu) https://gerrit.wikimedia.org/r/77509
This is hopefully fixed now and you should see it live on gu.wiki on August 22. Please reopen if it is not fixed after that date.
I checked today and it shows very weird result (will be waiting till 22nd August anyhow, but thought to report here now, so if needed someone can simultaneously work on it). See test page created on gu.wiki (http://gu.wikipedia.org/wiki/Test), it seems that alphabet sets provided in Comment 4, Comment 6 and Comment 14 are working but only in a specific sequence/manner, not in any combination
Thank you Nemo, it has been working perfectly well since last week.