CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
Forum rules
PLEASE READ THE FORUM RULES BEFORE YOU POST:
viewtopic.php?f=12&t=1901
MUST READ:
http://www.wysiwygwebbuilder.com/cms_tools.html
A lot of information about the Content Manager System can be found in the help/manual. Please read this first before posting any questions! Also check out the demo template that is include with the software.
CMS trouble shooting / FAQ:
viewtopic.php?f=10&t=43245
PLEASE READ THE FORUM RULES BEFORE YOU POST:
viewtopic.php?f=12&t=1901
MUST READ:
http://www.wysiwygwebbuilder.com/cms_tools.html
A lot of information about the Content Manager System can be found in the help/manual. Please read this first before posting any questions! Also check out the demo template that is include with the software.
CMS trouble shooting / FAQ:
viewtopic.php?f=10&t=43245
CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
Dear Pablo!
I'm from Vojvodina in North of Serbia where are many spoken languages, therefore sites are multilingual. I have difficulties with CMS Search. Everything seems to be ok but the CMS Search is working strange. I'm illustrating this with your CMS demo project. There is no custom code. I turned on unicode support for CMS Admin and CMS View. The editor is the newest CKEditor 4. Database collation is utf8_general_ci, charset is utf8.
http://www.ntesla.edu.rs/cmseredeti/index.html
Strange behavior with Hungarian language: search can find some of the words but not every word
Strange behavior with Serbian Cyrillic language: search can't find any words
Strange behavior with Serbian Latin language: search can find some of the words but not every word
What is your advice?
I saw your CMS demo at http://www.wysiwygwebbuilder.com/suppor ... php?page=4 and there is everything ok with CMS Search in Cyrillic article (CMS Search is finding EVERY word from article). The difference is that there is a belorussian language and therefore maybe the database configuration is different?
I have completed a new web site with CMS tools and noticed the above mentioned things.
http://www.ntesla.edu.rs/sr_intro.html
In this site I used CMS Tools in 3 places:
http://www.ntesla.edu.rs/prosveta/sr_dogadjaji.php
http://www.ntesla.edu.rs/informacije/sr_dok_skole.php
http://www.ntesla.edu.rs/informacije/sr ... abavke.php
(Of course with the other language too - with hungarian prefix hu_ in he page names)
I think the simplest way to find out the error is thru your CMS Demo project presented by me with almost nothing to changed in it. If you still want my project source based on your CMS Demo tell me and I will upload it somewhere.
Thanks in advance!
I'm from Vojvodina in North of Serbia where are many spoken languages, therefore sites are multilingual. I have difficulties with CMS Search. Everything seems to be ok but the CMS Search is working strange. I'm illustrating this with your CMS demo project. There is no custom code. I turned on unicode support for CMS Admin and CMS View. The editor is the newest CKEditor 4. Database collation is utf8_general_ci, charset is utf8.
http://www.ntesla.edu.rs/cmseredeti/index.html
Strange behavior with Hungarian language: search can find some of the words but not every word
Strange behavior with Serbian Cyrillic language: search can't find any words
Strange behavior with Serbian Latin language: search can find some of the words but not every word
What is your advice?
I saw your CMS demo at http://www.wysiwygwebbuilder.com/suppor ... php?page=4 and there is everything ok with CMS Search in Cyrillic article (CMS Search is finding EVERY word from article). The difference is that there is a belorussian language and therefore maybe the database configuration is different?
I have completed a new web site with CMS tools and noticed the above mentioned things.
http://www.ntesla.edu.rs/sr_intro.html
In this site I used CMS Tools in 3 places:
http://www.ntesla.edu.rs/prosveta/sr_dogadjaji.php
http://www.ntesla.edu.rs/informacije/sr_dok_skole.php
http://www.ntesla.edu.rs/informacije/sr ... abavke.php
(Of course with the other language too - with hungarian prefix hu_ in he page names)
I think the simplest way to find out the error is thru your CMS Demo project presented by me with almost nothing to changed in it. If you still want my project source based on your CMS Demo tell me and I will upload it somewhere.
Thanks in advance!
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
Are you sure the database is configured as UTF8/unicode?
Do you see the search words in the database?
Do you see the search words in the database?
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
According to the phpMyAdmin screenshot, the database is configured as utf-8:
There are search words in the database on 3 languages...
Serbian cyrillic:
Serbian Latin:
Hungarian:
The screenshot is not showing all of them of course, those would be bigger images...
(But the word "rnrnФранцуз" is not the real word. "Француз" IS the real word.
"rnrnsrbi" isn't a real word, but "Srbi" IS.)
Searching for words from the scrrenshots:
Serbian Cyrillic - word "индоевропских" is in the word list in the database, but the CMS Search can't find it in the article.
Serbian Latin - word "Mađarskoj" is in the word list in the database, but the CMS Search can't find it in the article.
Hungarian - word "anyanyelvűek" is in the word list in the database, but the CMS Search can't find it in the article.
There are search words in the database on 3 languages...
Serbian cyrillic:
Serbian Latin:
Hungarian:
The screenshot is not showing all of them of course, those would be bigger images...
(But the word "rnrnФранцуз" is not the real word. "Француз" IS the real word.
"rnrnsrbi" isn't a real word, but "Srbi" IS.)
Searching for words from the scrrenshots:
Serbian Cyrillic - word "индоевропских" is in the word list in the database, but the CMS Search can't find it in the article.
Serbian Latin - word "Mađarskoj" is in the word list in the database, but the CMS Search can't find it in the article.
Hungarian - word "anyanyelvűek" is in the word list in the database, but the CMS Search can't find it in the article.
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
I'm sorry, I don't think I can help you with this.
The CMS script may be incompatible with these languages. Although you are the first user that has reported issues with this.
The CMS script may be incompatible with these languages. Although you are the first user that has reported issues with this.
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
Sorry to hear that. Search among the articles is important thing. In the meantime I figured out that CMS Search...
...on hungarian is working correctly with words WITHOUT the national characters éáűúőóüí
...on serbian latin is working correctly with words WITHOUT the national characters čćžđš
...on serbian cyrillic is not functional at all
I'm not an expert but I read a little and found this article on the internet among others with the same content:
https://mathiasbynens.be/notes/mysql-utf8mb4
It's about full unicode support and using utf8mb4 instead of utf8. I'm not sure but can you reconsider your CMS script to implement full unicode support for above mentioned languages if I ask you politely? It may be as option for developers to chose through development interface. (like existing checkbox/properties for unicode support in CMS Admin and CMS View).
Maybe the UTF-16 is the solution? I'm asking this because I have other difficulties with hungarian/serbian latin/serbian cyrillic in the tables when I'm importing content from CSV file.
In the case of pure table importing cyrillic text from utf-8 csv I'm seeing incorrect characters, but when I importing from utf-16 csv the content is appearing correctly - sadly at first attempt for editing it is changing to garbage graphic characters.
When I'm using your Responsive Data Table extension utf-8 csv cyrillic import is incorrent to, but utf-16 csv cyrillic export is ok and the characters are shown correctly. I will report this anomaly in the right place and category on this forum. I mentioned it here because of the same problem of character set coding.
Dear Pablo. Please help me to solve this/these problem(s). If not now then in the future releases. I'm going to make other multilingual projects with CMS tools.
Thanks in advance!
...on hungarian is working correctly with words WITHOUT the national characters éáűúőóüí
...on serbian latin is working correctly with words WITHOUT the national characters čćžđš
...on serbian cyrillic is not functional at all
I'm not an expert but I read a little and found this article on the internet among others with the same content:
https://mathiasbynens.be/notes/mysql-utf8mb4
It's about full unicode support and using utf8mb4 instead of utf8. I'm not sure but can you reconsider your CMS script to implement full unicode support for above mentioned languages if I ask you politely? It may be as option for developers to chose through development interface. (like existing checkbox/properties for unicode support in CMS Admin and CMS View).
Maybe the UTF-16 is the solution? I'm asking this because I have other difficulties with hungarian/serbian latin/serbian cyrillic in the tables when I'm importing content from CSV file.
In the case of pure table importing cyrillic text from utf-8 csv I'm seeing incorrect characters, but when I importing from utf-16 csv the content is appearing correctly - sadly at first attempt for editing it is changing to garbage graphic characters.
When I'm using your Responsive Data Table extension utf-8 csv cyrillic import is incorrent to, but utf-16 csv cyrillic export is ok and the characters are shown correctly. I will report this anomaly in the right place and category on this forum. I mentioned it here because of the same problem of character set coding.
Dear Pablo. Please help me to solve this/these problem(s). If not now then in the future releases. I'm going to make other multilingual projects with CMS tools.
Thanks in advance!
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
I think the CMS script is Unicode compliant. I have tested it with different Unicode languages.
The documentation you are referring is all database configuration related.
Unfortunately, I cannot help you with the configuration of the server.
The documentation you are referring is all database configuration related.
Unfortunately, I cannot help you with the configuration of the server.
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
Dear Pablo!
I'm accepting your verdict, but let me present some other moments.
Once more, I'm not expert and I'm not pretending to be. I'm trying only to use my own brain...
Before I wrote the previous post to you I tried some things about database configuration of course.
First attempt:
Instead of pure utf-8 charset and collation I create it with utf8mb4 charset and collation. In the same time I tried to correct your generated script also to support utf8mb4 via search/replace. Not succeded, probably I messed up something, you are the master with your scripts.
Second attempt:
Instead of pure utf-8 charset and collation I create it with utf16 charset and collation. In the same time I tried to correct your generated script also to support utf16 via search/replace. As a result I saw chinese characters, therefore not succeded, once again you are the master of your own scripts.
I'm very sad now...Anyway, thanks for your patience reading my posts...
I'm accepting your verdict, but let me present some other moments.
Once more, I'm not expert and I'm not pretending to be. I'm trying only to use my own brain...
Before I wrote the previous post to you I tried some things about database configuration of course.
First attempt:
Instead of pure utf-8 charset and collation I create it with utf8mb4 charset and collation. In the same time I tried to correct your generated script also to support utf8mb4 via search/replace. Not succeded, probably I messed up something, you are the master with your scripts.
Second attempt:
Instead of pure utf-8 charset and collation I create it with utf16 charset and collation. In the same time I tried to correct your generated script also to support utf16 via search/replace. As a result I saw chinese characters, therefore not succeded, once again you are the master of your own scripts.
I'm very sad now...Anyway, thanks for your patience reading my posts...
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
To see if the script supports the characters, I have added the words "Mađarskoj" and "anyanyelvűek" to the test page:
http://www.wysiwygwebbuilder.com/suppor ... php?page=4
As you can see this seems to work correct, so this indicates that the script works for these languages.
http://www.wysiwygwebbuilder.com/suppor ... php?page=4
As you can see this seems to work correct, so this indicates that the script works for these languages.
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
Yes, thank you, you are right!! This is crystal clear now. Something with database configuration, but what...I'm going crazy...spent days to figure out.
Thanks again for giving me a fix point in further investigations! I'm fond of WYSIWYG Web Builder and I'm planning to be a long rider with it.
Thanks again for giving me a fix point in further investigations! I'm fond of WYSIWYG Web Builder and I'm planning to be a long rider with it.
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
Hello! Bigdenis.
I have exactly the same problem. Did you find a problem with the Cyrillic alphabet?
If found, if not difficult, share a solution, рlease.
I have exactly the same problem. Did you find a problem with the Cyrillic alphabet?
If found, if not difficult, share a solution, рlease.
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
- BaconFries
-
- Posts: 5648
- Joined: Thu Aug 16, 2007 7:32 pm
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
@NDV it is most unlikely that you will get a answer from the original poster as he was Last active:Tue Mar 05, 2019 9:13 pm
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
Please make sure Unicode support is enabled.
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
CMS Search
Cyrillic (Ukrainian, Russian): No results
CMS View
Enable Unicode support (Yes)
CMS admin
Search Index: true
MySQL php5.6, php7.3
utf8_unicode_ci: search works only in Latin, Cyrillic (Ukrainian, Russian) No results
utf8_general_ci: does not respond to the search
utf8mb4_general_ci: does not respond to the search
Cyrillic (Ukrainian, Russian): No results
CMS View
Enable Unicode support (Yes)
CMS admin
Search Index: true
MySQL php5.6, php7.3
utf8_unicode_ci: search works only in Latin, Cyrillic (Ukrainian, Russian) No results
utf8_general_ci: does not respond to the search
utf8mb4_general_ci: does not respond to the search
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
I watched it working in a demonstration, сyrillic search works.
The question is how to find the error.
Demonstration. This is just a picture, these are not source files and database dumps in which you can see and compare.
On the Russian forum. This question has been asked since 2017. No solution found.
The question is how to find the error.
Demonstration. This is just a picture, these are not source files and database dumps in which you can see and compare.
On the Russian forum. This question has been asked since 2017. No solution found.
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
- make sure you have the latest version of WWB.
- enable Unicode in all objects (cms admin, cms view).
- set the character set of the page to UTF8
- set the character set of the database to UTF8
- enable Unicode in all objects (cms admin, cms view).
- set the character set of the page to UTF8
- set the character set of the database to UTF8
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
WWB 15.2.3
Unicode is enabled in all objects (cms admin, cms view).
Set page encoding to UTF8
Installed database encoding on UTF8
Example:
utf8_unicode_ci.sql
utf8_general_ci.sql
https://www.dropbox.com/sh/fpchcq69cwku ... q4xFa?dl=0
Unicode is enabled in all objects (cms admin, cms view).
Set page encoding to UTF8
Installed database encoding on UTF8
Example:
utf8_unicode_ci.sql
utf8_general_ci.sql
https://www.dropbox.com/sh/fpchcq69cwku ... q4xFa?dl=0
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
All settings are correct and see the generated code is Unicode compliant.
So, I think something is wrong on the database side or PHP configuration.
Here is an export of my test database:
https://www.wysiwygwebbuilder.com/support/CMS_PAGES.zip
So, I think something is wrong on the database side or PHP configuration.
Here is an export of my test database:
https://www.wysiwygwebbuilder.com/support/CMS_PAGES.zip
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
Tried different databases.
Debian v8 MySQL 5.5.62, php5.6
Debian v9 MariaDB 10.3, php7.3
A simple set of rules to create a database
Also created databases via phpmyadmin.
To check, deleted and made a new.
Base utf8_unicode_ci
CMS_SEARCH_WORDS entries are present.
So I can’t imagine. Broken brain
What is strange is that extensions Bootstrap Table and MySQL CRUD, database search works and displays data.
Debian v8 MySQL 5.5.62, php5.6
Debian v9 MariaDB 10.3, php7.3
A simple set of rules to create a database
Code: Select all
CREATE DATABASE cmsdb CHARACTER SET utf8 COLLATE utf8_unicode_ci;
GRANT ALL PRIVILEGES ON cmsdb.* TO 'user'@'localhost';
FLUSH PRIVILEGES;
To check, deleted and made a new.
Base utf8_unicode_ci
CMS_SEARCH_WORDS entries are present.
So I can’t imagine. Broken brain
What is strange is that extensions Bootstrap Table and MySQL CRUD, database search works and displays data.
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
Thanks for the dump (CMS_PAGES). I will look for an error.
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
Transferred a test page from the home server to hosting (php7.2). The host provider said php settings are correct. Cyrillic, search no result.
CMS_PAGES: Cyrillic, search no result.
Brain explosion.
CMS_PAGES: Cyrillic, search no result.
Brain explosion.
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
Did you use my database?
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
I tried your base CMS_PAGES on my Debian server. Cyrillic, вut the search is "no result". Latin Search works, ок.
After, i transferred the pages to paid hosting.
Errors appeared on the hosting when exporting the database CMS_PAGES, php7.2. Booted with errors, but worked. like last time - cyrillic вut the search is "no result". Latin Search works, ок. After, i created a new database, and WWB independently created my own tables, but no result.
Weekends will be a lot of time. I will do CMS.
After, i transferred the pages to paid hosting.
Errors appeared on the hosting when exporting the database CMS_PAGES, php7.2. Booted with errors, but worked. like last time - cyrillic вut the search is "no result". Latin Search works, ок. After, i created a new database, and WWB independently created my own tables, but no result.
Weekends will be a lot of time. I will do CMS.
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
A familiar programmer helped
Invalid string
\W will not work Cyrillic utf-8.
Replace, comment or delete out.
Deleted string. Search and Cyrillic works, ок!
Invalid string
\W will not work Cyrillic utf-8.
Code: Select all
$word = preg_replace('/\W/', '', $word);
Deleted string. Search and Cyrillic works, ок!
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
Changed the line. Cyrillic search works, ok!
Or this code
Code: Select all
$word = preg_replace('/\W/u', '', $word);
Code: Select all
$word = preg_replace('/[^\w]/u', '', $word);
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
Thanks, I will investigate if this can be implemented in a future version.
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
I haven't figured out solution for this problem until today. In the meanwhile I made some CMS sites with this "half functional" search. I'm glad that NDV was so persistent and he has found the solution. I tested his solutions with corrected line in the script and I can say that the solution is perfectly working in CMS Search for Serbian Latin, Serbian Cyrillic and Hungarian languages too!
Pablo, please implement this solution in the future release! Thanks in advance.
Pablo, please implement this solution in the future release! Thanks in advance.
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
The modification has now been implemented in the latest build (02/16/2020)
https://www.wysiwygwebbuilder.com/download.html
https://www.wysiwygwebbuilder.com/download.html
Re: CMS Search strange behavior with Hungarian, Serbian Cyrillic and Serbian Latin languages
Thank you!
https://t.me/webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42
I offer my services for website development in WYSIWYG Web Builder, HTML/CSS/JQuery.
Contact us on telegram @webart42