Article

Strings having non western UTF-8 characters are displayed truncated by PHP

Information

 
Article Number000090488
EnvironmentCentOS RedHat Oracle Linux 5.x 6.x 7.x 64 bit
OpenEdge 10.2B06 and higher
OpenEdge 11.x
PHP 5 and 7
Question/Problem Description
Strings having non western UTF-8 characters are displayed truncated by PHP
Steps to Reproducecreate a UTF-8 OpenEdge database having a table character field with a FORMAT "x(40)" and MAX-WIDTH 40
enter a row having '还还还还还还还还还还还还还还还还还还还还还还' (22 Chinese characters) in that new table column
Clarifying Information
The .df of the used table column shows the following 

ADD FIELD "ColumnName" OF "TableName" AS character 
  DESCRIPTION ""
  FORMAT "x(40)"
  INITIAL ""
  LABEL "Column description"
  POSITION 15
  MAX-WIDTH 40
  ORDER 140


The SQL select used in the PHP code is like:
select columnName from PUB.tableName where someUniqueId='0000000001'

Executing an ODBC query outside of PHP with:
$DLC/bin/proenv
export LD_LIBRARY_PATH=$DLC/lib:$DLC/odbc/lib
export ODBCINI=/etc/odbc.ini
cd $DLC/odbc/samples/example

./example
./example DataDirect Technologies, Inc. ODBC Example Application.
Enter the data source name : Progress
Enter the user name : sysprogress
Enter the password : sysprogress
Enter SQL statements (Press ENTER to QUIT)
SQL> select columnName from PUB.tableName where someUniqueId='0000000001'

shows the expected full string with its 22 characters.

Both PHP 5 and PHP 7 are showing the same string truncation behavior on strings having non western UTF-8 characters.

 
Error MessagePHP show a truncated string such as
还还还还还还还还还还还还还�
instead of the correct full string:
还还还还还还还还还还还还还还还还还还还还还还
Defect/Enhancement Number
Cause
PHP unified ODBC and PHP PDO ODBC do not support Unicode. Those PHP ODBC libraries are using ANSI ODBC APIs (SQLxxxA APIs) instead of the Unicode ODBC API (SQLxxxW APIs)"

More information available on https://www.progress.com/tutorials/odbc/unicode

The reason UTF-8 works with PHP when using Mysql (another type of database often used with PHP) is because MySQL does not use 'odbc_do' (unified ODBC PHP library) but for example 'mysqli_query' from a different PHP library (mysqli). There is also more than one PHP library available for connecting to MySQL, such a custom PHP extension does not exist for OpenEdge. (only the generic PHP unified ODBC and PHP PDO libraries can be used for OpenEdge ODBC connections from within PHP)

PHP unified ODBC and PHP PDO ODBC uses the SQL width value of a column to determine the maximum amount of characters it will show from that column and assume 1 byte per character which is not a correct assumption for non western characters such as Chinese characters as the unicode code point for each Chinese character cannot fit into a single byte (in the string 还还还还还还还还还还还还还还还还还还还还还还 3 bytes for each Chinese character is needed).

The reason the truncated string that was displayed through PHP:
还还还还还还还还还还还还还�
only showed the first 13 chinese characters + 1 question mark character
is because
13 Chinese characters * 3 bytes = 39 bytes
and the SQL width is set to 40 bytes.
Resolution
None at this time.
Workaround
Although the PHP libraries are using the wrong (ANSI) ODBC APIs PHP can still show the characters correctly.

For the example given above, to take into account the PHP behavior with the SQL width, the SQL width value need to be increased to at least 
22 characters * 3 bytes = 66

This can be done with the following steps:

mpro databaseName -U sysprogress -P sysprogress
F3 -> Tools -> Data Dictionary -> Schema -> Adjust Field Width... -> TableName
and changed the 'Width' value 40 into 66 for the 'Field-Name' ColumnName and then selecting
<Save> -> <Close> -> A. Appl -> Database -> Exit -> F3 -> File -> Exit


or when the SQL width adjustments need to be done on all database tables:

dbtool databaseName
2. SQL Width Scan w/Fix Option
1=self-service
Padding % above current max: 300
<table>: (Table number or all)? all
<area>: (Area number or all)? all


If unicode characters using 4 bytes is expected then instead of
Padding % above current max: 300
use
Padding % above current max: 400
with the dbtool command described above.
Notes
Attachment 
Last Modified Date8/15/2018 6:57 AM


Feedback
 
Did this article resolve your question/issue?

   

Your feedback is appreciated.

Please tell us how we can make this article more useful. Please provide us a way to contact you, should we need clarification on the feedback provided or if you need further assistance.

Characters Remaining: 1025