Feature #19
closedAdding 16-bit Unicode support for Component Pascal identifiers
100%
Description
BlackBox 1.6 already supports extended ASCII-characters from the ISO-Latin-1 subset of Unicode for Component Pascal identifiers. For the benefit of the Cyrillic or Greek community, for example, it is required to add 16-bit Unicode support. In order to simplify the required changes in the compiler and runtime system and for providing a compact encoding
of plain ASCII-identifiers the UTF-8 encoding shall be used for representing Unicode identifiers both
internally in the compiler and runtime system and externally in symbol and object files. This also has 
the advantage that the symbol and object file formats stay compatible with BB 1.6 as long as plain ASCII
characters are used.  Because Unicode characters may also exist in module names, which are mapped to file names, it is required to add Unicode support to file name handling in a number of modules.
Refers to CPC-1.7 change list items 1, 7, 13, 21, 22, 23.
Updated by J. Templ about 11 years ago
- Description updated (diff)
- % Done changed from 0 to 50
Updated by J. Templ about 11 years ago
- Subject changed from Adding full Unicode support for Component Pascal identifiers to Adding 16-bit Unicode support for Component Pascal identifiers
- Description updated (diff)
Updated by I. Denisov almost 11 years ago
- Status changed from New to In Progress
- % Done changed from 50 to 70
According the today de facto standart of UTF-8.
http://tools.ietf.org/html/rfc3629
any valid UTF-8 string should match the next "Syntax of UTF-8 Byte Sequences":
UTF8-octets = *( UTF8-char ) UTF8-char = UTF8-1 / UTF8-2 / UTF8-3 / UTF8-4 UTF8-1 = 00X-7FX UTF8-2 = C2X-DFX UTF8-tail UTF8-3 = E0X A0X-BFX UTF8-tail / E1X-ECX 2( UTF8-tail ) / EDX 80X-9FX UTF8-tail / EEX-EFX 2( UTF8-tail ) UTF8-4 = F0X 90X-BFX 2( UTF8-tail ) / F1X-F3X 3( UTF8-tail ) / F4X 80X-8FX 2( UTF8-tail ) UTF8-tail = 80X-BFX
Alexander Shiryaev made the algorithm that doing conversion from UTF8 to Unicode checking validity of the input.
http://forum.oberoncore.ru/viewtopic.php?f=127&p=89571#p89571
Updated by I. Denisov almost 11 years ago
Helmut found one error!
Helmut wrote:
Dear BlackBox User,
two days ago I received an e-mail from Hans Klaver:
Dear Helmut,
Today I downloaded BlackBox 1.7-RC4 for Windows.
In case you do not know already: the BB Logo is missing from theAbout BlackBox dialog and from the Guided Tour document.
Bye,
Hans KlaverWithout his feedback I did not know that the uploaded version had an error.
I immediately rollback to the last known good version and search for the
error.I catch the error with the following steps:
1. select the word DevCPM.LogWStr
2. Info -> Search in Source
3. klick on the link Dev/Mod/CPP.odc
4. Dev -> Compile
command error: cannot load module DevCompilerNotes:
The error is reported form the StdInterpreter.ShowLoaderResult via
CannotLoaderResult.
The module DevCompiler exist and after restart of BlackBox I can
compile. So what happens?I found the error in module StdLoader
PROCEDURE (h: Hook) ThisMod (IN name: ARRAY OF SHORTCHAR):
Kernel.Module;at line
VAR m: Kernel.Module; ms: ModSpec; n: Kernel.Name; res: INTEGER;The variable definition res: INTEGER; must be deleted.
The correct line is
VAR m: Kernel.Module; ms: ModSpec; n: Kernel.Name;Please have a look at Help -> About BlackBox
Version 1.7-RC4 Built number 10 on 20.10.2014 is OK
Version 1.7-RC4 Built number 11 on 30.10.2014 is erroneous
Version 1.7-RC4 Built number 15 on 11.11.2014 is OKI apologize the inconvenience you with my fault.
With best regards
Helmut Zinn
Updated by I. Denisov almost 11 years ago
- % Done changed from 70 to 80
luowy suggested better converter, that is also handle surrogates and return res in clever way.
StdCoder.Decode ..,, ..fv....3QwdONl9RhOO9vRbf9b8R7fJHPNGomCrlAyIhgs,CbKBhZ xi2,CoruKu4qouqm8rtuGfa4.hOO9vRb1Y66wb8RTfQ9vQRtIdvPZHWKqtCa.E.U5Usp,6.5Qw dONlnayKmKKqCLLCJuGqayKm6F9vQ5nsH3.bnayKmKa2,Cor.kay4.qorGqmQCU2,CJuyKtQC9 8P9PP7ONbXmb.2.AdAk5kUm.,6.k39.86.QC18RdfQHfMf9R9vQ7ONb1E.kHE.0.p.,6.jdLL3 0EJYjyC.6.VQ.E4k.8Mtf.2.S02.e,2UgW.Ue.E.mP,UAU0IkmL,6.Y32.I16.j,6.J,U.YLk. 0.85CE,9T3E.0.n00.p.0U.460.J,U.2GE4E.q,CE3U2V1w,61s.VU.64s.T.S.8E0E08Mtf.2 .y20E.c4E.2E2.e0U.2Uw0e.8EOE.a78k8E.a,8k.E.U1o.2U5U3IkmL,6..EBU.YJ2.I3.,6. V2g0MR1U1A20k0u0I,QU,U.A2I.6.FR.QUDU.21gUdU1Y,MD6.1U.QUF2.0k7k0e,0kIE,O,2, ,E,4.0k7k,C,0E1k044Ck0E0G.CE,U8U.Y2QUZU.g2gUPU1Y0gUX,9.7.CE,k0a,0EJ2.5s16. d0zT1H6IZuH5OF7OJZOF,NJdfNl7JTvIdfQHfPDf8,78HeH,NRdfNldC,NEZeI1OK,tHB86b8G TeIduEFOEZuC,tHf8J,tQdfQp761eI.CIY42UmhgnJbUAdCZe3xc3JedQbBAV7QcDpdHZeUAhg ZhZxgVZh0BjohgUgbUAav2YoJipphXBgohgY3Yx2Yl2av2Ze2YmhgnhigZiUIZdgV7AV1,Oqo8 rtGLEqHE0nR0Gu4qomKEqHE4nRWGJ0mtGrkGrmemIqk4ak2OpU8JEWLK0momGEeKK0mq4KweHE aIb.rN1HM0HsMFfC,tIF0UBUnZZUQimIbUAdC,2YcIZUQC66JN8PU7Yiu2Y7,.Grka43PSdPNb 96JN8P.TvON76bPRZfQp763uHT8H9OERuCH66Fd8,tQffQyqn4KuKKE.qk2aEfEIeGcKIcCHQC HJam4aU7Igppgu2Y,,CHEyIX0md..ohg2YhJbUAdC,g,g,3OFDOGR86ZPN0GRqHE0nRqk2ako0 GRqHE66J96pND,,PPMl96pND,7H9eHFtQdfQH76P76XtC,tQ,dCvFnaKtsC,N1HM0j8GH8H986 FNRdfNipoqJECGE0HgaGEOGEWGp0GS0mq4ad2Y2xdUgV7M05HEenSgiopAsCPM0akYOIECLEqH EO42YI3d3pdUgV7A,HcMQfkgfUIbxsMFvC,dP,dCvlMi1Z76pND01bPRZHEenSoc,ZdHhcv2YU gV7A,HsE1uI98659O,tHB86PM0AVw3Yl2fcIZk2feAZioZrocMJbUQioJiPJhR3Yug55nRAdCR ccIhdQbBU7YDVtEZ7KRd9V7FB8Kp76l96pNDyIdGIICKoaGEqGEAbmQbUQiUUoBgdtC,7R,dCv lMgV7k2m598Ale9R7A9eFleC,N1HU76S,dCw7.IamYav2YBU7MGQgc3Yx2Ykgck66d8GsQZ76M AU7MFNuIHeF,tM.78K,7J.ENin4a.HkWuIWin4a.Hkt0GR6R.EN.H6Tock2fio3B8BleC,N1M0 W5w7.6BVtC,N1M0a2.B8A0Ge.UnQbUgV7k2gcA,.GXU..Gn4ak2A,9eH.HktUo,.bVnhCUIJeJ hcvgV7k2KIagcU2ZesMTfPbPRPPN,dNHHuCLu0mom46631,,M0CLu8rh.CIY8JI0HWCIM0HY0m J0mb8JWUdQbUAdC,0GtKqtUl.akWu2UAVBAV7M0THEenSY866PM0AV1,bfA,tHBO1HM0HM0tXu 2Y7pcU2ZX3hUYbU2as2aMBZDJecQgc3Yy2YkIc43fd2YI3d3lriKEe1B0iX3pd2R5M0tnMeHEa IX.tV,3aMBZD,u1.....aEyIau2Y7p6ES6C.cDAb43fd22....aEyQau2Y7p6ESMCV7KH,cDI6 ....U7YDddC,NG.tVs.UyEQOIgaGE....M09eH22M0HWjRBd8G9WBU76F9uEF7RHtC,7S,dC2j UIZUoao2Yf22Ud224HNWnR0m4k2A7GLEq1,7JF0k2MGQipVI3d3FIeGEC5y4.M06F6SN76X7AV 7AV7GHtCPM0A,QC.Q6EQ0HMWIE6S,7FHeJ,7BV7AFGIemayIW0GO0HMIZdAZv22.P.H.b1.I8Q 6s8MHT8F,,aWUA7UU2ZeghVBjWhgUIhyghV3jU2YeA3M0gcAl4ak2A,QC..lP8r76.g,A,KIbM 1M0QiUUaltQ5k2gcAFE8quOqhuqi0GRsMMGcPHtC,tQI501k2gcC,AV3Z7WGJ0momKq.kt.A,B uHZ86P96pND22T86R96P76X767uHU7kYcO,7Dv76PPMl96d8GsQd1U1Vk.kbElKLnghRBZdQbU .J190,78J767uEl8K,76J,A,9eHg,A,ZPNb96MA0GWKoVWmoamRQiUUa,UUohjZiUQgjpBkomK q.dPMHHEUU.AV3,.U7pd13ZdB3PM0K2kYOYcQiUg5dPMAZa2Zi3Yy2YkAZUY868KLr8rmCrrmK vKKm0Gla5bf8HN1cF.24..k2a2J1.kt.kV....MGcO.00G2.MFEtK4MAq.90PU7luG566EEG3A dCR6ZPNb99,NAVN8,NFR8F0GI6RZPRUYd8kYcOuHEqqkW5..kR0Gp00qqkQbUgcCl4sQd1Uk2f vgV7gcC76f8RBHeQ8a4rN1HcUXDJ9X1xhiZimxhgZhZJinpZHZC58RZ9P7ONbvM,Mwd0.UiQcj pho,YcZRiX3.5011.85...CLL.U2V.IS2U.UIU.U76.2..AU0CyIVGhighgmRiiQ88pum470,M wd0UnpZGhighA70,cw5.0.LJ.w.QI2U.sU.ktumdsIdPSNPN7ONbH.4D.o3aLq.,cwFE.2..F. pG.2U.E,,.RNEd1K5GomCb.6,6..UYU.AU.U.UUQoOF.2Uwpr,6C5H.WnlM.E.cUZj0E..UO., .1.eWwV.E.0t.U...Xi0... --- end of encoding ---
Updated by I. Denisov almost 11 years ago
- Status changed from In Progress to Closed
- % Done changed from 80 to 100
Resolved and applied in master branch.
Final solution is using simple format check according the Center decision.
Updated by I. Denisov over 10 years ago
- Status changed from Closed to In Progress
I found, that Externalize & Internalize are not working correctly for the Views and Models from modules with Cyrillic identifiers.
Updated by J. Templ about 10 years ago
- Status changed from In Progress to Closed
Updated by I. Denisov about 9 years ago
- Related to Bug #120: The interface is not showing for aliases added
Updated by I. Denisov about 9 years ago
- Related to Bug #57: wrong encoding of "module not found" message added
Updated by I. Denisov about 9 years ago
- Related to Bug #132: Trash in the definitions for extended records with unicode identifiers added