Bug #182

fixing code page conversion in RTF import

Added by J. Templ over 1 year ago. Updated over 1 year ago.

Status:ClosedStart date:11/21/2017
Priority:NormalDue date:
Assignee:J. Templ% Done:

0%

Category:-
Target version:1.7.1
Forum topic:

Description

RTF texts encoded for example with a russian Windows version are not imported correctly by BlackBox when opened on a non-russian Windows version.
In addition, eastern languages (e.g. Simplified Chinese) use multi-byte encoding, which is not imported correctly by BlackBox.

Reported by Oleg N. Cher and luowy, 2017-11-18.

Associated revisions

Revision 0209cc6c
Added by J. Templ over 1 year ago

fixing code page conversion in RTF import. Refs: #182.
RTF command "\ansicpg" is no longer skipped and command "\u" has been improved.
As proposed by Ivan.

Signed-off-by: Josef Templ <>

Revision b4019e4e
Added by J. Templ over 1 year ago

ansiCodePage initially set to 0. Refs: #182.

Signed-off-by: Josef Templ <>

Revision c86cfe0e
Added by J. Templ over 1 year ago

ansiCodePage initially set to 0. Refs: #182.

Signed-off-by: Josef Templ <>

Revision 9621fef3
Added by J. Templ over 1 year ago

current font set in command \plain and restored at group end. Refs: #182.
+ additional Mac character sets defined for \fcharset
+ special cases 0, 1, 2 refined for \fcharset
+ code page conversion only if using a font with code page different from 1252
+ code page conversion in Write improved, writes ? in case of error
+ ansiCodePage initialized to 1252
+ Write replaced by WriteUnicode for writing multibyte characters
+ command \cpg added for setting the code page of a font
+ type Context moved into ParseRichText because it references FontInfo now

Signed-off-by: Josef Templ <>

Revision 1f816218
Added by J. Templ over 1 year ago

2-byte encoding added for cp936 (simplified Chinese). Refs: #182.

Signed-off-by: Josef Templ <>

Revision cb78ba93
Added by J. Templ over 1 year ago

generic multi-byte support added by buffering text runs. Refs: #182.
Mostly as proposed by luowy.
Plus small code cleanups.

Signed-off-by: Josef Templ <>

Revision 015afb54
Added by J. Templ over 1 year ago

\charset0 mapped to cp1252. Refs: #182.
As it was before in BB1.6.

Signed-off-by: Josef Templ <>

Revision f47e7053
Added by J. Templ over 1 year ago

error behavior improved and TAB character handled. Refs: #182.
1. Selecting a font not in the font table now returns the default font.
This is because LibreOffice sometimes selects an undefined font but does not use it at all.
The new behavior avoids a TRAP 100 in such cases.
3. 09X (TAB) is no longer ignored. This is used for example in VS Editor.
3. Reference to RTF specification added in header comment.

Signed-off-by: Josef Templ <>

Revision fd4e5150
Added by J. Templ over 1 year ago

error behavior improved and TAB character handled. Refs: #182.
1. Selecting a font not in the font table now returns the default font.
This is because LibreOffice sometimes selects an undefined font but does not use it at all.
The new behavior avoids a TRAP 100 in such cases.
3. 09X (TAB) is no longer ignored. This is used for example in VS Editor.
3. Reference to RTF specification added in header comment.

Signed-off-by: Josef Templ <>

Revision aa34f96b
Added by J. Templ over 1 year ago

typo fixed in function Font. Refs: #182.
Initialization refined: an unspecified font is now identified by a special id named 'noFontId'.
This avoids inserting two FontInfo nodes with id = 0.

Signed-off-by: Josef Templ <>

History

#1 Updated by J. Templ over 1 year ago

  • Description updated (diff)

#2 Updated by R. Campbell over 1 year ago

  • Status changed from New to Closed

Also available in: Atom PDF