UNICODE_REPLACE_BAD function

Returns SOURCE with characters that are invalid in UTF-8 encoding replaced with the string REPL.

Prototypes

UNICODE_REPLACE_BAD(SOURCE VARCHAR(4000), REPL VARCHAR(100))
UNICODE_REPLACE_BAD(SOURCE VARCHAR(4000))

RETURNS VARCHAR(4000)

Description

Under certain circumstances, DB2 will permit text containing characters invalid in the UTF-8 encoding scheme to be inserted into a column intended to contain UTF-8 encoded data. While this doesn’t cause a problem for DB2 queries, it can cause issues for down-stream appliations. This function provides a means of stripping or replacing such invalid characters.

Parameters

SOURCE
The string to search for characters invalid in the UTF-8 encoding scheme.
REPL
The string to replace any invalid sequences with. Defaults to the empty string if omitted.

Examples

Replacement of truncated UTF-8 characters:

VALUES
    (UNICODE_REPLACE_BAD('FOO' || X'C2', 'BAR'))
1
--------------------....
FOOBAR

Replacement of invalid characters in the middle of a string:

VALUES
    (UNICODE_REPLACE_BAD('FOO' || X'80' || BAR))
1
--------------------....
FOOBAR