FMDiff™   

FileMaker Business Alliance
View Jürgen Geßwein's profile on LinkedIn

This site is W3C compliant:
Valid XHTML - Valid CSS
Last modified February 24 2016, 19:25:09 CET.

Entering Clean Text (or: avoiding unwanted characters)

Reader Comments at the bottom of this page

A frequently asked question is: Can illegal characters in a field harm my solution? The answer is twofold.

One: There is no such thing as an illegal character in FileMaker. FileMaker can store any utf-8 character (roughly 65,000 different chars) without any negative effects inside FileMaker. Characters of any code whatsoever in a text field can not corrupt a FileMaker file.

Two: There are unwanted characters, those with a character code in the range between 0 and 31 inclusive (decimal), with some exceptions. The Filter function can't be used because you want to only disallow a small number of characters, while the allowed rest is huge and thus can't be put in a filter definition. The issue with those characters becomes apparent when they are transferred to the outside of FileMaker, may it be export, HTML or XML output, printing, SQL, etc. You may encounter problems you can't explain but which have its origin in these invisible characters.


A FileMaker native solution in FileMaker Pro Advanced 10 and up

We create a Custom Function "cleanText" that is defined as follows. FileMaker Pro 10 provides a function to directly generate any bytecode.

Function Name and Parameters:
cleanText ( text )

Function Definition:

Substitute ( text ; 
[ char (  0 ) ; "" ] ; 	// This requires FileMaker Pro 10 or up
[ char (  1 ) ; "" ] ; 
[ char (  2 ) ; "" ] ; 
[ char (  3 ) ; "" ] ; 
[ char (  4 ) ; "" ] ; 
[ char (  5 ) ; "" ] ; 
[ char (  6 ) ; "" ] ; 
[ char (  7 ) ; "" ] ; 
[ char (  8 ) ; "" ] ; 
                        // char(9)  = Tab*
                        // char(10) = LineFeed*
[ char ( 11 ) ; "" ] ; 
[ char ( 12 ) ; "" ] ; 
                        // char(13) = Return*
[ char ( 14 ) ; "" ] ; 
[ char ( 15 ) ; "" ] ; 
[ char ( 16 ) ; "" ] ; 
[ char ( 17 ) ; "" ] ; 
[ char ( 18 ) ; "" ] ; 
[ char ( 19 ) ; "" ] ; 
[ char ( 20 ) ; "" ] ; 
[ char ( 21 ) ; "" ] ; 
[ char ( 22 ) ; "" ] ; 
[ char ( 23 ) ; "" ] ; 
[ char ( 24 ) ; "" ] ; 
[ char ( 25 ) ; "" ] ; 
[ char ( 26 ) ; "" ] ; 
[ char ( 27 ) ; "" ] ; 
[ char ( 28 ) ; "" ] ; 
                        // char ( 29 ) = Group Separator (see table below)
[ char ( 30 ) ; "" ] ; 
[ char ( 31 ) ; "" ]   
)

You may simply copy and paste this Custom Function from this web site with FileMaker Pro 10 Advanced. This does NOT work for earlier FileMaker versions.

Once you have saved this Custom Function you may remove those "unwanted" characters (* see character table below) in FileMaker Pro as

   myTextField = cleanText ( Self )


Apply the Custom Function to an input field

First we select the tab Auto-Enter from the field options, mark Calculate value or click the button Specify... on the right. In FileMaker Pro 9 or up you can enter

cleanText ( Self )

and click OK. Now you have to uncheck "Do not replace existing value of field (if any)" to make sure the function is called every time the field is being modified.


A ready made solution for you

If you want to avoid the hassle of defining all the definitions you can download a file that can easily be integrated in your solutions. The file contains a Custom Function that has to be transferred into every other file that need the character filter. Additionally a File Reference (now External Data Source) has to be created in the Relationship graph, but no relations are necessary.

Download the files (cross platform) (7 kByte) for your personal use. Information and files provided "AS IS" without any guarantee or liability.


Technical Details how FileMaker treats characters 0..31

A NULL character when entered into a calculation as string literal ("") causes FileMaker 9 or below to hang, requiring a force quit - with all its negative consequences. But what about all the other characters in the range 1..31? Notified by a user about difficulties transferring one of these characters as part of a calculation, I tested them all.

Since these problems do not apply for field contents, the tests were extended on how those characters behave on import and export. The results are interesting, to say the least. The table below shows all results in a compacted form.

Numeric ASCII Win2 Mac2 Copy FileMaker related
Dec Hex char Meaning1 Ctrl- Ctrl- Calc3 Exp. Imp. Comment
0 00 NUL Null @ 00 00 DO NOT USE literally in calculations!
1 01 SOH A "?" 01 01
2 02 STX B B 02 02
3 03 ETX End of text C C "?" 03 03
4 04 EOT D "?" 04 04
5 05 ENQ E E "?" 05 05
6 06 ACK F "?" 06 06
7 07 BEL Bell G "?" 07 07 Does not ring any bell today.
8 08 BS Backspace H H 08 08 Deletes character
9 09 HT Horizontal tab I Tab 09 09
10 0A LF Line feed J J CR 0D 0A Does not export as line feed! Must use XSLT instead. Import terminates Record.
11 0B VT Vertical tab K K "?" 0B 0BReturn characters within Text export as 0B
12 0C FF Form feed L "?" 0C 0C
13 0D CR Carriage return M CR 0D Import terminates Record.
14 0E SO N N "?" 0E 0E
15 0F SI O O "?" 0F 0F
16 10 DLE P P "?" 10 10
17 11 DC1 Q Q "?" 11 11
18 12 DC2 R R "?" 12 12
19 13 DC3 S S "?" 13 13
20 14 DC4 T T "?" 14 14
21 15 NAK U U "?" 15 15
22 16 SYN V V "?" 16 16
23 17 ETB W W "?" 17 17
24 18 CAN Cancel X X "?" 18 18
25 19 EM Y Y "?" 19 19
26 1A SUB Z Z "?" 1A 1A
27 1B ESC Escape [ "?" 1B 1B
28 1C FS \ "?" 1C 1C
29 1D GS Group separator ] "?" 1D4 Separates repetitions on export and import.
30 1E RS ^ "?" 1E 1E
31 1F US _ "?" 1F 1F

1 irrelevant entries omitted
2 Entering of control characters could not be fully tested on a German keyboard. The Mac characters listed were confirmed to work here.
3 Control characters listed as either "?" or are left blank, are lost when copied as script step or calculation. They have to be re-entered. A Line feed (0A) is silently converted to a Carriage return (0D) - which I consider a bug.
4 On export and import the character 1D is special. When exporting a repeating field the repetitions are separated by 1D on export to a text field. A field may also contain a character 1D which is simply exported as well. On import 1D separates the text for a field into repetitions if the target field is a repeating field. If the target field has no or not enough repetitions, everything following 1D will be ignored after all repetitions are exhausted. Note: character 1D is normally not visible and requires a special editor to see them.


Keeping Formatting Out On Data Entry

If you just want to make sure your fields will always contain Text Only, regardless of what sources you have copied from to paste in a field, there is a simple method to achieve that:
Define an auto-enter calculation,

in FileMaker Pro 9 and up:

   TextFormatRemove ( Self )

in FileMaker Pro  8, 8.5:

   TextFormatRemove ( <current field name> )

in FileMaker Pro 7:

   Evaluate ( Quote ( <current field name> ) )

and uncheck "Do not replace existing value of field (if any)" to make sure the function is called every time the field is being modified.

Needless to say that this additional functionality could be added to the Custom Function described earlier.

 

 

Any comments are welcome

If you have any comments, critics, or improvements to share, please don't hesitate to send an email to Winfried Huslik.

 

Reader Comments

Thank you, Winfried, for FMMenulet, FMDiff, and the excellent articles on your site. The "under-the-hood" details you provide are an oasis in the desert of technical knowledge that professional FileMaker developers currently inhabit.
Regards, Tom

 

Nice write-up, Winfried
I submitted a bug for FM7 a couple of years ago regarding crashes with null characters ...
Here's an AppleScript that could help with "the tedious part" on Mac; run it then paste in the FM dialog (it only builds part):

  set the_construct to ""
  repeat with i from 1 to 31
    set the_construct to the_construct & "[ \"" & (ASCII character i) & "\" ; \"\" ] ; "
  end repeat
  set the clipboard to the_construct

I haven't tested it with the actual substitution.
Steve

 

Wrong number of records in table Q: I'm looking at your web page and file and one thing confuses me. Why does your file have 32 records? One would be plenty if I understand correctly - or am I missing something?
Crispin

A: Thanks for the hint, you're right! This is the outcome of another bug in FileMaker, let's call it Lazy I/O. If you look at the original script, it does a Delete All Records after the import. In fact we don't need any records at all since the table contains just globals, but the number of records doesn't matter either. (Strange to be on record 1 of 0 while having found 32!) I've now added a Flush Cache to Disk to give FileMaker a break before it deletes the records. You may download the new version now.

 

Q: I have just discovered that the set field step in your import script does not survive a FMP cut and paste! It might be worth warning people.
Crispin

A: Thanks for this important note!

 

Q: Have you any thoughts on how to test the custom function when it is in place?

A: You may use the same method as described to enter the codes. Set up a calculation for the length of the field content and check this after pasting one or more codes in the field. You could also import the provided file with the special codes into the field (make sure to check auto-enter options on).

 

Q: You say that these unwanted characters will not cause corruption ... can you say exactly what IS the problem with these unwanted characters?  I'm left wondering if they are something that can have an impact on performance, stability, etc. ... or if they are merely unwanted because they do nothing for us (in which case I would want to remove them, but wouldn't worry about them quite as much).
Ray

A: The main issue with those characters is when these are transferred to the outside of FileMaker, may it be export, HTML or XML output, printing, SQL, etc. You may later encounter problems you can't explain but which have its cause in these invisible characters.

 

Q: Thanks, Winfried. This information and associated file will be very useful to all developers.
I would caution about the use of the Self function until it is fixed.
Jason

A: Thanks for the heads up. The Self function issue only strikes on related files. In this case it is always applied locally to a table in the same file and should not be an issue.


Examples are provided "AS IS" without warranties of any kind. Use at your own risk.

© 2005 - 2015 Winfried Huslik †. © 2024 Jürgen Geßwein. All Rights Reserved. FMDiff and FMVis are trademarks of Jürgen Geßwein, Augsburg, Germany. FileMaker is a trademark of FileMaker Inc., Santa Clara, CA, USA. Other trademarks mentioned are property of their respective owners. This web site has not been authorised, sponsored, or otherwise approved by FileMaker, Inc.