Coded Character Sets
Resources on the Web
With an emphasis on library automation and the internet environment. Updated June 2008.
- Library of Congress
- MARC 21 Specifications
- MARC Development
-
Selected MARC Proposals
- 96-10 USMARC Character Set Issues and Mapping to Unicode/UCS
- 97-10 Use of the universal code character set in MARC records
- 97-14 Addition of new characters to existing USMARC sets
- 98-16r Nonfiling characters in all MARC formats
- 98-18 Unicode Identification and Encoding in USMARC records
- 2001-09 Mapping of EACC characters to Unicode/UCS
- 2002-11 Repertoire Expansion in the Universal Character Set for Canadian Aboriginal Syllabics
- 2004-08 Changing the Unicode/UCS Mapping for the Double-Wide Diacritics
- 2005-05 Change of Unicode mapping for the Latin "alif" character
- 2006-04 Technique for conversion of Unicode to MARC-8
- 2006-09 Lossless technique for conversion of Unicode to MARC-8
-
Selected MARC Discussion Papers
- DP73 UCS and USMARC Mapping
- DP102 Non-filing characters
- DP118 Nonfiling characters in MARC 21 using the control character technique
- 2002-DP05 Guidelines for the Nonfiling Control Character Technique in the MARC 21 Formats
- 2002-DP06 Repertoire Expansion in the Universal Character Set for Canadian Aboriginal Syllabics
- 2004-DP03 Changing the Unicode/UCS Mapping for the Double-Wide Diacritics
- Selected MARC Reports
-
Selected MARC Proposals
- Archives of UNICODE-MARC Discussion List
- MARC in XML
- MARBI (Machine-Readable Bibliographic Information) Committee
-
The Structure and Content of MARC 21 Records in the Unicode Environment
by Joan M. Aliprand
- Perl Unicode introduction
- Perl Unicode Tutorial
- Unicode support in Perl
- Encode
- Perl locale handling (internationalization and localization)
- Some relevant Perl modules available on CPAN
- Convert::Recode
- Convert::Translit
- Encode
- Unicode::MapUTF8
- Unicode::Normalize
- MARC::Charset
- MARC::Record
- MARC.pm (deprecated in favor of MARC::Record)
- Internationalization Activity
- Charlint a Unicode Character Normalization Tool (in Perl)
- RFC 822 Standard for the Format of ARPA Network Text Messages
- RFC 1345 Character Mnemonics & Character Sets
- RFC 1502 X.400 Use of Extended Character Sets
- RFC 1521 MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies
- RFC 1815 Character Sets ISO-10646 and ISO-10646-J-1
- RFC 2046 Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types
- RFC 2130 Report of the IAB Character Set Workshop
- RFC 2220 The Application/MARC Content-type
- RFC 2277 IETF Policy on Character Sets and Languages
- RFC 2279 UTF-8, a transformation of ISO 10646
- Multilingual Support in Internet/IT Applications
- Character sets. Letters, Tokens and Codes by Johan W van Wingen