The University of Queensland Library
      Unicode in EndNote 8 and 9
 
 

 

    One of the major enhancements in version 8 of EndNote was the introduction of Unicode support. It enabled EndNote users to insert references using any of the scripts available in Unicode.

    In EndNote 9, there were further enhancements to this feature, allowing users to import Unicode files via filters and connection files.


    Displaying Unicode Characters

    Here is a sample library in EndNote 8, containing references in a variety of scripts:

    Sample EndNote 8 library with Unicode characters

    EndNote uses the default language settings of your computer. You do not need to install a new language pack to view or input Unicode characters in EndNote. If you can already type text in Russian or Japanese in a Microsoft Word document, then you will be able to type text in Russian or Japanese in your EndNote library.

    Those who normally use non-Western fonts for their research will know how to install additional input languages with Windows, by going to Control Panel and selecting the Regional and Language Options, and then clicking on the Languages tab. To install more complex languages (such as Chinese, Japanese, Arabic, Hebrew, Thai, Vietnamese) you will need to use the CD from which your Windows operating system was originally installed.

    If you are having trouble displaying Unicode characters, you may need to choose a different Display Font in EndNote. Some fonts do not support Unicode characters. To change the display font, click on Edit|Preferences and select Display Fonts.

    Endnote 8/9 is not able to cope with right-to-left scripts (such as Hebrew and Arabic).

    Note that Palm OS is not Unicode compliant, so certain characters that display properly in EndNote may not convert to your Palm device.

    If you are using a computer which does not have the appropriate language support installed, and you open an EndNote library created on another computer, you will find that the Unicode characters do not display correctly. You must install the necessary input languages. You may then find that you need to use the Recover Library command (go to Tools|Recover Library) before the Unicode characters will display correctly.


    Inputting Unicode Characters

    As explained above, if you have installed the appropriate input languages in Windows, you can type Unicode characters in an EndNote reference, in the same way as you would type Unicode characters in your word processor.

    You can also copy and paste Unicode characters from other Windows applications. If you have references in a word processor document, you can copy and paste the data into EndNote. Similarly, you can copy and paste data from a web-based database, such as a library catalogue.


    Importing Unicode Characters via Filters

    In EndNote 8, you cannot import Unicode characters using a filter. As in previous versions of EndNote, filters can only import plain ASCII text.

    In EndNote 9, it is possible to use a filter to import text files in encodings other than plain ASCII text. When using the Import dialog box, go to the Text Translation box and click on the downward arrow to see the available options. The standard Unicode encoding is UTF-8.


    Importing Unicode Characters via Connection Files

    The ability to import Unicode via connection files was introduced in EndNote 8.0.2. In that version of EndNote, it became possible to import non-Latin characters from Z39.50 servers which use UTF-8 encoding.

    In EndNote 9, this feature was extended to allow import of non-Latin characters from Z39.50 servers which use a much wider range of encodings. EndNote 9 can also parse references in a range of formats used in library catalogues in non-English speaking countries.

    The new syntax and text encoding options in EndNote 9 can be seen by clicking on Edit|Connection Files|New Connection. Click on Connection Settings. At Record Syntax, click on the down arrow to see the available options. Similarly, at Text, click on the arrow to see the text encodings supported.

    The number of connection files which will import Unicode is still fairly limited. This is because many library catalogues transliterate non-Roman scripts into the Latin alphabet. They may still use Unicode to represent diacritics which are not available in the basic ASCII character set, but when references are retrieved from the Z39.50 server, they are "downgraded" so that they can be displayed by non-Unicode character sets. Unicode diacritics will either not show, or they will rendered as an asterisk or another special character. As the records are not being sent in Unicode, EndNote cannot import them in Unicode.

    However some Z39.50 servers are now able to transmit references in Unicode or another non-Latin text encoding, and in such cases the EndNote connection files may be able to import the references. Here are some examples of connection files supplied with EndNote 9 which will import references in non-Latin scripts: Hokkaido U (Japanese), Okayama U (Japanese and some Chinese), Moscow State U (Russian), Russian State Library (Russian), TEI of Messolonghi (Greek).

    The connect log will display the retrieved data correctly if the Z39.50 server uses UTF-8 encoding. With other encodings, the data will not display.


    Searching References Containing Unicode Characters

    The Search Library function in EndNote 8 and 9 can search your EndNote library for any Unicode character.


    Sorting References Containing Unicode Characters

    If you are using an EndNote output style which sorts references alphabetically (e.g. an author/date style like Harvard, or a footnote style with a separate bibliography at the end of the document), then you may encounter problems with the sort order.

    It appears that EndNote is using the Unicode Collation Algorithm to sort references. This means that references will be grouped according to the script. So, for example, if you cite both English-language and Russian-language references, the English references will appear first in your bibliography, followed by the Russian references in a separate sequence. This is probably a useful and logical arrangement.

    With alphabetic scripts such as Russian and Greek, the sort order will usually conform to the normal conventions in those languages.

    However with non-alphabetic scripts, such as Japanese and Chinese, the sort order used by the Unicode Collation Algorithm does not conform to any of the normal sorting conventions used in those languages.


    Formatting References containing Unicode Characters

    1. Inserting Citations into a Microsoft Word Document

    References containing Unicode characters can be inserted into a Word document in the normal way.

    As explained above, there may be problems with the sort order of the bibliography.

    In EndNote 8 and 9, all reference types have a Translated Author and a Translated Title field. This gives the user the option of storing more than one form of the author and title, and modifying the output style to include that data.

    In English-language publications it is common practice to transliterate non-Western data, but the vernacular form is sometimes added in brackets. This is particularly the case with languages like Chinese and Japanese, where a transliterated syllable may correspond to multiple characters in the vernacular. If the EndNote user inputs transliterated data in the Author and Title fields, and stores the vernacular forms in the Translated Author and Translated Title fields, an EndNote output style can be modified to produce output like this:

    Chinese footnotes formatted with EndNote 8

    For journal articles, the vernacular title of the journal can be stored in the Alternate Journal field.

    A slightly different approach is to translate titles into English and give the original form in brackets, producing output like this:

    Sample translated Chinese footnotes in EndNote 8

    2. Using the "Copy Formatted" Function

    If you select a reference in your library and use the Copy Formatted command, you can paste the reference into a word processor document.

    3. Using the "Export" Function

    If you select some references in your library and use the Export command to save them to a file, the Unicode characters will be correctly formatted. Exporting in Rich Text Format will preserve text styles such as italic and bold.

    4. Using the "RTF Document Scan" Function

    For those who do not use Microsoft Word, Unicode characters can be formatted correctly in Rich Text Format documents using the RTF Document Scan function.

my.SI-net  |   eLearning/Blackboard  |   Feedback & suggestions
©2008 The University of Queensland, Brisbane Australia
ABN 63 942 912 684
CRICOS Provider Number: 00025B
Authorised by: University Librarian
Maintained by: UQ Library
  Last Updated: 15 December 2008.