products/sources/formale Sprachen/Isabelle/Tools/jEdit/dist/doc/users-guide/encodings.html |
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>Character Encodings</title><meta name="generator" content="DocBook XSL Stylesheets V1.79.1"><link rel="home" href="index.html" title="jEdit 5.6 User's Guide"><link rel="up" href="files.html" title="Chapter 4. Working With Files"><link rel="prev" href="line-separators.html" title="Line Separators"><link rel="next" href="vfs-browser.html" title="The File System Browser (FSB)"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Character Encodings</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="line-separators.html">Prev</a> </td><th width="60%" align="center">Chapter 4. Working With Files</th><td width="20%" align="right"> <a accesskey="n" href="vfs-browser.html">Next</a></td></tr></table><hr></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="encodings"></a>Character Encodings</h2></div></div></div><p>A character encoding is a mapping from a set of characters to
their on-disk representation. jEdit can use any encoding supported by
the Java platform.</p><p>Buffers in memory are always stored in <code class="literal">UTF-16</code>
encoding, which means each character is mapped to an integer between 0
and 65535. <code class="literal">UTF-16</code> is the native encoding supported by
Java, and has a large enough range of characters to support most modern
languages.</p><p>When a buffer is loaded, it is converted from its on-disk
representation to <code class="literal">UTF-16</code> using a specified
encoding.</p><p>The default encoding, used to load files for which no other
encoding is specified, can be set in the
<span class="guibutton"><strong>Encodings</strong></span> pane of the
<span class="guimenu"><strong>Utilities</strong></span>>
<span class="guimenuitem"><strong>Options</strong></span>
dialog box; see <a class="xref" href="global-opts.html#encodings-pane" title="The Encodings Pane">the section called “The Encodings Pane”</a>.
Unless you change this setting, it will be your operating system's
native encoding, for example <code class="literal">MacRoman</code> on the MacOS,
<code class="literal">windows-1252</code> on Windows, and
<code class="literal">ISO-8859-1</code> on Unix.</p><p>An encoding can be explicitly set when opening a file in the file
system browser's
<span class="guimenu"><strong>Commands</strong></span>><span class="guisubmenu"><strong>Encoding</strong></span>
menu.</p><p>Note that there is no general way to auto-detect the encoding used
by a file, however jEdit supports "encoding detectors", of which there
are some provided in the core, and others may be provided by plugins
through the services api. From the encodings option pane
<a class="xref" href="global-opts.html#encodings-pane" title="The Encodings Pane">the section called “The Encodings Pane”</a>, you can customize which
ones are used, and the order they are tried. Here are some of the
encoding detectors recognized by jEdit: </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p> <span class="bold"><strong>BOM</strong></span>: <code class="literal">UTF-16</code> and <code class="literal">UTF-8Y</code>
files are auto-detected, because they begin with a certain fixed
character sequence. Note that plain UTF-8 does not mandate a
specific header, and thus cannot be auto-detected, unless the
file in question is an XML file.</p></li><li class="listitem"><p> <span class="bold"><strong>XML-PI</strong></span>:
Encodings used in XML files with an XML PI like the
following are auto-detected:</p><pre class="programlisting"><?xml version="1.0" encoding="UTF-8"></pre></li><li class="listitem"><p> <span class="bold"><strong>html</strong></span>:
Encodings specified in HTML files with a <code class="literal">content=</code> attribute in a <code class="literal">meta</code> element may be auto-detected:</p><pre class="programlisting"><html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"> </pre></li><li class="listitem"><p> <span class="bold"><strong>python</strong></span>:
Python has its own way of specifying encoding at the top of
a file.</p><pre class="programlisting"># -*- coding: utf-8 -*- </pre></li><li class="listitem"><p> <span class="bold"><strong>buffer-local-property</strong></span>:
Enable buffer-local properties' syntax
(see <a class="xref" href="buffer-local.html" title="Buffer-Local Properties">the section called “Buffer-Local Properties”</a>)
at the top of the file to specify encoding. </p><pre class="programlisting"># :encoding=ISO-8859-1:
</pre></li></ul></div><p>The encoding that will be used to save the current buffer is shown
in the status bar, and can be changed in the
<span class="guimenu"><strong>Utilities</strong></span>><span class="guimenuitem"><strong>Buffer
Options</strong></span> dialog box. Note that changing this setting has no
effect on the buffer's contents; if you opened a file with the wrong
encoding and got garbage, you will need to reload it.
<span class="guimenu"><strong>File</strong></span>><span class="guimenuitem"><strong>Reload with
Encoding</strong></span> is an easy way.</p><p>If a file is opened without an explicit encoding specified and it
appears in the recent file list, jEdit will use the encoding last used
when working with that file; otherwise the default encoding will be
used.</p><div class="section"><div class="titlepage"><div><div><h3 class="title"><a name="d0e1850"></a>Commonly Used Encodings</h3></div></div></div><p>While the world is slowly converging on UTF-8 and UTF-16
encodings for storing text, a wide range of older encodings are
still in widespread use and Java supports most of them.</p><p>The simplest character encoding still in use is ASCII, or
<span class="quote">“<span class="quote">American Standard Code for Information Interchange</span>”</span>.
ASCII encodes Latin letters used in English, in addition to numbers
and a range of punctuation characters. Each ASCII character consists
of 7 bits, there is a limit of 128 distinct characters, which makes
it unsuitable for anything other than English text. jEdit will load
and save files as ASCII if the <code class="literal">US-ASCII</code> encoding
is used.</p><p>Because ASCII is unsuitable for international use, most
operating systems use an 8-bit extension of ASCII, with the first
128 values mapped to the ASCII characters, and the rest used to
encode accents, umlauts, and various more esoteric used
typographical marks. The three major operating systems all extend
ASCII in a different way. Files written by Macintosh programs can be
read using the <code class="literal">MacRoman</code> encoding; Windows text
files are usually stored as <code class="literal">windows-1252</code>. In the
Unix world, the <code class="literal">8859_1</code> character encoding has
found widespread usage.</p><p>On Windows, various other encodings, referred to as
<em class="firstterm">code pages</em> and identified by number, are used
to store non-English text. The corresponding Java encoding name is
<code class="literal">windows-</code> followed by the code page number, for
example <code class="literal">windows-850</code>.</p><p>Many common cross-platform international character sets are
also supported; <code class="literal">KOI8_R</code> for Russian text,
<code class="literal">Big5</code> and <code class="literal">GBK</code> for Chinese, and
<code class="literal">SJIS</code> for Japanese.</p></div></div><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="line-separators.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="files.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="vfs-browser.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Line Separators </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> The File System Browser (FSB)</td></tr></table></div></body></html>
¤ Dauer der Verarbeitung: 0.19 Sekunden
Die Informationen auf dieser Webseite wurden
nach bestem Wissen sorgfältig zusammengestellt. Es wird jedoch weder Vollständigkeit, noch Richtigkeit,
noch Qualität der bereit gestellten Informationen zugesichert.
Die farbliche Syntaxdarstellung ist noch experimentell.