PHP Tutorials - Tutorial Addendum - Non ASCII Characters in HTML abstracts

 31 December 18:00   

    



    



    

This affiliate explains:

    



        

  • Basic Rules

        



  •     

  • French Characters in HTML Abstracts - UTF-8 Encoding

        



  •     

  • French Characters in HTML Abstracts - ISO-8859-1 Encoding

        



  •     

  • Chinese Characters in HTML Abstracts - UTF-8 Encoding

        



  •     

  • Chinese Characters in HTML Abstracts - GB22312 Encoding

        



  •     

  • Characters of Assorted Languages in HTML Documents

        



  •     



    



    

Basic Rules

    

As you see from the antecedent chapters, a Web based appliance consistently

    

delivers advice to the user interface as a HTML document. The appliance

    

can either yield a changeless HTML certificate from the book system, or accomplish

    

a activating HTML certificate from a PHP script.

    

Let s apply on how to handle non ASCII characters in changeless HTML abstracts first.

    

Here are the accomplish and technologies complex in entering a HTML certificate and carrying

    

it to the uer interface:

    

 

    

H1. Key Sequences from keyboard

    

|

    

|- Argument editor

    

v

    

H2. HTML Document

    

|

    

|- Web server

    

v

    

H3. HTTP Response

    

|

    

|- Internet TCP/IP Connection

    

v

    

H4. HTTP Response

    

|

    

|- Web browser

    

v

    

H5. Visiual characters on the Screen

    



    

Based on my experience, actuality are some basal rules accompanying to those steps:

    

1. You haveto adjudge on the appearance encoding action to be acclimated in the HTML certificate first.

    

For alotof of the languages, you accept two options, a: use a encoding action specific to that language;

    

b: use a Unicode schema. For example, you can use either GB2312 (a simplified Chinese appearance schema)

    

or UTF-8 (a Unicode appearance schema) for Chinese characters. My advancement acclimated to be "a". But today,

    

I am suggesting "b", because Unicode action can abutment all characters of all languages.

    

2. PHP seems to be a nice language. The data blazon of cord is authentic as a arrangement of bytes,

    

like C language. This is altered than Java language, area cord is authentic as a arrangement of

    

Unicode characters. Cord literals in PHP can yield any arrangement of bytes. Accordingly you can access

    

non ASCII characters as PHP cord literals in any encoding schema.

    

3. From move "H1" to "H2", you charge baddest acceptable argument editor that supports the encoding action you accept selected.

    

The end ambition of this move is simple - characters in the HTML abstracts haveto be stored in a book using the

    

selected encoding schema. Don t beneath appraisal the adversity akin of this step. It could be actual frustrating,

    

because alotof computer keyboards abutment alphabetic belletrist only. You may accept to use some accent specific

    

input software to construe alphabetic belletrist into accent specific characters. The editor sometimes may

    

also abundance characters in anamnesis in one encoding schema, and action you altered encoding action if saving

    

files to harddisk.

    

4. From move "H3" to "H4", it is the job amid the Web server and Web browser. The HTTP acknowledgment will be

    

transmitted as is to the browser. The characters in the HTML certificate absorbed in the HTTP acknowledgment will

    

also be maintained as is.

    

5. From move "H4" to "H5", the browser opens the accustomed HTML certificate and displays encoded characters

    

into as accounting characters of the specific language. To do this, the browser needs your help. The first help

    

is to specify the appearance encoding name, "charset", acclimated in the HTML certificate as a <meta> tag.

    

The additional advice is to create abiding the browser can admission the a appearance chantry book advised for the defined encoding schema.

    

If no appearance encoding name is defined in the <meta> tag, some browsers will try to ascertain the

    

encoding action based on the HTML certificate content. If not successful, browsers will use absence encoding

    

schemas. For example, Internet Explorer (IE) use "Western European" as the absence encoding schema.

    

"Western European" seems to be apropos to "ISO-8859-1" standard.

    

(Continued on next part...)

    



    



    

 


Tags: specific, language, character, characters, languages, browser, editor, documents, notes, document, string, tutorial, tutorials, response

 characters, schema, encoding, document, documents, language, character, browser, unicode, string, ascii, specific, response, sequence, editor, languages, , encoding schema, html document, html documents, ascii characters, http response, non ascii, character encoding, non ascii characters, character encoding name, http response will, html document from, php tutorials tutorial, tutorials tutorial notes,

Share PHP Tutorials - Tutorial Addendum - Non ASCII Characters in HTML abstracts:
Digg it!   Google Bookmarks   Del.icio.us   Yahoo! MyWeb   Furl  Binklist   Reddit!   Stumble Upon   Technorati   Windows Live   Bookmark

Text link code :
Hyper link code:

Also see ...

Permalink
Article In : Computers & Technology  -  php