PDF file format Chinese embedded status and problem discussion 2

Wu and Asian font embedding status

First, font encoding instructions:

Asian fonts need to be embedded in PDF files, requiring more software than English fonts, as well as more sophisticated technologies. Before that, we must explain the difference between Asian fonts and English-style and their differences on different platforms. In the Asia-Pacific region, although Chinese characters are used, the code used in each country or region is different.

(I) Taiwan Region

1.Big 5: commonly known as Big 5

2.Big 5 plus: Big 5 Encoding

3.CNS11643: Chinese Standard Exchange Code

(b) China Mainland

1.GB2312-80: Commonly known as the national standard code

2.GBK: Chinese character code extension specification

(叁)Japan

1.Shift JIS

2.EUC-JIS

(d) South Korea

1.KSC 5601

(e) Others:

Second, the coding principle:

1. Selection of texts and fonts based on "Guozi Standard Font Table"

2. The Chinese character code unit is represented by 2 bytes, and it is represented by the digit number of the hexadecimal system.

3. Complies with the communication rules of CNS 5205 and CNS 7654

4. Arranged in different literals depending on how often words are used

5. Arrange word codes for each word in the order of the first stroke

叁, CID fonts and TrueType fonts, relationship between Postscript fonts:

CID is an abbreviation taken from Character ID. The function of this character ID code is to help retrieve and retrieve characters, which greatly improves its use efficiency. This method is most suitable for large font sets such as Oriental diatomic characters. - Chinese, Japanese, Korean, etc. Acrobat doesn't necessarily need to use CID fonts. You can also use TrueType fonts, but if you choose to transfer fonts to Type 1 in the printer driver font options when using TrueType fonts, they are embedded in the PDF. The Chinese character can no longer perform the action of adding characters. When you choose to transfer the font to Type 42, the Chinese characters embedded in the PDF can be modified and increased or decreased (the same font should be used in the modification). Chinese Postscript fonts are currently not available in the PDF file format.

The biggest advantage of embedded fonts is to solve the problem of different characters or corresponding errors between the file producer and the exporter. Now the PS process corresponds to the code and must be the same series of the same font company. Correspondingly, font inlining can solve this problem once and for all.

Fourth, Acrobat Reader can display PDF files containing CJK text:

There are two ways to display CJK text. First, when creating a PDF file, the author can embed the fonts used in all files, including CJK fonts, as long as the fonts can be embedded. Any language version of Acrobat Reader can display PDF files with embedded CJK fonts. However, because PDF files with embedded fonts may take too much space, authors may choose not to embed all the fonts used in the file when creating the file. This is the second way; if this is the case, Acrobat or Acrobat Reader To view this PDF file, users must use the correct Asian text font set.

V. Fonts currently contained in Asian font sets:

There are currently four Asian font sets (Traditional Chinese, Simplified Chinese, Japanese, and Korean), each of which includes serif and sans-serif fonts. The types of Asian fonts embedded in PDF files can be embedded in TrueType fonts on the Windows platform, and Adobe Postscript fonts in the CID format can be embedded on the Macintosh and Windows platforms. Older OCF-formatted Postscript fonts cannot be embedded. In addition, the fonts in the file must allow embedding. <Notes>

The factor that causes the PDF file size to change by embedding the font depends on the number of fonts and characters contained in the file. In general, the size of a PDF file will increase by 2 MB to 3 MB for each embedded C, J, or K font in a typical file. "MakeCID" in the Macintosh version of Acrobat 1.0 can convert TrueType fonts to Older OCF format Postscript fonts to "Width-Only" (width-only information) CID fonts. These CID fonts contain only the width information for the Roman characters used in TrueType or OCF fonts, which Distiller needs to use when creating PDF files that reference native TrueType or OCF fonts. For more information, see the files in the MakeCID utility folder. If you need to create a PDF file containing CJK text in the operating system of the Roman language, only the CID font with width information will come in handy. For creating a PDF containing CJK text on a system of Roman languages, the PDF Writer in Acrobat cannot create a PDF file containing CJK text on the operating system of the Roman language. However, if Distiller can use the font referenced in the Postscript file it is creating, Distiller can create a PDF file containing CJK text. If the Postscript file contains embedded TrueType fonts, you can create this Postscript file on any platform. In addition, Distiller ships with CID fonts for Width-Only for all CJK fonts in the current Adobe Type Library, as well as TrueType fonts common in Macintosh or Windows systems. With these font information, Distiller will be able to successfully create Postscript files on any platform. In addition, if there are other fonts in the produced Postscript file that need to be converted, you can use the "MakeCID" utility to create a CID font with only width information.

The Status of Land and Traditional Chinese Characters Embedded and Its Problems

Currently, support for Chinese PDF file fonts embedded in the word set is only Arphic's CID ATM font on Mac; on the PC there are Arphtic's CID Postscript fonts, with the current problems, the first is The size of the file, the second is that USER currently used on the Internet does not generally install related browsing software on personal PCs. In particular, Asian fonts, in addition to browsing the software, must also be equipped with Asian characters to open PDF files embedded in Asian fonts. In addition, the practice of exporting PDF files has not yet been adopted in the printing industry.

柒, future solutions

PDF file is a very promising file format. All kinds of graphic files can be converted into PDF file. In the future, no matter the Microsoft series software, Corel series, Adobe series, will support the PDF file format. Produce, modify, edit and output. But for now, in addition to the complete functionality of the Adobe Acrobat series, the functionality of other software is not yet complete, and the types of other related software applications are too complicated to be developed by related software development companies to develop more integrated applications. software. There are only two methods for the file size problem. The palliative method is to compress the PDF in a higher proportion to facilitate the transmission. The method for the permanent solution must still be to develop a more broadband network on the one hand; To facilitate the transmission of large amounts of information on the Internet.

捌, conclusion

The PDF file format is not Acrobat's unique file format. For example, Wahang's DynaDoc file format is also a type of portable file. Its general features are similar to those of Acrobat's PDF file format, but they are only slightly functional. But basically the purpose, purpose, way of production, and way of browsing are very similar.

PDF files still have considerable space for R&D and application. For example, in the field of e-newsletters, it is a very good PDF file application example, because most of the general e-newsletters only have a large amount of text, lack of pictures and lively layout, so that readers are When browsing e-news, you will feel that facing the full version of stiff text and reducing the will to read; if you can arrange the e-newspaper into the layout of a regular newspaper, in front of the reader in the form of PDF file, so there is a lively layout and The picture will definitely increase the user's willingness to read. In addition, the hyperlink function of the second layer of the PDF file allows the user to directly link from the headline title and direct the page to the text of the newspaper, eliminating the need to page by page. Browse time. Of course, this part of the idea has to wait until users are accustomed to reading files with Acrobat Reader and solve problems before they can enter the experimental and application phases. This will be another new outlet for online publishing.

玖, notes

[Note 1] Acrobat 4.0 Electronic Document New Century P59~60

[Note 2] The picture was taken from Wending Company website

[Note] For Arphic fonts that currently support Distiller for embedding, Arphic allows users to embed as long as it is a legal original font.

[Note 4] Using Distiller to embed characters in text, it is impossible to search and copy because it loses text characters.

Glossary

(1) Postscript: Postscript is a document description technology published by the Adobe (http://) company in the United States in 1985. Adobe also uses this technology to create a famous font that conforms to Postscript technology and thus changes the positive printing industry. Postscript can accurately depict any text and graphics in the plane. Today's Postscript technology is very commonly used in the printing field, including Display, Laser Printer, Imageetter, and Digital Printing. Digital Printing, etc. Output devices. The most important thing to do with Postscript technology is the Postscript font. Users can adjust some of the parameters through Postscript technology and change the font size, shadow/stereo/hollow/ Thickness and other special effects. Due to the excellent performance of Postscript in printing, most of the world's major documents are currently in the form of Postscript. Adobe released in April 1997 a more progressive Postscript 3 emphasizes enhanced quality and color capabilities and Internet columns Printing function.

(2) TrueType: TrueType font format is jointly developed by Apple and Microsoft in the United States. It was firstly used in Apple's Macintosh series and Microsoft Windows 3.1. At present, Apple's OS 8.0 and Microsoft Windows 95/NT also use TrueType as font. The format is basically TrueType and Postscript, all using Bezier (Bez)

Sports bra

Rain Poncho Co., Ltd. , http://www.zjraincoat.com

Posted on