Synopse PDF engine 1.7.2

« Synopse PDF engine - SQLite3 Framework version 1.7 »

By AB4327-GANDI, 2010-05-06. Permalink Open Source › Synopse PDF engine

The Unicode part of the Synopse PDF engine has been updated in order to support right-to-left languages and ligatures (i.e. glyph shaping).

Therefore, our Open Source engine is one of the few PDF producer able to natively handle Arabic languages. Even most commercial engines don't implement this nice feature.

"Glyph Shaping" is needed e.g. for Arabic language, because the character change its shape depending on its position in the word.

The Synopse PDF engine can now optionally call the Windows Uniscribe API, to render Ordering (e.g. Left to Right direction) and Shaping of the text. The USE_UNISCRIBE conditional must be defined, and the UseUniscribe property of the TPdfDocument class must be set to true.

The UseUniscribe property can be temporary to TRUE, when using the Canvas property, then back to FALSE when the Uniscribe API is not needed any more. Windows Uniscribe library can be slower, that's the reason why it is disabled by default.

Note that the UseUniscribe property must be set appropriately before the content generation if you use any TPdfDocumentGdi.VCLCanvas text output with such scripting (since the PDF rendering is done once just before the saving, e.g. before SaveToFile() or SaveToStream() methods calls).

Our Synopse PDF engine don't handle Font Fallback yet: the font you use must contain ALL glyphs necessary for the supplied unicode text - squares or blanks will be drawn for any missing glyph/character.

Using the Windows Uniscribe API is working very well (it is used internaly by Windows for drawing complex text), is almost well documented on MSDN, but there are very few sample code available. So writing this code was not so easy... but I hope it will be useful for some users! Thanks Mahammed Nasman for your comments and link to the MSDN site!

You can download the updated full source code of this unit, with other needed Synopse units, from synpdf.zip licensed under a MPL/GPL/LGPL tri-license.

The Unicode sample code has been updated also, and can be downloaded from here.

Here is the resulting PDF content. Feedback is welcome!

37 reactions

1 From Mohammed Nasman - 06/05/2010, 17:51

You are amazing Arnaud :-).

I was writing an email about the Delphi header file of usp10, but I saw you release newer version with Uniscribe support.

Thank you very much Arnaud for your great work.
2 From Mitch - 07/05/2010, 11:19

Bonjour Arnaud,

unfortunately that's about it for my French. I was looking at your
SynopsePDF code and it looks tres bien but as far as I can tell it
appears to be limited to 72dpi. Am I missing something or is that the only outpput option?

Merci,
Mitch
3 From A.Bouchez - 07/05/2010, 11:26

What do you call "resolution"? Such a concept doesn't exist in PDF, since all coordinates are floating point values of logical "units", so the resolution depends on the number of decimal used (2 decimals for the current version of the engine, corresponding to a 7200 dpi resolution, which is big enough IMHO for printing ).

If you use direct TPdfCanvas methods, all coordinates are declared as single type. So there is no resolution issue here.

If you use a VCLCanvas, the resolution of the canvas is by default, the screen resolution, because the temporary HDC used during PDF creation is the Desktop HDC, i.e. GetDC(0). I think this is OK for common usage. You can hack the code if you want to modify this really.

If you want to draw a specified TMetaFile, you have a Scale parameter (1.0 by default), which can be used to fit the resolution used for the TMetaFile creation: using some code like Scale := Screen.PixelsPerInch/fPrinterPxPerInch.x to guess the scaling factor.
4 From d.carstensen - 07/05/2010, 14:56

Hi,

Acrobat still asks for saving testunicode.pdf, so there must be still a systax error in document. I can't see the error, obj offsets are OK, maybe trailer root count or 20 bytes of xref entries?
Preflight tool doesn't report problems any longer.

Best regards
Dirk Carstensen
5 From A. Bouchez - 08/05/2010, 12:11

Can you see any difference of content after Acrobat saved the file? Perhaps the whole content changed.

By the way, I update the PDF engine, with two issues fixed now. Thanks for your feedback! Download link above didn't change, only zip content was updated to version 1.7.3.
6 From greymont - 08/05/2010, 15:33

Many, many thanks for your work, Arnaud!

I just discovered Synopse PDF Engine and I am creating a test PDF file over the weekend. If you do not mind, I'd like to ask for some help.

I have filled a page with numbers just to explore what I can do. Now I want to highlight a line of text.

I was going to issue a MoveTextPoint followed by a ShowText('xxx',True) and then a TextRect call. I want the next line of text to be enclosed in a rectangle filled with a background color. But I don't know what to use for rectangle coordinates.

Is there a method for obtaining the "current" cursor position after a MoveTextPoint call?
7 From A. Bouchez - 08/05/2010, 16:31

There is no storing of the MoveTextPoint() position yet.

About filling background text, see the TPdfEnum.TextOut() method and how it implements it.
You can also use VCLCanvas and draw your highlighted text just as usual.
8 From greymont - 08/05/2010, 17:04

Thanks, Arnaud.

I will continue experimenting with it.
9 From d.carstensen - 09/05/2010, 21:32

Hi Arnaud,
comparing the Acrobat saved PDF to original file is not a proper way because Acobat saves either with incremental update, linearization or compressed streams (PDF 1.6).
The bug must be object structure and not in PostScript content. Otherwise you would get an error message.
I'll try your update and give you a feedback.
Best regards
Dirk Carstensen
10 From greymont - 10/05/2010, 01:12

One observation: The declaration
/// a PDF coordinates rectangle
TPdfRect = record
Left, Top, Right, Bottom: Single;
end;

appears to me to suggest the four values are all relative to the overall coordinate system the rectangle is in. For example, 10,10,15,15 would describe a square that is 5 units wide and 5 units high.

However, my testing reveals that the Right and Bottom values are actually relative to the Left and Top positions. That would make them more Width and Height than Right and Bottom.

Is this a correct interpretation?
11 From A.Bouchez - 10/05/2010, 09:46

Yes, the TPdfRect Right/Bottom parameters can be sometimes used as Width/Height, depending of the context.

Perhaps It could make sense that another record type should be added for Width/Height parameters.
12 From A.Bouchez - 11/05/2010, 13:59

The Right to Left handling should be performed by the Uniscribe API.
Here is how it works (see TPdfWrite.AddUnicodeHexTextUniScribe method implementation for details, and http://msdn.microsoft.com/en-us/lib...).aspx for reference):
- you provide the engine with some unicode text
- the Uniscribe API split this text into several items, each one with its own language
- the Uniscribe API calculate the order to be used for screen rendering
- the Uniscribe API is used to calculate the corresponding glyphs, then append it to the PDF file

Note that the PDFString oriented methods (like MeasureText, TextWidth, TextRect...) and even the UnicodeTextWidth() method doesn't use Uniscribe so won't work as expected and should be avoided if you need Uniscribe.

On practice, the TPdfCanvas won't help you with right-to-left writing. But since TextOutW() and such use also Uniscribe internally, you should better use the VCLCanvas and GDI calls to make your page layout (like measuring the text and then align it to the right margin), and let the conversion be done by TPdfDocumentGDI.

About your Arabic and English mixed text, it works with version 1.7.3 (the one I uploaded some days ago, not the 1.7.2 version posted with this blog entry). Just reload it and retry.

By the way, I've added a TPdfBox record type, with explicit Width and Height properties, since TPdfRect could be confusing in some usage. Uploaded new 1.7.4 version to the link above.
13 From Jim - 13/05/2010, 02:27

Thanks for your engine. I'm experimenting with it now. How do we change the orientation of the output, that is select Portrait or Landscape?
14 From A. Bouchez - 13/05/2010, 09:03

Just exchange the page width and height parameters. You'll be in Landscape.
15 From Mohammed Nasman - 13/05/2010, 12:15

the combine with Arabic and English working fine now.

but still the RTL Origination, i have tried to use VCLCanvas, but it show the same, I made a quick test wit TImage.Canvas to show you the difference:
=====
Windows.ExtTextOut(img1.Canvas.Handle, 10, 20, img1.Canvas.TextFlags or ETO_RTLREADING , nil, 'مكتبة SynPDF بدعم اللغة العربية',
Length('مكتبة SynPDF بدعم اللغة العربية'), nil);

Windows.ExtTextOut(img1.Canvas.Handle, 10, 80, img1.Canvas.TextFlags , nil, 'مكتبة SynPDF بدعم اللغة العربية',
Length('مكتبة SynPDF بدعم اللغة العربية'), nil);

===

The first one show the Arabic Origination in right order, but second one use it in reverse order, that's happen only I have English letters with the Arabic, I tried to change the TextFlag, but it doesn't change any thing

doc.VCLCanvas.TextFlags := doc.VCLCanvas.TextFlags or ETO_RTLREADING;

doc is object from TPdfDocumentGDI.
16 From A. Bouchez - 13/05/2010, 18:14

Changing the TextFlags won't change anything, but it is not handled by the VCLCanvas/metafile enumeration during the PDF conversion.

There should be a problem with the Uniscribe implementation used. Since I didn't find any code sample on the Net, I had to implement it just following the MSDN documentation. The ScriptLayout() function is callled as requested, and should make all the RTL stuff. See http://msdn.microsoft.com/en-us/lib...).aspx

Perhaps the Textflag is faulty. It should be checked and implemented in TPdfEnum.TextOut somehow.
Check http://msdn.microsoft.com/en-us/gog...
I don't have time to do it now, so you could see by your own and tell me.
17 From A. Bouchez - 14/05/2010, 14:39

I had some time today to add the ETO_RTLREADING handling. The Synopse PDF Engine should now handle it as expected.

I've uploaded a new version of the engine, 1.7.4.RTL:
- added RightToLeftText property in TPdfCanvas (Uniscribe-only)
- handle ETO_RTLREADING option (Uniscribe-only) in VCLCanvas/TMetaFile

Please give me some feedback about this update. Hope it will meet your needs.
Thanks for your comments!
18 From Mohammed Nasman - 14/05/2010, 20:20

Perfect, working fine now, you did very great work Arnaud,

Thank you for your hard work.

I will do more tests and will give you feedback.
19 From A. Bouchez - 15/05/2010, 14:19

Response to greymont:

I'm getting a range check error from SynCommons.pas (version 1.7.4)when I turn range checking on (Delphi 7). The function is
procedure WinAnsiToUnicodeBuffer
(....)

The range checking error is wrong about this. My code is correct.
As stated by the comment " include S[L+1] = last #0 ", I need to copy also the trailing #0, which is at S[i+1] i.e. S[L+1] with i=L: so the for loop is correct.
By compiler design, there is always a #0 at the end of any Delphi string. And the caller expect this #0 to be appended, since most callers will use the destination buffer with Win32 wide API.
20 From A. Bouchez - 15/05/2010, 14:40

To greymont, about the Rectangle.

1. It looks like you are putting a "Rectangle" command inside a BeginText/EndText block. This is not correct according to the PDF format.

2. Pay attention to your loops. The SetFont/SetLeading/SetLineWidth should be ouside the loops, to reduce to PDF content size. You don't need either to call BeginText/EndText each time if you are just using MoveTextPoint/ShowText.

Try this code:

pPDF.AddPage;
pPDF.Canvas.SetFont('Courier New',10,[]);
pPDF.Canvas.SetLeading(pPDF.Canvas.Page.FontSize);
pPDF.Canvas.SetLineWidth(0.5);
pPDF.Canvas.BeginText;
for vRow := 1 to 10 do begin
for vCol := 1 to 10 do begin
vX := (vCol-1)*cWidth+1;
vY := pPDF.DefaultPageHeight-((vRow)*pPDF.Canvas.Page.FontSize);
pPDF.Canvas.MoveTextPoint(vX,vY);
pPDF.Canvas.ShowText('.',False);
if vRow = vCol then begin
pPDF.Canvas.EndText;
pPDF.Canvas.Rectangle(vX,vY,cWidth,pPDF.Canvas.Page.FontSize);
pPDF.Canvas.Stroke;
pPDF.Canvas.BeginText;
end;
end;//for
end;//for
pPDF.Canvas.EndText;
21 From Esmond - 09/06/2010, 01:53

Using lots of images consumes lots of ram. To reduce this I've tried doing an incremental save by working out the positions of all 'kids' pages of the first page initially. After laying out the first page I run the routines below: 'AddFakePages' then 'FlushAddPage'. By freeing the TJpegImages after each page in an unscientific test I saved 40% on ram. Can you give some tips on incrementally free the memory used by TPdfDocument or am I barking up the wrong tree with this method? Also thought about making image loading event driven to reduce memory.

procedure TPdfDocument.AddFakePages(PageCount, ObjPerPage: integer);
var
p : TPdfObject;
FKids: TPdfArray;
i, ii : integer;
begin
for i := 0 to PageCount - 1 do begin
p := TPdfObject.Create;
p.SetObjectNumber(FXref.ItemCount+(i*ObjPerPage));
FKids := FCurrentPages.PdfArrayByName('Kids');
FKids.AddItem(p);
end;
FCurrentPages.PdfNumberByName('Count').Value := FKids.ItemCount;
end;

function TPdfDocument.FlushAddPage: TPdfPage;
var
FResources: TPdfDictionary;
i, Pos : integer;
begin
for i := FXrefCount to FXref.ItemCount - 1 do // ignore FXref[0] = root PDF_FREE_ENTRY
with FXref.Items[i] do begin
Pos := fPDFWrite.Position;
Value.WriteValueTo(fPDFWrite);
ByteOffset := Pos;
end;
FXrefCount := FXref.ItemCount;
if FCurrentPages = nil then
raise EPdfInvalidOperation.Create('AddPage');
// create a new page object and add it to the current pages dictionary
result := fTPdfPageClass.Create(self);
FXref.AddObject(result);
fRawPages.Add(result); // pages may be nested
//_Pages_AddKids(FCurrentPages, result);
result.AddItem('Type', 'Page');
result.AddItem('Parent', FCurrentPages);
// create page resources
FResources := TPdfDictionary.Create(FXref);
result.AddItem('Resources',FResources);
FResources.AddItem('Font',TPdfDictionary.Create(FXref));
FResources.AddItem('XObject',TPdfDictionary.Create(FXref));
// create page content
FResources.AddItem('ProcSet',TPdfArray.CreateNames(FXref,['PDF','Text','ImageC']));
result.AddItem('Contents',TPdfStream.Create(self));
// assign this page to the current PDF canvas
FCanvas.SetPage(result);
end;
22 From Renato - 09/06/2010, 20:23

on SynPDf.pas line 4046 the font index its returned -1 if the font is not installed on system.

Need to hack here to set a default font (Arial) if fontindex returnet -1.

if (FontIndex<0) then
begin
AName := 'Arial';
FontIndex := fDoc.FTrueTypeFonts.IndexOf(AName);
end;

Now I need to setup a resolution for pdf: 300, 600, 1200 dpi... how could it is possible?

Gratz
23 From A.Bouchez - 12/06/2010, 09:03

I had already modified the engine (in its up to come version 1.8) in order to make font substitution if the font is not available, and use 'Arial' as a default font if the specified font is not installed.

About dpi, see comment 3 above.
24 From A.Bouchez - 12/06/2010, 09:08

To Esmond:

Your modification to force producing pages is very interresting, but is a bit tricky. I'll try to implement it with more respect to the engine architecture.

About bitmaps and pictures in general, the engine still lack of auto-compression to jpeg (as an option), and need some code refactoring to save memory when using a lot of bitmaps.

I didn't need to add any bitmap in my projects, so I didn't make a lot of tuning about them. IMHO PDF is better with vectorial processing, that's why I added the internal EMF enumeration and conversion to PDF. But I understand that you need bitmaps in your PDF.

So OK, I'll made better bitmap handling in the future.
Thanks for your interrest.
25 From Renato - 14/06/2010, 15:43

Thank ya Bouchez..

working now.
26 From Renato - 14/06/2010, 15:52

Ok,

i will waiting for 1.8 version.

Only one more question: The image printing
It is possible to print image and clone on other pages like this:

Page 1: print bitmap
Page 2: print clone bitmap of Page 1
Page 3: print clone bitmap of Page 1
Page 4: print clone bitmap of Page 1
...
27 From A. Bouchez - 14/06/2010, 15:58

You can do it by creating a TPdfImage instance, then use the AddXObject to register it into the pdf content, then use the DrawXObject method of the TPdfCanvas on every page to clone it in all pages.

Check the TPdfEnum.DrawBitmap to see how these methods and classes are used. The trick is just to create one TPdfImage instance, and draw it multiple times on every page canvas.
28 From Renato - 14/06/2010, 15:58

What do you think of creating a forum?
29 From Alexandre - 16/06/2010, 04:45

First, thank you very much for this great library.

I am drawing a bitmap to VCLCanvas. In order to position it correctly, I have to multiply all coordinates by Screen.PixelsPerInch/72, as in (where pagina is the current pdf page - TPDFPage):

pdf.VCLCanvas.StretchDraw(Rect((30*Screen.PixelsPerInch) div 72, (30*Screen.PixelsPerInch) div 72, ((pagina.PageWidth-30)*Screen.PixelsPerInch) div 72, ((pagina.PageHeight-30)*Screen.PixelsPerInch) div 72), bmp);

Is this the correct way to do it?
30 From A. Bouchez - 16/06/2010, 09:51

If you use the VCLCanvas, you don't have to use the PixelsPerInch property: just use the VCLCanvas methods and coordinates as usual. It has its own dpi. In order to calculate the page size in VCLCanvas pixels units, use the VCLCanvasSize property of the current TPdfDocumentGDI instance.

The framework will do all the resolution and coordinates mapping calculations for you.

Thanks for your interest.
31 From Esmond - 17/06/2010, 00:23

Thanks for the positive comments above.
I've got a slight problem in that Acrobat reader 9 flashes up a message saying 'The file is damaged and is being repaired.' when opening it. The code I'm using is:
pdfDoc := TPdfDocument.Create;
pdfpage := pdfDoc.AddPage;
JpegImage := TJpegImage.Create;
JpegImage.LoadFromFile('C:\image.jpg');
pdfImage := TPdfImage.Create(pdfDoc, jpegimage);
pdfDoc.AddXObject('image1', pdfimage);
pdfDoc.Canvas.DrawXObject(0, pdfpage.PageHeight-JpegImage.Height,
JpegImage.Width, JpegImage.Height, 'image1');
pdfDoc.SaveToFile('c:\test.pdf');
pdfDoc.Free;
I downloaded the acrobat 9 trial and saved the synopse pdf as a version 1.3 pdf. Apart from re-numbering and reversing the order of the objects, taking out some new lines and adding some xml the only strange thing was that the name 'image1' was changed to 'Im0' in the page object and page content stream but left as 'image1' in the jpeg data stream (the object number reference was still right). Is there a mistake in the code above or is the problem elsewhere?
32 From oldtype's me2DAY - 21/06/2010, 08:38

우엉의 생각

Synopse PDF engine 1.7.2 유니코드를 지원하는 오픈소스 PDF 엔진...
33 From A.Bouchez - 21/06/2010, 11:21

Please discuss about the Synopse PDF Library in our dedicated forum.

This post's comments feed

Add ping

Trackback URL : https://blog.synopse.info?trackback/212

« Synopse PDF engine - SQLite3 Framework version 1.7 »