 |
PDFedit design documentation
This document describes design and internals of PDFedit program intended
for PDF documents manipulation. It doesn't bring precise code or
classes description but rather provides ideas and general
information to understand the implementation. If somebody wants to
use or reuse this project or understand current state he/she
should start with this document and then follow with
automatically generated doxygen programming documentation.
Document itself is divided into several parts:
This part describes general information about project internals. Which
technologies were used during design and implementation and which helper
(utils) classes were implemented as support for particular tasks.
Finally describes Xpdf code reusage and modifications
neccessary to enable such usage.
Chapter 1. Used technologies
Our project uses several technologies. All of them are open,
standardized and generally accepted and free. Their licence policy is
compatible with GPL - General public licence
Boost is free, highly portable and de facto standard set of libraries
for C++ language (see Boost home). Most of new C++
features which are very likely to became part of the
standard are firstly implemented and tested here. Also the
technical report (TR1) is implemented in boost.
We are using mainly smart pointers
especially shared pointers which provide easy to use and
safe automatic object life cycle maintainance of shared objects.
All objects exported from kernel to higher layers are wrapped by
shared_ptr smart pointers.
Boost Iostreams make it easy to create standard C++ streams and stream
buffers and provide a framework for defining Filters and
attaching them to standard streams and stream buffers. The
second feature allows creating a flexible, easy to use and
extend solution to support encryption/decryption and
compression/decompression of objects.
STL - standard template library
STL is C++ standard set of libraries which provides container,
algorithm, iterators (and many more) template classes. Their
implementation is highly portable and optimized for high
performance. We are using mainly mentioned containers for data
storage purposes, iterators for effective data structures
traversing. For more information and documentation for STL,
see documentation.
Qt is a multiplatform C++ GUI toolkit created and maintained by
Trolltech, we are using version 3 of the toolkit.
We are using mainly gui (widgets) classes (see
Qt classes) and QSA framework for scripting
layer. Slightly modified QSA version based on QSA 1.1.4 is
included in our source tree.
CPP Unit automatic testing
CPP Unit is a C++ unit testing framework. We are using
this framework for automatic testing of kernel interface and its
functionality. All test cases are placed in kernel/tests directory
and they are linked to kernel/kernel_tests binary output. We have
implemented test classes for all interface objects. Each class is
specialized for certain class interface object. Each class has
general name form
TestClassName
where ClassName stands for tested class. Test class implements test
suite which is identified by its name. Main test program runs all
test suites specified by name (the section called “Test program”).
Each test suit consists of test cases which test particular behavior
of tested class.
Each test class should inherit from CPP unit TextFixture class.
At first CPPUNIT_TEST_SUITE and CPPUNIT_TEST macros should be
used to prepare this class to cpp unit framework and to define
test case functions. Finally class should be registered to
framework, so test program can run it by specified name (this
name should follow TEST_CLASSNAME convention).
Each test case should perform operations on tested class and
checks Invariant which have to be kept for
such operations. CPPUNIT_ASSERT macro should be used to check
invariant condition or CPPUNIT_FAIL should be used to force
failure of test.
See following example (it can be used as template for new test
suite creation).
class TestClassName : public CppUnit::TestFixture
{
// defines this class as test suite
CPPUNIT_TEST_SUITE(TestClassName);
// This method has to be implemented as test case
CPPUNIT_TEST(TestMethod);
// definition of other test cases
CPPUNIT_TEST_SUITE_END();
public:
void setUp()
{
// this method should initialize local test class
// data used in tests
}
void tearDown()
{
// clean up after setUp method
}
void Test()
{
// Implementation of testcase
CPPUNIT_ASSERT(expected==op())
// ...
// sometimes we need to test that something throws an
// exception
try
{
// this operation should throw with this parameters
op(parameters);
// exception hasn't occured, we will force failure
CPPUNIT_FAIL("This operation should have failed.");
}catch(ExceptionType & e)
{
// ok, exception has been thrown
}
}
};
// registers this class to CPP Unit framework and assigns it
// with given name
CPPUNIT_TEST_SUITE_REGISTRATION(TestClassName);
CPPUNIT_TEST_SUITE_NAMED_REGISTRATION(TestClassName, "TEST_CLASSNAME");
There are two sets of input parameters you can specify.
The first set of parameters specifies input pdf files or directories.
After no file with specified name is found the parameter is assumed
to be a name of a test suite to run. Result of these tests is
information whether the test was successful, threw an exception or
a condition was not met.
Code and script documentation
We are using Doxygen documentation tool. This means
that documented parts use special format for comments, where
the comment starts with double star, like:
/** Doxygen comment */. These comments are
then used by doxygen to create html pages (or other formats).
Functions exported to scripting use different kind of documentation
in addition to doxygen comments.
We had to use format different from doxygen, otherwise doxygen would
parse our comments and we would parse doxygen comments, which would
lead into confusion. So, besides of ordinary doxygen comment which
is located above the function body in .cc file, we write extra comment
in corresponding .h file, that is exported to scripting API documentation.
Content of the comment is often similar, but often doxygen documentation
for programmer contain information not useful or misleading for script
users and vice versa.
Documentation for scripts are in comment with first comment character
being a dash (unlike another star for doxygen comments), like this comment:
/*- Comment for function */.
The comment have to be put in a header file directly above the
function declaration.
Comment for class is similar, but it have equal sign instead of dash, like:
/*= Comment for class */.
It have to be put directly above class declaration.
Another difference between doxygen and this format is that the
comment is treated as docbook code - it can contain docbook tags to
format the comment or insert a list or table in it, basically whetever
docbook code can be put between <para> and </para> tags.
Among tools for generating documentation there is a perl script, which
will parse docbook file and all strings in format of
<!--TYPE: filename.h --> will be replaced by generated
chunk of documentation from filename.h, that will document the class in it.
This is used for wrapper classes, where class methods correspond to same
methods in scripting environment.
Similarly, all strings in format
<!--BASETYPE: filename.h --> will be replaced with
similarly generated doumentation chunk, except functions are assumed to
be static functions available to script, not methods of class contained
in the header file. This is used to comment the base classes
(Base, BaseGui and
BaseConsole), because slot functions in these
classes are exported to scripting as static functions.
Documentation generated this way is then treated as ordinary docbook XML
file. The scripting API is documented in User documentation Appendix.
Design and user documentation
Design and user manual are written in Docbook
standardized and open free format.
XML files (with osasis docbook 4.2 DTD file) which
forms (this) design documentation are stored in doc/design
directory. Main file is design_doc.xml file which includes all
other files. Files specialized for kernel design are stored in
doc/design/kernel directory. With same logic, gui design files
are stored in doc/design/gui directory.
Chapter 2. PDFedit layers
PDFEdit project is based on 3 layers model:
- Kernel - which has responsibility to maintain pdf
document content and provides interface for changes.
- Script - which has responsibility to wrap kernel
interface and export it to user or gui.
- Gui - which visualizes and makes comfortable usage of
all the functionality which is accessible directly from
kernel or Script layers.
Kernel layer is build on top of popular open source Xpdf
project (see Chapter 4, xpdf project in PDFedit).
It reuses xpdf code for low-level pdf document access - reading
and decoding content and parsing it to objects as well as
displaying functionality. Xpdf objects are transformed to
interface internal pdfedit objects which provides additional
logic and as such they are exported to higher layers.
Script layer is based on QSA - Qt script for applications,
scripting language based on ECMAScript, developed by Trolltech.
Gui layer uses Qt framework also developed by Trolltech.
Most of gui parts are based on scripts, which means that user interface
is very flexible and changes are possible without need of code
recompilation, most changes can be done even in runtime.
This chapter describes PDFedit layers, their comunication interface and
responsibilities.
Kernel, as the lowest layer, is responsible for maintaing of pdf content from
file and to provide object interface for making changes to higher layer.
We will call this objects as cobjects. More
precisely - highlevel cobjects (CPdf, CPage, etc.) which provide higher
pdf entities logic and lowlevel cobjects which are pdf data types carrier
(CInt, CArray, CDict, CString, etc.). Values stored in lowlevel cobjects
are also called properties and they are wrapped
by IProperty class.
Properties are identified by indirect reference (the way how pdf
adresses entities).
User of kernel should start with CPdf instance which provides all
properties from document as well as access to document pages or
outlines. Pages then provide access to Annotations. All cobjects are
returned wrapped by shared_ptr (see Chapter 1, Used technologies).
Kernel uses Xpdf code for document content parsing.
XPdf's XRef class provides fetching and parsing functionality.
Oposite way (from cobjects to file writing) is provided by
IPdfWriter
implementation. XPdf Stream class is replaced by StreamWriter kernel
class.
CXRef class inherits from XRef (xpdf class) and adds internals for
storing of changed objects not public for direct user. XRefWriter
enables interface for making changes inherited from CXRef. See
the section called “3 layer model” for more information.
Scripting is base of the editor functionality.
Each editor window have its own script context and scripts run independently in them.
On creating of each window, the scripting base is constructed (BaseGUI,
extended by GUI specific functions from Base).
The Base will construct some necessary objects:
QSProject
- QSA class for scripting project
QSInterpreter
- script interpreter from QSProject
QSImporter
- Helper class used for adding and removing objects into scripting environment in runtime
QSUtilFactory
- Standard QSA utility factory, provides File, Dir
and Process object, that will allow scripts to manipulate with files and
directories (reading, writing, creating, ...) and with processes (running external commands)
QSInputDialogFactory
- Standard QSA input dialog factory, allowing scripts to create simple dialogs for requesting user
input
Window will set ConsoleWriter object, that will handle script output.
There are two classes derived from ConsoleWriter:
ConsoleWriterGui, used in GUI mode, which transfers the output to command window
ConsoleWriterConsole, used in console mode, which simply writes the output to STDOUT.
Class Base (respectively BaseGui
or BaseConsole) export many functions as slots. These are visible as static
functions in the script and they are main way of communication between the .
BaseCore class, from which the Base class is extended
does not provide any static functions, but it provides basic script functionality - garbage collection,
support for callback functions, running init scripts and handling of script errors.
Wrappers.
Script need to manipulate with objects in PDf and in editor. Due to the limitations of QSA,
every C++ object (except some basic types, such as strings, numbers and some QT types, such as QColor)
need to be derived from QObject to be usable in scripting and only functions exported as slots will
be available for script. Due to this limitation, wrappers need to exist around most objects, like
tree items in object tree (QSTreeItem,
QSTreeItemContentStream), PDF objects
(QSAnnotation, QSArray
, QSContentStream, QSDict
, QSIProperty, QSPage
, QSPdf, QSPdfOperator
and QSStream), class for invoking popup menus (QSMenu)
and helper classes related to PDF objects (QSIPropertyarray
,QSPdfOperatorIterator
and QSPdfOperatorStack).
All wrappers are derived from QSCObject class, which provides
some basic function for memory handling and error handling.
Source of script input.
One source of script input are init scripts - they are run on application startup.
Another source of script input are menus and toolbars. Each menu or toolbar item have
some associated script code which is run when the item is activated. User can see
commands invoked by these scripts in teh console window.
Third source of script input is the preview window, as interaction with it can result in
script functions being called, depending on the mode of the window (different functions
will be called in "add new line" and "add new text" mde, for example)
Callbacks is anothe rsource of script input. There are some special toolbar items, which
either manipulate their internal state or edited document itself when interacted by user
(item to switch revision, select current color, edit text, item to show and edit current page number).
These items use callbacks to notify the script of their action, so the script may react
(for example reacting to another color being selected or react on text in the text edit
toolbar item being changed).
Finally, user can use the commandline and type in any script code he want to execute.
Scripting API documentation.
Description of static scripting functions, functions provided by settings and PageSpace objects and
description of scripting objects and their methods is included in the user documentation.
Console mode.
Functionality in console mode is similar, with few exceptions:
BaseConsole is used instead of
BaseGUI. This class extend the Base
class with few console-specific functions.
ConsoleWindow instead of
PDFEditWindow is used. This class provide some of the
functionality for running scripts on console (handling its input and output),
similarly as PDFEditWindow does.
PageSpace object is unavailable.
There is no interactivity. Editor will run scripts,
as specified on commandline, and the it will exit.
Basic class in GUI is PdfEditorWindow, which represent the main editor window.
Application can have more such windows opened, in each of them editing different document.
On top of the window is menubar and toolbar (although being on top is only default position,
user can move all toolbars as he wants, all toolbars will dock on either of four sides of the
editor window, or they can float outside of it).
All toolbars are of ToolBar class,
derived from QT class QToolBar.
The menubar is standard QT QMenuBar, although filling the menubar and also
toolbar with its items and maintaining them and their association to script code is responsility
of class Menu.
On bottom there is a statusbar, which can be used to show various information
(class StatusBar, derived from QT class QStatusBar)
Rest of the area not occupied by statusbar, menu and any toolbars is divided by movable splitter on left
and right part. Left part is divided by another splitted into part with preview window
(class PageSpace) on top and commandline window, providing script input and output
(class CommandWindow) on bottom.
Right part is also divided by splitter, on upper part there is object tree view
(class MultiTreeWindow), on lower part is property editor
(class PropertyEditor).
Every element mentioned above (except menu and the preview window) can he hidden and shown by user. Application will
remember the element layout (size and position of window, position of splitters and position of toolbars)
in settings when closing and will reopen next time with the same layout.
Dialogs are also part of the GUI. Many simple dialogs are handled by script, but
as script is unable to create more complex dialogs, some of them had to be implemented directly in C++.
They are:
AboutWindow
- window showing version of editor and information about program and its authors
AddItemDialog
- Dialog invoked when adding new properties to dictionary
(CDict) or new elements to array
(CArray) in the tree view.
AnnotDialog
- Dialog invoked when creating new annotation to fill in its data
HelpWindow
- Dialog invoked for displaying help. Basically a very simple HTML browser
MergeDialog
- Dialog invoked when function "Import pages from another document" is invoked.
The dialog allow to select pages from another document to import and specify
positions at which they should be imported
OptionWindow
- Dialog for editing the user preferences interactively.
The options are organized into tabs, each tab containing
elements derived from Option class,
which maps one value in settings to widget for editing it.
Sublasses of Option are:
BoolOption
- editing with checkbox as boolean value
ComboOption
- editing with combobox, allowing to select from list of predefined values
DialogOption
- generic class for editable string, with "..." button allowing to invoke dialog
to edit the option in some alternative and possibly more comfortable way
FileOption, derived from DialogOption
- editing with possibility to invoke dialog to pick a filename
FontOption, derived from DialogOption
- editing with possibility to invoke dialog to choose the font and
it s parameters interactively
StringOption
- editing with classical one line edit box
IntOption, derived from StringOption
- allowed input is limited to integer numbers
RealOption, derived from StringOption
- allowed input is limited to real numbers
When the user presses "Ok" or "Apply" button, each of the option editing widgets
is asked to save its state in the corresponding option.
RefPropertyDialog
- Dialog to interactively select target for reference while adding or editing it.
Also, some standard system dialogs (to pick font, color or name of file) are used.
Chapter 3. General utilities used in PDFedit
This chapter deals with utility classes implemented for pdfedit
purposes but can be reused also somewhere else (implementation
tends to be as much independant on pdfedit as possible).
They are stored (with some exceptions) in src/utils dicrectory and when
compiled, they are collected in one libutils.a
statuc library.
Delinearizator is class which provides simple interface for pdf
document delinearization (see also Linearized pdf document).
Class instances are created by factory method
getInstance (see Factory method design pattern) and
one instance handles one pdf file, which has to be linearized. If
file is not linearezed, instance is not created and exception is
thrown. When instance is created, document can be simply
delinearized by delinearize method.
As well as XRefWriter it also uses IPdfWriter impementator for
content writing (this can be changed in runtime and provides
flexibility for output format).
Delinearizator itslef is build on top of Xpdf XRef class which
provides object fetching functionality and cross reference table
maintainance. This is used for fetching of all objects (without
those which are specific for linearized content) and IPdfWriter
implementation is used to write them to the new file.
This class (implemented in src/utils/delinearizator.cc) depends on
xpdf code and kernel/pdfwriter module.
IConfigurationParser provides interface for underlaying stream
parsing where stream data can be somehow (depends on format and
parser implementation) transformed to key,
value pairs, where key stands for data
identifier and value is associated with this key.
Class is template and abstract which means that implementators have
to implement all methods and supply data types for key and value.
IConfigurationParser is defined in src/utils/confparser.h file.
We have implemented simple implementation in StringConfigurationParser
class which parses file with simple format:
# comments are ignore
% this allso stand for comment by default
key : value # this key value is associaed with value
Where both key and value are strings. This parser can be configured
to ignore comments (strings starting with character from
commentsSet), to use different delimiter character (the one which
separates key from value) by setting delimiterSet or to set
characters which should be considered as blank characters (by
setting blankSet).
Configuration parser code doesn't depend on pdfedit or xpdf code and
can be reused as it is without any changes. It uses STL streams. In
PDFedit project it is used e. g. in ModeController or OperatorHinter
classes.
RulesManager is simple concept based on association of
rule and its target.
Implementation uses C++ template mechanism to be generic in way of
data type definition for both rule and target. Both types have to
fullfill certain contracts (see doxygen program documentation for
more details). Rules are keys in internal storage and they are
associated with their targets (1:1 relation).
Described storage with data forms RulesManager configuration.
Second part is based on IRuleMatcher (implementator of this abstract
class). It has responsibility to provide logic related to rules
choosing, evaluating of compatibility of rules and defining priority
for rules. When findMatching method is called, matcher is considered
to choose association from storage which matches given rule the best.
Implementator of matcher has to implement class Functor
so that it describes when given rule matches given original rule
and also provide with priority of this match.
Class user just defines rule and target data types, implements
IRuleMatcher for rule data type with matching logic and use class as
it is. RulesManager also enables loading rule, value configuration
from configuration file. loadFromFile uses Parser template data
type. IConfigurationParser implementator with rule type for key and
target type for value can be used here.
This concept was used for ModeController class and OperatorHinter in
our project. (the section called “ModeController”).
IObserver class which stands for observer in following context is
mechanism to allow announcing internal object state change to other
objects. Object with internal state which announces is
observer handler and classes which monitor
(observe) are called observers. This is basic
idea of Observer design patter. This implementation keeps basic
contracts of this pattern and adds additional functionality to be as
flexible as possible.
From class point of view observer handler is
class which inherits from ObserverHandler class. This template
class provides interface for observer registration and
unregistration. Each observer has to be registered before it is
notified about changes. When it is no longer interested in changes,
it should unregister itself from observed objects. ObserverHandler
also provides method which announces all registered observer about
change (notifyObservers method). Observers are called in order which
depends on their priorities and for same priorities on registration
order.
Observer handler is responsible to call this
notifyObservers method
whenever its internal state has changed (and he wants to announce
this change) and provides correct parameters for it.
Observer has to implement IObserver abstract
class. The most important to implement is notify method. This method
is called by observer handler after its
state has changed. Observer can use given newValue parameter which
holds current value which has changed. Additional information can be
obtained from given context (see bellow). notify method can use
given values to update its internal state or to do additional
actions but in any case it shouldn't modify given data (this can
lead to end less loop, because observer handler notifies about
change and so observer is notified again and this will never stops).
Notifying would be rather poor if just newValue was available. So
our observer concept adds context to notify method. This context
keeps additional information. Context hierarchy is based on
IChangeContext abstract class. This provides just information about
type of context. notify implementator should check this type and
cast given context to correct type and use it as needed. If observer
handler doesn't want to give any context, this should keep NULL
value.
We have implemented following types of contexts because they were
needed by project.
-
BasicChangeContext - additionally gives previous value
(originalValue) of newValue
-
ComplexChangeContext - inherits from BasicChangeContext and
additionally gives information about value id. This can be
used for complex objects or containers where value is stored
inside and identifier is value name (in complex object
scope) or id (position or key) from container. We have
defined following conventions - if value was added, then
originalValue is NULL (or something that represents NULL)
and when value is removed, newValue is NULL.
-
ScopedChangeContext - inherits from IChangeContext and adds
second template parameter for scope data type. Scope is
abstraction for area where current newValue has changed.
We are using this context for the section called “ProgressObserver”.
New context types can be defined as well and very easily - just
inherit from IChangeContext and implement getType to return correct
type enum value and add some information specific for context inside
to class.
Note that all mentioned classes are C++ template classes and they use data
type as template parameter. This type stands for data type
of value which change is announced (newValue in notify method e. g.).
All instances are wrapped by share_ptr (boost smart pointer) to
prevent from data life cycle problems because share_ptr correctly
shares data between multiple user and they are deallocated in moment
when nobody holds shared pointer with them. If shared pointers are
used correctly (this means that object wrapped inside is never
deallocated by hand), no problems should occure with object
instances. This is very important because obsever handler doesn't
have any information whether observers, which are registered are
alive in moment when it calls their notify method.
Second restriction for implementators and users is that notify
method as well as all other methods (also constructors and
desctructor) can't throw an exception. This is intention, because
observer handler has to guarantee that each observer is called after
it finishes notifyObservers method. It doesn't know anything about
observers so it also doesn't know how to handle their exceptions.
Only reaction would be (to keep contract that all observers are
notified) to silently ignore (or to log) exception. This could lead
to inconsistencies and so it is safer to forbit exceptions at all.
notify method implementator has to keep this in mind because, if
exception is thrown and it is forbiden in method signature (throw()
clause after method definition), program is forced to terminate by
default.
Iterator is a specific implementation of Iterator design pattern.
It is used to traverse an arbitrary linked list that meets few
requirements. Main goal of this iterator implementation is to be flexible
and easily extensible because we need many special iterators
iterating only over specific items.
The iterator is bidirectional. Information about previous and next
item is obtained from the item itself. Sometimes it is not possible
to have a container outside stored items, what would be more
flexible, but the information must be stored in the items itself.
Item before first item and item after last item are not valid objects.
New special iterator can be easily created from the base iterator just by
inheriting and overloading one function which selects valid items.
Example of special iterators can be found in the section called “Pdfoperator iterator”.
Chapter 4. xpdf project in PDFedit
Pdfedit project uses Xpdf code for low level pdf content
operations, such as pdf object parsing (with quite good matching to the
Adobe pdf specification ver. 1.6. ), indirect object resolving,
generation of page and text output devices, streams decoding and so on.
We have tried to reuse the most of functionality which is not somehow
directly related to xpdf application special logic.
To prevent from errors comming from xpdf code as well as to be less
depended on xpdf in whole project, this is used just in few places
(namely CXref, XRefWriter, CObject, CPdf and CPage classes) and
rest of our pdfedit code uses just our classes and special objects.
This means that substitution of xpdf code by something
different is possible with changes on concrete places and rest of the code
doesn't know about that. Project currently uses xpdf in 3.01 version.
Changes needed for code reuse
Code of XPDF project couldn't have been reused without modifications
because it is not prepared for making changes to its objects very well
(code assumes data reading and using them - not to use and change them).
Our code changes can be divided to 3 cathegories:
- Syntactic - these are changes related to functions/methods
signature (const modificators, private methods changed to
protected, new parameters, non virtual methods are changed to
virtual in some classes).
- New features - these are changes which produce new
functionality required by our project (e. g. clone support
for Object and all used values inside).
- Design - these are changes in xpdf object hierarchy or
meaning of some components so they better fit our usage.
For more information see following detailed description.
Object class is used as value keeper in whole
xpdf code. Design is not very good, because all value types are stored
in this one object (even more in one union in Object class) and real
value type is then identified by enumeration Object type (returned by
getType method). We consider this design not very good because this
doesn't prevent user from bad usage and different value type can be
obtained than really stored (no internal checking is done and so on).
Nevertheless this behavior is kept, because change would require whole
xpdf code reorganization. We have focused just to Syntactic and new
features changes here.
Xpdf code uses kind of optimization for objects copying and so complex
values (such as dictionaries or arrays) are not copied by copy method
at all and reference counting is used instead. Our Object usage (used
primary in CXref class) requires deep copying and so cloning support
is neccessary.
We have added new clone method:
Object::clone()const;
which creates new Object instance with deep copy of value held by
original.
[1]
Returned Object instance is change independant on original
and so they don't influence each other when one is changed. Cloning
of complex value types are delegated directly to specialized clone
method implemented on such type.
Syntactic changes simply corrects parameters modificators and all methods
with pointer parameters which can have const modificators are changed
to have it. This change is just cosmetic and should prevent from bad
xpdf code usage.
XRref class is responsible for pdf objects fetching from stream. Pdf
defines so called Indirect pdf object. This objects are
identified by their object and generic numbers. XRef keeps and maintains
Cross reference table which contains mapping from indirect
object identifiers to file offset where object is stored. Internally uses
Parser and Lexer classes to parse file stream content to Object.
In first XRef had to be prepared for transparent wrapper usage (see
Wrapper design patter, so all public methods were changed to
virtual and private elements to protected (to enable access to and
manipulation with them in inheritance hierarchy). XRef class is then
wrapped by CXref (see the section called “CXRef”) class and rest of xpdf code doesn't know
difference and can be used without changes (with respect to XRef usage).
CXref reopen
[2]
functionality requires correct chaning of XRef internal
state (which includes entries array reinitialization, trailer creation
and so on). This everything was done in construtor in original
implementation. Clean up was done in destructor. We have added new
protected
void initInternals(Guint pos);
void destroyInternals();
methods, which use same code as original one but separated to enable
such internal state change anytime during XRef instance's life.
XRefWriter (see the section called “XRefWriter”) (descendant of CXref class which inherits directly from XRef)
needs to know where it is safe to put data not to destroy original
document data when changes are written to the document (as an Incremental update).
To enable this, XRef has new
Guint eofPos;
field which contains position of %%EOF marker or end of document. Value is
set in constructor because it has to be found out anyway and XRefWriter
doesn't have to this work again.
XRef class didn't provide information whether pdf reference (object and
generation number pair) is known
[3]
and so it wasn't possible to find out whether object value is null object
or it is not present. To solve this problem, we have added new public
virtual RefState knowsRef(Ref ref);
method which returns state of given reference. State is integer value
with predefined constants which may hold:
- UNUSED_REF - if there is no indirect object with given reference.
- RESERVED_REF - if reference is reserved to be used, but no
indirect object is registered yet. This state is used by CXref
class to mark that reference is planned to be used and we are
just waiting for some object to be used for it.
- INITIALIZED_REF - if indirect object with given reference
exists. This objects are considered when number of objects is
required.
CXref and XRefWriter descendants override this method to reflect
object added/reserved by their interface and additional logic (e. g.
current revision and so on).
XRef's getNumObjects returned the size of allocated entries array.
This is not very clean, because entries array contains also free
and unused entries. Even more array is allocated by blocks and so
there are more entries than real objects. This method is not used
in xpdf code at all, so it could be reimplemented to return just
really used objects (those with state INITIALIZED_REF).
Array class represents pdf array object. It is one of complex
value type. It may contain number of Object instances. To enable
Object cloning, new
Array * clone()const;
is implemented. It returns new Array instance with same number of
elements where each one (Object instance) is cloned (by
Object::clone() method).
Dict class represents pdf dictionary data type. It is one of complex
value type holding association of objects with their names (key,
value pair where key is name object and value is Object instance).
DictEntry used as entry (key, value pair association) kept value
(Object instance) as normal instance. This was changed to pointer
to instance to enable simpler value updating.
Original code didn't use const modificator for key (char * typed)
parameter and so it wasn't clear whether it uses given value and
stores it (and so parameter can't be deallocated after method
returns) or just use it to get information (so it can be
deallocated). This could possibly lead to memory leaks or worse
to duplicate deallocation od same memory. To solve this potential
problems, all methods which don't store key have const char * key
parameter.
Dict as complex object stored in general Object data keeper has to
support cloning, so new
Dict * clone()const;
is added. It returns new Dict instance with same number of
entries where each entry is deep copied - name string and associated
object (Object instance) is cloned (by Object::clone() method).
New method for simpler updating value has been added:
Object * update(char * key, Object * val);
This method will add new entry if no such entry is in dictionary or
replaces an old by given value and original is returned.
Original implementation didn't contain any method for entry removing
and so new
Object * del(const char * key);
has been added. This will remove entry with given key and returns
associated value.
Xpdf code defines Stream hierarchy to describe pdf stream objects.
Streams (as pdf data types) define container of pdf objects. This
container is associated with dictionary which contains information
about its length and filters which were used for stream encoding.
XRef class reads data from stream or Content stream
object is based on stream.
Stream is base class for both normal streams represented by
BaseStream (Stream descendant) base class and FilterStream (also
direct Stream descendant) base class used for all filered streams.
This stream objects hierarchy is strictly specialized for reading and
can't be used for making changes to stream data. CXref and XRefWriter
however needs to make transparent modifications to stream with pdf
content (so that xpdf code using Streams doesn't have to be changed
very much). This is the reason for some changes in Stream hierarchy
design.
Problem with stream modification is solved by new abstract class
(base class for all specialized stream modificators) StreamWriter.
This defines interface for stream writing (in same way as Stream
defines operations for reading). However, implementation of concrete
writer requires (such as FileStreamWriter) multiple inheritance,
because it needs interface from StreamWriter and also access to
concrete BaseStream (in FileStreamWriter it is FileStream) fields.
So original inheritance of all direct descendants of Stream
and BaseStream had to be changed to virtual (to prevent ambiguity).
This model enables transparent usage of StreamWriter as Stream typed
instances in xpdf code and as StreamWriter typed instances in our
higher level classes (like FileStreamWriter) in pdfedit code for
writing.
FilterStream hierarchy is untouched in design way, because our project
doesn't change filtered streams directly. It works just with base
stream, because FilterStream hierarchy is hard to be reused for
encoding. So just decode functionality is used.
Stream object as one of complext value data type which is stored in
Object (as all other data types) has to to provide cloning support.
We have added abstract
virtual Stream * clone()=0;
method in Stream base class. Each specific stream implementator has to
provide its clone implementation. No default implementation is written
in Stream directly to force all specific filters provide one. If any
of filters is not able to create clone, this method should return NULL.
This should not happen, however clone implementation has to be aware of
it (and has to check whether filter stream has cloned correctly).
FileStream is stream which reads data directly
from FILE stream and so cloning has to copy all data (from stream
start to the end - if stream is limited, then just first length
bytes) somewhere else. Creation of new file, just for temporarily
created clone is not very effective and may produce several problems
(not enough free place, creation and removing of temporary file,
etc.). We have solved this problem by creating
MemStream with buffer containing same data as
FileStream. This brakes contract of clone meaning a bit, because
cloned stream is not precisely the same as original, because it is
represented by another Stream class. Nevertheless it keeps the most
important contract, that user of Stream interface doesn't know the
difference and clone and original don't affect each other when one
is changed.
MemStream represent stream stored in buffer in
the memory. So cloning is straightforward and just buffer is copied
for new MemStream. All other attributes are set according copied
buffer.
Buffer copying starts from start field position and lenght bytes
are copied. So final MemStream will contain just data used in
original one. Finaly needFree field is always set to true, because
we have allocated new buffer for clone and so it has to be
deallocated when MemStream is destroyed.
[4]
EmbedStream is special stream which is stored in
some other stream. Clonig is also very simple, because this stream
just holds one Stream pointer field and some attributes which doesn't
change during instance life cycle. Cloning is then just delegation
to cloning of stream field and creating new EmbedStream with
cloned value and same parameters which were used for given instance.
FilterStream stream branch represents different
types of filters which can be used to encode stream data (Pdf
specification describes which filters can be used). Hierarchy and
design of filters follows Decorator design patter and each filter
works with underlaying stream (which is stored as pointer) field
which is typed as Stream (so it can be either stream with data -
MemStream, FileStream or
EmbedStream - or another filter stream).
FileStream is cloned in similar way as EmbedStream. Each
filter implemenetator holds Stream pointer (as already mentioned).
This is cloned by clone method (defined in super type). When
underlaying stream is cloned (and clone method returned wih non NULL
value - which means that this stream supports clonning), current
stream creates new same filter stream instance with same
configuration parameters (these are usually parameters which were
given to it in constructor - but when such attributes can change
in time they have to be stored somewhere else in constructor
specially for this purpose).
[5]
General implementation for all filter streams is as follows:
// clones underlying stream
Stream * cloneStream=str->clone();
if(!cloneStream)
// if underlying stream doesn't support cloning, it will fail too
return NULL;
// creates same typed filter stream with same parameters and cloned
// stream
return new ...(cloneStream[, specific_parameters]);
As mentioned above, some filters are not able to reconstruct
parameters given them as constructor parameters and so it is hard to
reconstruct same filter. Specially all filters which holds
StreamPredictor field has additional field with PredictorContext
(added by us):
struct PredictorContext
{
int predictor;
int width;
int comps;
int bits;
};
where all parameters needed for StreamPredictor creation are stored.
This structure is initialized in constructor and never changed. It is
just used for cloning.
Xpdf reads pdf files using two level mechanism. The lowest level, called Lexer, decodes streams if necessary
and reads byte after byte. The second level, called Parser, reads from Lexer byte after byte until one complete
object is read. This can be applied recursively. Then it initializes xpdf Object class with type and data
of the read object. Parser object can read objects either from a single stream or from an array of streams
(simply reading one stream after another could result in incomplete objects returned at the end of a stream).
Parser is used to parse all objects including content stream tags.
Content streams can consist of more streams. Decoding and then concatenating of these streams must form a valid
content stream. The problem is, that the content stream can be split into streams at almost arbitrary place.
Added feature which was missing in the Parser object is the information when it has finished reading one stream
and started reading another.
Added method
// End of actual stream
bool eofOfActualStream () const { return (1 == endOfActStream); }
where endOfActStream is new variable indicating how many objects have been buffered from current stream.
Content streams can consist of many small valid content streams.
When splitted correctly, user can easily delete/add small content streams.
Changes made by PdfEditor can be considered as such small valid content streams. After saving our changes we
want to see these changes separately to existing content streams.
This new feature is used to split many streams (which create one page content stream)
to many small content streams.
Because of the object buffering done by Parser the new feature had to be implemented specially this way.
Kernel design description
Kernel is object layer which provides interface for manipulation with
pdf document, its high level entities (like pages, outlines and
annotations) and all properties of entities. All these objects keeps
document logic inside and provides interface for higher layer for
simple manipulation. Higher layers (GUI and script in our case)
should use only these objects to get or change document related
information.
All kernel stuff is stored in src/kernel directory and consists of set
of classes. Classes are separated to 2 groups according logic related to
pdf which is implemented inside class:
High level objects - which wrapps pdf high
level entities, such as pages (the section called “CXRef”), annotations
(the section called “CAnnotation”), outlines, whole
document (the section called “CPdf”).
Each object has certain properties which are defined by pdf
specification and those are returned in low level
objects form. If pdf entity contains other high level entity in
its substructure (like pages contains annotations etc.), this
entity is responsible for creation and maintainance of such
high level object. CPdf is then root of all high
level objects.
- Low level objects - which wrapps pdf data
types.
[6]
According value type character (what can be stored inside) we
will distinguish 2 cathegories of data types:
- Simple types which holds simple
data such as integral values, floating point value,
string, name, operator, etc.
- Complex types which holds other
data types in their inside such as array, dictionary
and streams.
Following chapters will focus more deeply on particular parts of kernel
stuff. At first, all classes which are used as interface objects for
higher layers are described. Then some kernel classes which are not
part of interface but they are used internally by interface classes.
Finally there is description how document changes and revisions are
handled.
Chapter 5. Interface objects
As descibed above, kernel comunicates with higher layers (see Chapter 2, PDFedit layers)
with objects called cobjects. Those
cobjects can be high level and
low level. This chapter and its sections
describe these objects, their responsibilities and mutual cooperation.
Pdf file consists of objects. These objects are referenced from a special structure forming a tree.
Objects can be either simple (number, string, name, boolean value, reference, null) or
complex (array, dictionary, stream).
CObjects are objects in pdfedit which represent objects in pdf file.
All cobjects are derived from one base class IProperty.
Objects form a tree like structure so we can divide objects into single objects (leafs) and composite objects (nodes). This
is an example of Composite design pattern.
This is a different approach to xpdf implementation where each pdf object is represented by the same huge class.
The concept of having exactly one class representing each pdf object leads to many problems:
inheriting - unclear oo design, unmanagable, it breaks the idiom of one entity for one purpose adding changing operations - would result in even more monstrous class sometimes value inside class, sometimes outside - unclear oo design
There are many interesting design decisions in xpdf objects implementation.
For example memory handling makes it almost unsound to delete objects from
complex types. Memory allocation policy, that means who/when/how is to deallocate xpdf Object is a mess which could easily
lead to either memory leaks or memory corruption.
The new design counted with new object for each different pdf object. Because of the pdf decoding complexity (pdf can be encoded
using many filters) these objects use xpdf Object just for initializing and dispatching changes to CXref object.
Objects can not be simply copied, because it is not clear if a copy is a deep copy with all indirect object copied or not.
Every object is derived from one base class - IProperty. This base class is a hanger which can be used to
access child objects uniformly. This class is a read only interface for all properties.
Objects can be created uninitialized or can be initialized from an xpdf Object of the same type and simple objects
cat be initialized directly from a value.
Simple objects are very similar. They share behaviour and because of this also method names (only value keeper is different). They are represented by one class
using c++ templates. One template class (CObjectSimple), parameterized by object type, represents all 7 types of simple objects.
It is more difficult with complex types. Each complex type must contain references to its children which are
also pdf objects. A design decision was made to use smart pointers for referencing child objects. The reasons are:
Allocation and deallocation policy - we cannot be sure when an object is deallocated nobody
holds a pointer to the object. This could be solved by implementing reference counting, but why reimplement the wheel.
Automatic deallocating when not needed.
Pdf objects can be referenced using ids which are similar to pointers. This brings many problems. One of them is the
impossibility to delete such objects. Many of the problems are automatically handled by smart pointers.
CArray stores its children in a simple container indexed by position. CDict stores its children
in container indexed by string which uniquely identifies a property. The beauty of smart pointers arise when deallocating
an array, it automatically deallocates its children when not referenced and this is done recursively.
Streams are the most complicated from all pdf objects. The problem is that xpdf can decode pdf files but it can
not do it the other way round. (it is because it never needs it) Xpdf impelementation of streams is very rough.
We use boost filtering iostream which provide us with the necessary general concept of
encoded streams. However we have to implement the filters ourselves. (the easiest way is to save decoded streams
without any filters) We do not know the filters as long as we do not change the object. We can modify either raw encoded stream
or we can save decoded stream which is automatically encoded when saved using avaliable filters.
Each stream consists of a dictionary and stream buffer. The dictionary can not be accessed directly.
Dictionary interface is simulated by simple methods which delegate the calls to the dictionary object.
Buffer is stored in encoded form allowing us to return the same byte representation of an
unchanged object as read from a pdf file. At the time of writing this not all reversed filters have been implemented.
We access streams using Adapter design pattern implementing open/close interface. We need
to be able to read from more streams byte per byte because content streams can be splitted anywhere.
We decided to return only xpdf objects.
Every object can be obtained from CXref (see the section called “CXRef”) when knowing its reference number and then changed using iproperty interface.
Internal state of special object (cpage, ccontentstream, etc.) depends on these raw objects.
Therefore a mechanism was designed to allow special object to be notified about raw changes. Objects implement subject interface from
Observer design patternObserver design patter which allows everyone to register observers on these objects.
This observer gets notified each time the object was changed.
CPdf class is main pdf document class. It maintains document content
using XRefWriter field, document catalog PDF dictionary and provides
other specialized highlevel objects such as CPage (see the section called “CPage”)
and COutline.
Main responsibility is to keep all objects (it provides) synchronized
with data which are used for them. As an example, it has to keep
CPage instances synchronized with current state of page tree.
In design terminology, CPdf provides Facade design pattern; to
manipulate with file in document scope of view. All internal objects
used for particular purposes are hidden from class user and CPdf
provides interface for manipulation with it (as an example, CPdf uses
XRefWriter (see the section called “XRefWriter”)
which enables making changes to document, but exports only
CXref (see the section called “CXRef”)
which enables just objects fetching - almost same interface as Xpdf
XRef class).
Instance of CPdf can be create only by getInstance
factory method (see Factory method design pattern) and destroyed
only by close method described. CPdf instance is
one purpose object which maintains exactly one document during its
lifetime (between creation and close).
Each document may be opened in one of several modes. Each controls
set of activities which can be done. ReadOnly mode guaranties that
no change can be done to document. ReadWrite mode enables making
changes but with some restriction (see programming documentation
for more information). Finaly Advanced mode brigns full control
upon document.
Properties changing and revision handling
All changes to the document are done using
XRefWriter as described in Chapter 7, Changes to pdf document. Additional
logic and responsibility of CPdf in this direction is to make adpater
from IProperty interface of property to xpdf Object required by
XRefWriter. Even more it also provides interface to get indirect
objects in IProperty form. This means that it obscures low level
information about who is doing parsing and storing and what data
types are used. Also guaranties that all indirect properties are
represented by same instance of IProperty to enable their reasonable
usage.
To enable also inter document data exchange (in form of properties),
it provides functionality for adding of indirect properties. When
property is from other CPdf (this may mean other document), it'll do
all neccesary handling for this situation (e. g. all other indirect
objects which are refered from added one are added too).
Revision handling is done similar way but in this case without any
special logic. Revision changing and geting current revision or
cloning is directly delegated to XRefWriter. If document save
is required, just checks whether mode is not read only and delegates
the rest to XRefWriter
Provided high level objects
CPdf provides high level objects maintaining some specialized part
of document Pdf document catalog. These objects brings
logic on properties with special meaning in pdf document.
[7]
Pdf dictionaries referenced from Pdf page tree are
called page objects. These dictionaries must contain a "Type" entry
and the value must be a name object equal to "Page". Page objects
are basic building blocks of pdf files. Each page can be independent
with all required information stored in its dictionary. Some of its
properties can be inherited from parent pages. Page object describes
the appereance of the page (witdth, length, fonts used, rotation etc.).
The core of a page is one or more content streams which specify what
is on a page (text, pictures, graphics ...).
Generally, changing an object can result in many other changes.
Objects often depend on other objects. Changing a page property
can result in
redraw of the page redraw of other pages
This is the reason why cpage implements observer subject interface (see
the section called “Observers”.
An object can be notified if cpage changes.
Page dictionary can be changed in two ways. Either by cpage methods or
by requesting raw dictionary by its reference number. If we do not
want to parse the whole pdf file, we do not have the information
whether an object is a page or just an object with page type. This
problem is solved by Observer design patter. We observe the
underlying dictionary which implements the observer subject interface.
This way, we know every time the dictionary is changed either by cpage
or by cobject.
Xpdf has the best displaying engine of all tested viewers. All calls
to display methods are delegated to this engine. When displaying a page
CPage creates xpdf Page object from its actual state. Then it uses Page
object method displaySlice() to display a
rectangle of a page. Xpdf creates the graphical environment for drawing
into an output device. Finally it draws the page into supplied device.
Everything on a page is in its content stream(s).
Every operation in a content stream means altering the actual position
by moving the drawing pen from one position to another position creating
a rectangle which can be used to constrain each object. The rectangle can
be used to order all objects into a structure which can be easily
searched. This enables effective selection of only some objects.
Pdf specification does not force pdf converters to keep text
structure of converted document. This means that no text element
needs to correspond to an element in the original file e.g.
paragraphs, lines, words. Even the order of single letters can
be arbitrary. Xpdf (or any other sophisticated) viewer only
guesses which letter form a word or which words form a line.
We use xpdf text engine to extract, search text.
Pdf file can refer either to extern system fonts which are
specified by pdf specification and each viewer should have
these fonts avaliable or it can inline font metrics into
the pdf file. The latter option is very tricky because it
allows the font to contain only those letters which are
actually used. CPage object supports adding fonts which
can be used on a page. Fonts must be present in the pdf
file or they must refer to system fonts.
Annotation is interactive entity which is associated with rectangle
on page. They are organized in Annots array
entry in page dictionary and so CPage is responsible for returning of
all available annotations and also to provide interface to add new
annotation.
Each annotations is described by dictionary, which has to contain at
least Type element with Annot
and Subtype element with concrete annotation type.
Pdf specification describes several types of annotation types (e. g.
text annotation - which describes text box floating upon normal page
text, link annotation - which enables to jump to the target within document
or to perform certain action when link is activated by mouse click).
Rect element should be present as well, because it
specifies where annotation should be spreaded.
CAnnotation represents such annotation and provides simple interface to
manipulate elements in annotation dictionary. It implements
ObserverHandler (see the section called “Observers”) to anable user
to be informed about changes inside annotation dictionary. This class
provides just simple interface for internal manipulation and it is
intended to be base class for specific annotation types (no such
specialized class are available yet, because they are not required by
project in this moment).
Instance can be created in 2 different ways.
First possibility is to use existing annotation dictionary (e. g.
fetched from document). This way is used in the section called “CPage”
class where already existing annotations are fetched and used for
CAnnotation instance.
Second way is to use factory
static boost::shared_ptr<CAnnotation>
createAnnotation(Rectangle rect, std::string annotType);
method (see Factory method design pattern). This method uses
internal annotInit static the section called “Annotation initializator”.
Intializator is responsible for correct annotation dictionary
initialization according given type and for given rectangle.
This is safe way how to create new annotation instances.
Annotation initializator represented by IAnnotInitializator
abstract class. It provides Functor which
intialize given dictionary with correct data according given type.
Initializator is designed as Composite design pattern and so one
initializator class can support initialization of several annotation
types (getSupportedList returns annotation types which are supported
by this initializator).
We have implemented UniversalAnnotInitializer
which adds
bool registerInitializer(std::string annotType, boost::shared_ptr<IAnnotInitializator> impl, bool forceNew=false);
method responsible for registration of other initializator to
composite of initializators. When
createAnnotation is called,
UniversalAnnotInitializer choose registered
implementetator which supports such annotation type.
UniversalAnnotInitializer itself just initializes common elements
for all annotations (such as Type, Subtype, Rect elements).
CXRef class wrapps Xpdf XRef class and provides additional functionality
with same interface (see Wrapper design patter. It provides with
protected interface for making changes and internally stores changed
objects. When object should be fetched (fetch
method is called), it will check whether this object is already changed
and if so, uses changed value. Otherwise delegates to XRef original
implementation (this logic is kept in all methods defined in XRef).
CXRef inherits from XRef and so can be polymorphicaly used in xpdf code
and this code doesn't need any changes to use CXref functionality.
Aditional interface enables changes, but as we want to keep this making
changes under control so it is protected and so accessible only for its
inheritance descendants.
Added functionality includes:
-
new indirect objects creation - creates new pdf object and
associates it with reserved reference.
-
changing of already existing indirect objects.
changeObject method which changes object
with given reference with given object.
-
changing of document trailer - add, remove or change elements of
pdf trailer.
-
checking for type safety - checks whether given object can
safely replace original value (in document or currently saved
form) according types. Type safe change is consider such change,
when new value type is either same as old type, or dereferenced
types (if any of types is reference) or if original value is
CNull, then new value may have arbitrary type.
-
reopen functionality -
reopen method which
is responsible for document content reopen with Cross reference table
at specified position. This is then used to change current
revision of document, where cross reference table position is
specific for desired revision.
For more information about CXref usage, please see Chapter 7, Changes to pdf document.
Content stream can consist of one or more pdf stream.
It is responsible for everything on a page. If anything visible is
changed the content stream must be changed. Content stream is a
stream processed sequentially. Page can consist of one or more
content streams and these streams must be concatenated before reading
(objects can be splitted between two content streams). Content stream
consists of operators and their operands. Each operator updates
graphical state.
Generally, changing anything visible on a page means changing
something in underlying content stream. Because operators are
processed sequentially changing of one operator/operand may
affect many following operators (e.g. their bounding boxes).
Page needs to know if a content stream is changed because it
must reparse operators. CContentStream implements observer
subject interface so for example cpage (as content stream
maintainer) can be informed when it is changed.
Content stream can be changed in two ways. Either by
ccontentstream methods or by requesting raw operator and changing
its operands. The third way is to add/delete whole stream.
This problem is solved by ???.
Operators are processed sequentially and there are many situations
when only some types of operators are needed. Clear solution is to
use the Iterator design pattern. With this patter we can process
operators one by one. If we need specific iterators we just create
another child of basic iterator. There are be
simple and composite
operators. Operators form a tree-like structure. This is more
readable than a list of all operators. So we implement another
tree-like queue. Only the first level of operators is stored in
ccontentstream. Each composite operator stores its children.
This is an example of Composite design pattern. Simple and
composite operators are accessed uniformly.
Deleting and inserting an operator is not easy because it is stored
in two queues. We have information only about one queue (that one
which was used to get the operator) so we need to find out the
position in the second queue and change it adequately.
|
|