[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: rtf parsing
From: |
Fred Kiefer |
Subject: |
Re: rtf parsing |
Date: |
Fri, 14 Feb 2003 13:17:35 +0100 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020903 |
Hi Alexander,
Alexander Malmberg wrote:
I've been hacking the rtf parser a bit (trying to give the text system
something to display :). After a few fixes, it now displays most things
I give it properly, but there are a bunch of things in the rtf parser
I'd like to change (there's no character set handling, a fair amount of
inefficiencies, and I think this parser should be very liberal in what
it accepts). However, a few things about it are a bit unclear:
As usual with GNUstep things are the way they are because of their
history and not because of one specific intention. The original RTF
parser code was written by Stefan Böhringer and a lot of the structure
still goes back to this time. When I started to work on it I did this
for exactly the same reason you are doing it now. I wanted to be able to
display something in the text system, the GNUstep part I was woking on
by that time. All the changes I did where just to get some small extras
to be passed on and as soon as this worked I left the rest of the RTF
parser untouched.
You are right about the character set handling and it is even worse,
currently only ASCII strings get passed from the scanner.
1. The parser's written using bison. This seems odd to me since, afaict,
there's no syntax in rtf that needs parsing (except blocks) after lexing
is done, and it makes it harder to handle unknown control words,
skipping characters, and other odd cases. Is there any specific reason
why it was done this way?
There is, as far as I know no other reason behind this, than having code
that is very easy to maintain and extend. You could see this yourself
when you corrected/extended it. This will get a lot harder when writing
a hand crafted parser.
2. rtfGrammer.y talks about making the grammar easy to use in other
contexts as well (using c callbacks, it seems). Is anyone using it for
that?
I don't know about any other application using this. Rather the oposite
is the case. There are now fairly good RTF parsers available (For
example as part of ABIword) and we should rather thing about switching
to one of these.
It would be great if somebody took over the responsibility for the RTF
parser and came up with a better implementation. But I don't see this as
a high priority issue. Currently the rest of the text system needs more
attention. This must be sorted out for the next release of GNUstep,
while the RTF parser is bad enough to work.
BTW, talking about the text system, is there a specific reason why you
did remove the code for ruler markers in NSLayoutManager? This now leads
to a crash when switching on rulers in Ink. This problem also doesn't
have the highest priority, I just never did understand the background.
If there is any way, I could help to improve the text system let me
know. Although we still have different positions on a few minor issues
(I still thing merging GSLayoutManager and NSLayoutManger would be a
good thing and I would also prefer to keep changes that break current
code to a minimum), I think that your rework was the best thing that
ever happend to the text system and would like to contribute to make it
faster again.
Cheers
Fred
- rtf parsing, Alexander Malmberg, 2003/02/13
- Re: rtf parsing,
Fred Kiefer <=