help-smalltalk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Help-smalltalk] [PATCH] Example Squeak scanner/parser


From: Paolo Bonzini
Subject: [Help-smalltalk] [PATCH] Example Squeak scanner/parser
Date: Sat, 09 Jun 2007 09:20:23 +0200
User-agent: Thunderbird 2.0.0.0 (Macintosh/20070326)

This shows the infrastructure just introduced, by supporting Squeak-to-GST file-in, with less than 100 lines of code (not counting latent bugs exposed by the patch). In the future, this will be accessible from Daniele's conversion tool through a command-line option.

The methods were being parsed twice during the conversion, the second time with an RBParser. This is wrong because Squeak replaces single exclamation marks with an escaped sequence '!!' within methods. We undo this in a special SqueakFileInScanner, but the second parse was making our efforts void. This was easily fixed by storing the method parse tree in the LoadedMethod object.

In turn, this showed that the problems with comments in doits actually applied to the methods too. Hence, I reused the MappedCollection trick for methods too.

Paolo
2007-06-09  Paolo Bonzini  <address@hidden>

        * compiler/Exporter.st: Use #parseNodeAt: to access the method
        parse tree.  Define it for Behavior and #methodParseNode for
        CompiledMethod.
        * compiler/STFileParser.st: Use MappedCollection trick for method
        source too.  Look at the token type to distinguish '!' from other
        tokens.
        * compiler/STLoaderObjs.st: Store an RBMethodNode in LoadedMethod
        objects.
        * compiler/STLoaer.st: Adjust for above change to LoadedMethod.
        * compiler/SqueakParser.st: New.

        * scripts/Convert.st: Add example method to convert from Squeak.

* looking for address@hidden/smalltalk--devo--2.2--patch-392 to compare with
* comparing to address@hidden/smalltalk--devo--2.2--patch-392
A  compiler/.arch-ids/SqueakParser.st.id
A  compiler/SqueakParser.st
M  compiler/STFileParser.st
M  compiler/Exporter.st
M  scripts/Convert.st
M  compiler/ChangeLog
M  compiler/STLoader.st
M  compiler/STLoaderObjs.st
M  packages.xml.in

* modified files

--- orig/compiler/ChangeLog
+++ mod/compiler/ChangeLog
@@ -1,3 +1,18 @@
+2007-06-09  Paolo Bonzini  <address@hidden>
+
+       * compiler/Exporter.st: Use #parseNodeAt: to access the method
+       parse tree.  Define it for Behavior and #methodParseNode for
+       CompiledMethod.
+       * compiler/STFileParser.st: Use MappedCollection trick for method
+       source too.  Look at the token type to distinguish '!' from other
+       tokens.
+       * compiler/STLoaderObjs.st: Store an RBMethodNode in LoadedMethod
+       objects.
+       * compiler/STLoaer.st: Adjust for above change to LoadedMethod.
+       * compiler/SqueakParser.st: New.
+
+       * scripts/Convert.st: Add example method to convert from Squeak.
+       
 2007-06-08  Paolo Bonzini  <address@hidden>
 
        * compiler/RBParser.st: Move file-in related stuff...


--- orig/compiler/Exporter.st
+++ mod/compiler/Exporter.st
@@ -299,14 +299,25 @@ FileOutExporter subclass: FormattingExpo
                     
        source := STInST.RBFormatter new
                      initialIndent: 1;
-                      format: (STInST.RBParser 
-                                  parseMethod: (class sourceCodeAt: selector)
-                                  category: (class compiledMethodAt: selector)
-                                               methodCategory).
+                      format: (class parseNodeAt: selector).
         outStream nextPutAll: source; nl.
     ]
 ]
 
+Behavior extend [
+    parseNodeAt: selector [
+        ^(self compiledMethodAt: selector) methodParseNode
+    ]
+]
+
+CompiledMethod extend [
+    methodParseNode [
+       ^STInST.RBParser 
+            parseMethod: self methodSourceString
+            category: self methodCategory
+    ]
+]
+
 Class extend [
     fileOutDeclarationOn: aFileStream [
         (STInST.FileOutExporter on: self to: aFileStream)


--- orig/compiler/STFileParser.st
+++ mod/compiler/STFileParser.st
@@ -159,7 +159,7 @@ parseMethodDefinitionList
     "Called after first !, expecting a set of bang terminated
      method definitions, followed by a bang"
 
-    | node start stop |
+    | node source start stop |
 
     [   scanner atEnd or: [ currentToken isSpecial
           and: [ currentToken value == $! ] ] ] whileFalse: [
@@ -171,7 +171,9 @@ parseMethodDefinitionList
         "One -1 accounts for base-1 vs. base-0 (as above), the
          other drops the bang because we have a one-token lookahead."
        stop := currentToken start - 2.
-        node source: (scanner stream segmentFrom: start to: stop).
+        source := scanner stream copyFrom: start to: stop.
+       source := MappedCollection collection: source map: (1 - start to: stop).
+        node source: source.
 
         scanner stripSeparators.
         self step.           "gobble method terminating bang"
@@ -258,13 +260,13 @@ RBScanner subclass: #STFileScanner
 !STFileScanner methodsFor: 'accessing'!
 
 next
-    | token ch |
+    | token |
     buffer reset.
     tokenStart := stream position.
-    ch := currentCharacter.
     characterType == #eof ifTrue: [^RBToken start: tokenStart + 1].    "The 
EOF token should occur after the end of input"
     token := self scanToken.
-    ch == $! ifFalse: [ self stripSeparators ].
+    (token isSpecial and: [ token value == $! ])
+        ifFalse: [ self stripSeparators ].
     ^token!
 
 


--- orig/compiler/STLoader.st
+++ mod/compiler/STLoader.st
@@ -214,12 +214,10 @@ endMethodList
 !
 
 defineMethod: node 
+    node category: currentCategory.
     ^currentClass methodDictionary
         at: (node selector asSymbol)
-        put: (LoadedMethod
-                category: currentCategory
-                source: (node source)
-                selector: (node selector asSymbol))
+        put: (LoadedMethod node: node)
 !
 
 compile: node


--- orig/compiler/STLoaderObjs.st
+++ mod/compiler/STLoaderObjs.st
@@ -168,7 +168,7 @@ LoadedBehavior comment:
 by an STClassLoader.'!
 
 Object subclass: #LoadedMethod
-        instanceVariableNames: 'source category selector'
+        instanceVariableNames: 'node'
         classVariableNames: ''
         poolDictionaries: ''
         category: 'System-Compiler'!
@@ -314,8 +314,13 @@ includesSelector: selector
     ^self methodDictionary includesKey: selector
 !
 
+parseNodeAt: selector
+    "Answer the parse tree (if available) for the given selector"
+    ^(self compiledMethodAt: selector) methodParseNode
+!
+
 sourceCodeAt: selector
-    "Answer source code (if available) for the given compiledMethod"
+    "Answer source code (if available) for the given selector"
     | source |
     source := (self compiledMethodAt: selector) methodSourceCode.
     source isNil ifTrue: [ '" *** SOURCE CODE NOT AVAILABLE *** "' copy ].
@@ -822,29 +827,39 @@ superclass: sup name: s instanceVariable
 
 !LoadedMethod class methodsFor: 'instance creation'!
 
-category: category source: source selector: selector
+node: aRBMethodNode
     ^self new
-           category: category
-           source: source
-           selector: selector
+           node: aRBMethodNode
 !
 
 !LoadedMethod methodsFor: 'accessing'!
 
+node
+    ^node
+!
+
+node: aRBMethodNode
+    node := aRBMethodNode
+!
+
+methodParseNode
+    ^self node
+!
+
 methodCategory
-    ^category
+    ^node category
 !
 
 methodSourceCode
-    ^source
+    ^node source
 !
 
 selector
-    ^selector
+    ^node selector asSymbol
 !
 
 methodSourceString
-    ^source asString
+    ^node source domain asString
 ! !
 
 !LoadedMethod methodsFor: 'empty stubs'!
@@ -852,14 +867,6 @@ methodSourceString
 discardTranslation
 ! !
 
-!LoadedMethod methodsFor: 'initializing'!
-
-category: c source: s selector: sel
-    category := c.
-    source := s.
-    selector := sel.
-! !
-
 !PseudoNamespace methodsFor: 'abstract'!
 
 name


--- orig/packages.xml.in
+++ mod/packages.xml.in
@@ -396,6 +396,7 @@
   <filein>STDecompiler.st</filein>
   <filein>STLoaderObjs.st</filein>
   <filein>STLoader.st</filein>
+  <filein>SqueakParser.st</filein>
   <filein>Exporter.st</filein>
 
   <filein>RewriteTests.st</filein>


--- orig/scripts/Convert.st
+++ mod/scripts/Convert.st
@@ -149,6 +149,11 @@ STInST.STClassLoader subclass: SyntaxCon
     entities that have been constructed.'>
     
     
+    SyntaxConverter class >> convertSqueakStream: in to: out [
+        <catogory: 'instance creation'>
+        ^self convertStream: in with: STInST.SqueakFileInParser to: out
+    ]
+    
     SyntaxConverter class >> convertStream: in to: out [
         <catogory: 'instance creation'>
         ^self convertStream: in with: STInST.STFileInParser to: out



--- /dev/null
+++ mod/compiler/SqueakParser.st
@@ -0,0 +1,125 @@
+"======================================================================
+|
+|   Squeak input parser
+|
+|
+ ======================================================================"
+
+
+"======================================================================
+|
+| Copyright 2007 Free Software Foundation, Inc.
+| Written by Paolo Bonzini.
+|
+| This file is part of GNU Smalltalk.
+|
+| GNU Smalltalk is free software; you can redistribute it and/or modify it
+| under the terms of the GNU General Public License as published by the Free
+| Software Foundation; either version 2, or (at your option) any later version.
+| 
+| GNU Smalltalk is distributed in the hope that it will be useful, but WITHOUT
+| ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
+| FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more
+| details.
+| 
+| You should have received a copy of the GNU General Public License along with
+| GNU Smalltalk; see the file COPYING.  If not, write to the Free Software
+| Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.  
+|
+ ======================================================================"
+
+
+STFileInParser subclass: #SqueakFileInParser
+    instanceVariableNames: ''
+    classVariableNames: ''
+    poolDictionaries: ''
+    category: 'Refactory-Parser'!
+
+!SqueakFileInParser methodsFor: 'accessing'!
+
+evaluate: node
+    "Convert some Squeak messages to GNU Smalltalk file-out syntax.
+     This avoids that the STParsingDriver need to know about other
+     dialects."
+    | stmt |
+    node statements size == 1 ifFalse: [ ^driver evaluate: node ].
+
+    stmt := node statements first.
+    stmt isMessage ifFalse: [ ^driver evaluate: node ].
+    stmt selector == #addCategory: ifTrue: [ ^false ].
+    stmt selector == #commentStamp:prior: ifTrue: [
+       stmt arguments: { scanner nextRawChunk }.
+       stmt selector: #comment: ].
+
+    stmt selector == #methodsFor:stamp: ifTrue: [
+       stmt arguments first value = 'as yet unclassified'
+           ifTrue: [ stmt arguments first token value: nil ].
+
+       stmt arguments: { stmt arguments first }.
+       stmt selector: #methodsFor: ].
+
+    ^driver evaluate: node
+! !
+
+!SqueakFileInParser methodsFor: 'private-parsing'!
+
+scannerClass
+    "We need a special scanner to convert the double-bangs in strings
+     to single bangs.  Unlike in GNU Smalltalk, all bangs must be
+     `escaped' in Squeak."
+    ^SqueakFileInScanner! !
+
+
+STFileScanner subclass: #SqueakFileInScanner
+    instanceVariableNames: ''
+    classVariableNames: ''
+    poolDictionaries: ''
+    category: 'Refactory-Parser'!
+
+
+
+!SqueakFileInScanner methodsFor: 'accessing'!
+
+scanLiteralString
+    "In theory, this should also be applied to method comments, but the
+     representation of comments in RBParseNode makes it more complicated;
+     not a big deal."
+    | val |
+    val := super scanLiteralString.
+    val value: (val value copyReplaceAll: '!!' with: '!').
+    val value: (val value copyReplacing: 13 asCharacter withObject: 10 
asCharacter).
+    ^val!
+
+scanSpecialCharacter
+    "Treat ! specially, it is a binary operator in Squeak (if properly 
escaped).
+     This actually fails on tokens like +!! which is the binary operator +!,
+     but this are never used in practice."
+
+    | val |
+    currentCharacter == $!
+       ifFalse: [^super scanSpecialCharacter].
+    self step == $!
+       ifFalse: [^RBSpecialCharacterToken value: $! start: tokenStart].
+
+    buffer nextPut: $!.
+    (characterType == #binary and: [currentCharacter ~~ $-]) ifTrue:
+            [buffer nextPut: currentCharacter.
+            self step].
+    val := buffer contents.
+    val := val asSymbol.
+    ^RBBinarySelectorToken value: val start: tokenStart!
+
+nextRawChunk
+    "Return a raw chunk, converting all double exclamation marks to single.
+     This is used for parsing Squeak class comments."
+
+    buffer reset.
+    [ currentCharacter == $! and: [self step ~~ $!] ]
+       whileFalse:
+            [buffer nextPut: currentCharacter.
+            self step].
+
+    self stripSeparators.
+    ^RBLiteralToken
+        value: buffer contents
+        start: tokenStart! !


reply via email to

[Prev in Thread] Current Thread [Next in Thread]