[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#22241: 25.0.50; etags Ruby parser problems
From: |
Dmitry Gutov |
Subject: |
bug#22241: 25.0.50; etags Ruby parser problems |
Date: |
Sat, 23 Jan 2016 21:23:57 +0300 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:44.0) Gecko/20100101 Thunderbird/44.0 |
On 01/23/2016 07:38 PM, Eli Zaretskii wrote:
I don't speak Ruby. So please give a more detailed spec for the
features you want added. I wrote some questions below, but I'm quite
sure there are more questions I should ask, but don't know about. So
please provide as complete specification for each feature as you
possibly can, TIA.
There's no actual up-to-date language spec, and when in doubt, I fire up
the REPL and try things out (and forget many of the results afterwards).
So there's no "detailed spec" in my head. Let me just try my best
answering your questions, for now.
- Constants are not indexed.
What is the full syntax of a "constant"? Is it just
IDENTIFIER "=" INTEGER-NUMBER
Pretty much. IDENTIFIER should be ALL_CAPS, or CamelCase, with
underscores allowed.
INTEGER-NUMBER should be just EXPRESSION, because it can be any
expression, possibly a multiline one.
CamelCase constants usually are assigned some "anonymous class" value,
like in the following example:
SpecialError = Class.new(StandardError)
(Which is a metaprogramming-y way to define the class SpecialError).
But you probably shouldn't worry about ALL_CAPS vs CamelCase distinction
here, and just treat them the same.
? Is whitespace significant? What about newlines?
No spaces around "=" is fine. Spaces can also be replaced by tabs. A
newline before "=" is not allowed.
- Class methods (def self.foo) are given the wrong name ("self."
shouldn't be included).
Is it enough to remove a single "self.", case-sensitive, at the
beginning of an identifier? Can there be more than one, like
"self.self.SOMETHING"?
One one "self." is allowed. When you remove it, you should record that
SOMETHING is a method defined on the current class (or module). In Java
terms, say, it would be like "static" method.
The upshot is, it can be called on the class itself, but not on its
instance:
irb(main):001:0> class C
irb(main):002:1> def self.foo
irb(main):003:2> 3
irb(main):004:2> end
irb(main):005:1> end
=> nil
irb(main):006:0> C.foo
=> 3
irb(main):007:0> C.new.foo
NoMethodError: undefined method `foo' for #<C:0x000000020141e8>
So the qualified name of that method should be "C.foo", as opposed to
"C#foo" for an instance method.
Your other example, i.e.
def ModuleExample.singleton_module_method
indicates that anything up to and including the period should be
removed, is that correct?
More or less. This is an "explicit syntax", which is equivalent to using
"self.". These two declarations are equivalent:
module ModuleExample
def ModuleExample.foo
end
end
module ModuleExample
def self.foo
end
end
Is there only one, or can there be many?
There can be only one dot there. There could be a method resolution
operator (::) in there, I suppose, but I'm not sure if you want to add
support for that right now, or ever.
Should they all be removed for an unqualified name?
Yes.
- "class << self" blocks are given a separate entry.
What should be done instead? Can't a class be named "<<"?
A class cannot be named "<<". You should not add that line to the index,
but record that the method definitions inside the following scope are
defined on the current class or module. These are equivalent:
class C
def self.foo
end
end
class C
class << self
def foo
end
end
end
- Qualified tag names are never generated.
(Etags never promised qualified names except for C and derived
languages, and also in Java.)
OK, that would be a nice bonus, but we can live without it. ctags
doesn't define qualified names either.
Without qualified names, I suppose you should treat
def self.foo
end
and
def foo
end
and
def Class.foo
end
the same. Only record those as "foo".
How to know when a module's or a class's scope ends? Is it enough to
count "end" lines?
Hmm, maybe? I'm guessing etags doesn't really handle heredoc syntax, or
multiline strings defined with percent literals (examples here:
https://en.wikibooks.org/wiki/Ruby_Programming/Syntax/Literals#.22Here_document.22_notation)
The result shouldn't be too bad if you do that, anyway. Except:
Can I assume that "end" will always appear by
itself on a line?
Unfortunately, no. It can also be on the same line, after a semicolon
(or on any other line, I suppose, but nobody writes Ruby like that).
Examples:
class SpecialError < StandardError; end
or
class MyStruct < Struct.new(:a, :b, :c); end
(One could also stick a method definition inside that, but I haven't
seen that in practice yet). So, either:
- 'end' is on a separate line (after ^[ \t]*).
- class/module Name[< ]...; end$
'end' can also be followed by "# some comment" in both cases.
Can I disregard indentation of "end" (and of
everything else) when I determine where a scope begins and ends?
Probably, yes.
Indentation is not significant in Ruby, but heredocs can mess up the
detection of 'end' keywords, so we could use indentation as a way to
detect where each scope ends. But if etags doesn't normally do that,
let's not go there now.
A
A::B
A::B::ABC
A::B#foo!
A::B.bar?
A::B.qux=
Why did 'foo!' get a '#' instead of a '.', as for '_bar'?
It's common to use '#' in the qualified names of instance methods, in
Java, Ruby and JS docstrings. '.' is used for class methods (static
methods, in Java), or methods defined on other singleton objects.
Examples:
http://usejsdoc.org/tags-inline-link.html (search for '#' there)
http://stackoverflow.com/questions/5915992/javadoc-writing-links-to-methods
http://docs.ruby-lang.org/en/2.1.0/RDoc/Markup.html#class-RDoc::Markup-label-Links
(the documentation also says to use ":: for class methods", but let's
not do that)
> Why doesn't
> "class << self" count as a class scope, and add something to qualified
> names?
It just served to turn 'qux=' into a class (static) method.
should become (the unqualified version):
A
foo
bar=
tee
tee=
qux
All attr_* methods can take a variable number of arguments. The parser
should take each argument, check that it's a symbol and not a variable
(starts with :), and if so, record the corresponding method name.
Why did 'bar' and 'tee' git a '=' appended?
Because 'attr_writer :bar' effectively expands to
def bar=(val)
@bar = val
end
and 'attr_accessor :tee' expands into
def tee
@tee
end
def tee=(val)
@tee = val
end
Are there any other such "append rules"?
There are other macros (any code can define a macro), but let's not
worry about them now.
- bug#22241: 25.0.50; etags Ruby parser problems, Eli Zaretskii, 2016/01/23
- bug#22241: 25.0.50; etags Ruby parser problems,
Dmitry Gutov <=
- bug#22241: 25.0.50; etags Ruby parser problems, Eli Zaretskii, 2016/01/23
- bug#22241: 25.0.50; etags Ruby parser problems, Dmitry Gutov, 2016/01/23
- bug#22241: 25.0.50; etags Ruby parser problems, Eli Zaretskii, 2016/01/23
- bug#22241: 25.0.50; etags Ruby parser problems, Dmitry Gutov, 2016/01/23
- bug#22241: 25.0.50; etags Ruby parser problems, Eli Zaretskii, 2016/01/24
- bug#22241: 25.0.50; etags Ruby parser problems, Eli Zaretskii, 2016/01/30
- bug#22241: 25.0.50; etags Ruby parser problems, Dmitry Gutov, 2016/01/30
- bug#22241: 25.0.50; etags Ruby parser problems, Eli Zaretskii, 2016/01/30
- bug#22241: 25.0.50; etags Ruby parser problems, Dmitry Gutov, 2016/01/31
- bug#22241: 25.0.50; etags Ruby parser problems, Eli Zaretskii, 2016/01/31