[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: symbol table changes for private functions, classes, etc.
From: |
David Bateman |
Subject: |
Re: symbol table changes for private functions, classes, etc. |
Date: |
Sat, 12 May 2007 07:46:45 +0200 |
User-agent: |
Thunderbird 1.5.0.7 (X11/20060921) |
John,
What might concern me with this idea is that it will potentially impact
the speed of Octave. The symbols that are out of scope when doing a
symbol lookup aren't useful for lookups in the current scope, but being
in the same map affect the speed of the lookup. I suppose std::map uses
a hash so perhaps this won't be so bad. In any case a few benchmarks of
complex examples are probably in order before and after the change
Also, do you see this change going into 3.0? It might be a bit dangerous
if a 3.0 release is imminent. If it goes in perhaps an additional delay
for 3.0 will be needed..
Regards
D.
John W. Eaton wrote:
> I've been working on implementing private functions and classes and
> it seems to me that there is no reasonable way to handle looking up
> these kinds of functions given the current design of the symbol
> table(s) in Octave.
>
> I propose that we completely replace what we have now (separate tables
> for each function plus tables for global variables, built-in
> functions, and symbols at the top level) with a single symbol table in
> which we keep track of different scopes using unique integer tags for each
> scope (one for each function, one for global variables, and one for
> the top level). Each function will be assigned a scope ID when it is
> constructed.
>
> The symbol table itself will be a std::map object that maps symbol
> names to symbol_record objects (but different from the current
> symbol_record object). The table will also contain a std::map object
> that maps scope IDs to lists of pointers to variable values in the
> given scope. That way, we can also clear variable values from a given
> scope.
>
> A symbol_record object contains all the information for all scopes and
> types of objects that a symbol might have:
>
> // Scope id to variable value.
> std::map<unsigned int, octave_variable> variables;
>
> // Scope id to function object.
> std::map<unsigned int, octave_value> subfunctions;
>
> // Directory name to function object.
> std::map<std::string, octave_value> private_functions;
>
> // Class name to function object.
> std::map<std::string, octave_value> class_constructors;
>
> // Dispatch type to function object.
> std::map<std::string, octave_value> class_methods;
>
> octave_value function_on_path;
>
> octave_value built_in_function;
>
> Variables are represented as an octave_variable object, which is just
> an octave_value plus an integer value that can hold one of
>
> enum CLASS
> {
> local = 0, // generic variable
> automatic = 1, // varargin, argn, __nargin__, __nargout__
> formal = 2, // formal parameter
> global = 4, // global (redirects lookup to global scope)
> persistent = 8, // not cleared at function exit
> };
>
> Lookup of symbol values is done in a member function of the
> symbol_record class.
>
> For example, to get a reference to a variable in the symbol table, you
> would write
>
> octave_value& val = symbol_table::varref ("name");
>
> Then you can modify VAL.
>
> Finding the definition of a value (variable or function) is a bit more
> complex as it requires the argument list (if any) so that dispatching
> for class methods will work. I'm not completely happy with what I've
> come up with so far for this. There are some comments in the code
> attached below that provide more detail about the problem(s).
>
> These changes will be quite invasive as the current design of the
> symbol table code is a bit of a mess and there are many places where
> the tables are accessed directly so I expect they will destabilize
> things for a while. However, in the long term I think the new
> approach will be much better, so I think it is worth the big change
> rather than trying to tack on more kluges to an already weak
> foundation.
>
> A draft of the code I'm working on is attached. It's not complete
> and I haven't tried to actually run it yet, but it does compile (it
> requires some changes to Octave that are already on the
> object-branch). The classes here are named xymbol_record and
> xymbol_table just to avoid conflicting with the current symbol_record
> and symbol_table class names.
>
> Comments welcome.
>
> Thanks,
>
> jwe
>
>
>
>
> ------------------------------------------------------------------------
>
> #include <list>
> #include <map>
> #include <string>
>
> #include "oct-obj.h"
> #include "ov.h"
> #include "symtab.h"
> #include "pt-arg-list.h"
>
> class
> octave_variable
> {
> public:
> enum CLASS
> {
> local = 0, // generic variable
> automatic = 1, // varargin, argn, __nargin__, __nargout__
> formal = 2, // formal parameter
> global = 4, // global (redirects lookup to global scope)
> persistent = 8, // not cleared at function exit
> };
>
> class octave_variable_rep
> {
> public:
>
> octave_variable_rep (const octave_value& v, unsigned int sc)
> : value (v), storage_class (sc), count (1) { }
>
> octave_variable_rep (const octave_variable_rep& ov)
> : value (ov.value), storage_class (ov.storage_class), count (1) { }
>
> octave_value& varref (void) { return value; }
>
> octave_value value;
>
> unsigned int storage_class : 4;
>
> int count;
>
> private:
>
> // No assignment!
>
> octave_variable_rep& operator = (const octave_variable_rep&);
> };
>
> octave_variable (const octave_value& v = octave_value (),
> unsigned int sc = local)
> : rep (new octave_variable_rep (v, sc)) { }
>
> octave_variable (const octave_variable& ov)
> : rep (ov.rep)
> {
> rep->count++;
> }
>
> octave_variable& operator = (const octave_variable& ov)
> {
> if (this != &ov)
> {
> rep = ov.rep;
> rep->count++;
> }
>
> return *this;
> }
>
> octave_value& varref (void) { return rep->value; }
>
> octave_variable_rep *rep;
> };
>
> class
> xymbol_record
> {
> public:
>
> typedef std::map<unsigned int, octave_variable>::iterator int_var_iterator;
> typedef std::map<unsigned int, octave_value>::iterator int_val_iterator;
> typedef std::map<std::string, octave_value>::iterator str_val_iterator;
>
> xymbol_record (void)
> : variables (), subfunctions (), private_functions (),
> class_constructors (), class_methods (),
> function_on_path (), built_in_function () { }
>
> octave_value
> find (const std::string& name, tree_argument_list *args,
> const string_vector& arg_names,
> octave_value_list& evaluated_args,
> bool& args_evaluated, unsigned int scope);
>
> octave_value& varref (unsigned int scope);
>
> private:
>
> // Scope id to variable value.
> std::map<unsigned int, octave_variable> variables;
>
> // Scope id to function object.
> std::map<unsigned int, octave_value> subfunctions;
>
> // Directory name to function object.
> std::map<std::string, octave_value> private_functions;
>
> // Class name to function object.
> std::map<std::string, octave_value> class_constructors;
>
> // Dispatch type to function object.
> std::map<std::string, octave_value> class_methods;
>
> octave_value function_on_path;
>
> octave_value built_in_function;
> };
>
> class
> xymbol_table
> {
> public:
>
> typedef std::map<std::string, xymbol_record*>::const_iterator
> const_table_iterator;
> typedef std::map<std::string, xymbol_record*>::iterator table_iterator;
>
> // Insert a new name in the table.
> static void insert (const std::string& name)
> {
> if (instance_ok ())
> instance->do_insert (name);
> }
>
> // Find a value corresponding to the given name in the table.
> static octave_value
> find (const std::string& name, tree_argument_list *args,
> const string_vector& arg_names,
> octave_value_list& evaluated_args, bool& args_evaluated,
> unsigned int scope = current_scope)
> {
> return instance_ok ()
> ? instance->do_find (name, args, arg_names, evaluated_args,
> args_evaluated, scope)
> : octave_value ();
> }
>
> // Return a reference to the value
> static octave_value&
> varref (const std::string& name, unsigned int scope = current_scope)
> {
> static octave_value foobar;
>
> return instance_ok () ? instance->do_varref (name, scope) : foobar;
> }
>
> static void chain (unsigned int scope, octave_value *val)
> {
> if (instance_ok ())
> instance->do_chain (scope, val);
> }
>
> static const unsigned int global_scope;
>
> static unsigned int current_scope;
>
> private:
>
> static xymbol_table *instance;
>
> xymbol_table (void) : table () { }
>
> ~xymbol_table (void) { }
>
> static bool instance_ok (void)
> {
> bool retval = true;
>
> if (! instance)
> instance = new xymbol_table ();
>
> if (! instance)
> {
> error ("unable to create xymbol_table object!");
> retval = false;
> }
>
> return retval;
> }
>
> void do_insert (const std::string& name);
>
> octave_value
> do_find (const std::string& name, tree_argument_list *args,
> const string_vector& arg_names,
> octave_value_list& evaluated_args, bool& args_evaluated,
> unsigned int);
>
>
> octave_value& do_varref (const std::string& name, unsigned int);
>
> void do_chain (unsigned int scope, octave_value *val)
> {
> variables_by_scope[scope].push_back (val);
> }
>
> // Map from symbol name to object containing all info about the
> // known objects with this name.
> std::map<std::string, xymbol_record*> table;
>
> // Map from scope id to list of pointers to variable values in the
> // scope.
> std::map<unsigned int, std::list<octave_value *> > variables_by_scope;
>
> // Map from scope id to current call depth. Used to decide whether
> // to push new value on stack for recursive function calls.
> // FIXME -- how will this work?
> std::map<unsigned int, unsigned int> call_depth;
> };
>
>
> ------------------------------------------------------------------------
>
> #ifdef HAVE_CONFIG_H
> #include <config.h>
> #endif
>
> #include "oct-time.h"
> #include "file-ops.h"
>
> #include "load-path.h"
> #include "new-symtab.h"
> #include "ov-fcn.h"
> #include "toplev.h"
>
> xymbol_table *xymbol_table::instance = 0;
>
> const unsigned int xymbol_table::global_scope = 0;
>
> unsigned int xymbol_table::current_scope = 1;
>
> static octave_function *
> load_fcn_from_file (const std::string& file_name)
> {
> // WRITEME
> return 0;
> }
>
> // Find the definition of NAME according to the following precedence
> // list:
> //
> // variable
> // subfunction
> // private function
> // class constructor
> // class method
> // function on the path
> // built-in function
>
> // Notes:
> //
> // Code marked with "FIXME -- out of date check here" need to be
> // modified to check the load path to see if file that defined this
> // is still visible. If the file is no longer visible, then erase
> // the definition and move on. If the file is visible, then we also
> // need to check to see whether the file has changed since the the
> // function was loaded/parsed. However, this check should only
> // happen once per prompt (for files found from relative path
> // elements, we also check if the working directory has changed
> // since the last time the function was loaded/parsed).
> //
> // Code marked with "FIXME -- load_fcn_from_file changes" will need
> // a new definition of load_fcn_from_file that only loads the
> // function, doesn't execute script files, and doesn't enter the
> // function in the symbol table (we do that here).
>
> // FIXME -- we need to evaluate the argument list to determine the
> // dispatch type. The method used here works (pass in the args, pass
> // out the evaluated args and a flag saying whether the evaluation was
> // needed), but it seems a bit inelegant. We do need to save the
> // evaluated args in some way to avoid evaluating them multiple times.
> // Maybe evaluated args could be attached to the tree_argument_list
> // object? Then the argument list could be evaluated outside of this
> // function and we could elimnate the arg_names, evaluated_args, and
> // args_evaluated arguments. We would still want to avoid computing
> // the dispatch type unless it is needed, so the args should be passed
> // rather than the dispatch type. But the arguments will need to be
> // evaluated no matter what, so evaluating them beforehand should be
> // OK. If the evaluated arguments are attached to args, then we would
> // need to determine the appropriate place(s) to clear them (for
> // example, before returning from tree_index_expression::rvalue).
>
> octave_value
> xymbol_record::find (const std::string& name, tree_argument_list *args,
> const string_vector& arg_names,
> octave_value_list& evaluated_args,
> bool& args_evaluated, unsigned int scope)
> {
> int_var_iterator p = variables.find (scope);
>
> // Variable.
>
> if (p != variables.end ())
> {
> octave_variable var = p->second;
>
> if (var.global)
> {
> p = variables.find (xymbol_table::global_scope);
>
> // FIXME -- can we guarantee that this should not fail?
> assert (p != variables.end ());
>
> var = p->second;
> }
>
> octave_value& val = var.varref ();
>
> if (val.is_defined ())
> return val;
> }
>
> // Subfunction. I think it only makes sense to check for
> // subfunctions if we are currently executing a function defined
> // from a .m file.
>
> octave_function *curr_fcn = octave_call_stack::current ();
>
> if (curr_fcn && curr_fcn->is_user_function ())
> {
> int_val_iterator q = subfunctions.find (scope);
>
> if (q != subfunctions.end ())
> {
> // FIXME -- out-of-date check here.
>
> return q->second;
> }
> }
>
> // Private function.
>
> if (curr_fcn)
> {
> std::string file_name = curr_fcn->fcn_file_name ();
>
> if (! file_name.empty ())
> {
> size_t pos = file_name.find_last_of (file_ops::dir_sep_chars);
>
> if (pos != NPOS)
> {
> std::string dir_name = file_name.substr (0, pos);
>
> str_val_iterator q = private_functions.find (dir_name);
>
> if (q == private_functions.end ())
> {
> file_name = load_path::find_private_fcn (dir_name, name);
>
> if (! file_name.empty ())
> {
> // FIXME -- load_fcn_from_file changes.
>
> octave_function *fcn = load_fcn_from_file (file_name);
>
> if (fcn)
> {
> octave_value val (fcn);
>
> private_functions[dir_name] = val;
>
> return val;
> }
> }
> }
> else
> {
> // FIXME -- out-of-date check here.
>
> return q->second;
> }
> }
> }
> }
>
> // Class constructors. The class name and function name are the same.
>
> str_val_iterator q = class_constructors.find (name);
>
> if (q == class_constructors.end ())
> {
> std::string file_name = load_path::find_method (name, name);
>
> if (! file_name.empty ())
> {
> // FIXME -- load_fcn_from_file changes.
>
> octave_function *fcn = load_fcn_from_file (file_name);
>
> if (fcn)
> {
> octave_value val (fcn);
>
> class_constructors[name] = val;
>
> return val;
> }
> }
> }
> else
> {
> // FIXME -- out-of-date check here.
>
> return q->second;
> }
>
> // Class methods.
>
> if (args && args->length () > 0)
> {
> evaluated_args = args->convert_to_const_vector ();
>
> if (! error_state)
> {
> args_evaluated = true;
>
> int n = evaluated_args.length ();
>
> if (n > 0)
> evaluated_args.stash_name_tags (arg_names);
>
> string_vector arg_classes (n);
>
> for (int i = 0; i < n; i++)
> arg_classes[i] = evaluated_args(i).class_name ();
>
> // FIXME -- need to choose dispatch type, not just use type
> // of first argument.
>
> std::string dispatch_type = arg_classes[0];
>
> q = class_methods.find (dispatch_type);
>
> if (q == class_methods.end ())
> {
> std::string file_name = load_path::find_method (dispatch_type,
> name);
>
> if (! file_name.empty ())
> {
> // FIXME -- load_fcn_from_file changes.
>
> octave_function *fcn = load_fcn_from_file (file_name);
>
> if (fcn)
> {
> octave_value val (fcn);
>
> class_methods[dispatch_type] = val;
>
> return val;
> }
> }
> }
> else
> {
> // FIXME -- out-of-date check here.
>
> return q->second;
> }
> }
> else
> return octave_value ();
> }
>
> // Function on the path.
>
> if (function_on_path.is_defined ())
> {
> // FIXME -- out-of-date check here.
>
> return function_on_path;
> }
>
> // Built-in function.
>
> if (built_in_function.is_defined ())
> {
> // FIXME -- out-of-date check here.
>
> return built_in_function;
> }
>
> return octave_value ();
> }
>
> octave_value&
> xymbol_record::varref (unsigned int scope)
> {
> int_var_iterator p = variables.find (scope);
>
> if (p == variables.end ())
> {
> octave_variable& v = variables[scope];
>
> octave_value& val = v.varref ();
>
> xymbol_table::chain (scope, &val);
>
> return val;
> }
> else
> {
> octave_variable& v = p->second;
>
> return v.global
> ? variables[xymbol_table::global_scope].varref ()
> : v.varref ();
> }
> }
>
> void
> xymbol_table::do_insert (const std::string& name)
> {
> table[name] = new xymbol_record ();
> }
>
> octave_value
> xymbol_table::do_find (const std::string& name, tree_argument_list *args,
> const string_vector& arg_names,
> octave_value_list& evaluated_args,
> bool& args_evaluated, unsigned int scope)
> {
> octave_value retval;
>
> evaluated_args = octave_value_list ();
> args_evaluated = false;
>
> table_iterator p = table.find (name);
>
> if (p != table.end ())
> {
> xymbol_record *sr = p->second;
>
> retval = sr->find (name, args, arg_names, evaluated_args,
> args_evaluated, scope);
> }
>
> return retval;
> }
>
> octave_value&
> xymbol_table::do_varref (const std::string& name, unsigned int scope)
> {
> table_iterator p = table.find (name);
>
> if (p == table.end ())
> {
> do_insert (name);
>
> p = table.find (name);
> }
>
> xymbol_record *sr = p->second;
>
> return sr->varref (scope);
> }