[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to add pseudo vector types

From: Yuan Fu
Subject: Re: How to add pseudo vector types
Date: Thu, 22 Jul 2021 09:47:45 -0400


Do you want to discuss this?  I'd prefer to have it the other way
around: use BUF_ZV_BYTE by default.  The callers could widen the
buffer if they needed to access outside of the narrowing.

Yes, I meant to discuss this. The problem with respecting narrowing is that, a user can freely narrow and widen arbitrarily, and Emacs needs to translate them into insertion & deletion of the buffer text for tree-sitter, every time a user narrows or widens the buffer. Plus, if tree-sitter respects narrowing, it could happen where a user narrows the buffer, the font-locking changes and is not correct anymore. Maybe that’s not the user want. Also, if someone narrows and widens often, maybe narrow to a function for better focus, tree-sitter needs to constantly re-parse most of the buffer. These are not significant disadvantages, but what do we get from respecting narrowing that justifies code complexity and these small annoyances?

  *bytes_read = (uint32_t) len;

Is using uint32_t the restriction of tree-sitter?  Doesn't it support
reading more than 2 gigabytes?

I’m not sure why it asks for uint32 specifically, but that’s what it asks for its api. I don’t think you are supposed to use tree-sitter on files of size of gigabytes, because the author mentioned that tree-sitter uses over 10x as much memory as the size of the source file [1]. On files larger than a couple of megabytes, I think we better turn off tree-sitter. Normally those files are not regular source files, anyway, and we don’t need a parse tree for a log.

That leads to another point. I suspect the memory limit will come before the speed limit, i.e., as the file size increases, the memory consumption will become unacceptable before the speed does. So it is possible that we want to outright disable tree-sitter for larger files, then we don’t need to do much to improve the responsiveness of tree-sitter on large files. And we might want to delete the parse tree if a buffer has been idle for a while. Of course, that’s just my superstition, we’ll see once we can measure the performance.

+DEFUN ("tree-sitter-node-type",
+       Ftree_sitter_node_type, Stree_sitter_node_type, 1, 1, 0,
+       doc: /* Return the NODE's type as a symbol.  */)
+  (Lisp_Object node)
+  CHECK_TS_NODE (node);
+  TSNode ts_node = XTS_NODE (node)->node;
+  const char *type = ts_node_type(ts_node);
+  return intern_c_string (type);

Why do we need to intern the string each time? can't we store the
interned symbol there, instead of a C string, in the first place?

I’m not sure what do you mean by “store the interned symbol there”, where do I store the interned symbol? (BTW, If you see something wrong, that’s probably because I don’t know the right way to do it, and grepping only got me that far.)

[1]: https://github.com/tree-sitter/tree-sitter/issues/222#issuecomment-435987441


reply via email to

[Prev in Thread] Current Thread [Next in Thread]