LLM Experiments, Part 1: Corrections

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

LLM Experiments, Part 1: Corrections

From:	Andrew Hyatt
Subject:	LLM Experiments, Part 1: Corrections
Date:	Mon, 22 Jan 2024 00:15:18 -0400
User-agent:	Gnus/5.13 (Gnus v5.13)


Hi everyone,

This email is a demo and a summary of some questions which coulduse your feedback in the context of using LLMs in Emacs, andspecifically the development of the llm GNU ELPA package. If thatinterests you, read on.

I'm starting to experiment with what LLMs and Emacs, together, arecapable of. I've written the llm package to act as a base layer,allowing communication various LLMs: servers, local LLMs, free,and nonfree. ellama, also a GNU ELPA package, is also showing someinteresting functionality - asking about a region, translating aregion, adding code, getting a code review, etc.

My goal is to take that basic approach that ellama is doing(providing useful functionality beyond chat that only the LLM cangive), and expand it to a new set of more complicatedinteractions. Each new interaction is a new demo, and as I writethem, I'll continue to develop a library that can support thesemore complicated experiences. The demos should be interesting, andmore importantly, developing them brings up interesting questionsthat this mailing list may have some opinions on.

To start, I have a demo of showing the user using an LLM torewrite existing text.

I've created a function that will ask for a rewrite of the currentregion. The LLM offers a suggestion, which the user can reviewwith ediff, and ask for a revision. This can continue until theuser is satisfied, and then the user can accept the rewrite, whichwill replace the region.


You can see the version of code in a branch of my llm source here:
https://raw.githubusercontent.com/ahyatt/llm/flows/llm-flows.el

And you can see the code that uses it to write the text correctorfunction here:

https://gist.githubusercontent.com/ahyatt/63d0302c007223eaf478b84e64bfd2cc/raw/c1b89d001fcbe948cf563d5ee2eeff00976175d4/llm-flows-example.el

There's a few questions I'm trying to figure out in all thesedemos, so let me state them and give my current guesses. Theseare things I'd love feedback on.

Question 1: Does the llm-flows.el file really belong in the llmpackage? It does help people code against llms, but it expandsthe scope of the llm package from being just about connecting todifferent LLMs to offering a higher level layer necessary forthese more complicated flows. I think this probably does makesense, there's no need to have a separate package just for thisone part.

Question 2: What's the best way to write these flows with multiplestages, in which some stages sometimes need to be repeated? It'skind of a state machine when you think about it, and there's astate machine GNU ELPA library already (fsm). I opted to not modelit explicitly as a state machine, optimizing instead to just usethe most straightforward code possible.

Question 3: How should we deal with context? The code that has thetext corrector doesn't include surrounding context (the textbefore and after the text to rewrite), but it usually is helpful.How much context should we add? The llm package does know aboutmodel token limits, but more tokens add more cost in terms ofactual money (per/token billing for services, or just the CPUenergy costs for local models). Having it be customizable makessense to some extent, but users are not expected to have a goodsense of how much context to include. My guess is that we shouldjust have a small amount of context that won't be a problem formost models. But there's other questions as well when you thinkabout context generally: How would context work in differentmodes? What about when context may spread in multiple files? It'sa problem that I don't have any good insight into yet.

Question 4: Should the LLM calls be synchronous? In general, it'snot great to block all of Emacs on a sync call to the LLM. On theother hand, the LLM calls are generally fast enough (a fewseconds, the current timeout is 20s) that the user isn't going tobe accomplishing much while the LLM works, and is likely to getinto a state where the workflow is waiting for their input and wehave to get them back to a state where they are interacting withthe workflow. Streaming calls are a way that works well for justgetting a response from the LLM, but when we have a workflow, theresponse isn't useful until it is processed (in the demo's case,until it is an input into ediff-buffers). I think things have tobe synchronous here.

Question 5: Should there be a standard set of user behaviors aboutediting the prompt? In another demo (one I'll send as a followup),with a universal argument, the user can edit the prompt, minuscontext and content (in this case the content is the text tocorrect). Maybe that should always be the case. However, thatprompt can be long, perhaps a bit long for the minibuffer. Using abuffer instead seems like it would complicate the flow. Also, ifthe context and content is embedded in that prompt, they wouldhave to be replaced with some placeholder. I think the promptshould always be editable, we should have some templating system.Perhaps emacs already has some templating system, and one that canpass arguments for number of tokens from context would be nice.

Question 6: How do we avoid having a ton of very specificfunctions for all the various ways that LLMs can be used? Besidescorrecting text, I could have had it expand it, summarize it,translate it, etc. Ellama offers all these things (but without thediff and other workflow-y aspects). I think these are too much forthe user to remember. It'd be nice to have one function when theuser wants to do something, and we work out what to do in theworkflow. But the user shouldn't be developing the promptthemselves; at least at this point, it's kind of hard to justthink of everything you need to think of in a good prompt. Theyneed to be developed, updated, etc. What might be good is a systemin which the user chooses what they want to do to a region as asecondary input, kind of like another kind ofexecute-extended-command.

These are the issues as I see them now. As I continue to developdemos, and as people in the list give feedback, I'll try to workthrough them.

BTW, I plan on continuing these emails, one for every demo, untilthe questions seem worked out. If this mailing list is not theappropriate place for this, let me know.

[Prev in Thread]

Current Thread

[Next in Thread]

LLM Experiments, Part 1: Corrections, Andrew Hyatt <=
- Re: LLM Experiments, Part 1: Corrections, Sergey Kostyaev, 2024/01/22
  - Re: LLM Experiments, Part 1: Corrections, Andrew Hyatt, 2024/01/22
    - Re: LLM Experiments, Part 1: Corrections, T.V Raman, 2024/01/22
    - Re: LLM Experiments, Part 1: Corrections, Andrew Hyatt, 2024/01/22
    - Re: LLM Experiments, Part 1: Corrections, T.V Raman, 2024/01/22
    - Re: LLM Experiments, Part 1: Corrections, Emanuel Berg, 2024/01/22
    - Re: LLM Experiments, Part 1: Corrections, Andrew Hyatt, 2024/01/22
- Re: LLM Experiments, Part 1: Corrections, João Távora, 2024/01/22
  - Re: LLM Experiments, Part 1: Corrections, T.V Raman, 2024/01/22
  - Re: LLM Experiments, Part 1: Corrections, Andrew Hyatt, 2024/01/23

Prev by Date: Re: Excessive use of `eassert`
Next by Date: Re: master 37889523278: Add new `swap` macro and use it
Previous by thread: Font weight selection problem
Next by thread: Re: LLM Experiments, Part 1: Corrections
Index(es):
- Date
- Thread