bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] [GSoC] Proposal: “Design and implementation of a framewo


From: Akash Rawal
Subject: Re: [Bug-wget] [GSoC] Proposal: “Design and implementation of a framework for plugins.”
Date: Mon, 3 Apr 2017 01:10:08 +0530
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0



On 04/02/2017 10:41 PM, Darshit Shah wrote:
* Akash Rawal <address@hidden> [170331 23:34]:
Hello,

First of all, thanks for the initial review of my proposal!

I have made some changes in response to the comments as well as last
mail on the mailing list by Darshit Shah. Kindly consider reviewing it.
For reference here is the link.
https://docs.google.com/document/d/1SBM-jNG5dJL85C7iK9aid2eDlByrxOsYUNDrk4049vQ/edit?usp=sharing


The proposal is still not final, in the sense that I am analysing the
code for further ideas of what can be customized using plugins.

I was asked to elaborate on rewriting of command line option processing
code. So here are my thoughts on it.

The current code in src/options.c has following drawbacks:
 -  Built around order-insensitive and idempotent model (Supplying
    an option modifies a variable). This might be adequete for wget,
    but might not be the case for plugins. Repeated options or
    order of options can matter for plugins.
Breaking the order-insensitive and idempotent model of command line
processing is an absolute no-go. It will definitely break many scripts
out there. We do wish to make it easy to transition from Wget to Wget2
and this is a huge change in how options are processed.  Also, I see no
clear advantages to why this should be required.
Backward compatibility will only be broken if options are changed,
which I didn't plan to do. wget2 will support all options it
supports now. Existing options shall remain order-insensitive
and idempotent, only new options and plugin-side options may
take advantage of greater flexibility.

Many programs have order-sensitive (e.g. ffmpeg, options
apply to input or output file based on placement) and non-idempotent
(e.g. -v (verbose) option in many program, -vv for higher verbosity)
 -  The need for keeping the list of options sorted. I don't think
    we can expect plugin writers to do the same.
No. Plugins should not be allowed to tinker with certain core
functionalities and that includes how options are processed.
 -  And most importantly, options cannot be added at runtime.
Which is a very good thing. Keeping everything at compile time means we
can have most of that code on the stack without the need for costly
dynamic memory allocations on each invokation. Options processing code
should definitely not be dynamic.
Okay if that is your requirement I'll remove it from my proposal.

So far I think rewriting the option processing code would be easier
than trying to patch existing code to support custom options.
Some parts of the original code could be reused but structure
would undergo a drastic change anyway.

Easier? Have you every tried writing one? The number of use cases and
corner cases you'd have to handle is massive. That alone could be an
entire GSoC project. You can't expect to write it as a byproduct of your
project.
I have written simpler ones. But even in wget's code option processor is not so large (~400 LOC, starting from src/option.c:853)
Cases for long options:
    --option
    --option=value
    --option value
    --no-option
Cases for short options:
    -x
    -x value
    -xvalue
    -xyzw
    -xyzvalue
    -xyz value


The idea I have is to expose an interface for plugins to describe
command line options they want to receive. I have attached a
prototype header. From this interface, wget will have a list of
command line options to support, along with plugin function to
be called for handling plugin specific options. In addition to
using this to parse command line, wget can Use the descriptions
and category information provided to generate --help output.

First let me ask you, how exactly do you envision a user using plugins
with Wget2? How are plugins selected? How does the user even know if a
plugin exists?
How about:
    $ wget_plugins="foo" wget2 --foo-option http://example.com/file.html
Or:
    $ wget2 --plugin=foo --foo-option http://example.com/file.html
Hypothetical example: music site parser that filters by rating
    $ wget2 --plugin=music_rating_filter --music_min_rating=4 \
          --recursive --level=20 \
          http://example_music_site.com/music/1
Multiple plugins:
    $ wget_plugins="foo:bar" wget2 --plugin=foo --plugin=bar \
          --foo-option --bar-option=value http://example.com/file.html
Or:
    $ wget2 --plugin=foo --plugin=bar --foo-option --bar-option=value \
          http://example.com/file.html
    $ #The above is also an example of non-idempotent option --plugin
Loading plugins which are not installed:
    $ wget2 --local-plugin=/path/to/libfoo.so --foo-option \
          http://example.com/file.html

And regarding user knowing which plugins are installed:
    $ wget2 --list-plugins
    foo
    bar
    music_rating_filter
Plugins could be installed in a directory like
 ${PREFIX}/lib/wget/plugins/. Many software do like this.

Regarding the need to support plugin side addition of command line
options at the first place, here are my thoughts.
Plugins might need additional data from the user. I can think
of 3 ways in which it can be done:
 1. Environment variables.
 2. Files with well-known names.
 3. Command line.
Environment variables have a disadvantage that they are automatically
passed to the subprocesses, creating a chance for interference.
Also, name clashes cannot be discovered.

Using files with well-known names has a chance of interference
with parallel running instances of wget.

None of these disadvantages are shared with using command line
options.

However, all three are commonly used to provide options to a program and
should all always be supported. Environment variables allow for all
instances in one subshell to share a set of commands without having to
pass them multiple times and without interfering with instances in other
subshells
Configuration files are useful so the system administrator can set a
bunch of common options for all users, maybe to enforce a policy.
Command line options give the user the power to modify options on a per
invokation level.
Per invocation level is actually my primary concern.


Opinions are welcome.

Regards,
Akash Rawal.




So the summary is, due to the requirement of running the code on
stack as much as possible, I'm removing the customizable option
processing from my proposal.

Regards,
Akash Rawal.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]