[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Standard example datasets

From: Alois Schloegl
Subject: Re: Standard example datasets
Date: Thu, 2 May 2019 08:49:06 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1

On 28.04.19 14:27, Carnë Draug wrote:
> On Sat, 27 Apr 2019 at 21:02, Andrew Janke <address@hidden> wrote:
>> Hi, Octave maintainers,
>> Some other statistical programs ship with standard example datasets and
>> methods to load or explore them. Does Octave have something like this?
>> For example, R ships with a bunch of example datasets in its "datasets"
>> package, and you can view a list of them by doing `data()`. And Matlab
>> ships with a bazillion example datasets that seem to all be just MAT
>> files in its source code root directories, that you can access with
>> load, like `load patients`.
>> Use case: I'm working on table stuff, and would like to add some example
>> tabular datasets in my package. Wondering if there's a standard
>> mechanism I should integrate with.
> Matlab also comes with such datasets.  Ideally we would have the same
> so that examples that use them work in Octave as well.  It would also
> simplify some test cases which require generation of input data (I
> would arguee that would actually enable them because if generation of
> such complex datasets is too complicated then there's no tests for
> them).
> Anyway, there is already an item on the tracker [1] that lists the
> ones in Matlab.  The issue is finding who is the copyright holder of
> such data and contact them.
> [1] https://savannah.gnu.org/patch/?9544


please consider also load_fisheriris (part of NaN-toolbox), which
downloads the data from [1] and caches the files in the current working
directory. This approach requires a network connection at runtime, when
the data is first downloaded. Perhaps a similar approach would be
suitable for other data sets.

If licensing of data is an issue, such download and cache mechanism
might be a viable solution. And the function "load" could provide a
functionality such that
    load fisheriris
would work out of the box.

BTW, the site [1] contains a number of other data sets, that octave
might want to support.

[1] http://archive.ics.uci.edu/ml/machine-learning-databases/iris/


reply via email to

[Prev in Thread] Current Thread [Next in Thread]