[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [GNUnet-developers] Namespaces / GNML

From: Jan Marco Alkema
Subject: RE: [GNUnet-developers] Namespaces / GNML
Date: Sun, 19 Jan 2003 16:43:30 -0800

Hello Tom,

The current gnunet cvs release uses less memory --) A hour ago I have had a
segmentation error. Shame on me I haven't run in gdb mode.

>There hasn't been much discussion. Either the proposal is
too good, or nobody is really interested in it.  :)

In mine point of view a good search facility is very useful. If you identify
all things properly you will have less bytes transferred and the idle bytes
in/out of gnunetd can be significantly reduced. Why should I get search
request if people know what I have in store?

> whether something you're going to download is likely to be something
you want.

If you have a good "map" of what you can find you can get what you want!

>, rather than content that you do not want, or even files of random

I don't think this is a problem. There will be a lot of files with garbage.
You only must known which are useful and not garbage you will be very
happy ---)

>Not only is it a waste of time to get such bad files, but downloading them
helps to spread their blocks across the network, reducing performance.

In the ideal situation you can check at the Microsoft site what the
footprint (MD5 + file size) should be of the Windows 2000 professional CD's.
If someone put it online you can check the footprints before transferring
the files to your local computer.

>Obviously, it isn't really possible to verify a file without
downloading it first.

In the current implementation of gnunet is a file size, Hash (a kind of MD5)
and CRC on every file. You should be able to verify a file if you known what
the file size, Hash and CRC should be.

>but I felt that if you either knew who put it up, or knew of somebody who
endorsed the file as real, that was prolly 95% as good, as you'd get a feel
of who's reliable.

I believe in the mix of decentralize and central concepts. People, companies
get GUID (global Unique Identification). Some kind of trusted party should
give GUID to them. If a person want to be anonymous the system should work
only those "anonymous people" don't get much priority to use mine system. If
a person with a real GUID (for example GUID-X) harms mine database then I
will remove the records inserted in mine database of GUID-X automatically.
Possible a blacklist of GUID may be build up.

>An application-layer protocol could be set up, involving a new file
format, call it GNML (GnuNetMarkupLanguage), which would be a (heavily)
stripped down HTML-like format, where hyperlinks (and any embedded
images if used) would not be URIs but exclusively GNUnet key/hash data,
and the entire file would have a digital signature.

There are other projects for searching URL's of files. For example Maybe we can use knowledge of them.

>Programs such as gnunet-gtk (which is still buggy, btw), would then have
a browser-like applet, tailored to rendering GNML (actually, GNML would
be designed with the browser in mind, to avoid having to code anything
gnarly) and able to check the signatures, remind the user of who the
author is (it should be possible to map public keys to long contrived
ask them whether to proceed with downloading+showing any inline
images, and start downloading any links clicked on.

I think that the best solution is Java for GUI and C for gnunet core. ODBC
database language for signatures, remind the user of who the author is, etc.

In mine opinion the minimal attributes of a file are ID, Path, Filename,
Filesize and MD5. The minimal footprint is MD5 + file size. All
(independent) files in the database and you know which files you have and
you don't have.

Maybe the "reference URL" is also an attribute. The reference URL for mysql
With a URL you look to the total distribution. Distributions have several
releases. A release can have several patches. There is and there should be
some relation between the independent files.

This file can be locally on mine server (URL= or
the file is duplicated on serveral download servers. It does not matter if I
rename this locally file to another name for instance mydatabase.tar.gz,
because you should check on footprint (MD5+ file size).

If someone downloads this file (mydatabase.tar.gz) from mine server
he can get the original filename by searching the reference filename in the

How much bytes must be transferred to upgrade mysql from release 7 to
release 9?

catalog mysql-4.0.7-gamma
Number of Files =       3121 Total file size =        56385961!

Number of not inserted files=204! Number of bytes in the not inserted files
= 18591204!

There are 204 files (18 mb) in the distribution of 56 Mb which are more than
one's stored in this distribution.

catalog mysql-4.0.9-gamma
Number of Files =       3124 Total file size =        56522592!

{Number of not inserted files=205!
 Number of bytes in the not inserted files = 18601325! => 205 files (18 Mb)
are obsolete}

I inserted all files of mysql-4.0.7-gamma in the catalog database and I
check than which files are different:

Number of not inserted files=3012! Number of bytes in the not inserted files
= 40875413!

Upgrade from mysql release 7 to release 9 (56522592 - 40875413 =) 15647179
bytes (16 Mb unzipped, 27 % of total) must be transferred.

In mine opinion the mysql example is generic to all other distributions
(php, apache, audio cd's, etc.).

I included the catalog program to verify mine information/observation,

Greetings Jan Marco

Appendix A: Catalog.sql


CREATE TABLE catalog (
        FileSN    BIGINT(10),
        Path      VARCHAR(255),
        FileName  VARCHAR(255),
        FileSize  BIGINT(15),
        MD5       CHAR(33)

CREATE INDEX idx_catalog ON catalog (FileSize, MD5);

Create the SQL database:

Shell> mysqladmin create gnunet

Create the table of flush the database:

Shell> mysql gnunet<catalog.sql

Attachment: catalog-1.01.tar.gz
Description: Binary data

reply via email to

[Prev in Thread] Current Thread [Next in Thread]