[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug #66248] Sending wget spider to the background will avoid issue with
From: |
anonymous |
Subject: |
[bug #66248] Sending wget spider to the background will avoid issue with Bad file descriptor |
Date: |
Tue, 24 Sep 2024 10:40:32 -0400 (EDT) |
URL:
<https://savannah.gnu.org/bugs/?66248>
Summary: Sending wget spider to the background will avoid
issue with Bad file descriptor
Group: GNU Wget
Submitter: None
Submitted: Tue 24 Sep 2024 02:40:29 PM UTC
Category: Program Logic
Severity: 3 - Normal
Priority: 5 - Normal
Status: None
Privacy: Public
Assigned to: None
Originator Name: freddieventura
Originator Email: creativefreddieventura@gmail.com
Open/Closed: Open
Discussion Lock: Any
Release: trunk
Operating System: GNU/Linux
Reproducibility: Every Time
Fixed Release: None
Planned Release: None
Regression: None
Work Required: None
Patch Included: None
_______________________________________________________
Follow-up Comments:
-------------------------------------------------------
Date: Tue 24 Sep 2024 02:40:29 PM UTC By: Anonymous
Hi,
I was just doing a simple webcrawl, trying to gather a list of all the urls in
a .txt , doing a spider check first.
I am following the first response on this thread.
https://stackoverflow.com/questions/52610592/wget-spider-a-website-to-collect-all-links
But I am doing
```
wget --spider --force-html --span-hosts -np --limit-rate=20k -e robots=off
--wait=3 --random-wait -r -l2 https://developers.google.com -o wget.log &
```
It works. But I just wanted to run it on the foreground (not send it to the
bacground)
So I am just doing this
```
wget --spider --force-html --span-hosts -np --limit-rate=20k -e robots=off
--wait=3 --random-wait -r -l2 https://developers.google.com -o wget.log
```
This last doesnt work , it takes 3 seconds to exit `wget` giving some lines on
the log like thisone
```
developers.google.com: No such file or directory
developers.google.com/index.html.tmp.tmp: Bad file descriptor
Cannot write to ‘developers.google.com/index.html.tmp.tmp’ (Bad file
descriptor).
Found no broken links.
```
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?66248>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
signature.asc
Description: PGP signature
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [bug #66248] Sending wget spider to the background will avoid issue with Bad file descriptor,
anonymous <=