[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] wget mirror site failing due to file / directory name cla
From: |
Paul Beckett (ITCS) |
Subject: |
Re: [Bug-wget] wget mirror site failing due to file / directory name clashes |
Date: |
Tue, 16 Oct 2012 09:50:57 +0000 |
Tim,
You raise a good point about CMS functionality exceeding that of flat pages,
and I understand that some CMS features wouldn't function. However for our
public facing site the majority of the site is essentially static pages that
would be reproduced perfectly. Another portion is dynamically generated based
upon a get request where all the information is in the URL, so I think these
could probably also be reproduced. For our site there would only be a
relatively small amount of content that required more dynamic interaction with
the server and couldn't be flattened
We do have fairly resilient load balanced systems, however these are not
infallible, they are also difficult (and expensive in licensing) to replicate
reliably outside of our data centre, to cater for a total loss of connectivity
to our data centre.
What I would like to do is use Apache's proxy-balancer to do some of the
current load-balancing but be able to failover to flat pages (reasoning this is
far better than nothing), in the event that all of the load-balanced nodes
fail. And to be able to mirror the flattened pages offsite in case our data
centre lost network connectivity.
Thanks,
Paul
>-----Original Message-----
>From: Tim Ruehsen [mailto:address@hidden
>Sent: Tuesday, October 16, 2012 9:14 AM
>To: address@hidden
>Cc: Paul Beckett (ITCS)
>Subject: Re: [Bug-wget] wget mirror site failing due to file / directory name
>clashes
>
>Am Friday 12 October 2012 schrieb Paul Beckett (ITCS):
>> I am attempting to use wget to create a mirrored copy of a CMS
>> (Liferay) website. I want to be able to failover to this static copy
>> in case the application server goes offline. I therefore need the
>> URL's to remain absolutely identical. The problem I have is that I
>> cannot figure out how I can configure wget in a way that will cope with:
>> http://www.example.com/about
>> http://www.example.com/about/something
>
>You can't make a failover copy with wget like tools. Maybe except for very
>simple web sites, but a CMS isn't that simple.
>On a web server there will be many essential resources that are not available
>via remote access (e.g. scripts, servlets, server configuration, database,
>...).
>What I want to say is: Even if you solve this (minor) problem of not being able
>to map URL paths to the local filesystem (a problem that occurs from time to
>time which can generally be solved by transforming the URL into a key/value
>pair. AFAIK, wget doesn't have such a feature yet), you will stumble over the
>next problem that prevents your copy to be a failover copy.
>
>It sounds that you have administrative access to your company's web server.
>So why not using any of the thousands of "professional"
>backup/failover/redundancy mechanisms for such use cases ?
>E.g. a filesystem and database cluster - today there should be out-of-the-box
>solutions.
>
>But maybe I don't get your intention...
>
>Tim