This is Interesting: Free Magazines for Graphics designers and webmasters  


Home > Archive > Webmaster forum > June 2006 > New whole-site HTML validation service available





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author New whole-site HTML validation service available
NikitaTheSpider@gmail.com

2006-06-08, 7:28 pm

Hi all,
I would like to announce the alpha release of a service that does
bulk/batch HTML validation, link checking and more. During alpha
testing, the service is free -- I need people to try this out! I invite
anyone interested in whole-site validation to check out
http://www.NikitaTheSpider.com/ where you can learn more about the
service and have your site crawled.

Thanks! Hope to see you there.

Philip
http://www.NikitaTheSpider.com/

Brian Wakem

2006-06-08, 7:28 pm

NikitaTheSpider@XXXXXXXXXX wrote:
> Hi all,
> I would like to announce the alpha release of a service that does
> bulk/batch HTML validation, link checking and more. During alpha
> testing, the service is free -- I need people to try this out! I invite
> anyone interested in whole-site validation to check out
> http://www.NikitaTheSpider.com/ where you can learn more about the
> service and have your site crawled.
>
> Thanks! Hope to see you there.
>
> Philip
> http://www.NikitaTheSpider.com/
>



What happens if you have millions or even an infinite number of pages?


--
Brian Wakem
Email: http://homepage.ntlworld.com/b.wakem/myemail.png
Charles Sweeney

2006-06-08, 7:28 pm

NikitaTheSpider@XXXXXXXXXX wrote

> Hi all,
> I would like to announce the alpha release of a service


A paid-for service after the free enticement. Many would call that spam.

> I need people to try this out!


Of course you do, in the hope that they will become paying customers.

I would imagine that anyone who was concerned about "validation" would
simply run their stuff through W3C.

--
Charles Sweeney
http://CharlesSweeney.com
NikitaTheSpider@gmail.com

2006-06-08, 7:28 pm

Brian Wakem wrote:
> NikitaTheSpider@XXXXXXXXXX wrote:
>
>
> What happens if you have millions or even an infinite number of pages?


Hi Brian,
The service is designed to accomodate large sites. I think the largest
site I tested had somewhere around 5,000 pages, but there's no built-in
limit. It should also support 50,000 or even 500,000 pages. I'm still
working on support for "infinite" though. ;)

If you have a large site that you want to spider, go for it! I'd
welcome the opportunity to see how Nikita holds up. You may want to
adjust the "politeness delay" parameter which controls how long Nikita
waits between requests to your site. It defaults to eight seconds which
is pretty high for big sites.

Cheers
Philip

NikitaTheSpider@gmail.com

2006-06-08, 7:28 pm

Charles Sweeney wrote:
> NikitaTheSpider@XXXXXXXXXX wrote
>
>
> A paid-for service after the free enticement. Many would call that spam.


Charles,
If you want to call it spam, that's your prerogative. If I was offering
dubious medical products or Nigerian fortunes in this group, I'd agree.
But a whole-site validation service is relevant to Webmasters, no?

>
> Of course you do, in the hope that they will become paying customers.
>
> I would imagine that anyone who was concerned about "validation" would
> simply run their stuff through W3C.


And the W3C Validator is still the best option for single-page
validation. My service doesn't have all of the options that the W3C
Validator does (set encoding, set doctype, file upload, etc.) and
doesn't do single-page validation; it isn't set up for that. It's
geared to validating a whole site (or a chunk of a site) all at once so
as to obviate the need to submit dozens/hundreds/thousands of URLs to
the W3C validator. That's just not practical for large sites. But for
single pages or small batches of them, the W3C Validator (and others,
like Validome) is a better choice.

Cheers
Philip

William Tasso

2006-06-08, 7:28 pm

Fleeing from the madness of the http://groups.google.com jungle
NikitaTheSpider@XXXXXXXXXX <NikitaTheSpider@XXXXXXXXXX> stumbled into
news:alt.www.webmaster
and said:

> Charles Sweeney wrote:


>
> And the W3C Validator is still the best option for single-page
> validation....
> geared to validating a whole site


http://htmlhelp.com/tools/validator/ - seems to work well enough.

--
William Tasso

http://williamtasso.com/words/what-is-usenet.asp
NikitaTheSpider@gmail.com

2006-06-08, 7:28 pm

William Tasso wrote:
> Fleeing from the madness of the http://groups.google.com jungle
> NikitaTheSpider@XXXXXXXXXX <NikitaTheSpider@XXXXXXXXXX> stumbled into
> news:alt.www.webmaster
> and said:
>
>
>
> http://htmlhelp.com/tools/validator/ - seems to work well enough.


Hi William,
You're right that you can validate multiple pages there, but that
validator limits itself to checking 100 pages, and AFAICT it checks
them in arbitrary order. In other words, you can't say "please check
the /foo/ subtree", it grabs the first 100 pages it sees. Also, I don't
think it has any throttle. It requests pages from sites as fast as
possible which can be hard on a Web server. Aside from these points, it
is a perfectly fine validator and I encourage people to use it if it
suits their needs.

Cheers
Philip

Charles Sweeney

2006-06-08, 7:28 pm

NikitaTheSpider@XXXXXXXXXX wrote

> Charles Sweeney wrote:
>
> Charles,
> If you want to call it spam, that's your prerogative. If I was
> offering dubious medical products or Nigerian fortunes in this group,
> I'd agree. But a whole-site validation service is relevant to
> Webmasters, no?


I didn't actually call it spam, but any "announcement" of a business
service is going to be pretty close to it by many people's definition.

It's certainly relevant, but one could say the same of hosting or domain
names, it doesn't mean that one would welcome daily announcements about
these things for sale.

I would have worded it differently, a simple request for people to try
it out, or a request for feedback, without the "I would like to
announce" part. In other words, is your post a request for help, or an
announcement of a new business.

You've got a good site there. I like the name, I like the logo, I like
the colours, and I generally like the look of it. I haven't tried your
service, so I can't speak for that, but you come across as knowledgeable
and I don't doubt that it works well.

Good luck with it.

--
Charles Sweeney
http://CharlesSweeney.com
Matt Probert

2006-06-11, 3:58 am

On 8 Jun 2006 07:12:46 -0700, "NikitaTheSpider@XXXXXXXXXX"
<NikitaTheSpider@XXXXXXXXXX> wrote:

>Hi all,
>I would like to announce the alpha release of a service that does
>bulk/batch HTML validation, link checking and more. During alpha
>testing, the service is free -- I need people to try this out! I invite
>anyone interested in whole-site validation to check out
>http://www.NikitaTheSpider.com/ where you can learn more about the
>service and have your site crawled.
>
>Thanks! Hope to see you there.
>
>Philip
>http://www.NikitaTheSpider.com/
>


Do you *really* want this tested by a big site with hundreds of
thousands of links?

Matt


--
Veritas Vincti
http://www.probertencyclopaedia.com
Matt Probert

2006-06-11, 3:58 am

On Thu, 08 Jun 2006 20:27:09 GMT, "Red E. Kilowatt"
<kilowattREMOVE@aww-faq.org> wrote:

>David Cary Hart <NO-Traif@TQMcube.com> wrote in message:
>9qaml3-pnt.ln1@news.TQMcube.com,
>
>
>Well.
>I'm glad that's finally settled. :-)
>


"It aint what you do, it's the way that you do it....

It aint what you say, it's the way that you say it..."


Matt


--
Veritas Vincti
http://www.probertencyclopaedia.com
Matt Probert

2006-06-11, 3:58 am

On 8 Jun 2006 07:12:46 -0700, "NikitaTheSpider@XXXXXXXXXX"
<NikitaTheSpider@XXXXXXXXXX> wrote:

>Hi all,
>I would like to announce the alpha release of a service that does
>bulk/batch HTML validation, link checking and more. During alpha
>testing, the service is free -- I need people to try this out! I invite
>anyone interested in whole-site validation to check out
>http://www.NikitaTheSpider.com/ where you can learn more about the
>service and have your site crawled.
>
>Thanks! Hope to see you there.
>
>Philip
>http://www.NikitaTheSpider.com/
>


Philip;

As you will see, I have submitted a URL to you to allow you to test
Nikita on a big site.

You should also perhaps build in some safeguards against people
submitting URLS with a view to "bombing" that site.

What if, when it's automated, I submit the URL of a site I don't like,
50 times, will Nikita start 50 threads sucking that entire site???

Matt


--
Veritas Vincti
http://www.probertencyclopaedia.com
NikitaTheSpider@gmail.com

2006-06-11, 3:58 am

Matt Probert wrote:
> On 8 Jun 2006 07:12:46 -0700, "NikitaTheSpider@XXXXXXXXXX"
> <NikitaTheSpider@XXXXXXXXXX> wrote:
>
>
> Philip;
>
> As you will see, I have submitted a URL to you to allow you to test
> Nikita on a big site.


Thanks Matt! It is in progress now.


> You should also perhaps build in some safeguards against people
> submitting URLS with a view to "bombing" that site.
>
> What if, when it's automated, I submit the URL of a site I don't like,
> 50 times, will Nikita start 50 threads sucking that entire site???


Yes, I've thought of the same thing -- that's exactly why I'm starting
crawls "by hand" at the moment. I have not yet built in safeguards
against someone using this service as a DOS attack on another Web site.
Until I do, I'll evaluate every seed URL and start each crawl myself.

FYI, the fact that I'm starting crawls by hand delays the start of
crawl requests that come in when I'm not in front of the computer. Once
Nikita handles these requests automatically, you generally won't have
to wait more than a minute for a crawl to start.

Cheers
Philip

Sponsored Links


Copyright 2003 - 2008 forum4designers.com  Software forum  Computer Hardware reviews