HTTPget

Copyright (C) 2002 Matej Kovačič

HTTPget is a Perl based program for analysing (parsing) websites.
You should put your URL's in an input text file and program tries to open the links and identifies what type of a document is on the other side. If it is HTML dokument it makes an analysis of the document.
Program also extracts all links from the document. As the result are get two tab-delimited text files - in the first there is an analysis of URL's in the second are extracted links.
Program is also able to identify if URL uses WebTracker.

Requirements
This program is intended to be installed to computer connected to the internet. HTTPget has been designed to use Perl with the following libraries:

You can get these libraries at CPAN. If you use ActiveState ActivePerl under Windows you can use PPM to install Time::HiRes library. Simply go to commandline and type PPM. Then type install Time::HiRes and then quit.
Perl and libraries must be installed before running this program. It is always recommended to use the latest stable version Perl and libraries.

License
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.

The package includes
HTTPget package is available in ZIP format. ZIP file includes:
- get.pl - a main script;
- import_get.sps - script to import data to SPSS;
- httpget.pdf - program manual in PDF format;
- install.txt - program manual in TXT format;
- sample.txt - sample input file with URL's.

Support
Please note: I do not offer any technical support.

Download
You can download the latest version of the program and manual in PDF format (get the
Acrobat PDF Reader) here:

Thank you?
Yes, you can express your gratitude by offering me some help. For instance - you can help me in further development of this program, you can help me by reporting bugs and you can help me to translate the manual and this website to slovenian language. If you are willing to help in development of this software, you can send me an
e-mail.

 

HTTPget: Copyright (C) 2002 Matej Kovačič.

 


Warning: Unknown: Your script possibly relies on a session side-effect which existed until PHP 4.2.3. Please be advised that the session extension does not consider global variables as a source of data, unless register_globals is enabled. You can disable this functionality and this warning by setting session.bug_compat_42 or session.bug_compat_warn to off, respectively. in Unknown on line 0