Welcome to Techtadka

Web Crawler in Java

Tuesday, 23 February 2010 06:34



A very basic webcrawler source code in java:-

The source code for the webcrawler for getting stated with the web parsing projects.

 

import java.io.InputStream;

import java.net.URL;

import java.net.URLConnection;

public class Crawler

{

public static void main(String argv[]) {

URL url = null;

        try {

            url = new URL("http://www.google.com");

            URLConnection urlConnection = url.openConnection();

            urlConnection.setAllowUserInteraction(false);

            InputStream urlStream = url.openStream();

            //urlConnection.guessContentTypeFromStream(urlStream);

            byte b[] = new byte[4];

            int numRead = urlStream.read(b);

            String content = new String(b, 0, numRead);

            while (numRead != -1)

            {

                numRead = urlStream.read(b);

                if (numRead != -1)

                {

                    String newContent = new String(b, 0, numRead);

                    content += newContent;

                }

            }

            urlStream.close();

            System.out.println(content);

        } catch (Exception e) {

            e.printStackTrace();

        }

}

}

 

In this code a url is crawled and the source code of the webpage(http://www.google.com) is stored in the string and then it is displayed by the system.out.println()

INSPIRE Programme

Sunday, 21 February 2010 10:28

INNOVATION IN SCIENCE PURSUIT FOR INSPIRED RESEARCH
 
inspire programme is developed by the Department of Science & Technology. The programme is to award scholarship at different levels of education.
 
SEATS: This scheme awards Rs 5000 to one million young learners of the age group 10-15 years for a duration of 5 years and arrange winter and summer camps. 
 
In this scheme every year 2 lakh school children in the age group of 10-15 years i.e. 6th to 10th standard shall be identified for INSPIRE AWARD(Rs 5000). The scheme plans to reach at least 2 students per secondry school.
 
The selection of Inspire Internship will be on the basis of top 1% performance in 10th class Examination.
 
SHE: This scheme offers total of 10,000 scholarship of Rs 80,000 every year for undertaking Bachelor and Masters level education in natural sciences.
 
AORC: Assured Opportunity for Research Careers offers assured opportunity for post doctoral researchers through contractual tenure track positions for 5 years in both basic and applied sciences.
 
For more details log on www.inspire-dst.gov.in

Updates


TV Series

blank
Fringe
The series follows an FBI Fringe Division team using unorthodox "fringe" science to investigate "the Pattern".A must see. More
blank
Heroes
The series tells the stories of ordinary individuals from around the world who inexplicably develop superhuman abilities.. More
blank
Kyle XY
The show centers around a boy named Kyle who wakes up in a lab with a superhuman brain.. More

Gadget Spotlight

blank
PSP 3000
This new version of the PlayStation Portable has a built-in microphone and a better screen which is able to resist glare, so it will work better outdoors..
More

Featured Tutorials

imageBlogging
Linking Blog to Domain Name Learn how to link your Wordpress or Blogspot blog to a domain name!


imageLinux
Securing the root password - An easy technique to quickly secure poorly secured linux systems...
More
imagePhotoshop
Fake photos, How to make them? Very easy methods for creating a quick yet realistic fake...
More



imageWindows
Enabling and using the 'Hibernate' feature - Learn how to set your PC to hibernation mode...
More
ad

Tag Cloud

Login Form