Bots – The Good, The Bad and The Ugly

An Internet bot, also known as web robot, WWW robot or simply bot, is a software application that runs automated tasks (scripts) over the Internet. Typically, bots perform tasks that are both simple and structurally repetitive, at a much higher rate than would be possible for a human alone. Web Bot is an internet bot computer program whose developers claim is able to predict future events by tracking keywords entered on the internet. The operators of Web Bot interpret the bot’s results and make a report called the “ALTA report” available on their website to paying subscribers. ALTA stands for “asymmetric language trend analysis”. Bots represent over 60 percent of all website traffic. This means that the majority of your website traffic could be coming from Internet bots, rather than humans.


Bots are routinely used on the internet where emulation of human activity is required. A simple online question-answer exchange may appear to be with another person when it is simply with a chat bot. Browser based bots (that accepts java-scripts and cookies) which stay inside infected browsers are becoming more sophisticated. Bad bots perform malicious tasks such as DDoS attacks, website scraping, and comment spam.


Web Crawler – Bot for search engines

A web crawler (also known as a web spider or webrobot) is a program or automated script which browses the World Wide Web in a methodical, automated manner. This process is called Web crawling or spidering. Many legitimate sites, search engines (Google, Bing, Yandex, Baidu), use spidering as a means of providing up-to-date data. Web crawlers are mainly used to create a copy of all the visited pages for later processing by a search engine, that will index the downloaded pages to provide fast searches. The crawler information is stored in “robots.txt”. By placing a robots.txt file at the root of your web server you can define rules for web crawlers such as allow or disallow that they must follow.

Googlebot – Googlebot is Google’s web crawling bot (sometimes also called a “spider”). Googlebot uses an algorithmic process, computer programs determine which sites to crawl, how often, and how many pages to fetch from each site. Likewise we have Baidu spider, Bingbot, Yandexbot, Soso spider, DuckDuckbot, etc.


Chat Bots

Chatbots or chat robot, a computer program that simulates human conversation, or chat, through artificial intelligence. Typically, a chat bot will communicate with a real person, but applications are being developed in which two chat bots can communicate with each other. Chat bots are used in applications such as e-commerce customer service, call centers, and Internet gaming. Chat bots used for these purposes are typically limited to conversations regarding a specialized purpose and not for the entire range of human communication. Chatbot development platforms like Chatfuel, Gupshup make it fairly simple to build a chatbot without a technical background. The best AI based chatbots available online are Mitsuku, Rose, Poncho, Right Click, Insomno Bot, Dr. AI and Melody.


Bad Bots

Bad bots represent over 35 percent of all bot traffic. Hackers execute bad bots to perform simple and repetitive tasks. These bots scan millions of websites and aim to steal website content, consume bandwidth and look for outdated software and plugins that they can use as a way to your website and database.

Website Scrapers and Spammers

Scrapers are bad bots that “scrape” original content from reputable sites and publish it to another site without permission. If you spend time reading blogs, you’ve probably spent some time perusing the comment section. Comment spam bots are bad bots that post spam in blog comments promoting items like shoes, cosmetics, etc. Every day millions of useless spam pages are created. Comment spam bots link to items they’re promoting in hopes that the reader will click on the link, redirecting them to a spam website. Once the user is on the spam site, hackers attempt to gather information such as credit card data.


Here are a couple of good resources in which you can lookup popular bad bots, crawlers, and scrapers.


-Prof. Yatin Jog

One comment

  1. Avatar
    Prof. Yatin Jog

    Envirobot is the latest biomimetic creation from Swiss researchers that autonomously swims around bodies of water and tests them for toxins and other factors The robot slithers through water like an eel, leaving mud and aquatic life undisturbed. It uses sensors to gather data from various locations, which it transmits to a remote computer in near-instantaneous fashion. Compared with conventional propeller-driven underwater robots, these are less likely to get stuck in algae or branches as they move around

