The Shodan eye is not as big as one might think
As Cooltechzone research team is performing many actions associated with search of IoT, databases, etc. on the Internet, we rely on the outputs of search engines.
It's where reconnaissance platforms such as Shodan come in handy, taking a user input string and indexing every device and/or IT infrastructure connected to the internet to return a relevant result to the user input.
But sometimes we know definitely – there is much more, but we can’t see it.
In this article, I shall be exploring such reconnaissance platforms and comparing them to one of the field pioneers, Shodan. For my comparison, I have designed my benchmark with a few critical metrics that will assist in rating the platforms.
Table of Contents
- Brief introduction to reconnaissance platforms
- Setting a benchmark
Reconnaissance platforms are, at their core, search engines capable of indexing devices connected to the internet. Shodan was one of the first official websites that offered such capabilities to become widely popular in the IT community.
Such platforms for reconnaissance allow a user to search for a general category of devices or an IP/host address specifically to gather more information about it. This information could include the software versions, the hardware used, any open ports, ISP information, etc.
This information can be used by researchers conducting global research on certain IT technologies, IT security professionals, threat hunters looking for new vulnerabilities in a system, and even by students in the field trying to learn such.
The focus of this article will be the reconnaissance platform known as Shodan, which was one of the first of its kind in the IT security field.
However, in most of my real-world uses, I prefer to use a combination of multiple reconnaissance platforms than a single one. It allows viewing the different varieties of data gathered by each different reconnaissance platform and making use of their additional features.
For the rest of the search engines or reconnaissance platforms, I will only give an overview and then step directly into the benchmark evaluation.
Firstly, while comparing any sort of technology or tool, you need to set a few benchmark metrics that will be used to compare the tool in question to the alternatives. How each tool performs in the benchmark test or evaluation decides how efficient the tool is.
Since the tools for comparison in this scenario are search engines, similar to Shodan, that can scan various internet-connected devices, and the benchmark should involve analyzing such searches and their results.
While these tools are more than just search engines and are information security research or reconnaissance platforms, I will be referring to them as search engines. It's for ease of understanding and also since, at their core, these tools are, in fact, search engines.
In this section, I will list a few test metrics for the comparison benchmark, and each engine will be scored based on these metrics with similar search terms.
I will first run a benchmark evaluation on Shodan, since it is the root engine for our comparison, give it a score, and then do the same for the other engines. Finally, I will also deliver a verdict on every search engine
First and foremost, we need terms or keywords to be used for all our searches for the benchmark evaluation. Using different keywords for different platforms would create a lack of streamlining for the entire process.
For my benchmark evaluation, I would be using keywords that are pretty common and easy to perceive and not include any complicated search terms.
Webcams are the most vulnerable part of a computer and are unfortunately one of the most searched-for devices in most reconnaissance platforms. A compromised webcam allows an attacker to see into the victim's life in real-time.
To look for exposed webcams or IP cameras, the search term “webcams” is used, but to further narrow the search, I will be using the phrase "cgi-bin/guestimage.html".
Image source – sendgrid.com
STARTTLS is a communication protocol used in email systems such as SMTP, POP, IMAP, etc., to upgrade a plain text connection to an encrypted connection. The protocol eliminates the need to use a separate port for encrypted communications.
However, the STARTTLS protocol isn't without its flaws, the main one being the possibility for a man-in-the-middle attacker to inject plain-text commands into the communication. The server would interpret these are part of the encrypted connection allowing credentials stealing.
To index devices running the "STARTTLS" protocol, I can enter the name of the protocol itself as the search phrase, which then returns the required list of devices.
Image source – realtek.com
Realtek has been a significant player in manufacturing various components of computer devices, such as microphones, webcams, and much more. But recently, 4 vulnerabilities were exposed in SDKs used by the manufacturer in their Wi-Fi modules.
These Wi-Fi modules were shipped to over 65 vendors who used them in over 200 different IoT devices. The vulnerability would therefore apply to and affect over a million IoT devices across the globe.
I would be using the search phrase
Realtek paired with the "v2" version number as the search phrase to look up this particular element.
Image source – whatis.techtarget.com
In a computer system, port number 80 is always associated with HTTP, which is a prevalent internet communications protocol. HTTP is also a very vulnerable protocol for internet communications since it can be easily exploited to gain access to the system.
Even though most systems are switching to the less vulnerable HTTPS (port 443) for internet communication, there are still systems that use HTTP, leaving the system port 80 open and vulnerable to exploits.
For my search, I would simply use the search term "port" and specify the port number as 80 to index all devices that have an open port 80.
For any kind of search engine, the number of results returned is directly proportional to the efficiency of the search engine. The more the number of results, the higher the chance of finding useful information from these results.
When there are more results offered by a platform, there is also a wider field of research or exploration that is possible to you as a user.
The point to be noted here is that just because a platform has maybe 100 times more results than the other, it doesn't guarantee that it is the best platform. The results returned also have to be helpful, which I shall be explaining in my next metric.
But the bottom line is that, while there is a larger pool of results to use, the more significant is the variety of devices that a user can choose from for their requirement.
Yes, the number of results returned by the tool is essential, but they would be of no use if all the results were not authentic. If I want to find IP cameras using one of the platforms, a search result where no IP cams are returned would be entirely pointless.
It's why a search result, besides having the most significant number, should be authentic and relevant to the user's need or search keyword.
In my benchmark metric, I would be analyzing the top 10 results returned by the platform in terms of how relevant it is to the keyword that I used. I would also be looking at how much data about the result is also gathered and delivered by the platform.
If the platforms in question only return the preliminary results with no other information about the technology used in the systems, it would simply add additional overhead where we'd have to do it ourselves.
Such technology could include operating systems running, version of hardware or software, any supporting companies for hosting, ISP details, and so on. It could also include web application (if any), web server, organization, ports open, and so much more.
The availability of information of any technology used by the servers or devices we search for is that it allows us to narrow down the target of our search as well as design any plan tailored for devices using that particular technology.
For example, if we need to filter devices running Linux in our search results, all we need to do is use the operating system's filter and narrow down the results to show only devices running Linux OS.
In my benchmark, I would be looking at the number of technology categories offered by the platform and how much these allow to narrow down results.
Besides the standard and basic features offered by such Shodan-like platforms, there are other functions such as vulnerability database, terminal support, and so on.
Such extra functionalities would be categorized under this section. While it is not critical for such a platform to have any such features, they are always worthy of a few additional points.
Such additional features, usually besides being a necessity, are present to improve user convenience and the usefulness of the search results.
In the benchmark evaluation, I shan't give any fixed marks to these additional features per se.
Instead, I would simply score these additional features based on their convenience and/or whether they help improve my experience with the platform at all.
If an additional feature exists purely as a gimmick and doesn't offer the user any benefits, I wouldn't be taking it into account during the final scoring process.
Image source – shodan.io
Shodan is one of my favorite reconnaissance platforms or IT search engines that allow users to index various types of internet-connected servers. The platform also allows the user to apply various filters to the indexed results.
The platforms work mainly by port scanning such internet-connected devices and grabbing their banner to extract helpful information from these devices.
However, the downside to banner grabbing is that any device or server that references a keyword in the user's search will also be indexed and returned.
As a simple example, imagine you are trying to perform a survey of the Google website. For this, you would enter the search query hostname:google.com which would return all servers and websites with the phrase Google in the hostname, whether official or not.
It definitely would make the survey a bit tricky, but it can be avoided by simply using the org filter and modifying the command to hostname:google.com org:" google".
I prefer Shodan over most other similar platforms because Shodan has a detailed description of all the indexed results, along with the same for all open ports in the indexed devices.
The number of results: In Shodan, searching for webcams returned a total of 834 results; search for STARTTLS returned 8,731,982 results; Realtek SDK v2 returned 11,121 results; port 80 returned 99,247, 046 results. The average number of results, therefore, is 26,997,746.
Top 10 results: A look at the top 10 results for the search terms in Shodan showed that while the platform returns ISP, open ports, and location details for almost all of the results, a lot of information isn't available. A few results had individual vulnerabilities for each device.
In webcams, only 2 of the indexed devices were accessible, while the rest were password protected. In the Realtek search, all indexed devices were honeypots, and no real devices were found.
Technology: On the technology side of things, Shodan gathered information about location, open ports, organizations, and products used. Information on vulnerabilities, operating systems, web technologies used was limited.
Additional features: In terms of additional features, the following features are available on Shodan besides the reconnaissance platform –
- Terminal support that allows running Shodan scans from a Unix terminal
Image source – cli.shodan.io
- Network monitor tool allowing personal usage as well as enterprise use to scan IP addresses and entire networks
- Shodan API that can be integrated with a third-party website/platform
- Advanced features such as local Shodan database available to enterprise users
Image source – zoomeye.org
ZoomEye, is yet another reconnaissance platform that works quite similar to Shodan but has a few additional features and misses a few from Shodan.
For instance, Shodan has the feature to show detailed information of every device that it indexes. On the other hand, ZoomEye has a detailed list of the various technologies used by the indexed devices.
I would also go as far as to say that ZoomEye is one of the most advanced engines on the list, with it indexing almost 10 times the number of results as Shodan, as we will see. It is also the most suitable platform for penetration testers who'd like to actually hack.
Number of results: For ZoomEye, the search for webcams returned 56,644 results, STARTTLS returned 24,209,576 results, Realtek returned 100,833 results, and port 80 returned 346,160,625 results coming to an average of 92,631,919 results.
The average number here is relatively more significant than that of Shodan, being almost 3 times larger, with a unique number of search results showing a much higher increase in results size than Shodan.
Top 10 results: For the top 10 results, like Shodan, there were 2 webcams accessible out of the 10 results. While the results seemed to be authentic, a few results were repeated. I was able to, however, index a few IPv6 devices (possibly routers) using the STARTTLS search.
The individual results, however, did not contain as much data as was gathered by Shodan, with only one port being discovered and no information about individual vulnerabilities.
Technology: Even though individual results from ZoomEye were poor compared to that of Shodan, the former shines in the comprehensive report on the technologies. ZoomEye managed to gather information on operating systems, web applications, web servers, products used, and much more for each search term.
While the variety of data found by ZoomEye for each search was more significant than Shodan, the amount of the data found by ZoomEye was relatively more minor.
Additional features: The additional features supported by ZoomEye are as follows –
- Terminal support along with API functionality to be run on a UNIX terminal
- Separate vulnerabilities tab applicable to the indexed devices, which is very useful while researching a specific type of device
- A statistics section where a user can look for various regional device statistics using an interactive globe
- Cyberspace radar system that works as a threat and asset detection system that allows management and supervision of a security system used by an enterprise or individuals
Image source – search.censys.io
Censys is one of the top reconnaissance platforms that can be used to gather and index information from internet-connected devices. It provides dozens of tags and also has an in-built mini-tutorial called “Data Definitions” that works to explain how to use the platform.
Censys can also index more information from internet-connected devices and has an excellent user interface that makes finding information very easy.
Several results: In terms of a number of search results, webcams return 41,606 results, STARTTLS returns 6,226,252 results, Realtek SDK v2 returns 91,643 results, and finally, port 80 returns 1,480,775 results.
It makes the average number of results per search equal to 1,960,069 results. While this is relatively small compared to its competitors, it makes up for this by providing better quality results.
Top 10 results: The individual result's index had many devices that allowed me to access the admin page of quite a few webcams, of which 3 were open to public access. I could also access the admin page of a couple of routers but couldn't go through due to password protection.
Technology: Censys manages to gather quite a bit of information about the technology running on the indexed devices such as operating systems, software vendors, products used, and ports.
The platform offers data of quite a lot of variety compared to most of its competitors. It even manages to index location information for the devices along with various protocols running on the system.
Additional features: While most additional features supported by Censys are for paid and enterprise users, it does have quite a few interesting ones –
- WHOIS tab that gathers WHOIS information about individual devices and hosts
- History information on each device and hosts about various operations on services and files
- An explore tab that allows users to browse through various files and subdomains of the host
- Vulnerability detection that automatically detects vulnerabilities in the system
Image source – fofa.so
FOFA is a reconnaissance platform that feels very similar to ZoomEye, in terms of the user interface and the results returned. Frankly, I do not enjoy this website since the content is entirely Chinese, making it quite hard to follow and use the features right.
Like most other similar platforms, FOFA too works using banner grabbing to gain information about devices.
However, compared to Shodan, FOFA can index devices that are open to public access without any password barrier. I will be listing these findings among the results to clarify what I mean.
The number of results: The webcam search on FOFA returned 15,444 results; the STARTTLS search returned 13,784,443 results; the Realtek search returned 1,834,439 even though I had to modify the search term; finally, the port 80 searches returned 5,737,070 results.
It makes an average of 5,342,849 results returned per search, making the Shodan average 4 times larger than FOFA.
Top 10 results: For the search in general, the terms had to be quite generalized to get any results at all. The search term for Realtek had to be reduced down to "Realtek" alone as the original did not return anything.
Speaking on individual results, the platform managed to index 4 webcams that weren't password-protected and even 1 router admin page (which needed a password to log in)
Much like ZoomEye, the results returned by FOFA, too, are not of as much variety or number as Shodan. No vulnerabilities for individual devices were returned, nor does FOFA have any additional support to show vulnerabilities.
Technology: In terms of technology data indexed, FOFA manages to gather information about ports, organizations, locations, and web servers in use. But even so, the indexed data isn't of as much variety as Shodan or even ZoomEye, for that matter.
But I feel like what FOFA lacks in variety; it makes up in its search algorithm, which finds and indexes more open devices than Shodan.
Additional features: While FOFA does not have a lot of additional features, the ones it does have are pretty good and are as below:
- API support to integrate the search engine with third party website
- Cyberspace surveying and monitoring tool that can detect vulnerabilities along with a community POC module
Image source – fofa.so
Image source – greynoise.io
While Greynoise is a platform for surveillance, it excels more in enterprise use rather than personal use. The platform has a meager hit rate for all searches in the benchmark evaluation metrics and has poor performance overall.
The product is more suited for enterprise use, with relatively good functionality to detect any malicious activity in an IP block and analyze IPs in bulk.
The website also required many modifications with the search terms. The search for webcams had to be done by using the phrase "cam" along, and that for Realtek was modified to "Realtek" alone.
The number of results: The webcam search returned 22 results, while STARTTLS returned no results at all. Realtek returned 12,320 results and finally, port 80 returned 1,116,056 results.
This brings the average number of results to only 282,099 results which are very small compared to the other platforms.
Top 10 results: Considering the top 10 results individually, there isn't much information presented here. The available information is location, ports, and a few other technical information.
Surprisingly, the platform does return the name of the operating system running on the indexed device.
Technology: In terms of technology, besides information about location, organizations, and operating systems, there is no other information about the technology used by the various indexed devices whatsoever.
Additional features: The additional features listed here for Greynoise might as well be their selling features since it doesn't have much else to offer as a reconnaissance platform.
- Alert system where Greynoise alerts the user if it detects any malicious scans or attacks targeted at an IP block that belongs to a user
- IP analysis tool that can analyze & enrich IP addresses in bulk
Image source – greynoise.io
Image source – leakix.net
LeakIX is a small-scale reconnaissance platform and works mainly by using crawler information that captures various assets and files from every internet-connected device.
The number of results: In LeakIX, there were 575 results for webcams, 436 for STARTTLS, 459,930 results for Realtek SDK v2, and finally 4,144,871 results for port 80. The average number of results per search, therefore, is 1,151,453.
It's almost 24 times smaller than Shodan, making the platform not as good at indexing devices. There were no publicly accessible devices that were indexed by the platform either.
Top 10 results: On LeakIX, while the individual results do have a page of their own with additional data much like Shodan, the data that it gathers doesn't compare to the scale of Shodan.
The only information available from the indexed devices is location, web technology used, and a few assets/files found by a web crawler.
Technology: Built-up to disappoint, the LeakIX capability to gather information on the technology used by the various devices indexed is highly limited.
The platform returns minimal information from a device that could be found by the user themselves through a Nmap portscan. The information includes geolocation, web technologies used, and sometimes, a hostname.
Additional features: The LeakIX platform, however, isn't without a few features that make using it quite convenient:
- Terminal support allowing usage from a Unix terminal
- The 'l9explore' plugin allows finding leaks, misconfigurations, and vulnerabilities on any network. The plugin also comes with more plugins for customization
Image source – github.com
Image source – onyphe.io
The Onyphe reconnaissance platform is capable of indexing various internet-connected devices and returning information gathered from their banners in a well-formatted form. The platform has a well-designed UI and is easy to follow.
Onyphe works by running data scans on various internet-connected devices and then gathering banner information from these devices.
The number of results: The search for webcams returned 3,380 results, while that for STARTTLS returned 8,453,275 results. Finally, the Realtek search returned 1,493 results, and port 80 returned 2,956,475 results.
It brings the average number of results per search to 2,853,656 results which is a pretty decent number of results but is relatively small compared to Shodan's.
Top 10 results: The individual results contain little information regarding the devices, such as operating system info. The returned results do have a location attached to them and belong to various devices connected to the internet.
Of all the results found, 3 webcams were open to public access, and no other devices were available.
Technology: Regarding the technology information indexed, the platform gathers information about the devices' location, operating systems, organizations, hostnames, and subdomains.
However, there is no information gathered about ports, web technology, web servers, or anything else by the platform.
Additional features: The additional features of Onyphe includes:
- API support that can be integrated with third-party platforms
- On-demand scans of requested network or IP block
Image source – onyphe.io
Image source – app.binaryedge.io
BinaryEdge is quite good in terms of indexing and gathering information on internet-connected devices and systems. The platform has an excellent user interface with advanced search capabilities.
The platform allows users to make custom searches under categories such as host, images, dataleaks, torrents, domains, and sensors. This advanced capability makes finding useful information much more accessible with minimum effort.
The number of results: The platforms returned a total of 3,322,409 results for the webcam search, 3,873,675 results for STARTTLS, 1,300 results for Realtek SDK v2, and finally 120,056,100 results for port 80.
It makes the average number of results per search equal 31,813,371, which is much higher than Shodan.
Top 10 results: Regarding the results themselves, almost all the results for the webcam opened up an admin login page that unfortunately required a username and password combination to access. Similarly, the search for Realtek devices also returned admin login pages for quite some routers.
Technology: Technology−wise, one of the most exciting features of the platform is that it recognizes the main products that were indexed and returns their info in a separate table. It also returns a few other information such as OS type, service running along with its version, and a direct link to access the device.
Another feature that I quite liked is the risk rating parameter given for the device based on its vulnerabilities and a metric to show whether or not the device was scanned by any malicious actor recently over the web.
Additional features: The BinaryEdge platform does not have a lot of additional features per se but does offer a few additional services to perform the surveillance:
- Images service that helps a user find images related to the search query
- Dataleaks service that can check whether a given email address is part of any data leak in any cyber attacks
- Torrents service that monitors torrent usage around the globe
- Domains service that can help find all subdomains created under the main domain, some of which aren't usually indexed by web crawlers
- Sensors service that uses various honeypots to find malicious actors or anyone who is poking around the internet trying to find something they aren't supposed to
Image source – app.binaryedge.io
To be honest, we were not expecting these kind of results, since I was always Shodan fan. I should say, that I will still use it for some projects, but maybe looking and checking the same query with Censys will make absolutely sense, since I can get more relevant results.
Not that it was the purpose of this research to make a top list, but here is our ranking for your any purpose. Enjoy!
- Rank 1 – Censys
- Rank 2 – ZoomEye
- Rank 3 – Shodan
- Rank 4 – FOFA
- Rank 5 – BinaryEdge
- Rank 6 – Onyphe
- Rank 7 – Greynoise
- Rank 8 – LeakIX
Censys is an excellent reconnaissance platform, and even though it lacks the total number of results, it makes up for it with the quality of the data it indexes. The platform is also very functional from a penetration tester's point of view and gathers information about the target.
In terms of results, ZoomEye indexes and returns more devices than Shodan and gathers a larger variety of data than Shodan. The platform also has exciting additional features that allow users to find vulnerabilities in the indexed devices.
Shodan is the classic reconnaissance platform to gather information about internet-connected devices. The platform, even though it has a lesser variety of data, does manage to gather information from more devices. It also has quite a few additional features that make using the platform quite interesting.
While the platform uses Chinese as its default language, it has tutorials and a sleek UI that make it quite simple. Furthermore, even though it doesn't gather as much data as the competition, it did manage to index the most significant number of publicly accessible devices.
BinaryEdge is one of my favorite platforms just for its user interface that is very well organized and easy to use. It also has a lot of information about web technologies that are collected and vulnerabilities and a risk rating for each indexed device.
Onyphe has quite a few different varieties of data that it collects from the indexed devices and also managed to find some publicly accessible devices. However, in terms of other data collected, Onyphe performs poorly compared to Shodan and a few other competitors.
Greynoise is one of the platforms with a pretty good user interface and also has an additional alerts system to alert a user about any malicious scans for a given IP block. However, the platform performs poorly when it comes to the actual gathering of data and indexing devices.
The Greynoise platform has the smallest number of results indexed for every search term in the benchmark evaluation metrics.
While LeakIX has more results indexed than Greynoise, it is still ranked last due to its complete lack of quality in the returned results. The platform doesn't have a lot of additional features either available to users.
Reconnaissance platforms, unlike standard search engines, find and index various devices connected to the internet. Most usually work by grabbing banners of the devices to gather information about the device, but a few others use web crawlers instead.
Our article looked at the Shodan reconnaissance platform in detail and then explored 7 other similar platforms that could work as viable replacements to Shodan.
The comparison was done using a benchmark evaluation using 4 different search metrics!
Please, share your opinion about this article in the comments. Was this guide helpful for you? I’m waiting for your feedback!