Open-Source Intelligence


Open-Source Intelligence (OSINT) is a process for finding publicly available information on a target company and/or individuals that allows identification of events (i.e., public and private meetings), external and internal dependencies, and connections. OSINT uses public (Open-Source) information from freely available sources to obtain the desired results. Some sources for collecting data are, for example, mass media such as television, radio, print media, or the Internet. The information collected from these sources will be analyzed, evaluated, and linked to each other to gain the necessary insights about our target.

The focus of OSINT lies in the word "Intelligence," which means constructing relationships between individual pieces of information from which we can create specific patterns and profiles about the target. The art here is to look behind the scenes and think outside the box.

It is essential to understand that the individuality and design of each company are usually entirely different. Therefore the focus of this Module is to understand and internalize the OSINT process's principles in the best possible way. We do not have to pay attention to the resources or tools we use since those can differ significantly, but rather on the way we find the information and use it during our assessments and how we present it to our clients to provide them the most value.


Definition of OSINT

There are many ways of describing OSINT. Many definitions of it are unfortunately wrong and could lead to illegal activities in most countries. In most guides and resources, we only find introductions of individual resources and tools aimed at providing us with all the most necessary information. With this, only the "Open-Source" part of OSINT is covered, and the "Intelligence" behind it is completely ignored or skipped. Knowing which tools are available and which ones we choose that best fit our own approach, with a bit of luck, only covers 10% of what is needed for OSINT. We will see and use some resources and tools throughout this Module, but no tool can replace the core elements. All OSINT tools are based on specific information resources and are therefore limited to those. Those tools may help us to find some information quickly, but, as mentioned before, this is only a tiny fraction of the whole picture.

Open-source information or open sources, is any data that can be obtained from public sources by anyone without any restrictions, whether for free or commercially, in a legal and ethically acceptable way.

In OSINT, information is categorized and linked together to form a logical connection. For example, it could be an employee of a company with a developer role who uses specific source code components from their private Github repository for the company infrastructure. Whether any particular tool will even show us this source is questionable. However, the intelligence we apply manually allows us to logically relate such details to our target company.

It is essential to understand precisely what OSINT is and what it is not. OSINT is based only on the passive gathering of information about the target company from publicly available and (without registration or authentication required) accessible sources. To make it clear, this is mainly about interaction with the target and not about third parties. Information that is disclosed on different platforms by our target company (for example, on LinkedIn) can be used with a registered account, of course. If, for example, we are dealing with internal forums, for which we need to register to access internal information (confidential & non-public), we are actively interacting with our target, which is no longer part of OSINT. There must be no active enumeration or interaction with the target, such as brute-forcing subdomains or similar, during the OSINT phase. This type of interaction with the target is part of the active enumeration phase. The OSINT process should consist of only using publicly available information.

Note: In this Module, we will see some examples of how this information can be used in later phases of our penetration test. Examples that will have to do with phone calls or physical access are, therefore, not part of the OSINT process.

It cannot be stressed strongly enough how important it is to differentiate between these categories.

Even if the customer has an open S3 bucket running on AWS as an example, which was not contractually defined in the scope of work and we identify it by brute-forcing the names, we are already a violation of the contract. This type of activity could lead to a problematic situation with significant consequences.

A distinction must be made between tools that actively interact with the available resources of the target company and use different methods to retrieve information or use the disclosed information to connect it to other parts of the databases. For example, when a tool tries to find vhosts or subdomains by brute-forcing, it is an active scanning method that sends requests to the DNS servers to confirm the existence of one of these vhosts or subdomains and is, therefore no longer part of OSINT.

However, when a tool extracts the SSL certificate from a web server and compares the data in databases such as crt.sh, also known as Certificate Transparency, we are not performing an active scan against the target company. In this case, we are merely using the viewable information to obtain additional information from third parties.

Below is a table that shows a few examples of when it is OSINT and its legality.

Activity OSINT? Legality
Looking for employees on Social Networks Yes Legal. Open-Source information.
Viewing company's website Yes Legal. Open-Source information.
Directory brute-forcing company's website No Illegal without permission since it is an active enumeration/scanning process.
Brute-forcing company's subdomains No Illegal without permissions since it is an active enumeration/scanning process.
Looking at certificate transparency logs to identify subdomains Yes Legal. No direct interaction with the target company.
Using third-party services that have collected information about the target company Yes No direct interaction with the target company.
Using third-party services that scan the target company No Illegal without permissions since it is an active enumeration/scanning process requested by us.