A "breach" is an incident where data is inadvertently exposed in a vulnerable system, usually due to insufficient access controls or security weaknesses in the software. HIBP aggregates breaches and enables people to assess where their personal data has been exposed.
When email addresses from a data breach are loaded into the site, no corresponding passwords are loaded with them. Separately to the pwned address search feature, the Pwned Passwords service allows you to check if an individual password has previously been seen in a data breach. No password is stored next to any personally identifiable data (such as an email address) and every password is SHA-1 hashed (read why SHA-1 was chosen in the Pwned Passwords launch blog post.)
No. Any ability to send passwords to people puts both them and myself at greater risk. This topic is discussed at length in the blog post on all the reasons I don't make passwords available via this service.
The public search facility cannot return anything other than the results for a single user-provided email address or username at a time. Multiple breached accounts can be retrieved by the domain search feature but only after successfully verifying that the person performing the search is authorised to access assets on the domain.
Occasionally, a breach will be added to the system which doesn't include credentials for an online service. This may occur when data about individuals is leaked and it may not include a username and password. However this data still has a privacy impact; it is data that those impacted would not reasonably expect to be publicly released and as such they have a vested interest in having the ability to be notified of this.
There are often "breaches" announced by attackers which in turn are exposed as hoaxes. There is a balance between making data searchable early and performing sufficient due diligence to establish the legitimacy of the breach. The following activities are usually performed in order to validate breach legitimacy:
A "paste" is information that has been "pasted" to a publicly facing website designed to share content such as Pastebin. These services are favoured by hackers due to the ease of anonymously sharing information and they're frequently the first place a breach appears.
HIBP searches through pastes that are broadcast by the @dumpmon Twitter account and reported as having emails that are a potential indicator of a breach. Finding an email address in a paste does not immediately mean it has been disclosed as the result of a breach. Review the paste and determine if your account has been compromised then take appropriate action such as changing passwords.
Pastes are often transient; they appear briefly and are then removed. HIBP usually indexes a new paste within 40 seconds of it appearing and stores the email addresses that appeared in the paste along with some meta data such as the date, title and author (if they exist). The paste itself is not stored and cannot be displayed if it no longer exists at the source.
Whilst HIBP is kept up to date with as much data as possible, it contains but a small subset of all the records that have been breached over the years. Many breaches never result in the public release of data and indeed many breaches even go entirely undetected. "Absence of evidence is not evidence of absence" or in other words, just because your email address wasn't found here doesn't mean that is hasn't been compromised in another breach.
Some people choose to create accounts using a pattern known as "plus aliasing" in their email addresses. This allows them to express their email address with an additional piece of data in the alias, usually reflecting the site they've signed up to such as firstname.lastname@example.org or email@example.com. There is presently a UserVoice suggestion requesting support of this pattern in HIBP. However, as explained in that suggestion, usage of plus aliasing is extremely rare, appearing in approximately only 0.03% of addresses loaded into HIBP. Vote for the suggestion and follow its progress if this feature is important to you.
The breached accounts sit in Windows Azure table storage which contains nothing more than the email address or username and a list of sites it appeared in breaches on. If you're interested in the details, it's all described in Working with 154 million records on Azure Table Storage – the story of Have I Been Pwned
Nothing is explicitly logged by the website. The only logging of any kind is via Google Analytics, Application Insights performance monitoring and any diagnostic data implicitly collected if an exception occurs in the system.
When you search for a username that is not an email address, you may see that name appear against breaches of sites you never signed up to. Usually this is simply due to someone else electing to use the same username as you usually do. Even when your username appears very unique, the simple fact that there are several billion internet users worldwide means there's a strong probability that most usernames have been used by other individuals at one time or another.
When you search for an email address, you may see that address appear against breaches of sites you don't recall ever signing up to. There are many possible reasons for this including your data having been acquired by another service, the service rebranding itself as something else or someone else signing you up. For a more comprehensive overview, see Why am I in a data breach for a site I never signed up to?
Yes, it has to in order to track who to contact should they be caught up in a subsequent data breach. Only the email address, the date they subscribed on and a random token for verification is stored.
You don't, but it's not. The site is simply intended to be a free service for people to assess risk in relation to their account being caught up in a breach. As with any website, if you're concerned about the intent or security, don't use it.
Sure, you can construct a link so that the search for a particular account happens automatically when it's loaded, just pass the name after the "account" path. Here's an example:
HIBP enables you to discover if your account was exposed in most of the data breaches by directly searching the system. However, certain breaches are particularly sensitive in that someone's presence in the breach may adversely impact them if others are able to find that they were a member of the site. These breaches are classed as "sensitive" and may not be publicly searched.
A sensitive data breach can only be searched by the verified owner of the email address being searched for. This is done via the notification system which involves sending a verification email to the address with a unique link. When that link is followed, the owner of the address will see all data breaches and pastes they appear in, including the sensitive ones.
There are presently 21 sensitive breaches in the system including Adult Friend Finder, Ashley Madison, Beautiful People, Bestialitysextaboo, Brazzers, CrimeAgency vBulletin Hacks, Fling, Florida Virtual School, Freedom Hosting II, Fridae, Fur Affinity, HongFire, Mate1.com, Muslim Match, Naughty America, Non Nude Girls, Rosebutt Board, The Candid Board, The Fappening, xHamster and 1 more.
After a security incident which results in the disclosure of account data, the breach may be loaded into HIBP where it then sends notifications to impacted subscribers and becomes searchable. In very rare circumstances, that breach may later be permanently remove from HIBP where it is then classed as a "retired breach".
A retired breach is typically one where the data does not appear in other locations on the web, that is it's not being traded or redistributed. Deleting it from HIBP provides those impacted with assurance that their data can no longer be found in any remaining locations. For more background, read Have I Been Pwned, opting out, VTech and general privacy things.
There is presently 1 retired breach in the system which is VTech.
Some breaches may be flagged as "unverified". In these cases, whilst there is legitimate data within the alleged breach, it may not have been possible to establish legitimacy beyond reasonable doubt. Unverified breaches are still included in the system because regardless of their legitimacy, they still contain personal information about individuals who want to understand their exposure on the web. Further background on unverified breaches can be found in the blog post titled Introducing unverified breaches to Have I Been Pwned.
Some breaches may be flagged as "fabricated". In these cases, it is highly unlikely that the breach contains legitimate data sourced from the alleged site but it may still be sold or traded under the auspices of legitimacy. Often these incidents are comprised of data aggregated from other locations (or may be entirely fabricated), yet still contain actual email addresses of unbeknownst to the account holder. Fabricated breaches are still included in the system because regardless of their legitimacy, they still contain personal information about individuals who want to understand their exposure on the web. Further background on unverified breaches can be found in the blog post titled Introducing "fabricated" breaches to Have I Been Pwned.
Occasionally, large volumes of personal data are found being utilised for the purposes of sending targeted spam. This often includes many of the same attributes frequently found in data breaches such as names, addresses, phones numbers and dates of birth. The lists are often aggregated from multiple sources, frequently by eliciting personal information from people with the promise of a monetary reward . Whilst the data may not have been sourced from a breached system, the personal nature of the information and the fact that it's redistributed in this fashion unbeknownst to the owners warrants inclusion here. Read more about spam lists in HIBP .
If a password is found in the Pwned Passwords service, it means it has previously appeared in a data breach. HIBP does not store any information about who the password belonged to, only that it has previously been exposed publicly and how many times it has been seen. A Pwned Password should no longer be used as its exposure puts it at higher risk of being used to login to accounts using the now-exposed secret.
The design and build of this project has been extensively documented on troyhunt.com under the Have I Been Pwned tag. These blog posts explain much of the reasoning behind the various features and how they've been implemented on Microsoft's Windows Azure cloud platform.