Find out if your GitHub repository contains sensitive information such as password, secret key, confidential, etc.

GitHub is used by millions of users to host and share the codes. It’s fantastic, but sometimes you/developers/code owners can accidentally dump confidential information in a public repository, which can be a disaster.

There are many incidents where confidential data was leaked on GitHub. You can’t eliminate human error but can take action to reduce that.

How do you ensure your repository doesn’t contain a password or key?

Simple answer – don’t store.

But in reality, you can’t control other people’s behavior if working in a team.

Thanks to the following solution, which helps you to find mistakes in your repository.

Gittyleaks

A python-based free utility on finding words like a user, password, email in a string, config, or JSON formats.

Gittyleaks can be installed using pip and have an option to find suspicious data.

Secrets Scanning

GitHub has secrets scanning feature that scans the repositories to check for accidentally committed secrets. Identifying and fixing such vulnerabilities helps to prevent attackers from finding and fraudulently using the secrets to access services with the compromised account’s privileges.

Key highlights include;

  • The GitHub helps to scan and detect the secrets hidden accidentally hence enabling you to prevent data leaks and compromises.
  • It can scan both public and private repositories while alerting service providers who had issued the detected secrets for mitigation
  • For private repositories, GitHub alerts the organization owners or administrators and also displays a warning in the repository.

Git Secrets

Released by AWS Labs, as you can guess by the name – it scans for the secrets. Git Secrets would be helpful in preventing committing AWS keys by adding a pattern.

It let you scan for a file or folder recursively. If you suspect your project repository may contain AWS key, then this would be an excellent place to start.

Repo Supervisor

Repo Supervisor by Auth0 lets you find misconfiguration, password, etc.

It’s a serverless tool that can be installed inside a Docker container or any server using NPM.

Truffle Hog

One of the popular utility to find secrets everywhere, including branches, commit history.

Truffle Hog search using regex and entropy, and the result is printed on the screen.

You can install using pip

pip install truffleHog

Git Hound

A git plugin based on GO, Git Hound, helps to prevent sensitive data getting committed in a repository against PCRE (Perl Compatible Regular Expressions).

It’s available in a binary version for Windows, Linux, Darwin, etc. Useful if you don’t have GO installed.

Gitrob

Gitrob makes it easy for you to analyze the finding on a web interface. It’s based on Go, so that’s prerequisite.

Watchtower

AI-powered scanner to detect API keys, secrets, sensitive information. Watchtower Radar API lets you integrate with GitHub public or private repository, AWS, GitLab, Twilio, etc. The scan results are available on a web interface or CLI output.

Repo Security Scanner

Repo security scanner is a command-line tool that helps you to discover passwords, tokens, private keys, and other secrets accidentally committed to the git repo when pushing sensitive data.

This is an easy-to-use tool that investigates the entire repo history and provides the scan results within a short time. The scanning enables you to identify and address the potential security vulnerabilities that exposed secrets introduces in the open-source software.

GitGuardian

GitGuardian is a tool that enables developers, security, and compliance teams to monitor the GitHub activity in real-time and identify vulnerabilities due to exposed secrets like API tokens, security certificates, database credentials, etc.

The scanning tool allows the teams to enforce security policies in private and public code as well as in other data sources.

GitGuardian major features are;

  • The tool helps to find sensitive information such as secrets in the private source code,
  • Identify and fix sensitive data leaks on public GitHub,
  • It is an effective, transparent and easy to set up secrets detection tool
  • Wider coverage and comprehensive database to cover almost any sensitive information at risk
  • Sophisticated pattern matching techniques that improve the discovery process and effectiveness.

Conclusion

I hope this gives you an idea of finding sensitive data in the GitHub repository. If you are looking for secret management, then check out this article for possible solutions.