Find out if your GitHub repository contains sensitive information such as password, secret key, confidential, etc.

GitHub is used by millions of users to host and share the codes. It’s fantastic, but sometimes you/developers/code owner can accidentally dump confidential information in public repository which can be a disaster.

There are many incidents where confidential data was leaked on GitHub. You can’t eliminate human error but can take action to reduce that.

How do you ensure your repository doesn’t contain a password or key?

Simple answer – don’t store.

But in reality, you can’t control other people behavior if working in a team.

Thanks to the following solution which helps you to find mistakes in your repository.

Gittyleaks

A python-based free utility on finding words like a user, password, email in a string, config, or JSON formats.

Gittyleaks can be installed using pip and have an option to find suspicious data.

Git Secrets

Released by AWS Labs, as you can guess by the name – it scans for the secrets. Git Secrets would be helpful to prevent committing AWS keys by adding a pattern.

It let you scan for a file or folder recursively. If you suspect your project repository may contain AWS key, then this would be an excellent place to start.

Repo Supervisor

Repo Supervisor by Auth0 let you find misconfiguration, password, etc.

It’s a serverless tool which can be installed inside Docker container or any server using NPM.

Truffle Hog

One of the popular utility to find secrets everywhere, including branches, commit history.

Truffle Hog search using regex and entropy, and the result is printed on the screen.

You can install using pip

pip install truffleHog

Git Hound

A git plugin based on GO, Git Hound, helps to prevent sensitive data getting committed in a repository against PCRE (Perl Compatible Regular Expressions).

It’s available in a binary version for Windows, Linux, Darwin, etc. Useful if you don’t have GO installed.

Along with the above tools, you may also try Surch and Gitrob.

If your project requirements to have a credential in the repository because of open-source, then you may consider using a password vault to manage the secrets.

There are some options available.

HashiCorp Vault – secret storage to store, lease, audit, revocation of tokens, password, certificates, API keys, AWS credentials, and more.

It’s open-source so FREE and available to be installed from source or binary.

BlackBox – it works with Git, Subversion, Mercurial and Perforce. Data is encrypted using GPG (GNU Privacy Guard). It supports the following.

  • Linux
  • macOS X
  • MinGW
  • Cygwin

It’s free so give a try to see how it goes.

Gitrob

Gitrob makes it easy for you to analyze the finding on a web interface. It’s based on Go so that’s prerequisite.

Watchtower

AI-powered scanner to detect API keys, secrets, sensitive information. Watchtower Radar API let you integrate with GitHub public or private repository, AWS, GitLab, twillo, etc. The scan results are available on a web interface or CLI output.

Conclusion

I hope this gives you an idea of finding sensitive data in GitHub repository and learn about tools to encrypt them if you need to store in Git. If you are new or interested in learning GitHub, then you may refer to this ultimate course.