Anaconda is a Python distribution used for machine learning, data science, and integrated development environment. However, its offerings are not limited to Python.
It supports open-source libraries such as TensorFlow, PyTorch, SciPy, scikit-learn, etc., which are used for data science and machine learning.
Let’s walk through some open-source tools supported by Anaconda and used for scientific computing:
OpenCV – It is a computer vision and machine learning library for C++, Java, and Python with support for all major operating systems.
Bokeh – It is a data visualization library for web browsers providing tools and widgets to better visualize the specifics of your data.
Spyder – An IDE that comes bundled with Anaconda providing a complete development ecosystem for data scientists and machine learning folks.
Conda – It also provides a package manager named conda, which is used to manage and install packages for various programming languages such as Python, R, and Julia. Python, if installed independently, contains a package manager named pip, which is an alternative to conda. The pip package manager downloads packages from the Python package index — it’s like npm but for Python.
Use cases for Anaconda
What makes Anaconda rich is its support for a variety of packages that can be used for the following domains:
With support for libraries such as OpenCV and scikit-image, anaconda proves to be an efficient package for image processing and computer vision projects. Image manipulation, analysis, processing, cleaning, restoration, and much more can be done using these open-source libraries.
Anaconda’s robust ecosystem of libraries and tools can be used for data manipulation, preprocessing, and providing useful insights into data.
Libraries such as Pandas and Numpy make it possible for data scientists to analyze, clean, and manipulate data in a structured and controlled manner.
An Anaconda project named Holoviz is a Python-based data visualization tool that includes Panel, hvPlot, Datashader, and a lot more Python packages to make data visualization more powerful and accurate.
Data visualization is really helpful for visually communicating ideas and concepts through data. Effective visualizations help in improved decision-making by communicating patterns in the data.
Tensorflow, Pytorch, and scikit-learn are libraries offered by Anaconda for machine learning-related projects.
Natural Language Processing
For NLP academics and developers, Anaconda offers a suitable environment for experimenting with various algorithms and strategies. NLP libraries supported by Anaconda are NTLK, gensim and spaCy.
So, to summarize, Anaconda is a bundle or a distribution containing tools and libraries which are useful in data science and machine learning.
With that being said, let’s look at Anaconda’s installation process.
Minimum 5 GB of disk space
Anaconda can be installed by downloading an installer which is technically a bash script, verifying the hash, and running it.
#1. Downloading the script
You can download the installer from Anaconda’s official website and execute it. However, if you want to download an older release, you can do that by using ‘curl’. You can find bash scripts for all Anaconda releases here.
Once that’s done, you must verify the hash of the file against the hash listed here. Verifying the hash is really important in order to make sure the file hasn’t been tampered with and to prevent malicious script execution on your system.
To do so, you need the filename of the bash script. You can get the filename of the script using the ls command.
Get the hash by using the following command:
Verify the hash you received with the hash listed on Anaconda’s website for your particular installation type. If they match, you are good to go!
#3. Executing the bash script
Next, run the bash script using the following command:
Then, you will be prompted to agree to their licenses and agreement. Enter “yes” to proceed. After that, it will ask you to verify the location of the installation.
The installation will now begin. Once it is successful, you will receive a message to initialize anaconda using conda init. Type “yes” if you want to do so.
#4. Activating Anaconda
If you wish to activate the anaconda later sometime, you can use the following command:
source <conda installation path>/bin/activate
And then run, conda init. You need to restart your terminal after that.
#5. Adding PATH to anaconda installation
Also, add the path to your Anaconda installation manually if you opted for not initializing conda at the time of installation. You can do so by adding the following line in your ~/.bashrc file. Just replace <anaconda installation path> with the actual installation path.
That’s it; you have successfully installed Anaconda on Ubuntu! You can verify the installation using the following steps.
#6. Verifying installation
Restart your terminal and type conda list. This command will list all the packages which are currently installed on your system.
Or else, you can verify the version of Python installed by Anaconda.
Setting Up Environments
Environments in Anaconda are a great way to isolate different installations of Python and other packages specifically required for a particular project. Each environment is like an isolated box that has its own version of Python and a set of relevant packages.
#1. Creating Environments
When you activate Anaconda for the first time, you are in the base environment which is indicated by the (base) keyword right before your terminal path.
In order to create a new environment, use the following command and just replace the <<env_name>> with the name with which you want this environment to be recognized:
conda create --name <<env_name>>
You will see the following output at the time of the environment creation process.
In order to use a specific environment, you need to run conda activate <<env_name>> with <<env_name> being the name of the environment.
You should see the name of the environment right before the terminal path.
#2. Creating environments with packages
At the time of environment creation, you can also specify the Python version which will be used inside that environment.
Linux is a multiuser operating system thus, multiple users can interact with the same computer at the same time using Linux. Being a multiuser operating system, it is important for Linux to guarantee the security and privacy of the files belonging to different users.