Skip to content
Permalink
c915909d11
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
163 lines (99 sloc) 6.15 KB
# ps-conda-environment
Setting up Anaconda environment for Plant Science data pipelines
- Manual by Sungchan Oh (Sun)
- First Created: November 13, 2024
- Last Updated: November 13, 2024
- Contact: oh231@purdue.edu
- Prerequisite
- Access to Purdue Compute Resources (e.g., [Negishi](https://www.rcac.purdue.edu/compute/negishi))
- Terminal software ([MobaXterm (Win)](https://mobaxterm.mobatek.net/), [iTerm2 (mac OS)](https://iterm2.com/), [Terminator (Linux)](https://gnome-terminator.org/))
___
## Three ways to get your environment ready for plant science data pipelines
1. [Manually create Anaconda environment](#1-manually-create-anaconda-environment)
- Construct an Anaconda environment step-by-step, customizing it to specific requirements.
- This method offers granular control but requires careful package management.
- Note: This approach has been successfully tested on Negishi (Linux-64) systems.
2. [Create Anaconda environment using `environment.yml`](#2-create-anaconda-environment-using-environmentyml)
- Leverage a `environment.yml` file to define the environment's dependencies.
- This approach is more automated and reproducible, especially for those using Negishi.
- Note: This method has also been tested on Negishi (Linux-64) systems.
3. [Use prebuilt Apptainer/Docker container](#3-use-prebuilt-apptainerdocker-container)
- Utilize a pre-configured Apptainer/Docker container encapsulating the entire pipelines and the environment.
- This method is convenient for immediate use and portability.
- It eliminates the need for manual environment setup and ensures consistency across different systems.
## Important Considerations
- Environment Compatibility: While these methods have been tested on Negishi (Linux-64), compatibility with other systems may vary. Users should be aware that successful installation and execution may depend on specific system configurations and package versions.
- Anaconda and Python Version Compatibility: If manual environment setup is preferred, it's crucial to identify compatible versions of Anaconda and Python that support the required Python packages.
- Apptainer/Docker Container Adoption: As the pre-built Apptainer container is under development, it's recommended to stay updated on its availability and usage instructions. This approach promises a simplified and standardized environment for running plant science data pipelines.
By carefully considering these factors, users can effectively establish the necessary environment for their plant science data analysis tasks.
## 1. Manually create Anaconda environment
- Launch a terminal emulator and connect to the Negishi login node.
```bash
ssh [your_id]@negishi.rcac.purdue.edu
```
- Replace `your_id` with your actual `Purdue ID` (e.g., `oh231`).
- Remove the square brackets from all placeholders.
- For the remaining placeholders like `["something"]`, replace `something` with the appropriate value and remove square brackets.
- Load Anaconda moduele and create a new Anaconda environment.
```bash
module load anaconda/2024.02-py311.lua
conda create --name ps37 python=3.7.9 -y
```
- Note: Python 3.7 is highly compatible with the Python dependencies used in Plant Science pipelines.
- Activate the new Anaconda environment.
```bash
conda activate ps37
```
- Install Python dependencies.
```bash
conda install -c conda-forge ipython numpy gdal pandas matplotlib -y
conda install -c conda-forge laspy lastools rasterio fiona python-pdal pdal -y
conda install -c conda-forge asteval pyproj scikit-image spectral opencv -y
```
- Installation may take several minutes.
## 2. Create anaconda Environment Using `environment.yml`
- Launch a terminal emulator and connect to the Negishi login node as instructed in [1. Manually create Anaconda environment](https://github.itap.purdue.edu/PlantScience/ps-conda-environment?tab=readme-ov-file#1-manually-create-anaconda-environment).
- Load Anaconda moduele
```bash
module load anaconda/2024.02-py311.lua
```
- To proceed, either navigate to the directory containing the `environment.yml` file or download the file to your current working directory.
- Create a new Anaconda environment and install required dependencies using the following command:
```bash
conda env create -f environment.yml
```
## 3. Use prebuilt Apptainer/Docker container
- TODO
## Verify `laspy` dependency and potential workaround
- Run Python and import `laspy`.
```python
import laspy
```
- If the `laspy` import fails, the following steps detail a potential workaround to address potential incompatibility issues between laspy and the queue module.
- Locate `copc.py` (Environment-Specific): The exact path to `copc.py` may vary depending on your Python installation and environment. We recommend the following general guidance. Navigate to your active Python environment's site-packages directory using the command `conda env list` or `which python` to find the Python executable and its associated site-packages path.
- Common locations for the site-packages directory include:
```bash
~/.conda/envs/<module_name>?/<env_name>/lib/python<version>/site-packages
```
- Once in the site-packages directory, search for the `laspy` package and locate the `copc.py` file within its subdirectories. Example:
```bash
~/.conda/envs/2024.02-py311/ps37/lib/python3.7/site-packages/laspy/copc.py
```
- Edit `copc.py` (with Caution :exclamation:)
- Warning: Improper modification of system files can lead to unintended consequences. Proceed with caution and back up the file before making changes.
- Assuming you've found `copc.py`, open it in a `text editor (Notepad, vim, etc.)`. Locate the following line:
```python
from queue import Queue, SimpleQueue
```
- Comment out this lines with `#`. Here's the modified code:
```python
#from queue import Queue, SimpleQueue
```
- Before or after the commented out line, add the following code:
```python
from queue import Queue
```
- Run Python and import laspy
```python
import laspy
```