Installing PyMINEr

Here I'll walk you through the installation process for PyMINEr using a virtual machine running Ubuntu from Windows. The virtual machine software used here is Virtual Box. Note that Virtual Box can run on mac or linux as well. Using Virtual Box will let you use PyMINEr no matter what kind of computer you have.

 

Get the official release of PyMINEr code here.

or

Get the 'bleeding edge'

 

Just type:

    sudo bash PyMINEr_install.sh

    python3 -m pip install -r requirements.txt

#### That's it! ####

 

If you're on a cluster or want to do a local installation, you'll need to modify the path variable, and install the dependencies with:

    python3 -m pip install -r requirements.txt

    PATH=~/bin/pyminer/bin:$PATH

    gunzip ~/bin/pyminer/lib/*.gz

Local installs will only work with the current, bleeding edge version because you'll need to use the -string_db_dir argument, and point it towards the ~/bin/pyminer/lib/ directory, or wherever you put it. It would also be convenient to change the ./bashrc file so that you don't need to set the path variable every time you run PyMINEr.

Chromium output files and automated analyses have some informatics no-nos in them. Before you can use the Chromium results for PyMINEr analysis, we need to do some cleaning to it. I've written up some code and a tutorial to walk you through how to do this.

 
 

A basic example of using PyMINEr

Here I'll walk you through a basic example of how to use PyMINEr with a scRNAseq dataset using the default parameters. Think of this as the "Hello World" for using PyMINEr. The input file is here, and the results that you get should look like this. Note that typically, the expression input you give PyMINEr should be the log-transformed and normalized expression matrix.

 

A list of the gProfiler accepted species codes is listed here: https://biit.cs.ut.ee/gprofiler/help.cgi#help_id_2

 

Interpreting the Results

Here I'll give your walk-through on how to interpret the PyMINEr results.

 

PyMINEr for Large Datasets and Genes of Interest

Sometimes you've got more data than your computer can actually handle all at once. To address this issue, we have a script that will convert your file to a PyMINEr compatible HDF5 file. I'll show you how to do it here. As a basic example, we'll start out with the same input file as before. I'll also show you how to use the genes of interest option, using this file.

 

To convert a tab-delimited text file into a PyMINEr compatible hdf5 file, type:

   tab_to_h5.py expression.txt

This makes the hdf5 file as well as the corresponding row and column annotation text files which need to get fed into pyminer with the arguments: -ids <ID_list.txt> -cols <column_IDs.txt>

 

Using custom cell types/sample groups

If you're running PyMINEr on a traditional dataset with a priori groups (i.e.: WT vs KO, etc) - you can provide these groupings to PyMINEr. If you're doing scRNAseq, but want to use an algorithm not included in PyMINEr for cell type identification, you can provide those gropuings here as well. Above is the tutorial for how to do that.

©2018 by Scott Tyler

Follow