Installing PyMINEr

PyMINEr is a python package for analyzing scRNAseq data (although, really you could use it for anything that's a 2D matrix!). To install it, you'll first need to install python3 (preferably >=3.7). You can do this over at anaconda, or you could install it at the command line.

​Using pip

Just type:

​    python3 -m pip install bio-pyminer

​Using Docker

Sometimes getting all the right dependencies installed can be a pain & the dependencies of one program can break another. To get around this, you can use Docker. There are plenty of tutorials out there on this. Once you're familiar with how to use docker though, you can pull the latest PyMINEr docker image like so:

docker pull scottyler89/pyminer

docker run -it --name first_try -v <path_to_data>:/data scottyler89/pyminer
cd /data/

pyminer.py -h

Then, when you're finished with using PyMINEr, you can log out of the docker image by typing "exit" at the command line.

Chromium output files and automated analyses have some informatics no-nos in them. Before you can use the Chromium results for PyMINEr analysis, we need to do some cleaning to it. I've written up some code and a tutorial to walk you through how to do this.

 
 

A basic example of using PyMINEr

Here I'll walk you through a basic example of how to use PyMINEr with a scRNAseq dataset using the default parameters. Think of this as the "Hello World" for using PyMINEr.

The input file is: here

Genes of interest: here

Note that typically, the expression input you give PyMINEr should be the log-transformed and normalized expression matrix.

A list of the gProfiler accepted species codes is listed here: https://biit.cs.ut.ee/gprofiler/page/organism-list

 

Interpreting the Results

Here I'll give your walk-through on how to interpret the PyMINEr results.

 

PyMINEr for Large Datasets

Sometimes you've got more data than your computer can actually handle all at once. To address this issue, we have a script that will convert your file to a PyMINEr compatible HDF5 file. I'll show you how to do it here. As a basic example, we'll start out with the same input file as before. I'll also show you how to use the genes of interest option, using this file.

 

To convert a tab-delimited text file into a PyMINEr compatible hdf5 file, type:

   tab_to_h5.py expression.txt

This makes the hdf5 file as well as the corresponding row and column annotation text files which need to get fed into pyminer with the arguments: -ids <ID_list.txt> -cols <column_IDs.txt>

 

Using custom cell types/sample groups

If you're running PyMINEr on a traditional dataset with a priori groups (i.e.: WT vs KO, etc) - you can provide these groupings to PyMINEr. If you're doing scRNAseq, but want to use an algorithm not included in PyMINEr for cell type identification, you can provide those gropuings here as well. Above is the tutorial for how to do that.

©2018 by Scott Tyler

Follow