PyMINEr is a python package for analyzing scRNAseq data (although, really you could use it for anything that's a 2D matrix!). To install it, you'll first need to install python3 (preferably >=3.7). You can do this over at anaconda, or you could install it at the command line.
python3 -m pip install bio-pyminer
Sometimes getting all the right dependencies installed can be a pain & the dependencies of one program can break another. To get around this, you can use Docker. There are plenty of tutorials out there on this. Once you're familiar with how to use docker though, you can pull the latest PyMINEr docker image like so:
docker pull scottyler89/pyminer
docker run -it --name first_try -v <path_to_data>:/data scottyler89/pyminer
Then, when you're finished with using PyMINEr, you can log out of the docker image by typing "exit" at the command line.
A basic example of using PyMINEr
Here I'll walk you through a basic example of how to use PyMINEr with a scRNAseq dataset using the default parameters. Think of this as the "Hello World" for using PyMINEr.
The input file is: here
Genes of interest: here
Note that typically, the expression input you give PyMINEr should be the log-transformed and normalized expression matrix.
A list of the gProfiler accepted species codes is listed here: https://biit.cs.ut.ee/gprofiler/page/organism-list
Interpreting the Results
Here I'll give your walk-through on how to interpret the PyMINEr results.
PyMINEr for Large Datasets
Sometimes you've got more data than your computer can actually handle all at once. To address this issue, we have a script that will convert your file to a PyMINEr compatible HDF5 file. I'll show you how to do it here. As a basic example, we'll start out with the same input file as before. I'll also show you how to use the genes of interest option, using this file.
To convert a tab-delimited text file into a PyMINEr compatible hdf5 file, type:
This makes the hdf5 file as well as the corresponding row and column annotation text files which need to get fed into pyminer with the arguments: -ids <ID_list.txt> -cols <column_IDs.txt>
Using custom cell types/sample groups
If you're running PyMINEr on a traditional dataset with a priori groups (i.e.: WT vs KO, etc) - you can provide these groupings to PyMINEr. If you're doing scRNAseq, but want to use an algorithm not included in PyMINEr for cell type identification, you can provide those gropuings here as well. Above is the tutorial for how to do that.