Below are some tutorials to help you get started
Installing PyMINEr
What to do with Chromium output
Basic use case
Interpreting the results
Using PyMINEr with big datasets
Using custom cell types / sample groups
Installing PyMINEr
Here I'll walk you through the installation process for PyMINEr using a virtual machine running Ubuntu from Windows. The virtual machine software used here is Virtual Box. Note that Virtual Box can run on mac or linux as well. Using Virtual Box will let you use PyMINEr no matter what kind of computer you have.
Get the official release of PyMINEr code here.
or
Get the 'bleeding edge'
Just type:
sudo bash PyMINEr_install.sh
python3 -m pip install -r requirements.txt
​
#### That's it! ####
If you're on a cluster or want to do a local installation, you'll need to modify the path variable, and install the dependencies with:
python3 -m pip install -r requirements.txt
PATH=~/bin/pyminer/bin:$PATH
gunzip ~/bin/pyminer/lib/*.gz
Local installs will only work with the current, bleeding edge version because you'll need to use the -string_db_dir argument, and point it towards the ~/bin/pyminer/lib/ directory, or wherever you put it. It would also be convenient to change the ./bashrc file so that you don't need to set the path variable every time you run PyMINEr.
Chromium output files and automated analyses have some informatics no-nos in them. Before you can use the Chromium results for PyMINEr analysis, we need to do some cleaning to it. I've written up some code and a tutorial to walk you through how to do this.
A basic example of using PyMINEr
Here I'll walk you through a basic example of how to use PyMINEr with a scRNAseq dataset using the default parameters. Think of this as the "Hello World" for using PyMINEr. The input file is here, and the results that you get should look like this. Note that typically, the expression input you give PyMINEr should be the log-transformed and normalized expression matrix.
A list of the gProfiler accepted species codes is listed here: https://biit.cs.ut.ee/gprofiler/help.cgi#help_id_2
PyMINEr for Large Datasets and Genes of Interest
Sometimes you've got more data than your computer can actually handle all at once. To address this issue, we have a script that will convert your file to a PyMINEr compatible HDF5 file. I'll show you how to do it here. As a basic example, we'll start out with the same input file as before. I'll also show you how to use the genes of interest option, using this file.
To convert a tab-delimited text file into a PyMINEr compatible hdf5 file, type:
tab_to_h5.py expression.txt
This makes the hdf5 file as well as the corresponding row and column annotation text files which need to get fed into pyminer with the arguments: -ids <ID_list.txt> -cols <column_IDs.txt>
Using custom cell types/sample groups
If you're running PyMINEr on a traditional dataset with a priori groups (i.e.: WT vs KO, etc) - you can provide these groupings to PyMINEr. If you're doing scRNAseq, but want to use an algorithm not included in PyMINEr for cell type identification, you can provide those gropuings here as well. Above is the tutorial for how to do that.