Loading...
 

Cluster FAQ

Frequently Asked Questions (FAQ) oriented to end users of the VHIR Cluster.


1. Where to ask the questions not answered in this FAQ?

Ask at the VHIR forums:
http://forums.vhir.org

If for some reason you can't access or login at the forums site, ask at the UEB: ueb(a)vhir.org (or phone extension: 4007)

2. Terms Glossary

Terms glossary related to Computing Cluster at VHIR.

CentOS
GNU/Linux Distribution based on RedHat; they recompile their code to return to maintain a free/open-sourced version of RedHat, while keeping stable versions of their packages, etc. Does not use the latest version of programs, but usually much older than those found in the stable versions of Debian and Ubuntu GNU/Linux Distributions, as a reference. CentOS Distribution is often used computers linked to Roche sequencers, the old VHIR cluster, etc.
CPD
Data Processing Centre, usual location where institution computer servers (calculation, backups, etc) are placed.
KVM (1)
Kernel (based) Virtual Machine. Virtualization System for computers. There are others, such as Xen, etc.
KVM (2)
Keyboard View Mouse. Screen terminal commonly used in CPD cabins to access a headless server of the same or other cabin. We have an HP cabin at the Hospital CPD, in "Mediterranean" building of VHIR.
HPC
High Performance Computing.
Miceli "Fredolic"
VHIR Cluster Name at the internal documentation.
Master Node
= "Head node" = "Front End" node, the director of the orchestra of the musicians (compute nodes). It usually has more two or more CPU, at least, to manage the queue system. That is the node of the cluster where users connect to in order to send jobs to the queue/s for processings. See the figure below.
Module
Specific version of a program to be loaded in the Cluster prior to its use. Programs are converted first into cluster modules, so that each version of a program has its own module, and each researcher can easily choose which version of that software program needs to use each time. See: https://www.tacc.utexas.edu/research-development/tacc-projects/lmod
Compute Node
Worker under the direction indicated and managed by the Master Node. They usually have 4 or more CPU for doing computational tasks. See the figure below.
Rocks
GNU/Linux distribution based on CentOS (see above), adapted for Computing Clusters. It allows the use of different queue managers, such as SGE or Condor.
Roll
Set of computer packages packages associated with common functionality in Rocks GNU/Linux distribution, such as web-server, SGE or condor queueing system, etc. Specific concept from the Rocks Gnu/Linux distribution. See: http://central6.rocksclusters.org/roll-documentation/
SGE
Sun Grid Engine (before) and Son of Grid Engine, 2014, probably. (Although it could be employing another fork: Open Grid Scheduler). Queue Management system for the computer cluster.


3. What's the difference between the 'Head' & 'Compute' Nodes?

See this diagram, and check the Glossary again.

Click to enlarge
Click to enlarge



4. How to book the Cluster?

4.1. Don't forget to book the cluster in advance!

Remember to add some booking period if you plan to use it, so that we can all know in advanced which days/weeks you may be sending jobs to the cluster.

With the usual bookings system.

First time?

  • Send an email to ueb(a)vhir.org to request a user for the bookings system


Warning:

  • Other users can concurrently reserve some calendar slot in the computer cluster (at the time).
  • When you foresee that there might be a conflict between two concurrent intensive uses of the same working queue a the Cluster, UEB Staff will discuss with the involved users to try to stay in exclusive slots, to avoid competition among their calculation intensive processes.

Click to enlarge
Click to enlarge

Click to enlarge
Click to enlarge

How to book a resource?:

  1. Log in to the bookings system with the username and password you have been provided for the booking system
  2. Select the resource you want to book
    • In the case of the Computer Cluster, select:
      • Familia: Hardware
      • Recursos: Cluster de Càlcul
  3. Click on the date you are interested in
  4. Fill in the details and save your request.
    • You will receive an email with a copy of your request.
    • When you request is validated, you will receive a second email confirming that your request as validated. Alternatively, the second email that you receive might also ask you for more details or ask you for more information prior to the approval of your request, etc.

5. Which is my username and password to connect to the Cluster?

The same ones that you have for the UEB bookings system (see above).

6. How do I connect to the VHIR Cluster?

You need to connect through a terminal window (also known as "console") from a computer within the VHIR computer network, and run this type of command (replace the word "yourusername" appropriately with your username for the UEB Bookings system ) to connect through a "Secure Shell" (ssh) to the IP of the vhircluster. That IP is differnt if you connect from inside the VHIR internal computer (cable) network or from outside (which also includes using VHIR_EXERNA wifi network):

6.1. From inside the VHIR internal network

Command in a console
ssh yourusername@172.18.50.16

6.2. From outside the VHIR network (VHIR_Externa wifi, from home, from abroad, etc.)

Command in a console
ssh yourusername@193.146.115.180

7. Where can I find a Secure Shell (ssh) program in my computer?

It depends on the operating system that you use in your computer.

7.1. GNU/Linux

Using ssh from a terminal window. Ask at the VHIR forums for more support.

7.2. Mac

Using ssh from a terminal window (open Xterm or X11 program). Ask at the VHIR forums for more support.

7.3. Windows

Using ssh from a terminal window, throught the program Putty. You can fetch Putty from:
http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html

Ask at the VHIR forums for more support.

8. What if I want to connect to the Cluster from outside of the VHIR network?

You can use the public IP of the vhircluster: 193.146.115.180

Command in a console
ssh yourusername@193.146.115.180


Or alternatively, if you have VPN access to the VHIR network, you can connect using a VPN connection. (VPN: Virtual Private Network). If you need VPN connection, you can request so at informatica at vhir.org:
Manuals to access the VHIR network through VPN
Image VPN En Windows XP
Image Vpn V7 V2

8.1. VPN Client (GNU/Linux)

Use the VPN Client provided by default in your GNU/Linux distribution. No extra documentation available at this time. Ask at the forums, please, if you need further help or guidance.

8.2. VPN Client (Mac OSX)

You can use this VPN Client downloaded from https://vpn.vhir.org once you provide your VPN user credentials.

Ask at the forums, please, if you need further help or guidance.

8.3. VPN Client (Windows)

You can use this VPN Client downloaded from https://vpn.vhir.org once you provide your VPN user credentials.

Ask at the forums, please, if you need further help or guidance.

9. How to run a program in the Cluster?

First check that the program is already installed: see the list below. If it's not, contact the UEB staff indicating which program and version you want to run.

You need to decide:

  1. If you want to use a whole compute-node queue
    • which will allow you to run your job in many nodes at once, and use more of the available resources from the cluster for your task (even if you will need to create some script to have the cluster run your commands in the compute nodes),
      In this case, you will have to choose in whether sending the jobs sequentially (serial job) or in parallel (see below)
    • You must know that the work is thrown to a job queuing system called "Sun Grid Engine" (SGE, otherwise called "Son of Grid Engine," his descendant). The SGE queuing system manages cluster computer resources effectively in a parallel (distributed) way; eg CPU time, software, disk usage, ...

    .
  2. or if you want to use an interactive shell (on a single compute-node)
    • which will provide a single node for you so that you can run programs interactively (and you will not need to create a script but you will be able to run commands once by one interactively)
      See below for more information.

10. How to send a serial job (on a compute-node queue)?

If you want to run a job without using parallelization, create a text file with a header (some lines starting with "#") containing some instructions like the ones shown in these examples:

Example 1: 'submit.sh'

Contents of submit.sh
#!/bin/bash

#$ -N test
#$ -cwd

module load Bowtie/1.1.1

bowtie --version
which bowtie


Example 2: sleep.sh

Contents of sleep.sh
#!/bin/bash 
# 
#$ -cwd 
#$ -j y 
#$ -S /bin/bash 
# 
date 
sleep 20 
date


Lines starting with #$ are treated as options of the job-queueing system.

  • -cwd runs the task at the current working directory.
  • -j y merges the standard error and standard output channels, instead of having them as separated output channels.
  • -S /bin/bash specifies that the shell interpreter for this task is Bash shell.


After that, you can send it to the job queue system in the cluster, with the qsub command:

Command run in a terminal at the front-end node
$ qsub sleep.sh 
Your job 16 ("sleep.sh") has been submitted


The output you see indicates the job number at the queue (job number 16, in this example).

11. How to send a parallel job (on a compute-node queue)?

To have a job run in parallel, you need to write a script such as these ones:


Example 1: 'hello.qsub'

Contents of hello.qsub
#!/bin/bash 

#$ -cwd 
#$ -j y 
#$ -S /bin/bash 

/opt/openmpi/bin/mpirun hello_c


In this example, the program will use the parallelization libraries openmpi.

After that, you can send it through the usual job queue system, with the qsub command

Command run in a terminal at the front-end node
$ qsub hello.qsub
Your job 14 ("hello.qsub") has been submitted

Command run in a terminal at the front-end node
$ qstat -f 
queuename                  qtype resv/used/tot. load_avg arch          states 
----------------------------------------------------------------------------------------------
all.q@compute-0-0.local    BIP   0/0/24         -NA-     linux-x64     au 
----------------------------------------------------------------------------------------------
all.q@compute-0-1.local    BIP   0/1/4          0.00     linux-x64     
     14 0.55500 hello.qsub test         r     12/23/2014 17:07:04     1        
----------------------------------------------------------------------------------------------
all.q@compute-0-2.local    BIP   0/0/4          0.00     linux-x64     
----------------------------------------------------------------------------------------------
all.q@compute-0-3.local    BIP   0/0/4          0.00     linux-x64     
----------------------------------------------------------------------------------------------
all.q@compute-0-4.local    BIP   0/0/4          0.00     linux-x64     
----------------------------------------------------------------------------------------------
all.q@compute-0-5.local    BIP   0/0/4          0.05     linux-x64     
$


Once finished, we will be able to see the output of the job (any output message plus any error message, since they were merged as set in the job script):

Command run in a terminal at the front-end node
$ cat hello.qsub.o14 
Hello, world, I am 0 of 1




If we want to run the previous example again, but using 5 instances (cores) at the same time, we can do so with the param -pe orte 5 :

Command run in a terminal at the front-end node
$ qsub -pe orte 5 hello.qsub   
Your job 15 ("hello.qsub") has been submitted


We can see the jobs at teh queues:

Command run in a terminal at the front-end node
$ qstat -f 
queuename                  qtype resv/used/tot. load_avg arch          states 
----------------------------------------------------------------------------------------------
all.q@compute-0-0.local    BIP   0/0/24         -NA-     linux-x64     au 
----------------------------------------------------------------------------------------------
all.q@compute-0-1.local    BIP   0/0/4          0.00     linux-x64     
----------------------------------------------------------------------------------------------
all.q@compute-0-2.local    BIP   0/4/4          0.00     linux-x64     
     15 0.55500 hello.qsub test         r     12/23/2014 17:08:49     4        
----------------------------------------------------------------------------------------------
all.q@compute-0-3.local    BIP   0/1/4          0.00     linux-x64     
     15 0.55500 hello.qsub test         r     12/23/2014 17:08:49     1        
----------------------------------------------------------------------------------------------
all.q@compute-0-4.local    BIP   0/0/4          0.00     linux-x64     
----------------------------------------------------------------------------------------------
all.q@compute-0-5.local    BIP   0/0/4          0.04     linux-x64     
$


Once finished, we can see the final output:

Command run in a terminal at the front-end node
$ cat hello.qsub.o15 
Hello, world, I am 0 of 5 
Hello, world, I am 1 of 5 
Hello, world, I am 3 of 5 
Hello, world, I am 4 of 5 
Hello, world, I am 2 of 5

11.1. How to submit an R BATCH job to the Cluster

Using the Sun Grid Engine to submit an R BATCH job to the cluster is very simple: there are just two steps needed.

  1. First (on vhircluster frontend node), write a wrapper script
    assuming you have an R program in a file named mycommands.R,
    • you need to create a new file that will invoke and run your R program when it is submitted to the cluster. Let’s call this new file batch.sh .
    • You should put this batch.sh file in the same directory as your mycommands.R file.
    • To run an R BATCH job on the cluster using the mycommands.R file, your batch.sh file need only have this one line in it, like this:
      R CMD BATCH mycommands.R
    • The file might have other lines in it to specify SGE job options or commands to run before or after the “R CMD BATCH …” line.
    • The technical name for this file is “shell script”. Knowing this might help you communicate with the system administrator.

    .
  2. Submit your script to the cluster
    • Once you’ve written your short batch.sh file, you could submit it to the cluster via the command:
      qsub -cwd batch.sh
      • Additional information on how to specify your job’s memory needs can be seen below, under the qrsh section, which uses common parameters with qsub for such goal.


The -cwd option tells SGE to execute the batch.sh script on the cluster from the current working directory (otherwise, it will run from your home directory, which is probably not what you want).

That’s all you have to do! There are a few things to note:

  • You do not have to put an & at the end of the line (don’t worry if you don’t know what the & might be used for).
  • qsub automatically sends your job to the cluster and returns to your command line prompt so that you can do other things.
  • After submitting your job with qsub, you can use the qstat command to see the status of your job(s).

12. How to check that my job is in the queue?

You can do with the command qstat -f. In this example, this is run after sending the job in the script sleep.sh to the job processing queue.

Command run in a terminal at the front-end node
$ qstat -f 
queuename
 qtype resv/used/tot. load_avg arch          states 
----------------------------------------------------------------------------------------------
all.q@compute-0-0.local  BIP   0/0/24         -NA-     linux-x64     au 
----------------------------------------------------------------------------------------------
all.q@compute-0-1.local  BIP   0/0/4          0.00     linux-x64     
----------------------------------------------------------------------------------------------
all.q@compute-0-2.local  BIP   0/0/4          0.00     linux-x64     
----------------------------------------------------------------------------------------------
all.q@compute-0-3.local  BIP   0/0/4          0.01     linux-x64     
----------------------------------------------------------------------------------------------
all.q@compute-0-4.local  BIP   0/0/4          0.04     linux-x64     
----------------------------------------------------------------------------------------------
all.q@compute-0-5.local  BIP   0/0/4          0.00     linux-x64     
############################################################################ 
 - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS 
############################################################################ 
     16 0.00000 sleep.sh   test         qw    12/23/2014 17:39:38     1        
$


Once you run it again a little bit later, you will see that the job started to be processed by the queue:

Command run in a terminal at the front-end node
$ qstat -f 
queuename 
   qtype resv/used/tot. load_avg arch          states 
----------------------------------------------------------------------------------------------
all.q@compute-0-0.local    BIP   0/0/24         -NA-     linux-x64     au 
----------------------------------------------------------------------------------------------
all.q@compute-0-1.local    BIP   0/0/4          0.00     linux-x64     
----------------------------------------------------------------------------------------------
all.q@compute-0-2.local    BIP   0/0/4          0.00     linux-x64     
----------------------------------------------------------------------------------------------
all.q@compute-0-3.local    BIP   0/0/4          0.01     linux-x64     
----------------------------------------------------------------------------------------------
all.q@compute-0-4.local    BIP   0/0/4          0.04     linux-x64     
----------------------------------------------------------------------------------------------
all.q@compute-0-5.local    BIP   0/1/4          0.00     linux-x64     
     16 0.55500 sleep.sh   test         r     12/23/2014 17:39:49     1


Once finished, we can see the final result at the output file:

Command run in a terminal at the front-end node
$ cat sleep.sh.o16 
Tue Dec 23 17:39:49 CET 2014 
Tue Dec 23 17:40:09 CET 2014

13. How to get notified of my job status via email

If you wish to be notified via email when your job’s status changes, include options like the following when submitting your jobs:

qsub  -m e  -M my.email@vhir.org   your_job.sh

which means send email to given address(es) when the job ends.

If you want to automatically have such options (or others) always added to your job(s), simply put them in a file named .sge_request in your home directory. You can also have working-directory-specific .sge_request files (see the man page for sge_request - man sge_request).

Lines like this in your .sge_request file:

-M my.email@vhir.org
-m e

will cause an email to be sent, when your job ends, for every cluster job that you start (including, for what it’s worth, a qrsh ‘job’).

You could use -m n on individual qsub job command lines to suppress email notification for certain jobs.

Or better yet, … you might only put the -M my.email at vhir.org in the .sge_request file and simply use the -m e option on jobs for which you want email notification.

Note: You may also invoke the options shown above (and others) by including special lines at the top of your job shell scripts. Lines beginning with #$ are interpreted as qsub options for that job. For example, if the first few lines of your script look like the following:

#!/bin/bash
#$ -M my.email@vhir.org
#$ -m e

The lines beginning with #$ would cause SGE to send email to ‘my.email@vhir.org’ when the job ends.

#$ -m be

would cause an email to be sent when the job begins (‘b’) and ends (‘e’). See the manual page for qsub (type man qsub at a shell prompt ) to get more information.

14. How to use interactive shells (on a single compute-node)?

Use command qrsh:

From http://gridscheduler.sourceforge.net/htmlman/htmlman1/qsub.html

Quote:
qrsh - submit an interactive rsh session to Sun Grid Engine.

14.1. How to connect to any node?

Once connected to the cluster frontend node, type qrsh:

Quote:
username@vhircluster:~$ qrsh

14.2. How to connect to an specific node?

You can choose your node (if it has available resources) with:

username@vhircluster:~$ qrsh -l h='compute-node-name' -now no


You can choose your compute-node-name from the ones available: form compute-0-0 to compute-0-6

14.3. How to leave the interactive shell running when I disconnect?

Using nohub command.

If you don't want to get your job stopped when you logout from the copute-node, or if the network connection is broken from unknown reasons ("broken pipe" type of message), you can prepend your commands sent to the interactive sheel with the command nohub.

More information, for instance, at:
https://www.cyberciti.biz/tips/nohup-execute-commands-after-you-exit-from-a-shell-prompt.html

14.4. How to get R plots shown locally

If your software needs X11 forwarding (for example, you will need this for R plots to be shown in your computer form the R commands run in the interactive shell in the cluster), you need to ssh to the cluster IP with -X option, such as:

ssh username@cluster-ip-address -X


See above for the clustger-ip-address (it's different depending on whether you connect from inside or outside of the VHIR network).

14.5. How to start an R session directly

You can start an R session directly using qrsh, and you can ask with some specific memmory requirements; for instance, 10G maximum memmory and 5G memory should be available at the moment.

[username@vhircluster ~]$ module load R
[username@vhircluster ~]$ qrsh -V -l h_vmem=10G,mem_free=5G R --save


Alternatively, you can login to a suitable machine using qrsh and then start R. In the example below, qrsh requests 3 slots and 5G per slot maximum memory (3*5=15G max memory).

[username@vhircluster ~]$ module load R
[username@vhircluster ~]$ qrsh -now no -pe mpi 3 -l  h_vmem=5G
[username@compute-0-3:~]$ R

R version 3.3.2 (2016-10-31) -- "Sincere Pumpkin Patch"
Copyright (C) 2016 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

[Previously saved workspace restored]

> q()
[username@compute-0-3:~]$ exit
[username@vhircluster ~]$


14.6. How to do more customizations to an interactive shell (qrsh)?

See this document, for the time being:
https://jhpce.jhu.edu/knowledge-base/how-to/


15. Which programs are already installed?

The Rocks Distribution is based on CentOS GNU/Linux distribution, and many programs come in sets called "Rolls" (see the #Glossary above).

In addition, many other programs are added as modules, so that you can load a specific version of them just before running your script. (see the #Glossary above)

16. Which programs are available in the Bio roll?

The "Bio" Roll (http://central6.rocksclusters.org/roll-documentation/bio/6.1.1/) includes this set of programs:

HMMER, NCBI BLAST, MpiBLAST, biopython, ClustalW, MrBayes, T_Coffee, Emboss, Phylip, fasta, Glimmer, TIGR Assembler, perl-bioperl, perl-bioperl-ext, perl-bioperl-run, perl-bioperl-db, foundation-python, flex, readline-devel, foundation-python-extras, xorg-x11-devel, gd, ReportLab, readline, gd-devel.

17. Which programs are available as 'modules'?

To display the full list of available software at any time:

module spider


To display the summarized list of software (just name and app version):

module av


To display just the bioinformatics-related apps:

module av bio


To display the available versions of an specific program (e.g. Bowtie):

module av bowtie

18. Which modules (and versions) were available, as a reference, in March 2015?

See this list:

-----------------------------------------------------------------------------------------------------------------------------------------------------
The following is a list of the modules currently available:
-----------------------------------------------------------------------------------------------------------------------------------------------------
  Autoconf: Autoconf/2.69-GCC-4.8.4
    Autoconf is an extensible package of M4 macros that produce shell scripts to automatically configure software source code packages. These
    scripts can adapt the packages to many kinds of UNIX-like systems without manual user intervention. Autoconf creates a configuration script for
    a package from a template file that lists the operating system features that the package can use, in the form of M4 macro calls. - Homepage:
    http://www.gnu.org/software/autoconf/ 

  Automake: Automake/1.15-GCC-4.8.4
    Automake: GNU Standards-compliant Makefile generator - Homepage: http://www.gnu.org/software/automake/automake.html 

  BamTools: BamTools/2.2.3-goolf-1.7.20, BamTools/2.3.0-goolf-1.7.20
    BamTools provides both a programmer's API and an end-user's toolkit for handling BAM files. - Homepage: https://github.com/pezmaster31/bamtools 

  bamUtil: bamUtil/1.0.13-goolf-1.7.20
    bamUtil is a repository that contains several programs that perform operations on SAM/BAM files. All of these programs are built into a single
    executable, bam. - Homepage: http://genome.sph.umich.edu/wiki/BamUtil 

  BEDTools: BEDTools/2.17.0-goolf-1.7.20
    The BEDTools utilities allow one to address common genomics tasks such as finding feature overlaps and computing coverage. The utilities are
    largely based on four widely-used file formats: BED, GFF/GTF, VCF, and SAM/BAM. - Homepage: http://code.google.com/p/bedtools/ 

  BioPerl: BioPerl/1.6.923-goolf-1.7.20-Perl-5.20.1
    Bioperl is the product of a community effort to produce Perl code which is useful in biology. Examples include Sequence objects, Alignment
    objects and database searching objects. - Homepage: http://www.bioperl.org/ 

  Biopython: Biopython/1.64-goolf-1.7.20-Python-2.7.9
    Biopython is a set of freely available tools for biological computation written in Python by an international team of developers. It is a
    distributed collaborative effort to develop Python libraries and applications which address the needs of current and future work in
    bioinformatics. - Homepage: http://www.biopython.org 

  BLAST: BLAST/2.2.26-Linux_x86_64
    Basic Local Alignment Search Tool, or BLAST, is an algorithm for comparing primary biological sequence information, such as the amino-acid
    sequences of different proteins or the nucleotides of DNA sequences. - Homepage: http://blast.ncbi.nlm.nih.gov/ 

  BLAST+: BLAST+/2.2.30-goolf-1.7.20
    Basic Local Alignment Search Tool, or BLAST, is an algorithm for comparing primary biological sequence information, such as the amino-acid
    sequences of different proteins or the nucleotides of DNA sequences. - Homepage: http://blast.ncbi.nlm.nih.gov/ 

  Boost: Boost/1.55.0-goolf-1.7.20-Python-2.7.9, Boost/1.55.0-goolf-1.7.20
    Boost provides free peer-reviewed portable C++ source libraries. - Homepage: http://www.boost.org/ 

  Bowtie: Bowtie/0.12.7, Bowtie/1.0.0-goolf-1.7.20, Bowtie/1.1.1
    Bowtie is an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome. - Homepage:
    http://bowtie-bio.sourceforge.net/index.shtml 

  Bowtie2: Bowtie2/2.1.0-goolf-1.7.20, Bowtie2/2.2.2-goolf-1.7.20
    Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at
    aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes.
    Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around
    3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes. - Homepage: http://bowtie-bio.sourceforge.net/bowtie2/index.shtml 

  BWA: BWA/0.7.10-goolf-1.7.20, BWA/0.7.11-goolf-1.7.20, BWA/0.7.12-goolf-1.7.20
    Burrows-Wheeler Aligner (BWA) is an efficient program that aligns relatively short nucleotide sequences against a long reference sequence such
    as the human genome. - Homepage: http://bio-bwa.sourceforge.net/ 

  CMake: CMake/2.8.12-GCC-4.8.4
    CMake, the cross-platform, open-source build system. CMake is a family of tools designed to build, test and package software. - Homepage:
    http://www.cmake.org 

  Crass: Crass/0.3.12-goolf-1.7.20
    Crass is a program that searches through raw metagenomic reads for Clustered Regularly Interspersed Short Palindromic Repeats - Homepage:
    http://bioinformatics.ninja/crass/ 

  Cufflinks: Cufflinks/2.2.1-goolf-1.7.20
    Transcript assembly, differential expression, and differential regulation for RNA-Seq - Homepage: http://cufflinks.cbcb.umd.edu/ 

  cURL: cURL/7.40.0
    libcurl is a free and easy-to-use client-side URL transfer library, supporting DICT, FILE, FTP, FTPS, Gopher, HTTP, HTTPS, IMAP, IMAPS, LDAP,
    LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMTP, SMTPS, Telnet and TFTP. libcurl supports SSL certificates, HTTP POST, HTTP PUT, FTP uploading,
    HTTP form based upload, proxies, cookies, user+password authentication (Basic, Digest, NTLM, Negotiate, Kerberos), file transfer resume, http
    proxy tunneling and more. - Homepage: http://curl.haxx.se 

  cutadapt: cutadapt/1.7.1-goolf-1.7.20-Python-2.7.9
    cutadapt removes adapter sequences from high-throughput sequencing data. This is usually necessary when the read length of the sequencing
    machine is longer than the molecule that is sequenced, for example when sequencing microRNAs. - Homepage: http://code.google.com/p/cutadapt/ 

  EasyBuild: EasyBuild/1.16.2, EasyBuild/2.0.0
    EasyBuild is a software build and installation framework written in Python that allows you to install software in a structured, repeatable and
    robust way. - Homepage: http://hpcugent.github.com/easybuild/ 

  Eigen: Eigen/3.1.1-goolf-1.7.20
    Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms. - Homepage:
    http://eigen.tuxfamily.org/index.php?title=Main_Page 

  EPACTS: EPACTS/3.2.6-goolf-1.7.20
    EPACTS (Efficient and Parallelizable Association Container Toolbox) is a versatile software pipeline to perform various statistical tests for
    identifying genome-wide association from sequence data through a user-friendly interface, both to scientific analysts and to method developers.
    - Homepage: http://genome.sph.umich.edu/wiki/EPACTS 

  FastQC: FastQC/0.11.2-Java-1.7.0_67
    FastQC is a quality control application for high throughput sequence data. It reads in sequence data in a variety of formats and can either
    provide an interactive application to review the results of several different QC checks, or create an HTML based report which can be integrated
    into a pipeline. - Homepage: http://www.bioinformatics.babraham.ac.uk/projects/fastqc/ 

  FASTX-Toolkit: FASTX-Toolkit/0.0.14-goolf-1.7.20
    The FASTX-Toolkit is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing. - Homepage:
    http://hannonlab.cshl.edu/fastx_toolkit/ 

  FFTW: FFTW/3.3.4-gompi-1.7.20
    FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of
    both real and complex data. - Homepage: http://www.fftw.org 

  freebayes: freebayes/0.9.18-goolf-1.7.20
    Bayesian haplotype-based polymorphism discovery and genotyping. - Homepage: https://github.com/ekg/freebayes 

  GATK: GATK/2.8-1-Java-1.7.0_67, GATK/3.3-0-Java-1.7.0_67
    The Genome Analysis Toolkit or GATK is a software package developed at the Broad Institute to analyse next-generation resequencing data. The
    toolkit offers a wide variety of tools, with a primary focus on variant discovery and genotyping as well as strong emphasis on data quality
    assurance. Its robust architecture, powerful processing engine and high-performance computing features make it capable of taking on projects of
    any size. - Homepage: http://www.broadinstitute.org/gatk/ 

  GCC: GCC/4.7.2, GCC/4.8.4, GCC/4.9.2
    The GNU Compiler Collection includes front ends for C, C++, Objective-C, Fortran, Java, and Ada, as well as libraries for these languages
    (libstdc++, libgcj,...). - Homepage: http://gcc.gnu.org/ 

  gnuplot: gnuplot/4.6.0-goolf-1.7.20
    gnuplot-4.6.0: Portable interactive, function plotting utility - Homepage: http://gnuplot.sourceforge.net/ 

  gompi: gompi/1.7.20
    GNU Compiler Collection (GCC) based compiler toolchain, including OpenMPI for MPI support. - Homepage: (none) 

  goolf: goolf/1.7.20
    GNU Compiler Collection (GCC) based compiler toolchain, including OpenMPI for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW and
    ScaLAPACK. - Homepage: (none) 

  GSL: GSL/1.16-goolf-1.7.20
    The GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. The library provides a wide range of mathematical routines
    such as random number generators, special functions and least-squares fitting. - Homepage: http://www.gnu.org/software/gsl/ 

  HMMER: HMMER/3.1b1-goolf-1.7.20
    HMMER is used for searching sequence databases for homologs of protein sequences, and for making protein sequence alignments. It implements
    methods using probabilistic models called profile hidden Markov models (profile HMMs). Compared to BLAST, FASTA, and other sequence alignment
    and database search tools based on older scoring methodology, HMMER aims to be significantly more accurate and more able to detect remote
    homologs because of the strength of its underlying mathematical models. In the past, this strength came at significant computational expense,
    but in the new HMMER3 project, HMMER is now essentially as fast as BLAST. - Homepage: http://hmmer.janelia.org/ 

  HTSeq: HTSeq/0.6.1p1-goolf-1.7.20-Python-2.7.9
    HTSeq is a Python package that provides infrastructure to process data from high-throughput sequencing assays. - Homepage:
    http://www-huber.embl.de/users/anders/HTSeq/ 

  HTSlib: HTSlib/1.1-goolf-1.7.20
    A C library for reading/writing high-throughput sequencing data. This package includes the utilities bgzip and tabix - Homepage:
    http://www.htslib.org/ 

  hwloc: hwloc/1.10.1-GCC-4.8.4
    The Portable Hardware Locality (hwloc) software package provides a portable abstraction (across OS, versions, architectures, ...) of the
    hierarchical topology of modern architectures, including NUMA memory nodes, sockets, shared caches, cores and simultaneous multithreading. It
    also gathers various system attributes such as cache and memory information as well as the locality of I/O devices such as network interfaces,
    InfiniBand HCAs or GPUs. It primarily aims at helping applications with gathering information about modern computing hardware so as to exploit
    it accordingly and efficiently. - Homepage: http://www.open-mpi.org/projects/hwloc/ 

  IGV: IGV/2.3.40-Java-1.7.0_67
    The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic
    datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations. -
    Homepage: http://www.broadinstitute.org/software/igv/ 

  IOR: IOR/2.10.3-goolf-1.7.20-mpiio
    The IOR software is used for benchmarking parallel file systems using POSIX, MPIIO, or HDF5 interfaces. - Homepage:
    http://sourceforge.net/projects/ior-sio/ 

  Java: Java/1.7.0_67
    Java Platform, Standard Edition (Java SE) lets you develop and deploy Java applications on desktops and servers. - Homepage: http://java.com/ 

  libgtextutils: libgtextutils/0.6.1-goolf-1.7.20
    ligtextutils is a dependency of fastx-toolkit and is provided via the same upstream - Homepage: http://hannonlab.cshl.edu/fastx_toolkit/ 

  libStatGen: libStatGen/1.0.13-goolf-1.7.20
    Useful set of classes for creating statistical genetic programs. - Homepage: http://genome.sph.umich.edu/wiki/C%2B%2B_Library:_libStatGen 

  libtool: libtool/2.4.5-GCC-4.8.4
    GNU libtool is a generic library support script. Libtool hides the complexity of using shared libraries behind a consistent, portable interface.
    - Homepage: http://www.gnu.org/software/libtool 

  M4: M4/1.4.17-GCC-4.8.4
    GNU M4 is an implementation of the traditional Unix macro processor. It is mostly SVR4 compatible although it has some extensions (for example,
    handling more than 9 positional parameters to macros). GNU M4 also has built-in functions for including files, running shell commands, doing
    arithmetic, etc. - Homepage: http://www.gnu.org/software/m4/m4.html 

  mdtest: mdtest/1.9.3-goolf-1.7.20
    mdtest is an MPI-coordinated metadata benchmark test that performs open/stat/close operations on files and directories and then reports the
    performance. - Homepage: http://sourceforge.net/projects/mdtest/ 

  miRDeep: miRDeep/2.0.0.7
    miRDeep2 is a completely overhauled tool which discovers microRNA genes by analyzing sequenced RNAs. The tool reports known and hundreds of
    novel microRNAs with high accuracy in seven species representing the major animal clades. - Homepage: https://www.mdc-berlin.de/8551903/en/ 

  muTect: muTect/1.1.4-Java-1.7.0_67
    MuTect is a method developed at the Broad Institute for the reliable and accurate identification of somatic point mutations in next generation
    sequencing data of cancer genomes. - Homepage: http://www.broadinstitute.org/cancer/cga/mutect 

  numactl: numactl/2.0.10-GCC-4.8.4
    The numactl program allows you to run your application program on specific cpu's and memory nodes. It does this by supplying a NUMA memory
    policy to the operating system before running your program. The libnuma library provides convenient ways for you to add NUMA memory policies
    into your own program. - Homepage: http://oss.sgi.com/projects/libnuma/ 

  numpy: numpy/1.9.1-goolf-1.7.20-Python-2.7.9
    NumPy is the fundamental package for scientific computing with Python. It contains among other things: a powerful N-dimensional array object,
    sophisticated (broadcasting) functions, tools for integrating C/C++ and Fortran code, useful linear algebra, Fourier transform, and random
    number capabilities. Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data.
    Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases. - Homepage:
    http://www.numpy.org 

  OpenBLAS: OpenBLAS/0.2.13-GCC-4.8.4-LAPACK-3.5.0
    OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version. - Homepage: http://xianyi.github.com/OpenBLAS/ 

  OpenMPI: OpenMPI/1.8.4-GCC-4.8.4
    The Open MPI Project is an open source MPI-2 implementation. - Homepage: http://www.open-mpi.org/ 

  openmpi-x86_64: openmpi-x86_64

  opt-python: opt-python

  Perl: Perl/5.20.1-goolf-1.7.20
    Larry Wall's Practical Extraction and Report Language - Homepage: http://www.perl.org/ 

  picard: picard/1.119
    A set of tools (in Java) for working with next generation sequencing data in the BAM format. - Homepage: http://sourceforge.net/projects/picard 

  PLINK: PLINK/1.07, PLINK/1.90b
    plink-1.07-src: Whole-genome association analysis toolset - Homepage: http://pngu.mgh.harvard.edu/~purcell/plink/ 

  PLINKSEQ: PLINKSEQ/0.10-goolf-1.7.20
    PLINK/SEQ is an open-source C/C++ library for working with human genetic variation data. The specific focus is to provide a platform for
    analytic tool development for variation data from large-scale resequencing and genotyping projects, particularly whole-exome and whole-genome
    studies. It is independent of (but designed to be complementary to) the existing PLINK package. - Homepage:
    https://atgu.mgh.harvard.edu/plinkseq/ 

  protobuf: protobuf/2.5.0-goolf-1.7.20
    Google Protocol Buffers - Homepage: https://code.google.com/p/protobuf/ 

  pyfasta: pyfasta/0.5.2-goolf-1.7.20-Python-2.7.9
    fast, memory-efficient, pythonic (and command-line) access to fasta sequence files - Homepage: https://github.com/brentp/pyfasta/ 

  pysam: pysam/0.8.1-goolf-1.7.20-Python-2.7.9
    Pysam is a python module for reading and manipulating Samfiles. It's a lightweight wrapper of the samtools C-API. Pysam also includes an
    interface for tabix. - Homepage: https://github.com/pysam-developers/pysam 

  Python: Python/2.7.9-goolf-1.7.20
    Python is a programming language that lets you work more quickly and integrate your systems more effectively. - Homepage: http://python.org/ 

  Qualimap: Qualimap/2.0.2
    Qualimap 2 is a platform-independent application written in Java and R that provides both a Graphical User Inteface (GUI) and a command-line
    interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts. - Homepage:
    http://qualimap.bioinfo.cipf.es/ 

  Queue: Queue/3.3-0-Java-1.7.0_67
    GATK-Queue is command-line scripting framework for defining multi-stage genomic analysis pipelines combined with an execution manager that runs
    those pipelines from end-to-end. - Homepage: https://www.broadinstitute.org/gatk/guide/article?id=1306 

  R: R/3.1.2-goolf-1.7.20
    R is a free software environment for statistical computing and graphics. - Homepage: http://www.r-project.org/ 

  rocks-openmpi: rocks-openmpi

  rocks-openmpi_ib: rocks-openmpi_ib

  SAMSTAT: SAMSTAT/1.5-goolf-1.7.20
    Displaying sequence statistics for next generation sequencing - Homepage: http://samstat.sourceforge.net/ 

  SAMtools: SAMtools/0.1.18-goolf-1.7.20, SAMtools/0.1.19-goolf-1.7.20, SAMtools/1.1-goolf-1.7.20, SAMtools/1.2-goolf-1.7.20
    SAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating
    alignments in a per-position format. - Homepage: http://samtools.sourceforge.net/ 

  Scala: Scala/2.10.5-Java-1.7.0_67, Scala/2.11.6-Java-1.7.0_67
    General purpose programming language designed to express common programming patterns in a concise, elegant, and type-safe way. - Homepage:
    http://www.scala-lang.org/ 

  ScaLAPACK: ScaLAPACK/2.0.2-gompi-1.7.20-OpenBLAS-0.2.13-LAPACK-3.5.0
    The ScaLAPACK (or Scalable LAPACK) library includes a subset of LAPACK routines redesigned for distributed memory MIMD parallel computers. -
    Homepage: http://www.netlib.org/scalapack/ 

  scipy: scipy/0.15.1-goolf-1.7.20-Python-2.7.9
    SciPy is a collection of mathematical algorithms and convenience functions built on the Numpy extension for Python. - Homepage:
    http://www.scipy.org 

  snpEff: snpEff/4.0-Java-1.7.0_67, snpEff/4.1-Java-1.7.0_67
    SnpEff is a variant annotation and effect prediction tool. It annotates and predicts the effects of genetic variants (such as amino acid
    changes). This package also includes SnpSift - Homepage: http://snpeff.sourceforge.net/ 

  tabix: tabix/0.2.6-goolf-1.7.20
    Generic indexer for TAB-delimited genome position files - Homepage: http://samtools.sourceforge.net 

  TopHat: TopHat/2.0.13-goolf-1.7.20
    TopHat is a fast splice junction mapper for RNA-Seq reads. - Homepage: http://ccb.jhu.edu/software/tophat/ 

  Trinity: Trinity/2.0.4-goolf-1.7.20
    Trinity assembles transcript sequences from Illumina RNA-Seq data. - Homepage: http://trinityrnaseq.github.io/ 

  VCFtools: VCFtools/0.1.12-goolf-1.7.20-Perl-5.20.1
    The aim of VCFtools is to provide methods for working with VCF files: validating, merging, comparing and calculate some basic population genetic
    statistics. - Homepage: http://vcftools.sourceforge.net/ 

  Velvet: Velvet/1.2.10-goolf-1.7.20-mt-kmer_31, Velvet/1.2.10-goolf-1.7.20-mt-kmer_57, Velvet/1.2.10-goolf-1.7.20-mt-kmer_63
    Sequence assembler for very short reads - Homepage: http://www.ebi.ac.uk/~zerbino/velvet/ 

  Scala: Scala/2.10.5-Java-1.7.0_67, Scala/2.11.6-Java-1.7.0_67
    General purpose programming language designed to express common programming patterns in a concise, elegant, and type-safe way. - Homepage:
    http://www.scala-lang.org/ 

  ScaLAPACK: ScaLAPACK/2.0.2-gompi-1.7.20-OpenBLAS-0.2.13-LAPACK-3.5.0
    The ScaLAPACK (or Scalable LAPACK) library includes a subset of LAPACK routines redesigned for distributed memory MIMD parallel computers. -
    Homepage: http://www.netlib.org/scalapack/ 

  scipy: scipy/0.15.1-goolf-1.7.20-Python-2.7.9
    SciPy is a collection of mathematical algorithms and convenience functions built on the Numpy extension for Python. - Homepage:
    http://www.scipy.org 

  snpEff: snpEff/4.0-Java-1.7.0_67, snpEff/4.1-Java-1.7.0_67
    SnpEff is a variant annotation and effect prediction tool. It annotates and predicts the effects of genetic variants (such as amino acid
    changes). This package also includes SnpSift - Homepage: http://snpeff.sourceforge.net/ 

  tabix: tabix/0.2.6-goolf-1.7.20
    Generic indexer for TAB-delimited genome position files - Homepage: http://samtools.sourceforge.net 

  TopHat: TopHat/2.0.13-goolf-1.7.20
    TopHat is a fast splice junction mapper for RNA-Seq reads. - Homepage: http://ccb.jhu.edu/software/tophat/ 

  Trinity: Trinity/2.0.4-goolf-1.7.20
    Trinity assembles transcript sequences from Illumina RNA-Seq data. - Homepage: http://trinityrnaseq.github.io/ 

  VCFtools: VCFtools/0.1.12-goolf-1.7.20-Perl-5.20.1
    The aim of VCFtools is to provide methods for working with VCF files: validating, merging, comparing and calculate some basic population genetic
    statistics. - Homepage: http://vcftools.sourceforge.net/ 

  Velvet: Velvet/1.2.10-goolf-1.7.20-mt-kmer_31, Velvet/1.2.10-goolf-1.7.20-mt-kmer_57, Velvet/1.2.10-goolf-1.7.20-mt-kmer_63
    Sequence assembler for very short reads - Homepage: http://www.ebi.ac.uk/~zerbino/velvet/ 

  ViennaRNA: ViennaRNA/1.8.4-goolf-1.7.20, ViennaRNA/2.1.8-goolf-1.7.20
    The Vienna RNA Package consists of a C code library and several stand-alone programs for the prediction and comparison of RNA secondary
    structures. - Homepage: http://www.tbi.univie.ac.at/~ronny/RNA/vrna2.html 

  Xerces-C++: Xerces-C++/3.1.1-goolf-1.7.20
    Xerces-C++ is a validating XML parser written in a portable subset of C++. Xerces-C++ makes it easy to give your application the ability to read
    and write XML data. A shared library is provided for parsing, generating, manipulating, and validating XML documents using the DOM, SAX, and
    SAX2 APIs. - Homepage: http://xerces.apache.org/xerces-c/

19. Which versions are available for those installed modules (as of March 2015)?

---------------------------------------------------------- /home/soft/modules/bio -----------------------------------------------------------
   BamTools/2.2.3-goolf-1.7.20                     libgtextutils/0.6.1-goolf-1.7.20
   BamTools/2.3.0-goolf-1.7.20              (D)    libStatGen/1.0.13-goolf-1.7.20
   bamUtil/1.0.13-goolf-1.7.20                     miRDeep/2.0.0.7
   BEDTools/2.17.0-goolf-1.7.20                    muTect/1.1.4-Java-1.7.0_67
   BioPerl/1.6.923-goolf-1.7.20-Perl-5.20.1        picard/1.119
   Biopython/1.64-goolf-1.7.20-Python-2.7.9        PLINK/1.07
   BLAST/2.2.26-Linux_x86_64                       PLINK/1.90b                              (D)
   BLAST+/2.2.30-goolf-1.7.20                      PLINKSEQ/0.10-goolf-1.7.20
   Bowtie/0.12.7                                   pyfasta/0.5.2-goolf-1.7.20-Python-2.7.9
   Bowtie/1.0.0-goolf-1.7.20                       pysam/0.8.1-goolf-1.7.20-Python-2.7.9
   Bowtie/1.1.1                             (D)    Qualimap/2.0.2
   Bowtie2/2.1.0-goolf-1.7.20                      Queue/3.3-0-Java-1.7.0_67
   Bowtie2/2.2.2-goolf-1.7.20               (D)    SAMSTAT/1.5-goolf-1.7.20
   BWA/0.7.10-goolf-1.7.20                         SAMtools/0.1.18-goolf-1.7.20
   BWA/0.7.11-goolf-1.7.20                         SAMtools/0.1.19-goolf-1.7.20
   BWA/0.7.12-goolf-1.7.20                  (D)    SAMtools/1.1-goolf-1.7.20
   Crass/0.3.12-goolf-1.7.20                       SAMtools/1.2-goolf-1.7.20                (D)
   Cufflinks/2.2.1-goolf-1.7.20                    snpEff/4.0-Java-1.7.0_67
   cutadapt/1.7.1-goolf-1.7.20-Python-2.7.9        snpEff/4.1-Java-1.7.0_67                 (D)
   EPACTS/3.2.6-goolf-1.7.20                       tabix/0.2.6-goolf-1.7.20
   FastQC/0.11.2-Java-1.7.0_67                     TopHat/2.0.13-goolf-1.7.20
   FASTX-Toolkit/0.0.14-goolf-1.7.20               Trinity/2.0.4-goolf-1.7.20
   freebayes/0.9.18-goolf-1.7.20                   VCFtools/0.1.12-goolf-1.7.20-Perl-5.20.1
   GATK/2.8-1-Java-1.7.0_67                        Velvet/1.2.10-goolf-1.7.20-mt-kmer_31
   GATK/3.3-0-Java-1.7.0_67                 (D)    Velvet/1.2.10-goolf-1.7.20-mt-kmer_57
   HMMER/3.1b1-goolf-1.7.20                        Velvet/1.2.10-goolf-1.7.20-mt-kmer_63    (D)
   HTSeq/0.6.1p1-goolf-1.7.20-Python-2.7.9         ViennaRNA/1.8.4-goolf-1.7.20
   HTSlib/1.1-goolf-1.7.20                         ViennaRNA/2.1.8-goolf-1.7.20             (D)
   IGV/2.3.40-Java-1.7.0_67

-------------------------------------------------------- /home/soft/modules/compiler --------------------------------------------------------
   GCC/4.7.2    GCC/4.8.4 (D)    GCC/4.9.2

--------------------------------------------------------- /home/soft/modules/devel ----------------------------------------------------------
   Autoconf/2.69-GCC-4.8.4    Boost/1.55.0-goolf-1.7.20-Python-2.7.9        CMake/2.8.12-GCC-4.8.4    protobuf/2.5.0-goolf-1.7.20
   Automake/1.15-GCC-4.8.4    Boost/1.55.0-goolf-1.7.20              (D)    M4/1.4.17-GCC-4.8.4

---------------------------------------------------------- /home/soft/modules/lang ----------------------------------------------------------
   Java/1.7.0_67               Python/2.7.9-goolf-1.7.20    Scala/2.10.5-Java-1.7.0_67
   Perl/5.20.1-goolf-1.7.20    R/3.1.2-goolf-1.7.20         Scala/2.11.6-Java-1.7.0_67 (D)

---------------------------------------------------------- /home/soft/modules/lib -----------------------------------------------------------
   libtool/2.4.5-GCC-4.8.4    Xerces-C++/3.1.1-goolf-1.7.20

---------------------------------------------------------- /home/soft/modules/math ----------------------------------------------------------
   Eigen/3.1.1-goolf-1.7.20    numpy/1.9.1-goolf-1.7.20-Python-2.7.9    scipy/0.15.1-goolf-1.7.20-Python-2.7.9

---------------------------------------------------------- /home/soft/modules/mpi -----------------------------------------------------------
   OpenMPI/1.8.4-GCC-4.8.4

--------------------------------------------------------- /home/soft/modules/numlib ---------------------------------------------------------
   FFTW/3.3.4-gompi-1.7.20    OpenBLAS/0.2.13-GCC-4.8.4-LAPACK-3.5.0
   GSL/1.16-goolf-1.7.20      ScaLAPACK/2.0.2-gompi-1.7.20-OpenBLAS-0.2.13-LAPACK-3.5.0

-------------------------------------------------- /home/soft/modules/rockscluster_modules --------------------------------------------------
   openmpi-x86_64    opt-python    rocks-openmpi    rocks-openmpi_ib

--------------------------------------------------------- /home/soft/modules/system ---------------------------------------------------------
   hwloc/1.10.1-GCC-4.8.4

------------------------------------------------------- /home/soft/modules/toolchain --------------------------------------------------------
   gompi/1.7.20    goolf/1.7.20

--------------------------------------------------------- /home/soft/modules/tools ----------------------------------------------------------
   cURL/7.40.0         EasyBuild/2.0.0               (D)    mdtest/1.9.3-goolf-1.7.20
   EasyBuild/1.16.2    IOR/2.10.3-goolf-1.7.20-mpiio        numactl/2.0.10-GCC-4.8.4

---------------------------------------------------------- /home/soft/modules/vis -----------------------------------------------------------
   gnuplot/4.6.0-goolf-1.7.20

  Where:
   (D):  Default Module

20. How to use the program modules?

To load an specific version of a program (e.g. Bowtie):

module load Bowtie/1.1.1


When you load a module of a program (bowtie, in this example), this version of Bowtie will be automatically added to the PATH session. If you change module to another version, the new version of the program will be placed in the PATH instead. Example:

[soft@vhircluster]$ module load Bowtie/1.1.1

[soft@vhircluster]$ bowtie --version
bowtie version 1.1.1

[soft@vhircluster]$ module load Bowtie/0.12.7
The following have been reloaded with a version change:
  1) Bowtie/1.1.1 => Bowtie/0.12.7

[soft@vhircluster]$ bowtie --version
bowtie version 0.12.7


To display the list of modules that you have loaded at any time in the current session:

module list


To unload all modules:

module purge


All this module management can be (and is intended to be) done thorugh scripts send to the job queue system. See above how to #Send_Jobs.

21. Which libraries have been used to compile the software modules?

As of March 16th, 2015, these are the libraries availabe, which were used to compile the software modules:

GCC/4.8.4
OpenMPI/1.8.4-GCC-4.8.4
OpenBLAS/0.2.13-GCC-4.8.4-LAPACK-3.5.0
FFTW/3.3.4-gompi-1.7.20
ScaLAPACK/2.0.2-gompi-1.7.20-OpenBLAS-0.2.13-LAPACK-3.5.0

22. Which programming languages are available?

As of March 16th, 2015, these are the main programming languages available:

Perl/5.20.1
Python/2.7.9
R/3.1.2
Java/1.7.0_67


You can see un updated list running the command in a terminal at the front-end:

module av lang

23. How to have a new custom program installed at the Cluster?

You can try to install your program in your home folder at the front-end node.

If you require your program to be installed elsewhere, or system wide for some reason, please, send your request by email to the UEB: ueb at vhir.org . In that case, you will find the program installed by the UEB in this folder after you connect to the head node (unless explicitly indicated elsewhere):

/export/apps/

24. Where are the new custom programs located?

Programs which are not added as 'modules' (see above), are located in a specific folder at the Cluster. And the path to that location depends on whether you are in the head node or in compute nodes.

  1. Head node:
    • System programs are available from any folder.
    • Custom programs additionally installed, are located by default under this folder:
      /export/apps/
  2. Compute nodes:
    • System programs are available from any folder.
    • Custom programs additionally installed, are located by default in compute nodes under this folder:
      /share/apps/

25. Why can't I see the custom programs located there?

The folder where the custom programs are located is mounted when needed in compute nodes. A simple basic command to list the programs under that folder might not be enough. To ensure that the program you requested is available for your scripts, try these steps (replace "myclusteruser" with your own username as user of the cluster):

  1. Connect to the head node as usual:
    mylocaluser@mycomputer:~$ ssh myclusteruser@172.18.50.16
    Password: 
    Last login: Tue Mar  3 15:38:52 2015 from mymachinename.ir.vhebron.net
    Rocks 6.1.1 (Sand Boa)
    Profile built 15:21 05-Nov-2014
    
    [myclusteruser@vhircluster ~]$

  2. start an interactive shell in a compute noode as usual:
    [myclusteruser@vhircluster ~]$ qrsh
    [myclusteruser@compute-0-4 /]$

  3. run this command in this interactive shell at the compute node you are connected to:
    [myclusteruser@compute-0-4 /]$ ls -l /share/apps
    total 20
    drwxr-xr-x 8 root root 4096 Feb  3 11:33 FastQC
    -rwxr-xr-x 1 root root  652 Nov 24 16:50 install_R_pkgs.sh
    drwxr-xr-x 2 root root 4096 Nov 24 18:01 logs
    drwxr-xr-x 2 root root 4096 Nov 23 13:20 regular-R-pkgs
    drwxr-xr-x 2 root root 4096 Nov 23 13:20 special-R-pkgs
    [myclusteruser@compute-0-4 /]$
    • In this case, you can see that the only installed program by the time of this writing was FastQC, plus some R packages.
      Please note that A simple ls /share or ls /share/apps might not trigger the mount of that folder so it would seem as if it's empty.

26. How to have a new big database installed at the Cluster?

Databases (such as the ones related to genomes, variants, etc) and other big files that your program may need can be already available on a shared folder at:

/export/apps/data/


If you require a new big database placed there for you because you consider that it might have some interest for other cluster users, please, send your request by email to the UEB: ueb at vhir.org

27. Why wget at the compute nodes does not work?

Designed in purpose, to prevent access from internet to the compute nodes. Users connect to the Head Node, and download there with wget or similar tools whatever they need from the Internet to the shared folder between the nead and compute nodes. And users adapt their scripts to have the compute nodes use that information from there.

If you need your program to access the internet from your program, you need to adapt your program to use those files from disk instead, And at the front-end node fetch those files from the internet with wget, and place them within a subfloder at /export/apps/. Then you can indicate to your program that will be run through the compute nodes, to use those files from /share/apps/.

28. Why do I see R version 3.1.1 if the module list says 3.1.2 or newer?

You are loading R directly in the frontend node. You need to load the module of R first, so that you use the modules version of the R program installed in the cluster, with the available extra R packages installed.

And remember to launch an interactive shell first (with qrsh, see above) to use the compute nodes and not the head node directly.

Alias names for this page

ClusterFAQ | Cluster FAQ for users