Difference between revisions of "Tutorial:RunningSimulations"

From ALPS
Jump to: navigation, search
(Running the simulation on a parallel machine)
(Examples)
Line 220: Line 220:
  
 
There is a number of constraints used in this example to filter data from the archive. The first constraint <for-each name="SystemSize"/> used in this example describes a loop over all possible values of the specified parameter. In the given example multiple sets are generated one for each system size found in the archive. The second constraint <constraint name="Energy" type="SCALAR_AVERAGE" condition="<0" /> restricst the energy range to negative values.
 
There is a number of constraints used in this example to filter data from the archive. The first constraint <for-each name="SystemSize"/> used in this example describes a loop over all possible values of the specified parameter. In the given example multiple sets are generated one for each system size found in the archive. The second constraint <constraint name="Energy" type="SCALAR_AVERAGE" condition="<0" /> restricst the energy range to negative values.
For more details and further examples go to the tool page.
+
For more details and further examples go to the [http://alps.comp-phys.org/software/ALPS/doc/tool/index.html tool page].
 
 
 
 
  
 
== Evaluation of observables ==
 
== Evaluation of observables ==

Revision as of 11:52, 8 May 2005

General overview

The data structures/files used in the workflow are illustrated in the following figure

The workflow of an ALPS simulation illustrated thru data structures (yellow) and tools (blue).


Running a Monte Carlo simulation

In the ALPS library Monte Carlo simulations are based on the scheduler library which allows you to specify parameters for your simulations, including multiple definitions of parameters (e.g. if you want to simulate a physical system at a couple of temperatures). The scheduler library will then start jobs for every single parameter set, either on a serial or parallel machine, and uses checkpoints to prevent data loss when exceeding machine walltimes. The scheduler library asks for a job file which specifies task files for every set of parameters for which a Monte Carlo simulation shall be run. The job and task files are given in XML format, following the schema at http://xml.comp-phys.org. The scheduler will read in these files and write observables into the task file. An example job file could look like this:

<JOB>
  <OUTPUT file="parm.xml"/>
  <TASK status="new">
    <INPUT file="parm.task1.in.xml"/>
    <OUTPUT file="parm.task1.xml"/>
  </TASK>
  <TASK status="new">
    <INPUT file="parm.task2.in.xml"/>
    <OUTPUT file="parm.task2.xml"/>
  </TASK>
  <TASK status="new">
    <INPUT file="parm.task3.in.xml"/>
    <OUTPUT file="parm.task3.xml"/>
  </TASK> 
</JOB>

and an example task file like:

<SIMULATION>
  <PARAMETERS>
    <PARAMETER name="L">100</PARAMETER>
    <PARAMETER name="SWEEPS">10000</PARAMETER>
    <PARAMETER name="T">0.5</PARAMETER>
    <PARAMETER name="THERMALIZATION">100</PARAMETER>
    <PARAMETER name="WORK_FACTOR">SWEEPS * L</PARAMETER>
  </PARAMETERS> 
</SIMULATION>

Before a simulation starts, the task file just lists all simulation parameters. Afterwards results and checkpoint information will be added. See the schema documentation for more details.

Tools

Since the XML format of the job and task files is probably not what you want to deal with on a daily basis, the parameter2xml tool lets you specify the simulation parameters in a plain text file which is converted to the XML format for your conveniece.

parameter2xml

The parameter2xml tool transforms a plain text parameter file into the above XML format,thereby creating the job and all neccessary task files. The parameter file consists of a number of parameter assignments of the form:

MODEL="Ising";
SWEEPS=1000;
THERMALIZATION=100; 
WORK_FACTOR=[L*SWEEPS];
{ L=10; T=0.1; }
{ L=20; T=0.05; }

where each group of assignments inside a block of curly braces {...} indicates a set of parameters for a single simulation. Assignments outside of a block of curly braces are valid globally for all simulation after the point of definition. Strings are given in double quotes, as in "Ising".

Two parameters have a special meaning:

Parameter Default Meaning
SEED 0 The random number seed used in the next Monte Carlo run created. After using a seed in the creation of a Monte Carlo run, this value gets incremented by one.
WORK_FACTOR 1 A factor by which the work that needs to be done for a simulation is multiplied in load balancing.


The syntax to invoke parameter2xml is:

parameter2xml parameterfile [xmlfileprefix]

which converts a parameterfile into a set of XML files, starting with the prefix given as optional second argument. The default for the second argument is the name as the parameterfile.


Invoking the program

Running the simulation on a serial machine

The simulation is started by first creating the job filse, and then giving the name of the XML job file as argument to the program. In our example, the program is called main and the sequence for running it is:

parameter2xml parm job 
main job.in.xml

The results will be stored in a file job.out.xml, which refers to the files job.task1.out.xml, job.task2.out.xml and job.task3.out.xml for the results of the three simulations.

Command line options

The program takes a number of command line options, to control the behavior of the scheduler:

Option Default Description
-T timelimit infinity gives the time (in seconds) which the program should run before writing a final checkpoint and exiting.
-Tc checkpointtime 1800 gives the time (in seconds) after which the program should write a checkpoint.
-Tmin checkingtime 60 gives the minimum time (in seconds) which the scheduler waits before checking (again) whether a simulation is finished.
-Tmax checkingtime 900 gives the maximum time (in seconds) which the scheduler waits before checking (again) whether a simulation is finished.


Running the simulation on a parallel machine

is as easy as running it on a single machine. We will give the example using MPI. After starting the MPI environment (using e.g. lamboot for LAM MPI, you run the program in parallel using mpirun. In our example, e.g. to run it on four processes you do:

parameter2xml parm job 
mpirun -np 4 main job.in.xml

Command line options

In addition to the command line options for the sequential program there are two more for the parallel program:

Option Default Description
-Nmin numprocs 1 gives the minimum number of processes to assign to a simulation.
-Nmax numprocs infinity gives the maximum number of processes to assign to a simulation.

If there are more processors available than simulations, more than one Monte Carlo run will be started for each simulation.

Analysing the results of a Monte Carlo simulation

During the Monte Carlo simulation expectation values of a couple of observables (specified and implemented in the simulation code) are measured and stored in the respective task files. To archive the task files produced from a simulation and to extract data from these files or the archive respectively a couple of tools are documented in the following.


Tools

convert2xml

The simulation output files only contain the collected measurements from all runs. Details about the individual Monte Carlo runs for each simulation can be obtained by converting the checkpoint files to XML, using the convert2xml tool, e.g.:

convert2xml run-file

This will produce an xml file of the task, containing information extracted from this Monte Carlo run.

archivecat

The archivecat tool wraps the specified task files into an archive file.

archivecat task-file [task-file [task-file ...]]

extracttext

The extracttext script can be used to extract data in form of a plot from an archive or a set of task files. An input plot file in XML format specifies which observables should be extracted. For an example see below. The output format is plain text.

extracttext plot-file archive-file 
extracttext plot-file task-file [task-file [task-file ...]]

extractxmgr

The extractxmgr script works similar to the extracttext, but produces output in the xmgrace plot format.

extractxmgr plot-file archive-file 
extractxmgr plot-file task-file [task-file [task-file ...]]

extracthtml

The extractxmgr script works similar to the extracttext, but produces output in html format.

extracthtml plot-file archive-file 
extracthtml plot-file task-file [task-file [task-file ...]]


Examples

An example plot file describing a plot of energy versus temperature for all system sizes calculated could look like this:

<?xml version="1.0" encoding="UTF-8"?> 
<?xml-stylesheet type="text/xsl" href="http://xml.comp-phys.org/2003/4/plot2html.xsl"?>

<plot name="Energy versus temperature for some model">

  <legend show="true"/>
  <xaxis label="Temperature" type="PARAMETER" name="T"/>
  <yaxis label="Energy"      type="SCALAR_AVERAGE"/>

  <for-each name="SystemSize"/>

  <constraint name="Energy"  type="SCALAR_AVERAGE" condition="<0" />

  <set label="start "/>

</plot>
       

There is a number of constraints used in this example to filter data from the archive. The first constraint <for-each name="SystemSize"/> used in this example describes a loop over all possible values of the specified parameter. In the given example multiple sets are generated one for each system size found in the archive. The second constraint <constraint name="Energy" type="SCALAR_AVERAGE" condition="<0" /> restricst the energy range to negative values. For more details and further examples go to the tool page.

Evaluation of observables

Examples

The following example reads the expectation values of the particle number operators n and n2 of the simulation of a bosonic Hubbard model, calculates the expectation value of the compressibility and writes it back to the checkpoint.

#include <alps/scheduler.h>
 
void evaluate(const boost::filesystem::path& p, std::ostream& out) {
  alps::ProcessList nowhere;
  alps::scheduler::MCSimulation sim(nowhere,p);
 
  // read in parameters
  alps::Parameters parms=sim.get_parameters();
  double beta=parms.defined("beta") ? static_cast(parms["beta"])  : (1./static_cast(parms["T"]));             
 
  // determine compressibility
  alps::RealObsevaluator n  = sim.get_measurements()["Particle number"];
  alps::RealObsevaluator n2 = sim.get_measurements()["Particle number^2"];
  alps::RealObsevaluator kappa= beta*(n2 - n*n);  
  kappa.rename("Compressibility");
 
  // write compressibility back to checkpoint  
  sim << kappa;
  sim.checkpoint(p);
}
 
int main(int argc, char** argv)
{
  alps::scheduler::SimpleFactory factory;
  alps::scheduler::init(factory);
  boost::filesystem::path p(argv[1],boost::filesystem::native);
  evaluate(p,std::cout);
}

(c) 2003-2005 by Simon Trebst and Synge Todo