IPS Portal

The IPS portal hosted on the NERSC Spin service, shows the progress and status of IPS runs on a variety of machines. The simulation configuration file and platform configuration file contain entries that allow the IPS to publish events to the portal.

On the top-level page, you will see information about each run including who ran it, the current status, physics time stamp, wall time, and a descriptive comment. From there you can click on a Run ID to see the details of that run, including calls on components, data movement events, task launches and finishes, and checkpoints.

To use the portal include

USE_PORTAL = True
PORTAL_URL = http://lb.ipsportal.production.svc.spin.nersc.org

The source code for the portal can be found one GitHub and issues can be reported using GitHub issues.

in either your Platform Configuration File or your Simulation Configuration File.

Tracing

Note

New in IPS-Framework 0.6.0

IPS has the ability to capture a trace of the workflow to allow analysis and visualizations. The traces are captured in the Zipkin Span format and viewed within IPS portal using Jaeger.

After selecting a run in the portal there will be a link to the trace:

../_images/run_list.png ../_images/trace_link.png

The default view is the Trace Timeline but other useful views are Trace Graph and Trace Statistic which can be selected from the menu in the top-right:

../_images/jaeger_options.png

The statistics can be further broken down by operation.

../_images/statistics_sub_group.png

Note

Self time (ST) is the total time spent in a span when it was not waiting on children. For example, a 10ms span with two 4ms non-overlapping children would have self-time = 10ms - 2 * 4ms = 2ms.

Child Runs

Note

New in IPS-Framework 0.7.0

If you have a workflow where you are running ips as a task of another IPS simulation you can create a relation between them that will allow it to be viewed together in the IPS-portal and get a single trace for the entire collection.

To setup the hierarchical structure between different IPS runs, so if one run starts other runs as a separate simulation, you can set the PARENT_PORTAL_RUNID parameter in the child simulation configuration. This can be done dynamically from the parent simulation like:

child_conf['PARENT_PORTAL_RUNID'] = self.services.get_config_param("PORTAL_RUNID")

This is automatically configured when running ips_dakota_dynamic.py.

The child runs will not appear on the main runs list but will appear on a tab next to the events.

../_images/child_runs.png

The trace of the primary simulation will contain the traces from all the simulations:

../_images/child_runs_trace.png