A Cyberinfrastructure platform to meet the needs of data intensive radio astronomy on route to the SKA

ADASS 2011 - Day 2

Benchmarking the CRBLASTER computational Framework on a 350-MHz 49-core Maestro Development Board

High performance computing for under 30 W ~ GPUs use 300W.
RAW processor, multicore processors in meshes.

Do statistical analysis on images in a stack. What about one image, say of a gamma ray burst - the key is you can't repeat the observation. (van Dokkum 2001) - can get rid of them, using a high pass filter - what remains is the useful information. The original L.A.COSMIC running in IRAF is slow, 100s for 800x800 WFPC2 image. Solution: C version running in parallel, its embarrassingly parallel.

State of art flight computer, BAE system's RAD750 - low power <11 W, is slow 132 MHz and expensive, $150,000. Reason it is slower due to radiation hardened.
49-core RHDB MAESTRO Processor.

Unleashing the power of distributed CPU/GPU Architectures: Massive Astronomical Data Analysis and Visualization case study

grep 1 MB - 1 sec
grep 1 gb - 1min
and so on...

AAOGlimpse - fun with opengl and fits

A FITS image display program
"its fun to look on the underneath of your data"
"how many fits viewers can fly an imperial star cruiser"
looks very useful for 3D data cube.
can use sockets to write plugins for fitting etc.
uses libjpeg.
OS X executable.
Port to Linux or Windows should be straightforward. Source will be made available.

GPUs and Python: a recipe for lightning-Fast Data pipelines

PyCUDA: C-API allow CUDA code to be easily integrated into existing python data pipeline frameworks.
PyCUDA SourceModule allows CUDA code to be compiled and easily linked into python code.
The CUDA code will be complied at import time and can be called as python method
Use python native C-API to use thrust - using libcuda.

The astronomical virtual observatory: lessons learnt, looking forward

This talk is taking a European view.
VO: enable seamless access to the wealth of astronomical resources
IVOA: interoperability standards. procedure adapted from w3c
Inclusive definition: data centres populate the VO with data and services, service to the community, added-value, sustainability quality. Had about 80 data centres answer the census of European data centres (Euro VO-DCA, Euro VO-AIDA, 2009, 2010)


The IVOA Architecture

For each archive and for each service it has its own data model and access interface, idea of VO is to hide the specifics of each archive and service to offer a common framework for access and usage.

Mentioned: Astronomical Data Query Language - AQDL, Table Access Protocal (TAP), Simple Application Messaging Protocal - SAMP ; enabling the inter connectivity between VO applications.

See IVOA documents


US VO efforts


To: develop a dedicated VAO Portal (data discovery tool), cross-matching that is scalable, SEDs, Time domain astronomy, Data linking and semantic astronomy, Desktop tool integration, Data mining and statistical analysis.

TAP client: Seleste; SED tool: IRIS - this uses Sherpa - and allows you to use NED or upload your own. Positional cross-matching - you can upload your own table - will have large pre-match catalogues for common ones soon.

VO-IRAF integration, there are some 700 IRAF tasks that will become VO-aware using SAMP. This will be part of the next IRAF release.

Data mining using dame.


Modeling physical quantities, measurement sets and theories

A data model provides domain specific concepts - it characterizes a family of datasets, it is an instance of a met-model, possibly a theory. examples; the schema of a database.
A data format has no associated self-described language - example XML with no schema, html or FITS. This is widely used for data exchange. The semantics is in form of documentation as there is no way to express constraints - with custom codes needed. A measurement set is a set of concrete concepts at different levels.


The challenges of new observing and operating modes at ground based optical observatories

Operating costs much money, with new technology can get more science with less money. How: by placing the users first.

Move to Queue and Service astronomy, and add the data pre-processing.

Visibility is not everything

ALMA data rates, 6 MB/s avg, peak 60 MB/s
about 0.1 MB/s is metadata or auxiliary data.
Metadata describes the observation, visibilities, calibration and auxiliary data e.g. projection number, spectral setup
Auxiliary data: data required ot process and analyze the primary data, e..g pointing
ALMA science data model (SDM)
Data model must contain all info necessary fro the astronomical processing of raw data - this is supported by CASA. Set of tables represented as XML documents (mostly).